Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 42
|
![]() |
Author |
|
Drago75
Cruncher Joined: May 17, 2020 Post Count: 25 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() |
Now that is a great heads up! Thanks for the info
|
||
|
PecosRiverM
Veteran Cruncher The Great State of Texas Joined: Apr 27, 2007 Post Count: 1054 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Have 2 systems up and waiting. 256 threads ea
----------------------------------------![]() ![]() ![]() |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1061 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Have 2 systems up and waiting. 256 threads ea That's an impressive thread count -- as per your post on the first page of the other thread about this Beta, it might help towards your [5-year] Beta badge! ![]() ![]() ![]() However, please leave enough work for some other systems -- otherwise how will the Beta test find out if there are hosts where the application won't run properly? ![]() Cheers - Al P.S. I don't care about badges, one way or the other; I just want to see reliability proven... [Edit 2 times, last edit by alanb1951 at Apr 8, 2025 1:43:46 PM] |
||
|
catchercradle
Senior Cruncher England Joined: Jan 16, 2009 Post Count: 158 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have a number of beta tasks (MAM) running across 3 hosts (one physical machine. Host OS Ubuntu, guests Ubuntu and Win10) On the Linux Guest, a couple have just errored out after being restarted after suspension to let system run a bit cooler. Running 3 hosts was to try and get a few more ARP tasks which are not available in abundance. I got caught out by the number I downloaded.
|
||
|
f300
Cruncher Joined: Jan 14, 2014 Post Count: 2 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
I had 10 MAM running (plus 2 ARP) but needed to reboot. I suspended BOINC, gave it 30 seconds then exited before rebooting. On restarting BOINC all 10 MAM errored with Out Of Memory, eg:
https://www.worldcommunitygrid.org/contribution/workunit/705959011 - Unhandled Exception Record - Reason: Out Of Memory (C++ Exception) (0xe06d7363) at address 0x000000000101AF29 The ARP resumed fine. Having just rebooted with nothing else running, unless the memory usage for the tasks spiked massively I can't actually have been out of memory (32GB). |
||
|
Link64
Advanced Cruncher Joined: Feb 19, 2021 Post Count: 142 Status: Offline Project Badges: ![]() ![]() ![]() ![]() |
It seems MAM tasks do not survive restarting, in particular those, which seem to run fine.
----------------------------------------![]() |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12561 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have 8 Windows versions on each of 2 i7-3770s.
Machine 1 is halfway through 3 beta jobs in 12 hours - no visible problems. Machine 2 has completed 2 jobs in the same time, but the 3rd job has been on 0.500% for hours and the next 2 jobs are slowly progressing, forecasting 10 days to finish. All 3 continuing jobs have had a user abort already. Should I do the same? I have restricted both machines in app_config to 3 beta30, 3 arp1 & 2 mcm1. Mike |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1993 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Looks like most of the BETA are turning into a turd. Out of a total of 83, 4 have been returned in a day, 2 valid, to PVa.
----------------------------------------On several others, the WUs are stopping at either 0.5% or at 2.6%, with a remaining runtime estimate ranging fro 8 days to a whopping 52 days... ![]() Will let all run until the morning before I gonna abort those that don't seem to make any reasonable progress... Ralf ![]() |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 837 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I aborted a couple at the 15-18 hour mark after making 0% progress, but they were at like 15% and 50% progress but stalled.
----------------------------------------The only ones that have actually successfully finished (and validated) finished at 6%/minute i.e. very fast total. I have one that is stuck at 21 hours at 41.845% (zero checkpoint). I want to assume it's stuck and abort it but I'll let it run overnight.
[Edit 1 times, last edit by hchc at Apr 26, 2025 5:25:52 AM] |
||
|
geophi
Advanced Cruncher U.S. Joined: Sep 3, 2007 Post Count: 110 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@Mike Gibson
Could you post your app_config.xml for WCG that you are using now? I'm trying to restrict the number of beta that can run at one time. Thank you. George |
||
|
|
![]() |