| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 264
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The download servers are serving lots of large files right now and clients are having to backoff to help allow files to be served as quickly as possible. Wouldn't it be better not to give out work that the servers can't cope with? As it stands, your systems are allocating work and then no actual work gets done if even a single file can't be downloaded. The first of my machines to get one of the new batch has just been able to start, almost an hour later ... 18-Nov-2013 14:44:38 [World Community Grid] Scheduler request succeeded: got 1 new tasks [ ... ] 18-Nov-2013 15:29:53 [World Community Grid] Starting task MCM1_0000084_8385_0 using mcm1 version 726 |
||
|
|
duanebong
Advanced Cruncher Singapore Joined: Apr 25, 2009 Post Count: 134 Status: Offline Project Badges:
|
I had 1 WU that was stuck on 100% for 16 hours (the original estimate by BOINC is the WU should take 2.5hrs). After restarting BOINC, the counter went back down to 0 hours, so I aborted it. Seems rare - just 1 out of a several hundred WUs I've processed since the project started. Or perhaps not rare enough, seeing that quite a few people are encountering the problem.
----------------------------------------![]() |
||
|
|
rebirther
Cruncher Germany Joined: Nov 19, 2005 Post Count: 29 Status: Offline Project Badges:
|
The problem is back:
20/11/2013 19:59:35 World Community Grid [error] Error reported by file upload server: Server is out of disk space 20/11/2013 19:59:44 World Community Grid [error] Error reported by file upload server: can't write file /usr/local/boinc/data/upload/2c6/MCM1_0000127_6577_1_0: No space left on server |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
---------------------------------------- [Edit 1 times, last edit by Former Member at Nov 20, 2013 7:10:33 PM] |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
Got this on Ubuntu 13.10 X64
----------------------------------------<core_client_version>7.2.28</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.26_i686-pc-linux-gnu -SettingsFile MCM1_0000115_9535.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running wcgrid_mcm1_7.26_i686-pc-linux-gnu: SVMModel.cpp:326: virtual void BinarySVM::trainModel(const std::string&, unsigned int, int*): Assertion `modelID >= 0' failed. SIGABRT: abort called Stack trace (2 frames): [0x80d8c0d] [0xf76ff400] Exiting... </stderr_txt> ]]> ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
TimAndHedy
Senior Cruncher Joined: Jan 27, 2009 Post Count: 267 Status: Offline Project Badges:
|
I have a work unit that has been running for more than 25 hours.
It shows as 100% complete but keeps running anyway. MCM1_ 0000014_ 2298_ 2-- PC4770K In Progress 11/19/13 15:10:15 11/22/13 15:10:15 0.00 / 0.00 0.0 / 0.0 It looks like to others attempted and resulted in no reply. How should I proceed? Any information needed on it? |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I wonder if it could possibly be a leftover version 7.24 WU ? If so, that would explain it.
I would'nt expect the new version 7.26 jobs to behave in this way. |
||
|
|
breathesgelatin
Advanced Cruncher Joined: Aug 5, 2006 Post Count: 117 Status: Offline Project Badges:
|
So virtually all of my workunits are still erroring out. I've only had one validate through so far. I fiddled with some settings - updated BOINC client, turned on the 'LAIM' setting, and also allowed tasks to run 24/7. Still can't figure out what to do to fix this issue. Some of the tasks error out immediately, with 0.00 CPU time, others go for about 3 hours and then error out.
----------------------------------------Any thoughts? Anything I could post to help troubleshoot? The tasks are erroring out on my husband's computer, so I don't always watch what's going on on that machine very closely. I could try to keep a better eye on it tomorrow to see what happens. ![]() |
||
|
|
TimAndHedy
Senior Cruncher Joined: Jan 27, 2009 Post Count: 267 Status: Offline Project Badges:
|
I wonder if it could possibly be a leftover version 7.24 WU ? If so, that would explain it. I would'nt expect the new version 7.26 jobs to behave in this way. Yes, it is a 7.24. It has been hanging around awhile. It may have run for 10 days on a couple of other systems. It's hard to tell. |
||
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
Falconet,
Looks like a wingman finished successfully so that rules out a workunit issue. If you see any more errors like this let me know. Thanks, armstrdj |
||
|
|
|