| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 26
|
|
| Author |
|
|
Amr Adam
Advanced Cruncher Egypt Joined: Aug 13, 2012 Post Count: 74 Status: Offline Project Badges:
|
i got stuck last night, but update project solved it this morning.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello Amr,
----------------------------------------I got stuck a couple times as well. I ended up aborting the particular work unit that seems to hang in the downloading activity. Luckily, I am able to supervise my computers more or less on a regular basis and clear the jams. However, it has been frustrating at times as it means that 12 CPUs have been down for a length of time..it is valuable computational time after all. Especially on this critically important project... The only time I run at less than 100% is on my home machine, it generates heat like crazy when all 12 CPUs are going full tilt and the fans are running 100%...I usually have to knock it back to 40% on the %CPU time overnight just to slow the fans down and reduce the heat generation...all other times it's at 100%. The work machine is always at 100% since it is in an air conditioned section of the building... ![]() [Edit 3 times, last edit by Former Member at Sep 22, 2012 5:43:56 AM] |
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
@kismetix,
----------------------------------------If the generated noise becomes (is) a serious issue for you, I would like to recommend to challenge the ventilators in place in your hosts. Usually, the ventilators provided as a bundle with a CPU are TOO noisy (especially in a high frequency band). Switching to professional CPU ventilators with 12 cm section would help a lot. Likewise, the power supply can impact the generated noise significantly. I have at home several hosts, running 100% day and night over the year, including during hot summer weeks. The noise is acceptable. Initially (some years ago) I did not take care about this noise issue and the result was really stressful. Now, I am more careful and the hosts are quiet. Enjoy, Yves --- PS: Crunching machines need to be cleaned up regularly (every 2 or 3 months at least), it will reduce the noise and increase the host life expectation. |
||
|
|
Byteball_730a2960
Senior Cruncher Joined: Oct 29, 2010 Post Count: 318 Status: Offline Project Badges:
|
I would say that this issue has passed now.
I have checked 80 computers that have been running about 330 core days in the last 4 days and found only one stuck WU which was a throwback to earlier last week. Do we know why it happened? I hope it was particular to that batch of HCMD2 and won't spread to other projects. Checking 80 computers was a pain, but sometimes you need to take a little pain for the love of WCG. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Speculation: May be related to the file corruption that occurred 6/7 September. The techs did their best to identify all files that could have been affected, but with all the HCMD2 work already loaded long before, they had a higher chance of still cause download issues weeks later. Other sciences have a shorter pre-load period so any file issue flushed out earlier.
|
||
|
|
slakin
Advanced Cruncher Joined: Jul 4, 2008 Post Count: 79 Status: Offline Project Badges:
|
I had one of these WU stuck downloading, caught it just in time to avoid cores sitting idle.
CMD2_ 2533-MYH6.clustersOccur-2QOV_ G.clustersOccur_ 80_ 321259_ 322379_ 321657_ 321898_ 1-- 640 User Aborted 9/23/12 10:55:15 9/26/12 02:55:16 0.00 0.0 / 0.0 Noticed someone else has also aborted it. |
||
|
|
rilian
Veteran Cruncher Ukraine - we rule! Joined: Jun 17, 2007 Post Count: 1460 Status: Offline Project Badges:
|
just got a WU with download error
----------------------------------------CMD2_2517-2QOU_D.clustersOccur-2QOV_G.clustersOccur_0_16212_17998_16789_17192 ---------------------------------------- [Edit 1 times, last edit by rilian at Oct 5, 2012 1:09:45 PM] |
||
|
|
rilian
Veteran Cruncher Ukraine - we rule! Joined: Jun 17, 2007 Post Count: 1460 Status: Offline Project Badges:
|
And with another stuck download a dedicated to WCG machine was idle for couple of days cause of ONE stuck 2QOV wu
----------------------------------------CMD2_ 2522-2PA2_ A.clustersOccur-2QOV_ G.clustersOccur_ 0_ 10693_ 19462_ 17387_ 19462_ 4-- ---------------------------------------- [Edit 2 times, last edit by rilian at Oct 6, 2012 8:25:10 AM] |
||
|
|
Sabrina Tarson
Advanced Cruncher United States Joined: Jun 27, 2012 Post Count: 149 Status: Offline Project Badges:
|
I am also having this problem.
---------------------------------------- |
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
This problem has been already addressed in other threads.
----------------------------------------You should abort the task as well as the related download. |
||
|
|
|