Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 26
Posts: 26   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7376 times and has 25 replies Next Thread
JCMarsh [U.S. Army]
Cruncher
Joined: Feb 8, 2012
Post Count: 5
Status: Offline
Reply to this Post  Reply with Quote 
HCMD2 failing to download

I have a problem with HCMD2 failing to download work units. It tries but then the WUs get stuck at 0% downloaded, time out, and get stuck in Project Backoff. I've tried restarting BOINC, restarting the machine, abort transfer and task, project reset, cursing and throwing things, but all to no avail.

Windows 7 Pro (64 bit)
BOINC 7.0.28(x64)

I have another machine (32 bit XP pro with 7.0.28) that is crunching and downloading just fine, also only on HCMD2.

From Event Log of offending machine...
9/17/2012 9:24:53 AM | World Community Grid | Temporarily failed download of hcmd2.2QOV_P.clustersOccur.pdb.gzb: transient HTTP error
9/17/2012 9:24:53 AM | World Community Grid | Backing off 7 min 4 sec on download of hcmd2.2QOV_P.clustersOccur.pdb.gzb
9/17/2012 9:24:57 AM | | Project communication failed: attempting access to reference site
9/17/2012 9:24:58 AM | | Internet access OK - project servers may be temporarily down.
9/17/2012 9:25:53 AM | World Community Grid | Started download of hcmd2.2QOV_G.clustersOccur.pdb.gzb
9/17/2012 9:25:55 AM | World Community Grid | Temporarily failed download of hcmd2.2QOV_G.clustersOccur.pdb.gzb: transient HTTP error
9/17/2012 9:25:55 AM | World Community Grid | Backing off 4 min 58 sec on download of hcmd2.2QOV_G.clustersOccur.pdb.gzb
9/17/2012 9:25:58 AM | | Project communication failed: attempting access to reference site
9/17/2012 9:25:59 AM | | Internet access OK - project servers may be temporarily down.
9/17/2012 9:30:54 AM | World Community Grid | Started download of hcmd2.2QOV_G.clustersOccur.pdb.gzb
9/17/2012 9:30:56 AM | World Community Grid | Temporarily failed download of hcmd2.2QOV_G.clustersOccur.pdb.gzb: transient HTTP error
9/17/2012 9:30:56 AM | World Community Grid | Backing off 13 min 53 sec on download of hcmd2.2QOV_G.clustersOccur.pdb.gzb
9/17/2012 9:30:59 AM | | Project communication failed: attempting access to reference site
9/17/2012 9:31:01 AM | | Internet access OK - project servers may be temporarily down.

____UPDATE___
I reset project twice and it kept drawing the same series WU and got stuck in same manner. Left it alone for a bit and was about to add HFCC so I could crunch something, then it finally woke up and started downloading a different series of WU. All is fine now on my end, so I suppose that series of WU may have been pulled. Happy crunching!
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by JCMarsh [U.S. Army] at Sep 17, 2012 3:15:01 PM]
[Sep 17, 2012 2:42:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BobCat13
Senior Cruncher
Joined: Oct 29, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

All of those files are 2QOV_? input files. I also could not get the 2QOV_? files to download, so I just aborted those downloads and all is working fine now. It appears the 2QOV_? input files may be corrupt on the server.
----------------------------------------
[Edit 1 times, last edit by BobCat13 at Sep 17, 2012 2:50:24 PM]
[Sep 17, 2012 2:49:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
52 Aces
Cruncher
United States
Joined: Sep 19, 2009
Post Count: 29
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

Sounds different from what I saw, so this is just fyi for others:

I had a situation about a week ago where my machine was idle as 1 actual FILE was trying to download for a few hours, but it somehow mapped to TWO WU's --- but I didn't study it (ie: maybe I was my own wingman), instead I just quickly aborted the WU's directly, and immediately the system grabbed & DL'd a new batch of WU's and began crunching again.

Soooo, what looked like a server DL problem where the servers 'looked' unresponsive was something else going on entirely. And the DL being wedged prevented my system for asking for other WU's.
----------------------------------------
[Edit 1 times, last edit by 52 Aces at Sep 17, 2012 7:00:54 PM]
[Sep 17, 2012 6:58:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Paul Schlaffer
Senior Cruncher
USA
Joined: Jun 12, 2005
Post Count: 279
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

I had the same issue today with a 2516_2QOV work unit. Thanks to this post I aborted the WU and the problem was resolved. Unfortunately it idled 11 processors until then. Thanks for the info.
----------------------------------------
“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
[Sep 18, 2012 3:47:00 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

Started crunching again after a idle summer due to cooling issues I been getting these corrupted work units too.. same issue had one I was at work all 12 threads idled out cause of it. aborted 12 new clean ones.. crunching again.. had to clear another one this morning on another box.. hope there is a fix on these soon..
[Sep 20, 2012 10:41:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

Don't expect a fix as this project is also ending. I've had one stuck file on a different machine every day the past three days. Just abort it and move on.

Keep a (relatively) close eye on your machines and this becomes a non-issue. Since it (usually) only affects one file it shouldn't bring the machine to a halt.
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Sep 20, 2012 3:49:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Paul Schlaffer
Senior Cruncher
USA
Joined: Jun 12, 2005
Post Count: 279
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

I encountered another one of these today (CMD2_2533_MYH6.clusterOccur_2QOV).

Again I aborted the problem WU and the downloads resumed to normal.
----------------------------------------
“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
[Sep 20, 2012 11:13:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Byteball_730a2960
Senior Cruncher
Joined: Oct 29, 2010
Post Count: 318
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

I had the same issue today with a 2516_2QOV work unit. Thanks to this post I aborted the WU and the problem was resolved. Unfortunately it idled 11 processors until then. Thanks for the info.


I have the same issue (although not 12 cores). I have a number of machines that are left unattended around the world (family and friends).
My buffer on these machines is 0 days and a stuck download, idles cores that I don't want to lose.
I have no idea which computers have stuck workunits as the computers are on so randomly that I cannot detect a pattern.
Will these WUs eventually timeout and the computer comes back or will I be losing these machines?
[Sep 21, 2012 12:19:33 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Paul Schlaffer
Senior Cruncher
USA
Joined: Jun 12, 2005
Post Count: 279
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

These work units should eventually be aborted by the server due to the expired return deadline (or earlier if detected).
----------------------------------------
“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
----------------------------------------
[Edit 1 times, last edit by Paul Schlaffer at Sep 21, 2012 1:07:14 AM]
[Sep 21, 2012 1:05:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Byteball_730a2960
Senior Cruncher
Joined: Oct 29, 2010
Post Count: 318
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2 failing to download

Nice. Thanks for that.
[Sep 21, 2012 6:36:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 26   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread