Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 118
Posts: 118   Pages: 12   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 25933 times and has 117 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

Now I'm really fed up with seeing more "exited with zero status but no 'finished' file" and wasting processing since the last checkpoint, so I've added <name>beta11</name> <max_concurrent>2</max_concurrent> to app_config; if the problem still continues, it'll be down to only 1 at a time.
[Aug 19, 2014 9:33:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pcwr
Ace Cruncher
England
Joined: Sep 17, 2005
Post Count: 10903
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

Have noticed that if I suspend a wu, it starts from the beginning again.

Patrick
----------------------------------------

[Aug 19, 2014 9:51:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1326
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

Now I'm really fed up with seeing more "exited with zero status but no 'finished' file" and wasting processing since the last checkpoint, so I've added <name>beta11</name> <max_concurrent>2</max_concurrent> to app_config; if the problem still continues, it'll be down to only 1 at a time.

I understand your frustration Tony, however I consider you lucky because I haven't been able to get any beta tasks for some time now. I am not complaining I am just letting my thoughts out
----------------------------------------

[Aug 19, 2014 10:02:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

Have noticed that if I suspend a wu, it starts from the beginning again.
Patrick, that happens only if the wu hasn't finished Job#0 (which admittedly is a long job). Checkpoints occur in a CEP2 wu at the end of each job. It also sounds like you need to turn on LAIM (Leave Applications In Memory while suspended) under Memory Usage in the Device Profile you're using - then an wu can continue from where it left off rather than from a previous checkpoint
[Aug 19, 2014 10:03:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

... I consider you lucky because I haven't been able to get any beta tasks for some time now.
It's more down to micromanagement than luck biggrin
[Aug 19, 2014 10:05:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1326
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

... I consider you lucky because I haven't been able to get any beta tasks for some time now.
It's more down to micromanagement than luck biggrin

I understand. I am sure I will pick some tasks up at some point
----------------------------------------

[Aug 19, 2014 10:25:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

So, got a "Beta - The Clean Energy Project - Phase 2 needs 922.99MB more disk space. You currently have 1125.01 MB available and it needs 2048.00 MB." on my 24 core Xeon server. Odd, considering it was running maybe 5 work units total (19 cores idle).

On an un-related note. My 16 core Xeon server has reached single result reliability on CEP Beta. Whodathunkit?
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Aug 19, 2014 11:51:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pcwr
Ace Cruncher
England
Joined: Sep 17, 2005
Post Count: 10903
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

Have noticed that if I suspend a wu, it starts from the beginning again.
Patrick, that happens only if the wu hasn't finished Job#0 (which admittedly is a long job). Checkpoints occur in a CEP2 wu at the end of each job. It also sounds like you need to turn on LAIM (Leave Applications In Memory while suspended) under Memory Usage in the Device Profile you're using - then an wu can continue from where it left off rather than from a previous checkpoint


It already is. The computer also runs 24/7.

Patrick
----------------------------------------

[Aug 20, 2014 6:13:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

I see that the validator has been turned on for these beta WUs. Many of my results are now Valid, but I noticed one in Pending Verification. It's BETA_ E225108_ 587_ S.328.C42H26N6O1.JXTUBXMYSMVBOD-UHFFFAOYSA-N.14_ s1_ 14_ 0-- which finished with RC = 0x1 in Job#0. The wingman's _1 finished in Job#6. _2 is In Progress. So it appears that convergence is run-dependent or machine-dependent in extreme cases. Is that to be expected?
[Aug 20, 2014 7:44:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Clean Energy Project - Phase 2 BETA test new workunits - Aug 15, 2014 [ Issues Thread ]

Patrick, so LAIM is on but a suspended wu restarts from the beginning - hmmm, can't explain that sad
[Aug 20, 2014 7:51:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 118   Pages: 12   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread