World Community Grid - View Thread - Workunits start at the beginning again and again

World Community Grid Forums

Category: Completed Research

Forum: The Clean Energy Project - Phase 2 Forum

Thread: Workunits start at the beginning again and again

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 16

[ ]

Author

This topic has been viewed 74271 times and has 15 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Workunits start at the beginning again and again

I have a host with Win7, DualCore CPU. On that host are running two virtual machines with winXP. Each machine has boinc and only working an the clean energy probject phase 2.
In machine 1 everything works fine.
On machine 2 the first work unit was computed fine. Then it beginn computing from the beginning and telling me it exited with zero but no success file or so.
The first work unit I canceled. Got a new one, same problem.
I reset the project. Error is still there. I detached from world community grid and connected again. Error still there.

Wheres the problem? In virtual machine one everything works fine and also worked fine in machine two for the first work unit.

Thanks for help.

[Dec 18, 2010 1:57:40 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Workunits start at the beginning again and again

Hello Strom_umsonst,
I am puzzled. I probably can not help, but I think you should give some more information, hoping that some other member can help.
First, just what software are you using to create 2 virtual machines?
Second, please post the start of the Messages tab on virtual machine #2 to show what BOINC thinks its environment is.
Third, you should compare this posted list of messages with the Messages tab on virtual machine #1. There should not be any significant difference.
Fourth, check that both virtual machines show the correct date and time, since BOINC is very sensitive to this.

Lawrence

[Dec 18, 2010 2:20:41 PM]

jfpz
Cruncher
Joined: Apr 7, 2005
Post Count: 8
Status: Offline
Project Badges:

10 year badge for Human Proteome Folding - Phase 2

90 day badge for Help Cure Muscular Dystrophy

180 day badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

10 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

10 year badge for Help Cure Muscular Dystrophy - Phase 2

45 day badge for Discovering Dengue Drugs - Together - Phase 2

10 year badge for The Clean Energy Project - Phase 2

10 year badge for Computing for Clean Water

10 year badge for Drug Search for Leishmaniasis

10 year badge for GO Fight Against Malaria

5 year badge for Computing for Sustainable Water

200 year badge for Mapping Cancer Markers

20 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

20 year badge for Africa Rainfall Project

100 year badge for OpenPandemics - COVID-19


Re: Workunits start at the beginning again and again

I'm getting a similar errors on repeatedly on HFCC e.g.
'12/18/2010 11:15:38 AM|World Community Grid|Task HFCC_L3_01569924_L3_0001_0 exited with zero status but no 'finished' file
12/18/2010 11:15:38 AM|World Community Grid|If this happens repeatedly you may need to reset the project.

Win Vista, no VMs. I have reset the project.
date/time is correct.
details:
12/17/2010 10:41:27 AM||Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3
12/17/2010 10:41:27 AM||Data directory: C:\Program Files\BOINC
12/17/2010 10:41:27 AM||Processor: 2 AuthenticAMD AMD Turion(tm) 64 X2 Mobile Technology TL-50 [x86 Family 15 Model 72 Stepping 2] [fpu tsc pae nx sse sse2 sse3 3dnow mmx]
12/17/2010 10:41:27 AM||Memory: 1.37 GB physical, 3.00 GB virtual

[Dec 18, 2010 4:33:08 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Workunits start at the beginning again and again

Hello jfpz,

exited with zero status but no 'finished' file

This is a common warning message that should be ignored unless your work unit starts erroring out. There is a FAQ in Start Here that lists a large number of factors that can cause this warning message to print. Personally, I wish that it would not print by default but only if the user requested for the warning to be printed.

Lawrence

[Dec 18, 2010 5:09:05 PM]

anhhai
Veteran Cruncher
Joined: Mar 22, 2005
Post Count: 839
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

14 day badge for Nutritious Rice for the World

20 year badge for Help Fight Childhood Cancer

50 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

50 year badge for The Clean Energy Project - Phase 2

10 year badge for Computing for Sustainable Water

200 year badge for Uncovering Genome Mysteries

200 year badge for Outsmart Ebola Together

100 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

200 year badge for Microbiome Immunity Project

50 year badge for OpenPandemics - COVID-19


Re: Workunits start at the beginning again and again

exist with zero status is normal

all systems have that every once and a while. It usually causes no problems that I am aware of. Sekerob has replied about it many times, saying causes are like os updates, ....

----------------------------------------

[Dec 18, 2010 5:10:18 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Workunits start at the beginning again and again

When this error is normal, why does the workunit start at the beginning? So there is a lot of CPU time wasted for nothing.

I use VirtualBox for my virtual machines.
Date and Time is in sync with the host.

The Boinc messages I'll post later when both machines are running again.

[Dec 18, 2010 9:45:11 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Workunits start at the beginning again and again

I don't know if this is the same thing, but when I have problems with BOINC, stopping and starting BOINC does not help. If I reboot, all information in memory is cleared and the BOINC problems seem to disappear. That is what I have experienced.

Edit for additional information: I decided to review further the WUs in Inconclusive Status. In both cases, my execution time was over 8 hours and my wingman's execution time 1.86 hours in one case. I displayed my wingman's processing for the 1.86 hour case and see

[04:03:34] Starting job 10,CPU time has been restored to 6135.140625.
Application exited with RC = 0x3
[04:09:58] Finished Job #10
[04:09:58] Starting job 11,CPU time has been restored to 6507.671875.
[04:09:58] Skipping Job #11
[04:09:58] Starting job 12,CPU time has been restored to 6507.671875.
Application exited with RC = 0x80
[04:13:25] Finished Job #12
[04:13:25] Starting job 13,CPU time has been restored to 6705.593750.
[04:13:25] Skipping Job #13
[04:13:25] Starting job 14,CPU time has been restored to 6705.593750.
[04:13:25] Skipping Job #14
[04:13:25] Starting job 15,CPU time has been restored to 6705.593750.
[04:13:25] Skipping Job #15
04:13:31 (9692): called boinc_finish

The other WU had one job exiting with an RC that caused the remaining jobs to be skipped. I think this is why these WUs are Inconclusive.

----------------------------------------
[Edit 2 times, last edit by Former Member at Dec 19, 2010 7:49:50 AM]

[Dec 19, 2010 6:43:35 AM]

jfpz
Cruncher
Joined: Apr 7, 2005
Post Count: 8
Status: Offline
Project Badges:


Re: Workunits start at the beginning again and again

Thanks for the reply Lawrence, like others it is frustrating to see this error message on multiple results totaling hundreds of hours of work. It wastes my time and wastes time that could be put into meaningful processing. Will read the FAQ and see if it contains a fix for HFCC

[Dec 21, 2010 7:49:24 PM]

KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline


Re: Workunits start at the beginning again and again

When this error is normal, why does the workunit start at the beginning? So there is a lot of CPU time wasted for nothing.

This project has some very long intervals between checkpoints. Leave applications in memory will address this problem on a normal machine. As for the virtual machine, I can't begin to say how it affects it other than to say anytime the work unit is unloaded from memory it will revert to the previous check point. If it hasn't run a two to three hours from beginning, this checkpoint will be the start of the work unit.

----------------------------------------

Distributed computing volunteer since September 27, 2000

[Dec 21, 2010 9:15:47 PM]

sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:

5 year badge for Nutritious Rice for the World

180 day badge for The Clean Energy Project

1 year badge for Influenza Antiviral Drug Search

20 year badge for Help Cure Muscular Dystrophy - Phase 2

5 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

10 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

45 day badge for OpenPandemics - COVID-19


Re: Workunits start at the beginning again and again

I think this may be host name related; Boinc is seeing both systems as being the same and resetting one. Can you first confirm that you are not using BAM or that your system is restarting (due to automatic updates or a system failure)?

[Dec 21, 2010 10:20:30 PM]

[ ]