Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 60
Posts: 60   Pages: 6   [ 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 11007 times and has 59 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Does offline calculation results in invalids?

Three days ago I downloaded about 40 WUs. The first 8 WUs started, sent trickles and uploaded files but still were running when the PC was disconnected from the internet. So all 8 WUs finished being offline. All other WUs started and finished being offline, too. Today I restored the internet connection and all WUs uploaded their files and were reported as complete. But only those 8 WUs which already sent trickles/files three days ago turned valid. All others turned Invalid although they were reported in time (12 hours before expiry date).
So is it not possible to calculate WUs being offline all the time?
Or is it related to the expiry date?
And does it mean that it is better to abort all unstarted WUs if the internet connection fails?
[Oct 8, 2015 5:26:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

12 Hours before deadline is not good enough. The server has to know the jobs were started, and have completed like 70 % at 24 hours before deadline. Else they are considered lost/No Reply holding up the sequential steps processing. So if at all, connect at least every 24 hours for this science.
[Oct 8, 2015 6:24:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

But no repair job was started before returning the WUs. Only after reporting them two repair jobs were created for each WU (still waiting to be sent). So the WU is still holding up the sequential steps processing, even more by being considered invalid and restarting it twice...
So what sense does it make to consider a WU as lost (and therefore invalid) before reaching its deadline but to wait sending a repair job until that WU expires?
And all of the 8 succeeding WUs were offline for nearly three days as well before becoming valid.
[Oct 8, 2015 6:44:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

Reads like the flow process needs a bit more honing ** by the techs.

I'm computing online i.e. the trickles flow every hour or two. Only have 2 allowed concurrent but 6 in total, meaning the latest wont finish until the 24 hour critical deadline test point. Think they'll have done 70% by then, presume them to then be left alone to finish and do not receive either a soft or hardstop instruction. If not, then they're getting soft stop and are allowed to finish the last trickle block of 10k steps. Would not bother me, as credit is given for good trickles, and someone else picks up where mine left off.

** The admin may test her linguistic prowess if 'honing' is used coorectly biggrin
[Oct 8, 2015 7:06:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

Some x-linked side info, 95% of the tasks sent out manage to return the full 100K steps in an assignment: https://secure.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=504350

Right now I have 2 tasks running that were buffered > 48 hours ago, now at 11 and 18%, running high priority. By extrapolation, they're not going to make the 70%, as in 12 hours the 24 hour threshold will happen, soft stop expected. Not had one yet, so will watch how it evolves.
[Oct 8, 2015 8:28:59 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

And more parallel posting on invalid, soft stop [not ATM] and more: https://secure.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=504385

Boils down to my late 2 FABH are seemingly going to be allowed to run on even if they don't manage the 70% hurdle at 72 hours. The scientists prefer full 100K tasks, for the least wavering I suppose in the series of 30x100K (30+ processed by different machines). Testing the waters is a guess to see how many fully complete if allowed to be out the full 96 hours... then if at deadline, having trickled the 90% intermediate file, suppose a soft stop becomes moot, as then the wait is programmed to continue through 100%. We'll see, observe and learn what's on in the heads of the techs cool

edit: to be
----------------------------------------
[Edit 1 times, last edit by SekeRob* at Oct 8, 2015 11:31:11 AM]
[Oct 8, 2015 11:07:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

The reason your result could have gone invalid was that if you sent back a trickle message but did not upload the intermediate upload files within 3 hours. At this time, you would get a hard stop and the result would have been marked for validation on the back end. If zero trickle messages were completed, it would mark your result invalid and send another copy to another computer from step 0.

Thanks,
-Uplinger


I allowed about 15 WUs to upload all of their intermediate upload files first. Then I forced a scheduler request (not asking for new work) to upload all trickle messages. Then the remaining WUs uploaded their intermediate files and were reported. But that made no difference - all WUs (apart from the 8 already started WUs) went invalid.
When I asked for new work some hours later I got that message for each already reported WU (i.e. 42 message lines):

08.10.2015 12:58:22|World Community Grid|[error] handle_trickle_down failed: null pointer

I will see if I can reproduce this behaviour.
[Oct 8, 2015 11:19:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

So I disconnected another (different) machine from the internet and waited for some WUs to complete. Some minutes ago I uploaded three WUs - one which already uploaded and trickled yesterday and two which started and finished completely offline. And - surprise - the first one became valid while both offline WUs turned invalid:

FAH2_ avx101121-ls_ 000028_ 0016_ 003_ 1-- 714 Invalid 06.10.15 23:24:32 09.10.15 14:21:19 17.65 383.6 / 0.0
FAH2_ avx101122-ls_ 000009_ 0019_ 002_ 0-- 714 Invalid 07.10.15 04:17:42 09.10.15 14:21:19 17.73 382.0 / 0.0

Both WUs were due on 11th October, so there should be no (hard or soft) stop involved. Also, the log looks as usual (at least to me).

Perhaps you should add to the system requirements for this project that one should never process WUs offline unless one wants to produce invalid WUs.
(Maybe it is stated there already, I never read system requirements... ;-) )

I backed up the BoincData folder for the first occurrence right before reconnecting, so If you need any further informations, just ask.
And yes, I tried it on different boinc versions (5.10.45 and 7.2.42), both with Windows 7.

Matthias
[Oct 9, 2015 2:46:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

So some invalids later...
The local intermediate upload files are changed as soon as a WU finishes. I.e. up to 90% they are once generated (and hopefully uploaded), but remain unchanged. Reaching 100% the last intermediate files and a result file are generated and all still locally available (not uploaded) intermediate files are changed somehow - they get a new timestamp and differ if they are compared binary. Since I do not know their structure I cannot tell the differences but if they are uploaded and the WU is reported it becomes invalid. If I replace the changed upload files with the original versions before uploading, the WU becomes valid, also if there was already an upload for that WU before (well, replacing the files only simulates that situation...). So going online before reaching 100% is sufficient to get a valid result, but going online after the WU finished makes it invalid...
Also, if an intermediate file (e.g. the first one for 10%) and the associated trickle is uploaded the processed time and some claimed credit shows up in the result status page after approx. two minutes. After completion of the WU the same action with the now changed intermediate/trickle file has no influence on the result status page - the values remain still 0.

These are only observations, maybe they have nothing to do at all with the offline problem. Will do more testing if time allows (need 18 hours for the next WU to complete before I can start testing...).

Matthias
----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 11, 2015 4:35:37 PM]
[Oct 11, 2015 4:29:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Does offline calculation results in invalids?

mweisensee,
Thank you for stating your observations. I don't run without a 24/7 internet connection but I have in the past and may again in the future, so your observations concerning FAHB 100% upload behavior when offline is a little disconcerting. If this is going to remain standard procedure then perhaps we need a footnote in the System Requirements warning of this.
[Oct 12, 2015 1:59:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 60   Pages: 6   [ 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread