Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1679 times and has 5 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
HDC Work Unit Pauzed and next started without interference

A HDC WU per the log below started and pauzed by itself and started the next HDC WU after 24 minutes CPU time:
2006-11-16 21:05:32 [---] Starting B10277_0006_CTMA1Aa-4-0-2_1
2006-11-16 21:05:33 [World Community Grid] Starting task B10277_0006_CTMA1Aa-4-0-2_1 using hdc version 514
2006-11-16 21:05:35 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_0
2006-11-16 21:05:35 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_1
2006-11-16 21:05:38 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_1
2006-11-16 21:05:38 [World Community Grid] Throughput 132 bytes/sec
2006-11-16 21:05:38 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_2
2006-11-16 21:05:39 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_0
2006-11-16 21:05:39 [World Community Grid] Throughput 79 bytes/sec
2006-11-16 21:05:39 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_3
2006-11-16 21:05:41 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_2
2006-11-16 21:05:41 [World Community Grid] Throughput 91 bytes/sec
2006-11-16 21:05:41 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_4
2006-11-16 21:05:43 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_3
2006-11-16 21:05:43 [World Community Grid] Throughput 87 bytes/sec
2006-11-16 21:05:43 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_5
2006-11-16 21:05:49 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_4
2006-11-16 21:05:49 [World Community Grid] Throughput 6076 bytes/sec
2006-11-16 21:05:49 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_6
2006-11-16 21:05:51 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_5
2006-11-16 21:05:51 [World Community Grid] Throughput 5947 bytes/sec
2006-11-16 21:05:51 [World Community Grid] Started upload of file B10277_0011_CTMA1Aa-4-0-7_1_7
2006-11-16 21:05:55 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_6
2006-11-16 21:05:55 [World Community Grid] Throughput 2366 bytes/sec
2006-11-16 21:05:57 [World Community Grid] Finished upload of file B10277_0011_CTMA1Aa-4-0-7_1_7
2006-11-16 21:05:57 [World Community Grid] Throughput 1548 bytes/sec
2006-11-16 21:10:20 [World Community Grid] Sending scheduler request: Requested by user
2006-11-16 21:10:20 [World Community Grid] Reporting 1 tasks
2006-11-16 21:10:20 [---] [http_debug] HTTP_OP::init_post(): https://secure.worldcommunitygrid.org/boinc/wcg_cgi/fcgi
2006-11-16 21:10:25 [World Community Grid] Scheduler RPC succeeded [server version 507]
2006-11-16 21:42:52 [---] Starting B10277_0003_CTMA1Aa-4-0-13_0
2006-11-16 21:42:52 [World Community Grid] Starting task B10277_0003_CTMA1Aa-4-0-13_0 using hdc version 514

There are no entries in the error log. Suspending the subsequent unit did not resume the paused WU. The WU, though pre-empting is on, was unloaded from memory. Exiting BOiNC and restarting did not resume the paused WU either and resumed the next one. From the result status page, the other 2 in the quorum have completed the paused WU.
Workunit Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit
B10277_ 0006_ CTMA1Aa-4-0-2 Pending Validation 11/14/2006 15:28:48 11/14/2006 19:06:35 2.85 61 / 0
B10277_ 0006_ CTMA1Aa-4-0-2 In Progress 11/14/2006 15:25:51 11/21/2006 15:25:51 0.00 0 / 0
B10277_ 0006_ CTMA1Aa-4-0-2 Pending Validation 11/14/2006 15:25:44 11/15/2006 00:34:48 5.90 57 / 0

The full message log in the txt files have no entries on the problem WU.

Is there a sequence to make it to restart or should it be cancelled?

Device 34409
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 3 times, last edit by Sekerob at Nov 16, 2006 10:34:56 PM]
[Nov 16, 2006 10:31:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HDC Work Unit Pauzed and next started without interference

Are you using the alpha client?

What's the deadline on the work unit that preempted it?
[Nov 16, 2006 10:36:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: HDC Work Unit Pauzed and next started without interference

Both have the identical deadline date/time to the second on Nov.21. When suspending the subsequent WU in progress (58 minutes), it remained in memory and BOiNC started downloading new work, rather than resuming the one in pause. What is of concern is, that it was removed from memory, and started the next in queue.

Alpha 5.7.2, but done 20 WU's or so without a single problem.

Read that the new version has done away with EDF and has a better checkpoint switching, but suspending all the other projects, did not make them resume....... they are overworked according BOINCview :o

It's stuck at 1.500% exact if that can help to know where it is in the startup sequence.

Guess, unless knreed et al can check if the 2 WU's already send in the quorum have an exception in their logs, I'll cancel and see what happens. No rush as it is seemingly dead.

Nighty Night... i'll see if dreams moved it on tomorrow morning.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Nov 16, 2006 11:02:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: HDC Work Unit Pauzed and next started without interference

Well dreams moved it on..... after about 5 wall clock hours, it suspended the very long HDC # 2 (after 4:22 hrs and 48%) and of same date/time stamp, then it went back to the 1st, suggesting it's not recognizing that both are of the same project?

There is a 3rd with same date/time stamp, to further obscure logic.

Absolutely no entries in the message screen, other than saying it restarted the HDC # 1, which after 4:21 hours is only at 50%

Interesting is, that if the new versions are able to wait for the exact checkpoints before project switching, the 'keep in memory' could be turned off, improving ram needs considerably.

I'll see what i can get to hear at the BOiNC dev forum. The second one was not unloaded from memory, so now i have 2 memory hungry HDC's occupying ram on 1 thread.

Edit: Found a thread reporting scheduling issues with long WU's and added the observation to it: http://boinc.berkeley.edu/dev/forum_thread.php?id=1312
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Nov 17, 2006 8:33:14 AM]
[Nov 17, 2006 6:59:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HDC Work Unit Pauzed and next started without interference

The second one was not unloaded from memory, so now i have 2 memory hungry HDC's occupying ram on 1 thread.

That is what Virtual Memory is for. Applications left in memory but not being accessed end up on the hard drive, leaving RAM available for other processes.
[Nov 17, 2006 7:50:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: HDC Work Unit Pauzed and next started without interference

Regrettably no Lawrence, the portion with keep-in-memory on, that was in physical ram, remains in physical ram (that's what the taskmanager has been telling me all along) the paused HDC # 2 still taking 125mb. In the case of k.i.m. off, it would just unload the ram part in the past and loose crunching time to the last checkpoint. Anyway, the new method is fine... I'm switching off the k.i.m setting now that that is functioning theoretically without time loss in the alpha and observe.

ciao

and happy birthday, to WCG of course :D
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Nov 17, 2006 8:01:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread