Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 179
Posts: 179   Pages: 18   [ Previous Page | 4 5 6 7 8 9 10 11 12 13 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 470488 times and has 178 replies Next Thread
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Interesting ones In Progress:
BETA_ avx101118-031_ r6_ 1_ wcgfahb00060000_ 0--
BETA_ avx101118-075_ r8_ 1_ wcgfahb00070000_ 0--
Yeah, interesting ones.
These numbering was expected to show up once.
Parent tasks must be from the 1st batch, where the user either aborted the task or exceeded the deadline (more probably) when the tasks were at 60% or 70% completion.

Would be nice to know, when deadline is not achieved, whether the task will be server aborted or run to 100%.
I think Keith is requested to enlighten that.
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by Crystal Pellet at Aug 28, 2015 8:10:17 AM]
[Aug 28, 2015 8:09:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
imakuni
Advanced Cruncher
Joined: Jun 11, 2009
Post Count: 102
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

So, it seems that I got one of those Beta tasks. The WU name is BETA_avx101118-008_r13_1_2, and so far, it's going fine, I guess. It seems to send the trickle message once every ~3h.

I'm just wondering, is that WU supposed to be huge? It's 21h in, and Boinc reports 2/3 of the work done, so I guess it'll take 7h more (though Boinc predicts 10h). I don't know if that run time is really supposed to be THAT big, or it's just my A6-3500 CPU that's weak.
----------------------------------------

Want to have an image of yourself like this on? Check this thread: https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,29840
[Aug 28, 2015 2:49:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Worry not imakuni this is the way these are supposed to be.

My intel e5-26xx @ 2.4 took about 31 hours another @ 3.1 took 26 hours so your A6-3500 seems to be coping well
----------------------------------------

[Aug 28, 2015 3:39:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Yarensc
Advanced Cruncher
USA
Joined: Sep 24, 2011
Post Count: 134
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Got a resend for one from the initial batch, looks like the first user to get it had download issues from an 'older' client too. The rest of the error log was just the other files also erroring out with the same 197 code. I'm running the _1 copy on 7.4.36 with Win 8.1 and it seems to be fine. 120MB Virtual memory and a bit over 8 hours on an AMD FX-8120

Result Name: BETA_ avx101118-023_ r15_ 1_ 0--
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>wcgrid_beta21_bedam_7.10_windows_intelx86</file_name>
<error_code>-197</error_code>
<error_message>user requested transfer abort</error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>wcgrid_FAHB_graphics_prod_32.exe.7.10</file_name>
<error_code>-197</error_code>
<error_message>user requested transfer abort</error_message>
</file_xfer_error>
[Aug 28, 2015 3:51:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Interesting ones In Progress:
BETA_ avx101118-031_ r6_ 1_ wcgfahb00060000_ 0--
BETA_ avx101118-075_ r8_ 1_ wcgfahb00070000_ 0--
Yeah, interesting ones.
These numbering was expected to show up once.
Parent tasks must be from the 1st batch, where the user either aborted the task or exceeded the deadline (more probably) when the tasks were at 60% or 70% completion.

Would be nice to know, when deadline is not achieved, whether the task will be server aborted or run to 100%.
I think Keith is requested to enlighten that.


This is normal. There are a few instances where this happens. One is that the user had returned a trickle message that was determined invalid. When that happens, the next work unit is generated from the point when the last valid point of a work unit. This allows for the greater say 3million steps that are needed to proceed without having to restart a given work unit from the beginning again. This will cut down on CPU cycles being wasted.

Thanks,
-Uplinger
[Aug 28, 2015 4:36:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Psalm103
Cruncher
Joined: Jan 6, 2007
Post Count: 24
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

I noticed after I rebooted the execution time of the my WU's jumped dramatically. Before I rebooted I was expecting them to finish in about 11 hours (CPU time), but after the reboot they ended up taking about 18 hours (CPU time). I rebooted about 1/2 way through. Below is an example of the log:

Thanks,
Ed

<core_client_version>7.4.42</core_client_version>
<![CDATA[
<stderr_txt>
[09:31:27] INFO:Turning trickle messaging on.
[09:31:27] INFO:Turning intermediate uploads on.
%IMPACT-I: Requested file to open for appending md.out Does not exist.
Opening it as a new file.
%IMPACT-I: Softcore binding energy with umax = 1000.00000
%IMPACT-I: Using AGBNP2: Analytical Generalized Born Model + Analytic
Non-Polar Hydration Model
%IMPACT-I: Hybrid potential for binding with lambda = 1.00000
agbnpf_assign_parameters(): info: attempting to load from SQL tables.
[09:39:37] INFO: Checkpointed. Progress 1000 of 100000 steps complete CPU time 426.522334
[09:47:12] INFO: Checkpointed. Progress 2000 of 100000 steps complete CPU time 829.363716
[09:55:02] INFO: Checkpointed. Progress 3000 of 100000 steps complete CPU time 1245.979987
[10:03:05] INFO: Checkpointed. Progress 4000 of 100000 steps complete CPU time 1659.585438
[10:11:09] INFO: Checkpointed. Progress 5000 of 100000 steps complete CPU time 2075.484104
[10:19:14] INFO: Checkpointed. Progress 6000 of 100000 steps complete CPU time 2488.699553
[10:27:21] INFO: Checkpointed. Progress 7000 of 100000 steps complete CPU time 2909.325049
[10:35:16] INFO: Checkpointed. Progress 8000 of 100000 steps complete CPU time 3329.045740
[10:43:06] INFO: Checkpointed. Progress 9000 of 100000 steps complete CPU time 3745.568410
[10:50:52] INFO: Sending trickle message to server.
[10:50:52] INFO: Starting intermediate upload, index = 1
[10:50:52] INFO: Checkpointed. Progress 10000 of 100000 steps complete CPU time 4162.013079
[10:58:17] INFO: Checkpointed. Progress 11000 of 100000 steps complete CPU time 4581.406168
[11:05:43] INFO: Checkpointed. Progress 12000 of 100000 steps complete CPU time 5002.250066
[11:13:06] INFO: Checkpointed. Progress 13000 of 100000 steps complete CPU time 5419.833542
[11:20:27] INFO: Checkpointed. Progress 14000 of 100000 steps complete CPU time 5835.373406
[11:27:53] INFO: Checkpointed. Progress 15000 of 100000 steps complete CPU time 6254.953696
[11:35:20] INFO: Checkpointed. Progress 16000 of 100000 steps complete CPU time 6675.345190
[11:42:44] INFO: Checkpointed. Progress 17000 of 100000 steps complete CPU time 7096.282689
[11:50:10] INFO: Checkpointed. Progress 18000 of 100000 steps complete CPU time 7516.658583
[11:57:39] INFO: Checkpointed. Progress 19000 of 100000 steps complete CPU time 7938.953290
[12:05:03] INFO: Sending trickle message to server.
[12:05:03] INFO: Starting intermediate upload, index = 2
[12:05:03] INFO: Checkpointed. Progress 20000 of 100000 steps complete CPU time 8357.628774
[12:12:31] INFO: Checkpointed. Progress 21000 of 100000 steps complete CPU time 8780.344684
[12:19:56] INFO: Checkpointed. Progress 22000 of 100000 steps complete CPU time 9198.255763
[12:27:21] INFO: Checkpointed. Progress 23000 of 100000 steps complete CPU time 9617.243249
[12:34:46] INFO: Checkpointed. Progress 24000 of 100000 steps complete CPU time 10037.057540
[12:42:13] INFO: Checkpointed. Progress 25000 of 100000 steps complete CPU time 10459.726649
[12:49:44] INFO: Checkpointed. Progress 26000 of 100000 steps complete CPU time 10883.612566
[12:57:14] INFO: Checkpointed. Progress 27000 of 100000 steps complete CPU time 11308.122488
[13:04:49] INFO: Checkpointed. Progress 28000 of 100000 steps complete CPU time 11734.301619
[13:12:27] INFO: Checkpointed. Progress 29000 of 100000 steps complete CPU time 12159.997148
[13:19:54] INFO: Sending trickle message to server.
[13:19:54] INFO: Starting intermediate upload, index = 3
[13:19:54] INFO: Checkpointed. Progress 30000 of 100000 steps complete CPU time 12578.875433
[13:27:46] INFO: Checkpointed. Progress 31000 of 100000 steps complete CPU time 12999.594530
[13:35:40] INFO: Checkpointed. Progress 32000 of 100000 steps complete CPU time 13421.202833
[13:43:24] INFO: Checkpointed. Progress 33000 of 100000 steps complete CPU time 13841.063924
[13:51:00] INFO: Checkpointed. Progress 34000 of 100000 steps complete CPU time 14262.734627
[13:58:39] INFO: Checkpointed. Progress 35000 of 100000 steps complete CPU time 14685.107335
[14:06:14] INFO: Checkpointed. Progress 36000 of 100000 steps complete CPU time 15106.232034
[14:13:58] INFO: Checkpointed. Progress 37000 of 100000 steps complete CPU time 15525.203920
[14:21:45] INFO: Checkpointed. Progress 38000 of 100000 steps complete CPU time 15948.512633
[14:29:25] INFO: Checkpointed. Progress 39000 of 100000 steps complete CPU time 16368.857328
[14:37:15] INFO: Sending trickle message to server.
[14:37:15] INFO: Starting intermediate upload, index = 4
[14:37:15] INFO: Checkpointed. Progress 40000 of 100000 steps complete CPU time 16791.152035
[14:44:59] INFO: Checkpointed. Progress 41000 of 100000 steps complete CPU time 17212.105133
[14:52:45] INFO: Checkpointed. Progress 42000 of 100000 steps complete CPU time 17629.173807
[15:00:32] INFO: Checkpointed. Progress 43000 of 100000 steps complete CPU time 18048.130092
[15:08:23] INFO: Checkpointed. Progress 44000 of 100000 steps complete CPU time 18470.502800
[15:16:07] INFO: Checkpointed. Progress 45000 of 100000 steps complete CPU time 18889.568286
[15:23:56] INFO: Checkpointed. Progress 46000 of 100000 steps complete CPU time 19308.430971
[15:31:39] INFO: Checkpointed. Progress 47000 of 100000 steps complete CPU time 19732.394889
[15:39:18] INFO: Checkpointed. Progress 48000 of 100000 steps complete CPU time 20152.864384
[15:47:08] INFO: Checkpointed. Progress 49000 of 100000 steps complete CPU time 20574.691088
[15:54:58] INFO: Sending trickle message to server.
[15:54:58] INFO: Starting intermediate upload, index = 5
[15:54:58] INFO: Checkpointed. Progress 50000 of 100000 steps complete CPU time 20994.942182
[16:02:40] INFO: Checkpointed. Progress 51000 of 100000 steps complete CPU time 21417.299289
[16:10:28] INFO: Checkpointed. Progress 52000 of 100000 steps complete CPU time 21839.983999
[16:17:29] INFO: Checkpointed. Progress 53000 of 100000 steps complete CPU time 22246.772207
[16:25:41] INFO:Turning trickle messaging on.
[16:25:41] INFO:Turning intermediate uploads on.
%IMPACT-I: Softcore binding energy with umax = 1000.00000
%IMPACT-I: Using AGBNP2: Analytical Generalized Born Model + Analytic
Non-Polar Hydration Model
%IMPACT-I: Hybrid potential for binding with lambda = 1.00000
agbnpf_assign_parameters(): info: attempting to load from SQL tables.
[16:39:34] INFO: Checkpointed. Progress 54000 of 100000 steps complete CPU time 23075.928512
[16:53:34] INFO: Checkpointed. Progress 55000 of 100000 steps complete CPU time 23913.435481
[17:07:25] INFO: Checkpointed. Progress 56000 of 100000 steps complete CPU time 24742.003592
[17:21:20] INFO: Checkpointed. Progress 57000 of 100000 steps complete CPU time 25576.406141
[17:35:13] INFO: Checkpointed. Progress 58000 of 100000 steps complete CPU time 26406.939865
[17:49:03] INFO: Checkpointed. Progress 59000 of 100000 steps complete CPU time 27234.275568
[18:02:52] INFO: Sending trickle message to server.
[18:02:52] INFO: Starting intermediate upload, index = 6
[18:02:52] INFO: Checkpointed. Progress 60000 of 100000 steps complete CPU time 28061.767273
[18:16:42] INFO: Checkpointed. Progress 61000 of 100000 steps complete CPU time 28891.536592
[18:30:31] INFO: Checkpointed. Progress 62000 of 100000 steps complete CPU time 29717.827088
[18:44:18] INFO: Checkpointed. Progress 63000 of 100000 steps complete CPU time 30543.758783
[18:58:17] INFO: Checkpointed. Progress 64000 of 100000 steps complete CPU time 31381.484153
[19:12:07] INFO: Checkpointed. Progress 65000 of 100000 steps complete CPU time 32210.925870
[19:25:58] INFO: Checkpointed. Progress 66000 of 100000 steps complete CPU time 33041.209992
[19:39:44] INFO: Checkpointed. Progress 67000 of 100000 steps complete CPU time 33866.829684
[19:53:30] INFO: Checkpointed. Progress 68000 of 100000 steps complete CPU time 34691.154568
[20:07:11] INFO: Checkpointed. Progress 69000 of 100000 steps complete CPU time 35510.986624
[20:20:48] INFO: Sending trickle message to server.
[20:20:48] INFO: Starting intermediate upload, index = 7
[20:20:48] INFO: Checkpointed. Progress 70000 of 100000 steps complete CPU time 36325.717446
[20:34:39] INFO: Checkpointed. Progress 71000 of 100000 steps complete CPU time 37155.892368
[20:48:34] INFO: Checkpointed. Progress 72000 of 100000 steps complete CPU time 37989.031308
[21:02:33] INFO: Checkpointed. Progress 73000 of 100000 steps complete CPU time 38823.012654
[21:16:31] INFO: Checkpointed. Progress 74000 of 100000 steps complete CPU time 39641.253500
[21:30:26] INFO: Checkpointed. Progress 75000 of 100000 steps complete CPU time 40474.844843
[21:44:16] INFO: Checkpointed. Progress 76000 of 100000 steps complete CPU time 41301.431742
[21:58:06] INFO: Checkpointed. Progress 77000 of 100000 steps complete CPU time 42130.998259
[22:11:55] INFO: Checkpointed. Progress 78000 of 100000 steps complete CPU time 42957.912760
[22:25:48] INFO: Checkpointed. Progress 79000 of 100000 steps complete CPU time 43789.366890
[22:39:35] INFO: Sending trickle message to server.
[22:39:35] INFO: Starting intermediate upload, index = 8
[22:39:35] INFO: Checkpointed. Progress 80000 of 100000 steps complete CPU time 44613.426572
[22:53:22] INFO: Checkpointed. Progress 81000 of 100000 steps complete CPU time 45440.169472
[23:07:14] INFO: Checkpointed. Progress 82000 of 100000 steps complete CPU time 46270.578395
[23:20:58] INFO: Checkpointed. Progress 83000 of 100000 steps complete CPU time 47092.438463
[23:34:54] INFO: Checkpointed. Progress 84000 of 100000 steps complete CPU time 47928.104620
[23:48:49] INFO: Checkpointed. Progress 85000 of 100000 steps complete CPU time 48761.789564
[00:02:43] INFO: Checkpointed. Progress 86000 of 100000 steps complete CPU time 49594.351301
[00:16:31] INFO: Checkpointed. Progress 87000 of 100000 steps complete CPU time 50420.626198
[00:30:20] INFO: Checkpointed. Progress 88000 of 100000 steps complete CPU time 51248.461104
[00:44:12] INFO: Checkpointed. Progress 89000 of 100000 steps complete CPU time 52078.511225
[00:57:58] INFO: Sending trickle message to server.
[00:57:58] INFO: Starting intermediate upload, index = 9
[00:57:58] INFO: Checkpointed. Progress 90000 of 100000 steps complete CPU time 52902.992110
[01:12:19] INFO: Checkpointed. Progress 91000 of 100000 steps complete CPU time 53735.850249
[01:26:22] INFO: Checkpointed. Progress 92000 of 100000 steps complete CPU time 54568.333985
[01:39:56] INFO: Checkpointed. Progress 93000 of 100000 steps complete CPU time 55369.133919
[01:53:53] INFO: Checkpointed. Progress 94000 of 100000 steps complete CPU time 56201.882857
[02:07:44] INFO: Checkpointed. Progress 95000 of 100000 steps complete CPU time 57031.511775
[02:21:39] INFO: Checkpointed. Progress 96000 of 100000 steps complete CPU time 57863.324707
[02:35:28] INFO: Checkpointed. Progress 97000 of 100000 steps complete CPU time 58687.883593
[02:49:13] INFO: Checkpointed. Progress 98000 of 100000 steps complete CPU time 59508.308452
[03:02:44] INFO: Checkpointed. Progress 99000 of 100000 steps complete CPU time 60316.206430
[03:16:43] INFO: Checkpointed. Progress 100000 of 100000 steps complete CPU time 61122.653600
%IMPACT-I: Species 1 written to SQL file md-out1.dms
%IMPACT-I: Species 2 written to SQL file md-out2.dms
03:16:44 (2176): called boinc_finish(0)

</stderr_txt>
]]>
[Aug 29, 2015 4:38:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

The runtime projections can be wrong to very wrong, depending on client version or how good/bad the initial flop estimate was that the techs put into feeder. There's an app_config in the newest clients <fraction_done_exact> that helps to better estimate the remaining time. Read configuration wiki and/or forums on how to set this parameter at project/science application level.
[Aug 29, 2015 4:55:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Interesting ones In Progress:
BETA_ avx101118-031_ r6_ 1_ wcgfahb00060000_ 0--
BETA_ avx101118-075_ r8_ 1_ wcgfahb00070000_ 0--
Yeah, interesting ones.
These numbering was expected to show up once.
Parent tasks must be from the 1st batch, where the user either aborted the task or exceeded the deadline (more probably) when the tasks were at 60% or 70% completion.

Would be nice to know, when deadline is not achieved, whether the task will be server aborted or run to 100%.
I think Keith is requested to enlighten that.

This is normal. There are a few instances where this happens. One is that the user had returned a trickle message that was determined invalid. When that happens, the next work unit is generated from the point when the last valid point of a work unit. This allows for the greater say 3million steps that are needed to proceed without having to restart a given work unit from the beginning again. This will cut down on CPU cycles being wasted.

Thanks,
-Uplinger

Not fully answered my question.
What's happening with the task on the clients machine?
- When a trickle seems to be invalid. Will the rest of the client task be aborted by the server?
- When the task is at e.g. 60% when deadline achieved. Will the task be server aborted for the 40% to do?

Another interesting point.
The trickle message is requested before the corresponding upload files are received by the server.
Would it be better to upload the files successful first before sending/requesting a trickle message?

29 Aug 18:32:06 [trickle] read trickle file projects/www.worldcommunitygrid.org/trickle_up_BETA_avx101118-002_r0_1_wcgfahb00200000_0_1440865872.xml
29 Aug 18:32:06 Sending scheduler request: To send trickle-up message.
29 Aug 18:34:40 Started upload of BETA_avx101118-002_r0_1_wcgfahb00200000_0_3
29 Aug 18:34:40 Started upload of BETA_avx101118-002_r0_1_wcgfahb00200000_0_13
29 Aug 18:34:44 Finished upload of BETA_avx101118-002_r0_1_wcgfahb00200000_0_3
29 Aug 18:34:44 Finished upload of BETA_avx101118-002_r0_1_wcgfahb00200000_0_13
----------------------------------------

[Aug 29, 2015 5:00:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Could it be, the server file indexer is primed, so it knows what is coming? Think[but unsure] to have seen at CPDN the trickle requests ,or to word it differently, request to trickle, went before the upload.

Yes, agree, if a result step is invalid, and another task is generated from the 10K block preceding, the conclusion seems thus to be that the one with the fault has a broken chain i.e. nothing good could come out of the remainder. Server forced abort would be in order.
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 29, 2015 5:09:17 PM]
[Aug 29, 2015 5:07:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Does it surprise you?

BETA_ avx101118-033_ r18_ 1_ 1-- - No Reply Sent Time 8/29/15 18:01:10 Due Time 8/29/15 18:01:10 0.00 0.0 / 0.0
----------------------------------------

[Aug 29, 2015 7:08:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 179   Pages: 18   [ Previous Page | 4 5 6 7 8 9 10 11 12 13 | Next Page ]
[ Jump to Last Post ]
Post new Thread