| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 18
|
|
| Author |
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
Please post issues for this beta here.
http://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=525207 Thanks, armstrdj |
||
|
|
dango
Senior Cruncher Joined: Jul 27, 2009 Post Count: 307 Status: Offline Project Badges:
|
got 2, started well....
|
||
|
|
JimWork
Cruncher Canada Joined: Oct 11, 2005 Post Count: 35 Status: Offline Project Badges:
|
got 12 ! woohoo --- if they work I get my little ruby badge and selfie pat on the back
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Guess what, same machine as before, the first download session (3 beta units) failed with error -120 (RSA key check failed for file), all subsequent download sessions of beta units succeeded. It does now have a scan exclusion on the BOINC ProgramData folder, so bye-bye to that hypothesis.
Failed download (all these 3 errored with -120): 17/06/2016 21:18:13 | World Community Grid | Scheduler request completed: got 3 new tasks 17/06/2016 21:18:15 | World Community Grid | Started download of wcgrid_beta22_gromacs_7.21_windows_x86_64 17/06/2016 21:18:15 | World Community Grid | Started download of wcgrid_HST1_graphics_prod_64.exe.7.21 17/06/2016 21:18:15 | World Community Grid | Started download of beta22_image01_7.21.tga 17/06/2016 21:18:15 | World Community Grid | Started download of beta22_image02_7.21.tga 17/06/2016 21:18:17 | World Community Grid | Finished download of beta22_image01_7.21.tga 17/06/2016 21:18:17 | World Community Grid | Finished download of beta22_image02_7.21.tga 17/06/2016 21:18:17 | World Community Grid | Started download of beta22_image03_7.21.tga 17/06/2016 21:18:17 | World Community Grid | Started download of beta22_image04_7.21.tga 17/06/2016 21:18:18 | World Community Grid | Finished download of beta22_image03_7.21.tga 17/06/2016 21:18:18 | World Community Grid | Finished download of beta22_image04_7.21.tga 17/06/2016 21:18:18 | World Community Grid | Started download of 51c7514d41a369294878918786403cd6.tpr 17/06/2016 21:18:18 | World Community Grid | Started download of 921d09821a96f169f77376f133f4a067.tpr 17/06/2016 21:18:19 | World Community Grid | Finished download of wcgrid_HST1_graphics_prod_64.exe.7.21 17/06/2016 21:18:19 | World Community Grid | Started download of e3c05a80b997297c5d05e51266bf4e08.tpr 17/06/2016 21:18:41 | World Community Grid | Finished download of 921d09821a96f169f77376f133f4a067.tpr 17/06/2016 21:18:49 | | Project communication failed: attempting access to reference site 17/06/2016 21:18:49 | World Community Grid | Temporarily failed download of wcgrid_beta22_gromacs_7.21_windows_x86_64: transient HTTP error 17/06/2016 21:18:49 | World Community Grid | Finished download of 51c7514d41a369294878918786403cd6.tpr 17/06/2016 21:18:50 | | Internet access OK - project servers may be temporarily down. 17/06/2016 21:18:50 | World Community Grid | Started download of wcgrid_beta22_gromacs_7.21_windows_x86_64 17/06/2016 21:18:50 | World Community Grid | Finished download of e3c05a80b997297c5d05e51266bf4e08.tpr 17/06/2016 21:18:51 | World Community Grid | Finished download of wcgrid_beta22_gromacs_7.21_windows_x86_64 Subsequent successful download (these 3 downloaded ok): 17/06/2016 21:23:19 | World Community Grid | Scheduler request completed: got 3 new tasks 17/06/2016 21:23:21 | World Community Grid | Started download of wcgrid_beta22_gromacs_7.21_windows_x86_64 17/06/2016 21:23:21 | World Community Grid | Started download of 8c805b3f4f07dd1c0f322724e785cf44.tpr 17/06/2016 21:23:21 | World Community Grid | Started download of 9a3569ba9e3ef061792b6cda5e3beeae.tpr 17/06/2016 21:23:21 | World Community Grid | Started download of 419e0dc8d42fc9adeb1dfddc4d42071f.tpr 17/06/2016 21:23:36 | World Community Grid | Finished download of 8c805b3f4f07dd1c0f322724e785cf44.tpr 17/06/2016 21:23:37 | World Community Grid | Finished download of 9a3569ba9e3ef061792b6cda5e3beeae.tpr 17/06/2016 21:23:37 | World Community Grid | Finished download of 419e0dc8d42fc9adeb1dfddc4d42071f.tpr 17/06/2016 21:23:43 | World Community Grid | Finished download of wcgrid_beta22_gromacs_7.21_windows_x86_64 Any ideas? |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
What's the 32 bit version number? Got a x86_64 on the test W10 with v7.21.
----------------------------------------HST issues I know of: Validation 32 bit AMD CPUs And, is all this effort working up towards getting feeder levels above 11K a day? Outstanding issues such as FAH2 going invalid when crunching offline through the 10th trickle would be higher on my priority list to resolve. Simply not crunching them as internet stability is iffy here. [Edit 1 times, last edit by SekeRob* at Jun 17, 2016 9:15:40 PM] |
||
|
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 786 Status: Offline Project Badges:
|
On Windows XP 32 bit 2 core I have 2 T000 betas running, using CPU and % done increasing but CPU% zero in BoincTasks. Windows task manager shows 20MB Mem usage and 1GB VM size, 6,000 page faults.
----------------------------------------1st No checkpoint after 45 minuts, 5% done. 2nd I tried to suspend but did not see CPU drop, reset to start, now 1.2% after 10 mins. AuthenticAMD, AMD Athlon(tm) 64 X2 Dual Core Processor 5000+ [Family 15 Model 107 Stepping 2] Memory 1.87 Gb, Virtual: 4.65 Gb Disk Used: 15.62 Gb, Free: 0.44 Gb Paul.
Paul.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Had three so far on WinXP32 with BOINC 7.6.22 and all went like this:
Result Name: BETA_ HST1_ 004073_ 000084_ AC0018_ T325_ F00008_ S00005_ 0-- <core_client_version>7.6.22</core_client_version> <![CDATA[ <message> The access code is invalid. (0xc) - exit code 12 (0xc) </message> <stderr_txt> INFO: result number = 0 INFO: No state to restore. Start from the beginning. [21:33:54] INFO: Running initial simulation ------------------------------------------------------- Program projects/www.worldcommunitygrid.org/wcgrid_beta22_gromacs_7, VERSION 4.6.1 Source code file: .\src\gmxlib\smalloc.c, line: 247 Fatal error: Not enough memory. Failed to realloc 527008 bytes for nl->gid, nl->gid=0x0 (called from file .\src\mdlib\ns.c, line 122) For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors ------------------------------------------------------- Thanx for Using GROMACS - Have a Nice Day : Not enough space </stderr_txt> ]]> I can't say I understand the memory error as Windows Task Manager says the Commit Charge is 1658M / 3168M and, as we had a couple of power failures less than 6 hours ago, the machine is pretty "fresh". One of the beta's at 7.20 is still running alongside a production 7.16 -- could there be some interaction? I've also noticed a bunch of soft_link files getting updated rather often to the current time. Where are those coming from? |
||
|
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 786 Status: Offline Project Badges:
|
An Intel XP 32 completed OK with CPU logged.
----------------------------------------On the dual core XP 32 one is valid and other PV. CPU is zero on both. Extract from log of the unit I restarted: Result Name: BETA_ HST1_ 004078_ 000001_ AT0012_ T000_ F00056_ S00006_ 0-- <core_client_version>7.2.47</core_client_version> <![CDATA[ <stderr_txt> 22:37:23 (28764): start_timer_thread(): CreateThread() failed, errno 0 INFO: result number = 0 INFO: No state to restore. Start from the beginning. [22:37:23] INFO: Running initial simulation [22:43:35] INFO: Completed step 100000 of initial simulation [22:49:28] INFO: Completed step 200000 of initial simulation [22:55:06] INFO: Completed step 300000 of initial simulation 22:58:59 (26716): start_timer_thread(): CreateThread() failed, errno 0 INFO: result number = 0 INFO: No state to restore. Start from the beginning. [22:58:59] INFO: Running initial simulation Back Off! I just backed up md.log to ./#md.log.1# Back Off! I just backed up traj.xtc to ./#traj.xtc.1# Back Off! I just backed up ener.edr to ./#ener.edr.1# [23:05:24] INFO: Completed step 100000 of initial simulation ... [03:41:32] INFO: Completed step 5000000 of initial simulation [03:41:32] INFO: Finished initial simulation. [03:41:32] INFO: Running secondary simulation [03:41:35] INFO: Run complete, CPU time: 16736.515625 03:41:35 (26716): called boinc_finish(0) </stderr_txt> ]]> BETA_ HST1_ 004078_ 000001_ AT0012_ T000_ F00056_ S00006_ 0-- unknown2 Pending Validation 17/06/16 20:20:35 18/06/16 02:40:07 0.00 / 4.71 98.6 / 0.0 Paul. Edit: Add result status.
Paul.
----------------------------------------[Edit 1 times, last edit by PMH_UK at Jun 18, 2016 8:54:52 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Today I completed a repair job for the 17 June beta, with one wingman turning Invalid. The only difference in the Invalid Result Log was a couple of restarts from a checkpoint file.
BETA_ HST1_ 004074_ 000024_ AC0031_ T400_ F00045_ S00005_ 2-- Microsoft Windows 10 Professional x64 Edition, (10.00.10586.00) 721 Valid 20/06/16 14:00:15 20/06/16 18:43:07 3.62 143.2 / 146.2 BETA_ HST1_ 004074_ 000024_ AC0031_ T400_ F00045_ S00005_ 1-- Microsoft Windows 8.1 x64 Edition, (06.03.9600.00) 721 Invalid 17/06/16 20:39:32 20/06/16 14:00:06 7.43 214.7 / 146.2 BETA_ HST1_ 004074_ 000024_ AC0031_ T400_ F00045_ S00005_ 0-- Microsoft Windows 7 Professional x64 Edition, Service Pack 1, (06.01.7601.00) 721 Valid 17/06/16 20:39:25 18/06/16 06:45:19 5.99 149.2 / 146.2 Result Name: BETA_ HST1_ 004074_ 000024_ AC0031_ T400_ F00045_ S00005_ 1-- <core_client_version>7.6.9</core_client_version> <![CDATA[ <stderr_txt> INFO: result number = 1 INFO: No state to restore. Start from the beginning. [16:27:19] INFO: Running initial simulation Writing checkpoint at step 380. Writing checkpoint at step 780. Writing checkpoint at step 1230. Writing checkpoint at step 1750. [16:49:35] INFO: Completed step 2000 of initial simulation ... (snipped) [19:10:34] INFO: Completed step 38000 of initial simulation Writing checkpoint at step 38410. Writing checkpoint at step 39510. [19:19:29] INFO: Completed step 40000 of initial simulation INFO: result number = 1 [11:13:36] INFO: Running initial simulation Reading checkpoint file state.cpt generated: Fri Jun 17 19:17:21 2016 [11:15:08] INFO: Completed step 40000 of initial simulation ... (snipped) [15:05:20] INFO: Completed step 86000 of initial simulation Writing checkpoint at step 86710. INFO: result number = 1 [07:51:55] INFO: Running initial simulation Reading checkpoint file state.cpt generated: Sat Jun 18 15:08:44 2016 [07:56:17] INFO: Completed step 88000 of initial simulation Writing checkpoint at step 88170. ... (snipped) Writing checkpoint at step 98980. [08:46:14] INFO: Completed step 100000 of initial simulation Writing checkpoint at step 100000. [08:46:17] INFO: Finished initial simulation. [08:46:17] INFO: Running secondary simulation [08:58:53] INFO: Run complete, CPU time: 26744.542500 08:58:53 (4288): called boinc_finish(0) |
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
All WUs of that batch returned, including one that was received on a 32bit Windows Server 2003 box, with only one other resulting in an error ( BETA_HST1_004074_000080_AC0032_T400_F00001_S00005, on Windows 7/64bit),but so did 3 wingmans (1 Windows XP 32bit, 1 Windows 7/64bit, 1 Windows 8.1/64)...
|
||
|
|
|