Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: Clean Energy Project - Phase 2 Beta Feb 24, 2016 [ Issues Thread ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 114
|
Author |
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1317 Status: Offline Project Badges: |
2 GB for 1 slot allowed and exceeding.
----------------------------------------I got 3 resends only from Linux machines with this "Maximum disk usage exceeded" and will probably run into the same error on my machine. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
BETA_E236293_985_S.320.C37H29N1O4S1Si1.VYS...UHFFFAOYSA-N.13_s1_14 started as a 3:34 hour WU. Restarted BOINC and my computer within initial projected timeframe without error, but it did not finish within that time. After 15+ computing, I shutdown the computer for the night. Next morning, the WU restarted from the beginning with no messages to the event log. it is still running after 2:40 minutes with a remaining(estimated) of 3:05 minutes.
----------------------------------------The event log does not have any messages about the beta WU, not even a checkpoint. Other WU messages appear not nothing from this beta. Partial event log: 2/27/2016 8:45:22 AM | | Starting BOINC client version 7.6.22 for windows_x86_64 2/27/2016 8:45:22 AM | | log flags: file_xfer, sched_ops, task, checkpoint_debug 2/27/2016 8:45:22 AM | | Libraries: libcurl/7.45.0 OpenSSL/1.0.2d zlib/1.2.8 2/27/2016 8:45:22 AM | | Data directory: C:\ProgramData\BOINC 2/27/2016 8:45:22 AM | | Running under account trodr 2/27/2016 8:45:26 AM | | CUDA: NVIDIA GPU 0: GeForce GTX 960 (driver version 361.91, CUDA version 8.0, compute capability 5.2, 2048MB, 1636MB available, 2412 GFLOPS peak) 2/27/2016 8:45:26 AM | | OpenCL: NVIDIA GPU 0: GeForce GTX 960 (driver version 361.91, device version OpenCL 1.2 CUDA, 2048MB, 1636MB available, 2412 GFLOPS peak) 2/27/2016 8:45:26 AM | | OpenCL: Intel GPU 0: Intel(R) HD Graphics 4600 (driver version 20.19.15.4331, device version OpenCL 1.2, 1630MB, 1630MB available, 200 GFLOPS peak) 2/27/2016 8:45:26 AM | | OpenCL CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 5.2.0.10094, device version OpenCL 1.2 (Build 10094)) 2/27/2016 8:45:26 AM | | All projects have zero resource share; setting to 100 2/27/2016 8:45:26 AM | | Host name: Mango 2/27/2016 8:45:26 AM | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz [Family 6 Model 60 Stepping 3] 2/27/2016 8:45:26 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx tm2 pbe fsgsbase bmi1 smep bmi2 2/27/2016 8:45:26 AM | | OS: Microsoft Windows 10: Core x64 Edition, (10.00.10586.00) 2/27/2016 8:45:26 AM | | Memory: 15.92 GB physical, 18.29 GB virtual 2/27/2016 8:45:26 AM | | Disk: 237.92 GB total, 81.93 GB free 2/27/2016 8:45:26 AM | | Local time is UTC -8 hours 2/27/2016 8:45:26 AM | | VirtualBox version: 5.0.10 2/27/2016 8:45:26 AM | GPUGRID | URL http://www.gpugrid.net/; Computer ID 292928; resource share 100 2/27/2016 8:45:26 AM | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 3467956; resource share 100 2/27/2016 8:45:26 AM | World Community Grid | General prefs: from World Community Grid (last modified 19-Feb-2016 06:48:36) 2/27/2016 8:45:26 AM | World Community Grid | Host location: none 2/27/2016 8:45:26 AM | World Community Grid | General prefs: using your defaults 2/27/2016 8:45:26 AM | | Reading preferences override file 2/27/2016 8:45:26 AM | | Preferences: 2/27/2016 8:45:26 AM | | max memory usage when active: 16300.71MB 2/27/2016 8:45:26 AM | | max memory usage when idle: 16300.71MB 2/27/2016 8:45:26 AM | | max disk usage: 21.41GB 2/27/2016 8:45:26 AM | | suspend work if non-BOINC CPU load exceeds 45% 2/27/2016 8:45:26 AM | | (to change preferences, visit a project web site or select Preferences in the Manager) ⦠2/27/2016 11:46:34 AM | World Community Grid | [checkpoint] result FAH2_000071_avx17558_000096_0056_012_wcgfahb00020000_0 checkpointed -------------- Aborted the job last night after a suspend/resume cycle reset the WU to the beginning. The WU was not going anywhere. [Edit 1 times, last edit by Former Member at Feb 28, 2016 9:23:36 PM] |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: |
BETA_E236293_985_S.320.C37H29N1O4S1Si1.VYS...UHFFFAOYSA-N.13_s1_14 started as a 3:34 hour WU. Restarted BOINC and my computer within initial projected timeframe without error, but it did not finish within that time. After 15+ computing, I shutdown the computer for the night. Next morning, the WU restarted from the beginning with no messages to the event log. it is still running after 2:40 minutes with a remaining(estimated) of 3:05 minutes. FWIW I looked for check pointing on many of the tasks I received. I didn't find any that check pointed before 9+ hours of run time.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
|
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1317 Status: Offline Project Badges: |
2 GB for 1 slot allowed and exceeding. I got 3 resends only from Linux machines with this "Maximum disk usage exceeded" and will probably run into the same error on my machine. As expected: first error task of the three mentioned 27 Feb 22:17:27 Aborting task BETA_E236295_747_S.314.C30H22N6O5S1Si1.IQWNYZMZQUSGER-UHFFFAOYSA-N.13_s1_14_2: exceeded disk limit: 2144.85MB > 2048.00MB [OT]: Can't report the task atm because of the famous: 27 Feb 21:24:56 Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates after a Linux upgrade Have to wait to restart BOINC/reboot the machine until 4 other CEP Beta's are ready/errored out. [/OT] [Edit 1 times, last edit by Crystal Pellet at Feb 27, 2016 9:41:38 PM] |
||
|
Trotador
Senior Cruncher Joined: Mar 26, 2009 Post Count: 154 Status: Offline Project Badges: |
BETA_E236293_985_S.320.C37H29N1O4S1Si1.VYS...UHFFFAOYSA-N.13_s1_14 started as a 3:34 hour WU. Restarted BOINC and my computer within initial projected timeframe without error, but it did not finish within that time. After 15+ computing, I shutdown the computer for the night. Next morning, the WU restarted from the beginning with no messages to the event log. it is still running after 2:40 minutes with a remaining(estimated) of 3:05 minutes. FWIW I looked for check pointing on many of the tasks I received. I didn't find any that check pointed before 9+ hours of run time. Yeah, no checkpoint in the many first hours of crunching |
||
|
pvh513
Senior Cruncher Joined: Feb 26, 2011 Post Count: 260 Status: Offline Project Badges: |
I finished five WUs from the new batch so far. Three ended with "Application exited with RC = 0x100", the other two with "Application exited with RC = 0xb".
----------------------------------------One of these five units had a "Maximum disk usage exceeded" with my wingman, but I did not get that error. I am running openSUSE 42.1 on all rigs. This WU is called BETA_ E236294_ 325_ S.314.C30H22N8O3S2.FNMFXVKTVDLVCB-UHFFFAOYSA-N.7_ s1_ 14_ 1-- [Edit 1 times, last edit by pvh513 at Feb 27, 2016 10:55:40 PM] |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1317 Status: Offline Project Badges: |
How often are tasks of this workunit resent then all will give the same error: "Maximum disk usage exceeded"
----------------------------------------BETA_ E236295_ 747_ S.314.C30H22N6O5S1Si1.IQWNYZMZQUSGER-UHFFFAOYSA-N.13_ s1_ 14_ 4-- Linux 3.16.0-38-generic - In Progress 2/28/16 08:00:29 2/29/16 17:36:28 0.00 0.0 / 0.0 BETA_ E236295_ 747_ S.314.C30H22N6O5S1Si1.IQWNYZMZQUSGER-UHFFFAOYSA-N.13_ s1_ 14_ 3-- Linux 2.6.32-504.el6.centos.plus.x86_64 - In Progress 2/27/16 21:30:05 2/29/16 07:06:04 0.00 0.0 / 0.0 BETA_ E236295_ 747_ S.314.C30H22N6O5S1Si1.IQWNYZMZQUSGER-UHFFFAOYSA-N.13_ s1_ 14_ 2-- Linux 3.2.0-98-generic 700 Error 2/27/16 07:54:36 2/28/16 08:00:15 12.86 371.3 / 0.0 BETA_ E236295_ 747_ S.314.C30H22N6O5S1Si1.IQWNYZMZQUSGER-UHFFFAOYSA-N.13_ s1_ 14_ 1-- Linux 3.16.0-38-generic 700 Error 2/26/16 20:50:06 2/27/16 07:54:28 7.18 169.4 / 0.0 BETA_ E236295_ 747_ S.314.C30H22N6O5S1Si1.IQWNYZMZQUSGER-UHFFFAOYSA-N.13_ s1_ 14_ 0-- Linux 4.2.0-1-amd64 700 Error 2/26/16 20:49:23 2/27/16 21:30:03 4.63 166.8 / 0.0 |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Dreading, got 8 of the 295 batch, 6 running, and some into the 17th hour with 3 recorded checkpoints, all _0 and _1.
Just curious, there was a longer time ago talk of real biggies coming to the grid [suppose they need opt-in to opt-in]... could these be it and the disk_bound really needing upping? Anyway, 5 copies would be the stop-sign, but not going to wait for it. The first that's going MDUE, and the rest goes exitus by hand... beta hours just for the heck of getting them credited on known failed jobs have not my particular interest. |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1317 Status: Offline Project Badges: |
Next one returned:
----------------------------------------28 Feb 10:38:47 Aborting task BETA_E236295_221_S.318.C26H26N4O8S1Si2.FAWZKWOIBUVPJU-UHFFFAOYSA-N.8_s1_14_3: exceeded disk limit: 2104.93MB > 2048.00MB I can't imagine that in production one should need >2GB for 1 slot. As far as I can see the errors are only happening on Linux machines. Maybe something wrong with purging temporary files from the slot? |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Oh, well mine run on a W7-64. The purging issue / true obliteration of slot content of old jobs was resolved somewhere 7.6.9. Have 7.6.22 on the Ubuntu.
----------------------------------------2GB is the minimum setting on the System Requirement page... just had a rejection of UGM work requests, because the full 5GB allowed for BOINC had been used, so upped it to 6GB... 10531 2/28/2016 9:50:24 AM Message from server: Uncovering Genome Mysteries needs 500.00MB more disk space. You currently have 0.00 MB available and it needs 500.00 MB. and later 10959 2/28/2016 11:02:29 AM Message from server: Uncovering Genome Mysteries needs 4.83MB more disk space. You currently have 495.17 MB available and it needs 500.00 MB. So gave it another 1GB... not usually running 6 CEP2 concurrent.... 1 or 2 max, because as of now, got a slug under my hands with very poor efficiency. edit: First finished of 295 with 4 checkpoints, skipping #4 (job 5), earlier observed by others. Guess this is same same continuation with an RC = end ... no science mileage to be made, or gone between nothingness and eternity. Result Name: BETA_ E236295_ 676_ S.314.C30H22N6O5S1Si1.YTTCTFKEFVTPBZ-UHFFFAOYSA-N.1_ s1_ 14_ 0-- <core_client_version>7.6.22</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. [17:14:34] Number of jobs = 5 [17:14:34] Starting job 0,CPU time has been restored to 0.000000. [06:42:39] Finished Job #0 [06:42:39] Starting job 1,CPU time has been restored to 43698.625718. [08:10:48] Finished Job #1 [08:10:48] Starting job 2,CPU time has been restored to 48849.139134. [08:41:17] Finished Job #2 [08:41:17] Starting job 3,CPU time has been restored to 50549.394033. 09:25:55 (23264): No heartbeat from core client for 30 sec - exiting No heartbeat: Exiting [09:26:28] Number of jobs = 5 [09:26:28] Starting job 3,CPU time has been restored to 50549.394033. 09:31:57 (21456): No heartbeat from core client for 30 sec - exiting No heartbeat: Exiting [09:34:31] Number of jobs = 5 [09:34:31] Starting job 3,CPU time has been restored to 50549.394033. Application exited with RC = 0xc0000005 [11:35:22] Finished Job #3 [11:35:22] Starting job 4,CPU time has been restored to 56878.619805. [11:35:22] Skipping Job #4 11:35:28 (21352): called boinc_finish </stderr_txt> ]]> [Edit 1 times, last edit by SekeRob* at Feb 28, 2016 10:46:32 AM] |
||
|
|