Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 156
|
![]() |
Author |
|
sk..
Master Cruncher http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif Joined: Mar 22, 2007 Post Count: 2324 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It might help; I was thinking about suggesting it.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I don't think that limiting the number of concurrent CEP2 WU's is going to help... I had just one running on my quad-core yesterday with 3 c4cw WU's, and the CEP2 WU still got shorted, exactly as all others have for me for the last few weeks. Also, before this problem started, there were plenty of times I was running 4 at once on the quad core, and got perfectly acceptable CPU times reported.
|
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Plz understand that the limiting of concurrent CEP2 tasks was not to imply that the 'big' time losses would be erased with that. That's an issue to be resolved before an all supported platforms release is put in production. The concurrency limiting is to further reduce any of the inefficiency due the large disk I/O the more running the more bottlenecking. On my quad running 2 or 4 makes a difference of > 10 minutes per task
----------------------------------------edit: strike "i.e."
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Oct 18, 2010 4:17:00 PM] |
||
|
sk..
Master Cruncher http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif Joined: Mar 22, 2007 Post Count: 2324 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Just a note to say that when I suspended one of these tasks, it immediately errored out:
E200464_ 778_ A.24.C20H15NOSSi.230.0.set1d06_ 1-- four Error 17/10/10 19:20:09 18/10/10 18:57:13 3.06 60.6 / 0.0 Kubuntu x64 Q6600 |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Yikes, here's the log for a CEP2 job that came up "valid." Look at how many times it encountered some kind of problem during job #2 and started job 2 over again, resetting the CPU time every time. It ended up with 6.2 hrs CPU time, but must have spent more than that on job #2 alone. (This is on a "reliable machine" that has been crunching nearly 24/7 since Feb.)
----------------------------------------I'm going to stop crunching CEP2 for a while. I look forward to participating in its next Linux beta and rejoining when there's a better-running version. World Community Grid Result Log Result Name: E200459_ 340_ A.24.C19H11NOS2Se.187.3.set1d06_ 0-- <core_client_version>6.10.17</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. [17:04:26] Number of jobs = 16 [17:04:26] Starting job 0,CPU time has been restored to 0.000000. [17:04:26] Starting new Job [17:04:26] Qink name = fldman [17:04:26] Qink name = gesman [17:04:26] Qink name = scfman [17:07:06] Qink name = anlman [17:07:09] End of Job [17:07:11] Finished Job #0 [17:07:11] Starting job 1,CPU time has been restored to 88.290000. [17:07:11] Starting new Job [17:07:11] Qink name = fldman [17:07:12] Qink name = gesman [17:07:12] Qink name = scfman [17:13:42] Qink name = anlman [17:14:18] End of Job [17:14:21] Finished Job #1 [17:14:21] Starting job 2,CPU time has been restored to 345.950000. [17:14:21] Starting new Job [17:14:21] Qink name = fldman [17:14:21] Qink name = gesman [17:14:21] Qink name = scfman [17:19:00] Qink name = anlman [17:19:00] Qink name = drvman [17:20:19] Qink name = optman [17:20:19] Qink name = fldman [17:20:19] Qink name = gesman [17:20:19] Qink name = scfman [17:29:00] Qink name = anlman [17:29:00] Qink name = drvman [17:30:19] Qink name = optman [17:30:19] Qink name = fldman [17:30:19] Qink name = gesman [17:30:20] Qink name = scfman Parent was killed, exiting [17:33:48] Number of jobs = 16 [17:33:48] Starting job 2,CPU time has been restored to 345.950000. [17:33:48] Starting new Job [17:33:48] Qink name = fldman [17:33:48] Qink name = gesman [17:33:48] Qink name = scfman [17:35:00] Number of jobs = 16 [17:35:00] Starting job 2,CPU time has been restored to 345.950000. Error reading in TMP file 44/0 (1120): No such file or directory [17:35:00] Starting new Job [17:35:00] Qink name = fldman [17:35:01] Qink name = gesman [17:35:01] Qink name = scfman Parent was killed, exiting [17:35:41] Number of jobs = 16 [17:35:41] Starting job 2,CPU time has been restored to 345.950000. [17:35:41] Starting new Job [17:35:41] Qink name = fldman [17:35:41] Qink name = gesman [17:35:41] Qink name = scfman Parent was killed, exiting [17:37:24] Number of jobs = 16 [17:37:24] Starting job 2,CPU time has been restored to 345.950000. [17:37:24] Starting new Job [17:37:24] Qink name = fldman [17:37:24] Qink name = gesman [17:37:24] Qink name = scfman [17:39:06] Number of jobs = 16 [17:39:06] Starting job 2,CPU time has been restored to 345.950000. [17:39:06] Starting new Job [17:39:07] Qink name = fldman [17:39:07] Qink name = gesman [17:39:07] Qink name = scfman [17:40:44] Number of jobs = 16 [17:40:44] Starting job 2,CPU time has been restored to 345.950000. Error reading in TMP file 44/0 (1120): No such file or directory [17:40:45] Starting new Job [17:40:45] Qink name = fldman [17:40:45] Qink name = gesman [17:40:45] Qink name = scfman Parent was killed, exiting [17:43:51] Number of jobs = 16 [17:43:51] Starting job 2,CPU time has been restored to 345.950000. [17:43:51] Starting new Job [17:43:51] Qink name = fldman [17:43:52] Qink name = gesman [17:43:52] Qink name = scfman Parent was killed, exiting [17:47:48] Number of jobs = 16 [17:47:48] Starting job 2,CPU time has been restored to 345.950000. [17:47:48] Starting new Job [17:47:48] Qink name = fldman [17:47:49] Qink name = gesman [17:47:49] Qink name = scfman [17:52:08] Qink name = anlman [17:52:08] Qink name = drvman [17:53:23] Qink name = optman [17:53:23] Qink name = fldman [17:53:23] Qink name = gesman [17:53:24] Qink name = scfman [17:57:43] Number of jobs = 16 [17:57:43] Starting job 2,CPU time has been restored to 345.950000. [17:57:43] Starting new Job [17:57:43] Qink name = fldman [17:57:44] Qink name = gesman [17:57:44] Qink name = scfman [18:02:03] Qink name = anlman [18:02:03] Qink name = drvman [18:03:17] Qink name = optman [18:03:17] Qink name = fldman [18:03:17] Qink name = gesman [18:03:17] Qink name = scfman [18:11:25] Qink name = anlman [18:11:25] Qink name = drvman [18:12:41] Qink name = optman [18:12:41] Qink name = fldman [18:12:41] Qink name = gesman [18:12:43] Qink name = scfman Parent was killed, exiting [18:16:33] Number of jobs = 16 [18:16:33] Starting job 2,CPU time has been restored to 345.950000. [18:16:34] Starting new Job [18:16:34] Qink name = fldman [18:16:34] Qink name = gesman [18:16:34] Qink name = scfman [18:20:59] Qink name = anlman [18:20:59] Qink name = drvman [18:22:14] Qink name = optman [18:22:14] Qink name = fldman [18:22:14] Qink name = gesman [18:22:14] Qink name = scfman [18:30:11] Qink name = anlman [18:30:11] Qink name = drvman [18:31:22] Qink name = optman [18:31:22] Qink name = fldman [18:31:22] Qink name = gesman [18:31:23] Qink name = scfman [18:39:07] Qink name = anlman [18:39:07] Qink name = drvman [18:40:18] Qink name = optman [18:40:18] Qink name = fldman [18:40:18] Qink name = gesman [18:40:19] Qink name = scfman [18:48:17] Qink name = anlman [18:48:17] Qink name = drvman [18:49:27] Qink name = optman [18:49:27] Qink name = fldman [18:49:27] Qink name = gesman [18:49:28] Qink name = scfman [18:56:27] Qink name = anlman [18:56:27] Qink name = drvman [18:57:41] Qink name = optman [18:57:42] Qink name = fldman [18:57:42] Qink name = gesman [18:57:42] Qink name = scfman [19:05:52] Qink name = anlman [19:05:52] Qink name = drvman [19:07:06] Qink name = optman [19:07:06] Qink name = fldman [19:07:06] Qink name = gesman [19:07:06] Qink name = scfman [19:14:32] Qink name = anlman [19:14:32] Qink name = drvman [19:15:44] Qink name = optman [19:15:44] Qink name = fldman [19:15:44] Qink name = gesman [19:15:45] Qink name = scfman [19:22:57] Qink name = anlman [19:22:57] Qink name = drvman [19:24:09] Qink name = optman [19:24:09] Qink name = fldman [19:24:09] Qink name = gesman [19:24:09] Qink name = scfman [19:31:47] Qink name = anlman [19:31:47] Qink name = drvman [19:32:58] Qink name = optman [19:32:58] Qink name = fldman [19:32:58] Qink name = gesman [19:32:58] Qink name = scfman [19:40:43] Qink name = anlman [19:40:43] Qink name = drvman [19:41:53] Qink name = optman [19:41:53] Qink name = fldman [19:41:53] Qink name = gesman [19:41:54] Qink name = scfman [19:49:35] Qink name = anlman [19:49:35] Qink name = drvman [19:50:45] Qink name = optman [19:50:45] Qink name = fldman [19:50:45] Qink name = gesman [19:50:46] Qink name = scfman [19:57:24] Qink name = anlman [19:57:24] Qink name = drvman [19:58:34] Qink name = optman [19:58:34] Qink name = fldman [19:58:34] Qink name = gesman [19:58:35] Qink name = scfman [20:04:51] Qink name = anlman [20:04:51] Qink name = drvman [20:06:02] Qink name = optman [20:06:02] Qink name = fldman [20:06:02] Qink name = gesman [20:06:02] Qink name = scfman [20:11:34] Qink name = anlman [20:11:34] Qink name = drvman [20:12:46] Qink name = optman [20:12:46] Qink name = fldman [20:12:46] Qink name = gesman [20:12:46] Qink name = scfman [20:18:45] Qink name = anlman [20:18:45] Qink name = drvman [20:19:55] Qink name = optman [20:19:55] Qink name = fldman [20:19:55] Qink name = gesman [20:19:56] Qink name = scfman [20:25:07] Qink name = anlman [20:25:07] Qink name = drvman [20:26:20] Qink name = optman [20:26:20] Qink name = fldman [20:26:20] Qink name = gesman [20:26:20] Qink name = scfman [20:31:34] Qink name = anlman [20:31:34] Qink name = drvman [20:32:47] Qink name = optman [20:32:47] Qink name = fldman [20:32:47] Qink name = gesman [20:32:48] Qink name = scfman [20:38:31] Qink name = anlman [20:38:31] Qink name = drvman [20:39:43] Qink name = optman [20:39:43] Qink name = fldman [20:39:43] Qink name = gesman [20:39:44] Qink name = scfman [20:45:54] Qink name = anlman [20:45:54] Qink name = drvman [20:47:07] Qink name = optman [20:47:07] Qink name = fldman [20:47:07] Qink name = gesman [20:47:08] Qink name = scfman [20:53:21] Qink name = anlman [20:53:21] Qink name = drvman [20:54:38] Qink name = optman [20:54:38] Qink name = fldman [20:54:38] Qink name = gesman [20:54:39] Qink name = scfman [21:00:45] Qink name = anlman [21:00:45] Qink name = drvman [21:02:01] Qink name = optman [21:02:01] Qink name = fldman [21:02:01] Qink name = gesman [21:02:02] Qink name = scfman [21:07:10] Qink name = anlman [21:07:10] Qink name = drvman [21:08:22] Qink name = optman [21:08:22] Qink name = anlman [21:08:58] End of Job [21:09:00] Finished Job #2 [21:09:00] Starting job 3,CPU time has been restored to 6444.680000. [21:09:00] Starting new Job [21:09:00] Qink name = fldman [21:09:01] Qink name = gesman [21:09:01] Qink name = scfman [21:16:31] Qink name = anlman [21:17:08] End of Job [21:17:12] Finished Job #3 [21:17:12] Starting job 4,CPU time has been restored to 6727.660000. [21:17:12] Starting new Job [21:17:12] Qink name = fldman [21:17:13] Qink name = gesman [21:17:13] Qink name = scfman [21:22:37] Qink name = anlman [21:23:15] End of Job [21:23:17] Finished Job #4 [21:23:17] Starting job 5,CPU time has been restored to 6947.350000. [21:23:17] Starting new Job [21:23:17] Qink name = fldman [21:23:18] Qink name = gesman [21:23:18] Qink name = scfman [21:29:03] Qink name = anlman [21:29:40] End of Job [21:29:43] Finished Job #5 [21:29:43] Starting job 6,CPU time has been restored to 7177.480000. [21:29:43] Starting new Job [21:29:43] Qink name = fldman [21:29:43] Qink name = gesman [21:29:43] Qink name = scfman [21:35:09] Qink name = anlman [21:35:46] End of Job [21:35:49] Finished Job #6 [21:35:49] Starting job 7,CPU time has been restored to 7396.620000. [21:35:49] Starting new Job [21:35:49] Qink name = fldman [21:35:49] Qink name = gesman [21:35:49] Qink name = scfman [21:43:29] Qink name = anlman [21:44:03] End of Job [21:44:06] Finished Job #7 [21:44:06] Starting job 8,CPU time has been restored to 7700.540000. [21:44:06] Starting new Job [21:44:06] Qink name = fldman [21:44:06] Qink name = gesman [21:44:06] Qink name = scfman [21:49:40] Qink name = anlman [21:50:16] End of Job [21:50:19] Finished Job #8 [21:50:19] Starting job 9,CPU time has been restored to 7924.880000. [21:50:20] Starting new Job [21:50:20] Qink name = fldman [21:50:20] Qink name = gesman [21:50:20] Qink name = scfman [21:56:07] Qink name = anlman [21:57:06] End of Job [21:57:09] Finished Job #9 [21:57:09] Starting job 10,CPU time has been restored to 8163.430000. [21:57:09] Starting new Job [21:57:09] Qink name = fldman [21:57:10] Qink name = gesman [21:57:10] Qink name = scfman [22:09:49] Qink name = anlman [22:10:43] End of Job [22:10:45] Finished Job #10 [22:10:45] Starting job 11,CPU time has been restored to 8659.740000. [22:10:45] Starting new Job [22:10:45] Qink name = fldman [22:10:46] Qink name = gesman [22:10:46] Qink name = scfman [22:18:03] Qink name = anlman [22:18:58] End of Job [22:19:00] Finished Job #11 [22:19:00] Starting job 12,CPU time has been restored to 8947.300000. [22:19:00] Starting new Job [22:19:01] Qink name = fldman [22:19:04] Qink name = gesman [22:19:04] Qink name = scfman [22:53:09] Qink name = anlman [22:59:27] End of Job [22:59:31] Finished Job #12 [22:59:31] Starting job 13,CPU time has been restored to 10433.970000. [22:59:32] Starting new Job [22:59:32] Qink name = fldman [22:59:34] Qink name = gesman [22:59:35] Qink name = scfman [00:32:15] Number of jobs = 16 [00:32:15] Starting job 13,CPU time has been restored to 10433.970000. [00:32:15] Starting new Job [00:32:15] Qink name = fldman [00:32:23] Qink name = gesman [00:32:23] Qink name = scfman Parent was killed, exiting [02:05:43] Qink name = anlman [02:11:41] End of Job [02:11:45] Finished Job #13 [02:11:45] Starting job 14,CPU time has been restored to 14228.950000. [02:11:46] Starting new Job [02:11:46] Qink name = fldman [02:11:48] Qink name = gesman [02:11:49] Qink name = scfman [03:45:50] Qink name = anlman [03:51:41] End of Job [03:51:44] Finished Job #14 [03:51:44] Starting job 15,CPU time has been restored to 18085.450000. [03:51:45] Starting new Job [03:51:45] Qink name = fldman [03:51:47] Qink name = gesman [03:51:50] Qink name = scfman [05:28:40] Qink name = anlman [05:36:43] End of Job [05:36:47] Finished Job #15 called boinc_finish Exiting 0 </stderr_txt> ]]> close Return to Top ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Well that is a novel one drawing attention. **
----------------------------------------[17:30:20] Qink name = scfman Parent was killed, exiting [17:33:48] Number of jobs = 16 [17:33:48] Starting job 2,CPU time has been restored to 345.950000. Wondered what caused this? Interested in the Linux Event Log and the stdoutdae.txt and stderrdae.txt logs from around your time segment when that happened. Meantime, I've just set the Swapfile use of the client to zero (0). Done that in past under Windows 7, to be more precise, disabled Swapfile in the OS, and only once saw an out of memory condition. Hoping something changes (for the better of course ;-). There are concurrent 2 CEP2 running and 2 HCMD2 (those memory use mini's). Edit: ** something nagging and sure enough, I posted the very thing on October 5: http://www.worldcommunitygrid.org/forums/wcg/...29985&offset=0#297929, but it occurred only once during the run and may have been playing with the system to cause this. It's the only other post showing this.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Oct 19, 2010 8:17:56 AM] |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Here's a portion of the stdoutae.txt. I don't seem to have a stderrae.txt. My Internet service provider was having trouble that evening. It looks as if every time BOINC made an unsuccessful effort to contact the internet, the running jobs had problems.
----------------------------------------17-Oct-2010 17:32:53 [World Community Grid] Computation for task E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0 finished 17-Oct-2010 17:32:53 [World Community Grid] Resuming task HFCC_n1_02464410_n1_0001_0 using hfcc version 611 17-Oct-2010 17:32:55 [World Community Grid] Started upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_0 17-Oct-2010 17:32:55 [World Community Grid] Started upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_1 17-Oct-2010 17:32:58 [World Community Grid] Finished upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_0 17-Oct-2010 17:32:58 [World Community Grid] Started upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_2 17-Oct-2010 17:33:01 [World Community Grid] Finished upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_1 17-Oct-2010 17:33:01 [World Community Grid] Started upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_3 17-Oct-2010 17:33:04 [World Community Grid] Finished upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_2 17-Oct-2010 17:33:04 [World Community Grid] Finished upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_3 17-Oct-2010 17:33:04 [World Community Grid] Started upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4 17-Oct-2010 17:33:47 [World Community Grid] Task HFCC_n1_02464410_n1_0001_0 exited with zero status but no 'finished' file 17-Oct-2010 17:33:47 [World Community Grid] If this happens repeatedly you may need to reset the project. 17-Oct-2010 17:33:47 [World Community Grid] Temporarily failed upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4: can't resolve hostname 17-Oct-2010 17:33:47 [World Community Grid] Backing off 1 min 0 sec on upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4 17-Oct-2010 17:33:47 [World Community Grid] Restarting task HFCC_n1_02464410_n1_0001_0 using hfcc version 611 17-Oct-2010 17:33:48 [World Community Grid] Task E200459_340_A.24.C19H11NOS2Se.187.3.set1d06_0 exited with zero status but no 'finished' file 17-Oct-2010 17:33:48 [World Community Grid] If this happens repeatedly you may need to reset the project. 17-Oct-2010 17:34:17 [---] Project communication failed: attempting access to reference site 17-Oct-2010 17:34:59 [World Community Grid] Task HFCC_n1_02464410_n1_0001_0 exited with zero status but no 'finished' file 17-Oct-2010 17:34:59 [World Community Grid] If this happens repeatedly you may need to reset the project. 17-Oct-2010 17:34:59 [---] BOINC can't access Internet - check network connection or proxy configuration. 17-Oct-2010 17:34:59 [World Community Grid] Started upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4 17-Oct-2010 17:34:59 [World Community Grid] Restarting task HFCC_n1_02464410_n1_0001_0 using hfcc version 611 17-Oct-2010 17:35:00 [World Community Grid] Task E200459_340_A.24.C19H11NOS2Se.187.3.set1d06_0 exited with zero status but no 'finished' file 17-Oct-2010 17:35:00 [World Community Grid] If this happens repeatedly you may need to reset the project. 17-Oct-2010 17:35:40 [World Community Grid] Task HFCC_n1_02464410_n1_0001_0 exited with zero status but no 'finished' file 17-Oct-2010 17:35:40 [World Community Grid] If this happens repeatedly you may need to reset the project. 17-Oct-2010 17:35:40 [World Community Grid] Temporarily failed upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4: can't resolve hostname 17-Oct-2010 17:35:40 [World Community Grid] Backing off 1 min 0 sec on upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4 17-Oct-2010 17:35:40 [World Community Grid] Restarting task HFCC_n1_02464410_n1_0001_0 using hfcc version 611 17-Oct-2010 17:35:41 [World Community Grid] Task E200459_340_A.24.C19H11NOS2Se.187.3.set1d06_0 exited with zero status but no 'finished' file 17-Oct-2010 17:35:41 [World Community Grid] If this happens repeatedly you may need to reset the project. 17-Oct-2010 17:35:41 [World Community Grid] Restarting task E200459_340_A.24.C19H11NOS2Se.187.3.set1d06_0 using cep2 version 619 17-Oct-2010 17:36:40 [World Community Grid] Started upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4 17-Oct-2010 17:37:22 [World Community Grid] Task HFCC_n1_02464410_n1_0001_0 exited with zero status but no 'finished' file 17-Oct-2010 17:37:22 [World Community Grid] If this happens repeatedly you may need to reset the project. 17-Oct-2010 17:37:22 [World Community Grid] Temporarily failed upload of E200459_053_A.24.C17H13N3S2Si2.38.0.set1d06_0_4: can't resolve hostname The same thing happened on my other machine that runs CEP2, but not on the machine that was running just C4CW, CMD2, and HCC. ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Good info... long history with BOINC and jobs resetting and or crashing completely when the internet goes. Was the BOINC Manager loaded at the time? I've a long standing practice of closing this sucker... a quick in and out, but never left to sit on panel or in icon tray (notification area, which anyway is not a feature available on Linux for BOINC). If there is security software, check that out too. I'm still on the GetDeb 6.10.58 64 bit version. I'm on scheduled networking for a long time and if you know WCG or ISP is off on a maint trip, I always take the client off-line so it does not waste time trying to knock it's head through the closed door.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Meantime, I've just set the Swapfile use of the client to zero (0). Done that in past under Windows 7, to be more precise, disabled Swapfile in the OS, and only once saw an out of memory condition. Hoping something changes (for the better of course ;-). There are concurrent 2 CEP2 running and 2 HCMD2 (those memory use mini's) After some half dozen ran mostly in "left alone" mode, the hard conclusion is that the gaps between 'elapsed' and CPU time increased. Now THAT's puzzling.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Good info... long history with BOINC and jobs resetting and or crashing completely when the internet goes. Was the BOINC Manager loaded at the time? I've a long standing practice of closing this sucker... a quick in and out, but never left to sit on panel or in icon tray (notification area, which anyway is not a feature available on Linux for BOINC). If there is security software, check that out too. I'm still on the GetDeb 6.10.58 64 bit version. I'm on scheduled networking for a long time and if you know WCG or ISP is off on a maint trip, I always take the client off-line so it does not waste time trying to knock it's head through the closed door. BOINC Manager was mostly closed during that period -- open only when I sometimes peeked to see how things were progressing. Interesting to learn that BOINC jobs tend to have problems with the Internet is out. I'll take mine offline in future at times when I know there are Internet isues. ![]() |
||
|
|
![]() |