| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 38
|
|
| Author |
|
|
hoanqui
Cruncher Joined: Aug 14, 2007 Post Count: 15 Status: Offline |
The BOINC CPU throttle of 30% means it runs 3 seconds at 100% and pauses for 7 seconds.... on all projects, both 5.8 and 5.10. I thought that only happened on Windows, as other platforms had better ways to deal with throttling and priorities. Any way, the cycling must be MUCH faster, since the peaks I am seeing during normal BOINC work are shorter than 1 sec. At some point the throttling itself caused problems in the science, with the only way avoiding offered was to set it at 100%. Perhaps that happened in this project? I have been running BOINC with SETI and Folding@home for years in various computers with Windows and OS X, and it rarely has been running at 100%. Like Didactylos already mentioned, during checkpointing the throttle is ignored as disk i/o takes priority. During I/O bound operation, the CPU should be rather idle, shouldn't it? As for the original message: Task lf432_00028_11 exited with zero status but no 'finished' file One known cause is the system clock being synchronized by the system. If that happens middle of the science, particularly backward, BOINC considers a timekeeping problem and tries to return a project to the last checkpoint. Just wonder if the throttling alternation 3:7 and the system time keeping might have something to do with this. Logs don't show any NTP activity yesterday nor today. Apart from that, seems quite an esoteric problem. (in fact, esoteric enough to not bother with it without some direct hint pointing that way. Wouldn't some useful logging from the app be wonderful?) Anyway, FWIW, I don't feel that could be the cause. Very few i've heard of set it to 30%, considering that the processes use pure idsle time and are at lowest priority i.e. even at 100% should hardly impair use. Except if, for example, somehow it manages to max out the CPU and stay half-alive when it should just be saving whatever to disk and GTFO because I just came back to the computer and I'm trying to do whatever, like, use a Vmware machine that needs plenty of RAM and CPU, which are being uselessly hogged by a HPF2 process that will output nothing. At best, the processor will still be maxed out because of an unwanted process (high temp, fan blowing). At worst, the still-in-use RAM will cause pagination and slow things. Full tilt is known to cause the fan to run all the time, but another Mac user reported that 90% was enough for that not to happen. Perhaps it was not the same model of computer. Or perhaps he was in a noisier environment and didn't mind or notice. Or perhaps I'm whiney about noise and temp. The thing is, 30% - 50% is good for me most of the time. (I mentioned before my iBook G4 and should now elaborate, because it's related: in my iBook G4 I could leave BOINC at 100% without problem, as somehow the power management was intelligent enough not to switch the CPU to high speed mode when only a low priority process was running. That way, the fan never went on just because of BOINC - no matter how bad it behaved!! With the Intel processors or the Macbook, that CPU management intelligence is sadly lost. So now I must be more careful with BOINC if I want to keep a relatively noise-free computer. And, if a low priority process goes haywire, the fan does the same.) [Edit 3 times, last edit by hoanqui at Aug 15, 2007 2:22:53 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm not convinced there is a problem that needs solving here.
hoanqui, if you can't stand the fan noise, then your options are: 1) lower your fan speed 2) opt out of HPF2 3) opt out of WCG (obviously, I hope you don't chose this option, but I'm afraid that avoiding fan noise is not a high priority for WCG). For "Task lf432_00028_11 exited with zero status but no 'finished' file" to be a problem requiring action, one of these situations must occur: 1) the message is appearing very frequently - many times in a single work unit. 2) the work is not progressing: work units are stuck at the same point for hours or days. 3) you are returning invalid results. From what you say, it seems that none of these problems are happening. So, I don't think this problem needs solving. Sekerob has suggested the common causes for this message showing up from time to time, and if none of those are relevant in your case, then there is nothing we can do (pending further troubleshooting). |
||
|
|
hoanqui
Cruncher Joined: Aug 14, 2007 Post Count: 15 Status: Offline |
I'm not convinced there is a problem that needs solving here. hoanqui, if you can't stand the fan noise, then your options are: 1) lower your fan speed 2) opt out of HPF2 3) opt out of WCG (obviously, I hope you don't chose this option, but I'm afraid that avoiding fan noise is not a high priority for WCG). 1: Check. 2: Being considered. 3: Being considered, if other projects behave as badly as this one. There is another option, though it is not mine (and this one would void the need for the previous 3 options: again, look at SETI@Home): --What about HPF2 respecting the prefs I set and following its "advertised" behaviour? For "Task lf432_00028_11 exited with zero status but no 'finished' file" to be a problem requiring action, one of these situations must occur: 1) the message is appearing very frequently - many times in a single work unit. 2) the work is not progressing: work units are stuck at the same point for hours or days. 3) you are returning invalid results. Does this qualify as frequently enough for you? Mon 6 Aug 06:50:40 2007|World Community Grid|Task lf204_00134_8 exited with zero status but no 'finished' file Mon 6 Aug 06:50:40 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Mon 6 Aug 06:57:01 2007|World Community Grid|Task lf204_00134_8 exited with zero status but no 'finished' file Mon 6 Aug 06:57:01 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Mon 6 Aug 07:28:44 2007|World Community Grid|Task lf204_00134_8 exited with zero status but no 'finished' file Mon 6 Aug 07:28:44 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Mon 6 Aug 07:34:52 2007|World Community Grid|Task lf204_00134_8 exited with zero status but no 'finished' file Mon 6 Aug 07:34:52 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Mon 6 Aug 07:40:56 2007|World Community Grid|Task lf204_00134_8 exited with zero status but no 'finished' file Mon 6 Aug 07:40:56 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Mon 6 Aug 10:22:06 2007|World Community Grid|Task lf204_00134_8 exited with zero status but no 'finished' file Mon 6 Aug 10:22:06 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Mon 13 Aug 04:04:15 2007|World Community Grid|Task lf383_00018_15 exited with zero status but no 'finished' file Mon 13 Aug 04:04:15 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Mon 13 Aug 05:33:57 2007|World Community Grid|Task lf383_00018_15 exited with zero status but no 'finished' file Mon 13 Aug 05:33:57 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Tue 14 Aug 17:21:20 2007|World Community Grid|Task lf813_00017_3 exited with zero status but no 'finished' file Tue 14 Aug 17:21:20 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Tue 14 Aug 17:36:23 2007|World Community Grid|Task lf813_00017_3 exited with zero status but no 'finished' file Tue 14 Aug 17:36:23 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Tue 14 Aug 18:35:18 2007|World Community Grid|Task lf813_00017_3 exited with zero status but no 'finished' file Tue 14 Aug 18:35:18 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Wed 15 Aug 03:41:15 2007|World Community Grid|Task lf829_00097_17 exited with zero status but no 'finished' file Wed 15 Aug 03:41:15 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Wed 15 Aug 03:46:26 2007|World Community Grid|Task lf829_00097_17 exited with zero status but no 'finished' file Wed 15 Aug 03:46:26 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Wed 15 Aug 04:08:19 2007|World Community Grid|Task lf829_00097_17 exited with zero status but no 'finished' file Wed 15 Aug 04:08:19 2007|World Community Grid|If this happens repeatedly you may need to reset the project. As I said, my main complaint is the fan noise. That means that I'm giving quite a lot of freedom to the project to use the computer however they please; just don't disturb me. The other side of that is: if my main complaint was the uselessly wasted unpaid CPU time (and energy and whatnot), HPF2 would have been kicked out of my computers some weeks ago. Just to try to drive all of this to something useful: is there a place where I can post a proper bug report or complaint or whatever so this can be fixed? |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
The frequency of times between the errors could coincide with checkpoint saves and all i see is HPF2, so as earlier suggested unselect HPF2 and see if FA@H produces the error.
----------------------------------------For a place to file a bug report send mail to support@worldcommunitygrid.org
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'll ask the techs to take a look at this individual case.
I'm sorry, but the preferences are - well, preferences. They are not strict limits (although the BOINC developers are working to enforce them more strictly). Oh, and comparing one BOINC project with another is invalid: SETI@Home, for example, has very different computing requirements to HPF2. I know it's frustrating when one project seems to work and another doesn't for no obvious reason. Anyway: if you have anything to add to your report, please post it here. For example, you could provide the stderr log from one of the work units that experienced this problem. |
||
|
|
hoanqui
Cruncher Joined: Aug 14, 2007 Post Count: 15 Status: Offline |
Anyway: if you have anything to add to your report, please post it here. For example, you could provide the stderr log from one of the work units that experienced this problem. I guess you're referring to stderr.txt files appearing in Slots/x/ directories. OK, I'll keep an eye on the thing to see if any appears. Right now there is one such file, but it's 0 KB. I guess yesterday's resets got rid of it. Right now I have cleared the local prefs, and re-set the web prefs: 40% CPU, 1 CPU, 2 GB disk, 75% mem, only work when computer 1 min idle. I'll try letting HPF2 run - though not while I sleep, that's for sure :P. If the problem happens again, I'll post the file here and switch off HPF2 to see if FA@H causes the same problem. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello hoanqui,
You get the shortest run time for a core using the BOINC threshold using 50%. The core should run BOINC for 1 second, then idle for 1 second. 40% will cause the core to run for 2 seconds, then idle for 3 seconds. Lawrence |
||
|
|
hoanqui
Cruncher Joined: Aug 14, 2007 Post Count: 15 Status: Offline |
You get the shortest run time for a core using the BOINC threshold using 50%. The core should run BOINC for 1 second, then idle for 1 second. 40% will cause the core to run for 2 seconds, then idle for 3 seconds. Two problems about this: --This is the second time someone mentions this kind of behaviour in this thread, so I'm feeling unsure. But the fact still is, the behaviour of BOINC I am seeing in my Mac is not like that: as I already said, the "on-off switching" must be MUCH faster, since I can not see peaks lasting longer than half a sec during normal work. (If you talk about "time units" instead of "seconds", then that would be OK with my observations). Last time I checked BOINC on a Windows XP machine, it also seemed to be switching too fast to be measured in seconds. --And anyway, over the long run, 40% should still generate less heat than 50%, no matter how you slice that time. If you or anyone has any first hand information about why this is not the case, please elaborate. |
||
|
|
hoanqui
Cruncher Joined: Aug 14, 2007 Post Count: 15 Status: Offline |
Continuing with the dying HPF2 processes saga:
Wed 15 Aug 23:52:22 2007|World Community Grid|Task lf829_00097_17 exited with zero status but no 'finished' file Wed 15 Aug 23:52:22 2007|World Community Grid|If this happens repeatedly you may need to reset the project. Thu 16 Aug 15:24:01 2007|World Community Grid|Task lf829_00097_17 exited with zero status but no 'finished' file Thu 16 Aug 15:24:01 2007|World Community Grid|If this happens repeatedly you may need to reset the project. In either case, the file Slots/x/stderr.txt remained at 0 KB. As stated before, I will now proceed to disable HPF2 and try running FA@H to see if it fails in the same way. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Please will you go to https://secure.worldcommunitygrid.org/ms/viewBoincResults.do and retrieve the error log from there, by clicking on the link in the "Status" column.
|
||
|
|
|