| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 51
|
|
| Author |
|
|
Torchwood 4
Cruncher Great Britain Joined: Apr 30, 2010 Post Count: 8 Status: Offline Project Badges:
|
Hi
----------------------------------------Your post of the log on [Mar 6, 2013 4:06:32 PM] suggests that BOINC shuts down when CPU load is over 50%. Try 100% (or 95%) and see if that happens. Re Ubuntu and tracking down excess daemons (Ubuntu-speak: Daemons = background programs in Windows-speak). Look on the forums, and you should see a guide on how to reduce resource drainage. Try increasing the buffers also - I had some weird results (not downloading many WU, then waiting) on my Windows and Linux installations until I increased the buffers. Finally, try doing the Memory stress test I suggested: reboot your machine, when the GRUB bootloader comes up, select MEMTEST and leave it running for 24 hours (or at least overnight). If there are memory glitches, then this should find them. Memory bugs could mean the system is prone to general weirdness. Linux is VERY dependent on your system RAM. Torchwood 4
Join Team Torchwood 4: Use advanced technology to prepare humanity for the future. Remember: The 21st Century is when everything changes...you have got to be ready...
|
||
|
|
RicktheBrick
Senior Cruncher Joined: Sep 23, 2005 Post Count: 206 Status: Offline Project Badges:
|
I am running 7 computers. I shut down one computer every night but restart it most of the time in less than 8 hours later. For every 168 hours my computers are running close to 160 hours. Just now I was watching a computer with a working gpu. It has around 15 work units that will use the gpu but it is running 4 work units that rely only on the cpu. It will stop the execution of a work unit to start another before the completion of the first unit. I watched one unit that reported that it had 32 hours before completion but after just 5 minutes it was down by more than 3 hours to around 29 hours. It then stopped the execution of the work unit and started another. It reported that this one had high priority. This computer too has at most has a ratio of cpu time to real time of 70%. I still suspect that this problem has started after the introduction of work units that use the gpu.
|
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
How is your "switch applications every XX minutes" setting configured?
----------------------------------------![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
RicktheBrick
Senior Cruncher Joined: Sep 23, 2005 Post Count: 206 Status: Offline Project Badges:
|
I am only doing wcg. There are no other projects so why should there be switching between applications but I did change that field to 9960 so there would be no doubt. I looked at my computer that has a working gpu. I did a special profile for it so it would be doing only hcc but today I found a clean energy work unit being executed and it reports that it will take 140 hours to complete. It has 16 work units that will take less than an hour and it is working on one that will take almost a week to complete.
|
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
I am only doing wcg. There are no other projects so why should there be switching between applications but I did change that field to 9960 so there would be no doubt. I looked at my computer that has a working gpu. I did a special profile for it so it would be doing only hcc but today I found a clean energy work unit being executed and it reports that it will take 140 hours to complete. It has 16 work units that will take less than an hour and it is working on one that will take almost a week to complete. The "switch applications every XX minutes" setting is for applications(or work units) running, not different projects. CEP2 work units have a cutoff at 12 CPU hours so it won't take a week to complete, at least not a CPU one. I don't know why you received that CEP2 WU since you claim you selected only HCC in your device profile and AFAIK CEP2 isn't contemplated in the "If no work is available for the projects I selected..." setting in the device profile which you probably selected. So maybe you did something wrong there. ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
RicktheBrick
Senior Cruncher Joined: Sep 23, 2005 Post Count: 206 Status: Offline Project Badges:
|
You are right. I checked the computer today and the clean energy work unit is gone. Now I wonder if that is the reason I am losing cpu time. Did the clean enery unit work for 12 hours and then quit giving me nothing for the effort? Also in the messages I found a message early today that stated that there were no hcc work units available so that tells me wcg is not generating hcc work units fast enough. I would assume that those who are doing more than a thousand units a day would be getting the vast majority of them since they are reporting them so much faster.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
For the general understanding of all readers: 'Swtich Every...' *Only* applies if you have two or more active projects attached to a client, say WCG + FightMalaria@Home. If that is the case, and times up, the switch *normally* only takes place at checkpoint, not earlier. *Only* when there's a rush job in deadline threat will a switch also take place *immediately* to include within the same project, e.g. WCG. At that time and *not* having LAIM on will unload any *pre-empted* task, regarless of whether a checkpoint was made.
----------------------------------------Added: Far as I know a task that has not reached it's first checkpoint will *not* be unloaded, regardless if LAIM is on or not. Not that this does explain anything at all of this thread, but it certainly eliminates ! 2 edits: grammar and emphasizing and addition [Edit 2 times, last edit by Former Member at Mar 17, 2013 12:50:41 PM] |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
For the general understanding of all readers: 'Swtich Every...' *Only* applies if you have two or more active projects attached to a client, say WCG + FightMalaria@Home. If that is the case, and times up, the switch *normally* only takes place at checkpoint, not earlier. *Only* when there's a rush job in deadline threat will a switch also take place *immediately* to include within the same project, e.g. WCG. At that time and *not* having LAIM on will unload any *pre-empted* task, regarless of whether a checkpoint was made. Added: Far as I know a task that has not reached it's first checkpoint will *not* be unloaded, regardless if LAIM is on or not. Not that this does explain anything at all of this thread, but it certainly eliminates ! 2 edits: grammar and emphasizing and addition Didn't know that, thanks. ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
RicktheBrick
Senior Cruncher Joined: Sep 23, 2005 Post Count: 206 Status: Offline Project Badges:
|
Here is what I suspect is going on. I have been closely watching my results for the last month. I have not checked clean energy and yet I am getting work units for it. I suspect that these are being executed for a number of hours and than they are being aborted without any benefit to me. I suspect that since the gpu units have started for hcc all other units are being increased in length. I suspect that this is being done to keep the total number of downloads/uploads down to reduce the bandwidth needed for this project. The end result is that I am getting at most 70% of the results I was getting from my computers without a working gpu. The two computers that I do have with working gpu are not being kept up with enough gpu work units so they are spending a lot of time doing non gpu work. I know this because I have a clean energy unit working now and I have seen messages telling me that there are no work units available for my profile(hcc only but will accept others if none are available). I just wish others would go to device installations and compute what they are getting for cpu time/real time.
|
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
Here is what I suspect is going on. I have been closely watching my results for the last month. I have not checked clean energy and yet I am getting work units for it. I suspect that these are being executed for a number of hours and than they are being aborted without any benefit to me. I suspect that since the gpu units have started for hcc all other units are being increased in length. I suspect that this is being done to keep the total number of downloads/uploads down to reduce the bandwidth needed for this project. The end result is that I am getting at most 70% of the results I was getting from my computers without a working gpu. The two computers that I do have with working gpu are not being kept up with enough gpu work units so they are spending a lot of time doing non gpu work. I know this because I have a clean energy unit working now and I have seen messages telling me that there are no work units available for my profile(hcc only but will accept others if none are available). I just wish others would go to device installations and compute what they are getting for cpu time/real time. No projects length/runtime/whatever were increased other than HCC itself which basically doubled the time required to complete one WU. All the other sciences have their normal runtime as they did before HCC GPU came into the game(obviously those runtimes vary not because they were increased but rather because all compounds/proteins/molecules/etc aren't the same). If those computers aren't getting enough GPU workunits increase their buffers. Why would CEP2 workunits be aborted without passing their deadlines and having two people complete it and validate it(the workunit)? CEP2 workunits will run for a maximum of 12 CPU hours. After that, they end and report back to the server for validation. You will get runtime and credit for them. It's how they work. Could you please be succinct and explain your problem again? Please post the event log and tell us how your device profile is configured. Thanks Edit: Spelling. ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz [Edit 3 times, last edit by Falconet at Mar 18, 2013 8:33:02 PM] |
||
|
|
|