| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 315
|
|
| Author |
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
I'm continuing the graphic app problem here ...
|
||
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
I posted a log from an errored out task earlier in this thread. This machine has for some reason produced 2 more tasks that errored out. One after almost 8 hours of runtime and the second after 1 minute.. This is a dedicated cruncher that has a week of nothing but valid results and I've not found any reason for this to happen. Can a tech offer any explanation?
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
----------------------------------------![]() ![]() [Edit 1 times, last edit by nanoprobe at Jun 25, 2019 4:03:18 PM] |
||
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
nanoprobe,
We haven't seen any work unit issues to this point. I checked your results and for one of them there was another machine that had returned a successful result. For the other one there haven't been any results back yet but I will keep an eye on the results to make sure others come back successful. There are six other hosts with the same exit code as yours however they have similar errors across multiple projects. Have you run a hardware test against the memory lately? Thanks, armstrdj |
||
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
nanoprobe, We haven't seen any work unit issues to this point. I checked your results and for one of them there was another machine that had returned a successful result. For the other one there haven't been any results back yet but I will keep an eye on the results to make sure others come back successful. There are six other hosts with the same exit code as yours however they have similar errors across multiple projects. Have you run a hardware test against the memory lately? Thanks, armstrdj Thanks for the reply. I have not run a test on the memory for that machine because it only runs when there is beta work or a new project. Assuming that there are more betas coming I will swap out the memory and see what happens during the next round. Thanks again. ![]()
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
|
NixChix
Veteran Cruncher United States Joined: Apr 29, 2007 Post Count: 1187 Status: Offline Project Badges:
|
I had 2 ARP betas that ran for 5 days. On one that I tried restarting, it reset to 4 days and I lost a day of processing. The wingman had similar run time. It would seem that more frequent break-pointing is needed.
----------------------------------------Cheers ![]() ![]() |
||
|
|
sptrog1
Master Cruncher Joined: Dec 12, 2017 Post Count: 1593 Status: Offline Project Badges:
|
The work units for the new beta test seem large. How much space on disc should be reserved for BOINC in order to run this new beta test?
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
How much space on disc should be reserved for BOINC in order to run this new beta test? Ah, the initial beta announcement didn't say, did it? I remember the software ensuring that there was enough space, but I'm not sure I remember the numbers exactly. I think it was at least 1GB/WU, and may have been a little higher. Did anyone make a note of it? Edit: Let me clarify that. The announcement did give the size of each WU in terms of the disk space needed to store the input and output files, but it did not say how much working space was needed to actually run them. This being a beta test, we only get given a maximum of one WU per thread. The size I gave was therefore the size I think was required per active thread. IIRC, it did go up and down because of compression, but if they are all are at the same point at the same time ... [Edit 1 times, last edit by Former Member at Jun 27, 2019 1:56:26 PM] |
||
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
The disk limit is 1.5 GB.
Thanks, armstrdj |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
I had 2 ARP betas that ran for 5 days. On one that I tried restarting, it reset to 4 days and I lost a day of processing. The wingman had similar run time. It would seem that more frequent break-pointing is needed. Cheers ![]() It's an unfortunate side-effect of the way Beta tasks are issued that large tasks will get sent to machines that wouldn't be expected to run them as production tasks. This project is one of them, and you only need to think back as far as CEP2 for another one... Furthermore, I'd be extremely surprised if the other climate projects don't result in similar performance if/when they hit Beta...I would hope that when this project goes live it will have similar warnings about the type of systems needed to run the tasks within reasonable times (along with a recommendation to use LAIM to help counter the restart issue.) And I think it will need an explicit opt-in to the project, so no "automatic opt in" when it goes live... [Edit for completeness -- I had remembered aright (though I left out "to the" in the paragraph above, possibly confusing my point -- oops!) - see up-thread and the armstrdj reply to this post made before I got back to add the link!] As regards execution times for this application -- I'm currently running machines with Kaby Lake processors and at least 2GB RAM per core, and my machines get through these tasks in long but not unreasonable times -- they seem to take about 20% longer than the longest-running HST1 tasks in each case. My slowest machine (clocked at 2.8GHz) gets through one of these in under 13 hours, as does my main workstation (clocked at 4GHz but hyperthreaded and running LOTS of other stuff at the same time...); my small server box (clocked at 3.5GHz but not hyperthreaded) runs them in around 9.5 to 10.5 hours). If I think back to some of my now-decommissioned machines I suspect the i3-2100 would've taken over 24 hours for one of these, and I wouldn't even have considered running this on my ancient MacBook Pro (which would've taken 3 or 4 days, I suspect)! One thing I can say for sure though is that despite the apparent size of the calculations being performed, these don't seem to reduce overall machine throughput in the way MIP1 does ![]() Cheers - Al. [Edit 2 times, last edit by alanb1951 at Jun 27, 2019 3:32:12 PM] |
||
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
I would hope that when this project goes live it will have similar warnings about the type of systems needed to run the tasks within reasonable times (along with a recommendation to use LAIM to help counter the restart issue.) And I think it will need an explicit opt-in project, so no "automatic opt in" when it goes live... This project will be opt-in only when launched and will call out the caveats on the project selection page. Also it is likely when launched the default will be set to only allow one task per host at a time. Thanks, armstrdj |
||
|
|
|