| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 20
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
My WCG task is not doing anything but reporting it is pausing.
I did have the same problem last week, then I did a reset of the project and it has been working fine for 3 days. After reporting a complete task it stopped again. Thu 29 Jun 2006 00:37:15 CEST|World Community Grid|Starting task za071_00381_2 using hpf2 version 507 Thu 29 Jun 2006 01:37:15 CEST|Einstein@Home|Restarting task h1_0112.0_S5R1__707_S5R1a_0 using einstein_S5R1 version 401 Thu 29 Jun 2006 01:37:15 CEST|World Community Grid|Pausing task za071_00381_2 (removed from memory) Thu 29 Jun 2006 01:55:17 CEST|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi Thu 29 Jun 2006 01:55:17 CEST|Einstein@Home|Reason: To fetch work Thu 29 Jun 2006 01:55:17 CEST|Einstein@Home|Requesting 10 seconds of new work Thu 29 Jun 2006 01:55:22 CEST|Einstein@Home|Scheduler request succeeded Thu 29 Jun 2006 01:55:24 CEST||Rescheduling CPU: files downloaded Thu 29 Jun 2006 02:02:23 CEST|World Community Grid|Sending scheduler request to https://secure.worldcommunitygrid.org/boinc/wcg_cgi/fcgi Thu 29 Jun 2006 02:02:23 CEST|World Community Grid|Reason: To report completed tasks Thu 29 Jun 2006 02:02:23 CEST|World Community Grid|Reporting 1 tasks Thu 29 Jun 2006 02:02:28 CEST|World Community Grid|Scheduler request succeeded Thu 29 Jun 2006 02:55:24 CEST|Einstein@Home|Pausing task h1_0112.0_S5R1__707_S5R1a_0 (removed from memory) Thu 29 Jun 2006 02:55:24 CEST|World Community Grid|Restarting task za071_00381_2 using hpf2 version 507 Thu 29 Jun 2006 03:55:25 CEST|Einstein@Home|Restarting task h1_0112.0_S5R1__707_S5R1a_0 using einstein_S5R1 version 401 Thu 29 Jun 2006 03:55:25 CEST|World Community Grid|Pausing task za071_00381_2 (removed from memory) Thu 29 Jun 2006 04:01:41 CEST||Rescheduling CPU: application exited Thu 29 Jun 2006 04:01:41 CEST|Einstein@Home|Computation for task h1_0112.0_S5R1__707_S5R1a_0 finished Thu 29 Jun 2006 04:01:41 CEST|Einstein@Home|Starting task h1_0112.0_S5R1__694_S5R1a_1 using einstein_S5R1 version 401 Thu 29 Jun 2006 04:01:44 CEST|Einstein@Home|Started upload of file h1_0112.0_S5R1__707_S5R1a_0_0 Thu 29 Jun 2006 04:01:48 CEST|Einstein@Home|Finished upload of file h1_0112.0_S5R1__707_S5R1a_0_0 Thu 29 Jun 2006 04:01:48 CEST|Einstein@Home|Throughput 25624 bytes/sec Thu 29 Jun 2006 05:01:42 CEST|Einstein@Home|Pausing task h1_0112.0_S5R1__694_S5R1a_1 (removed from memory) Thu 29 Jun 2006 06:01:42 CEST|World Community Grid|Pausing task za071_00381_2 (removed from memory) Thu 29 Jun 2006 06:01:42 CEST|Einstein@Home|Restarting task h1_0112.0_S5R1__694_S5R1a_1 using einstein_S5R1 version 401 Thu 29 Jun 2006 06:25:53 CEST|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi Thu 29 Jun 2006 06:25:53 CEST|Einstein@Home|Reason: To report completed tasks Thu 29 Jun 2006 06:25:53 CEST|Einstein@Home|Reporting 1 tasks Thu 29 Jun 2006 06:25:58 CEST|Einstein@Home|Scheduler request succeeded Thu 29 Jun 2006 07:01:43 CEST|Einstein@Home|Pausing task h1_0112.0_S5R1__694_S5R1a_1 (removed from memory) Thu 29 Jun 2006 08:01:44 CEST|World Community Grid|Pausing task za071_00381_2 (removed from memory) |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
hmmm, would you please check your BOINC profile on WCG and see what operating times, days of week etc it is set at and also weight of project. Given there's also einstein in your log, you seem to be sharing with one or more NON-WCG projects, thus if WCG has used its portion of time, BOINC will switch to the next project(s) until they've had their share upon which it will resume WCG.
----------------------------------------Make sure that enough time is given to WCG to finish the WCG Work Unit(s) in your BOINC queue within 1 week (the closing dates are listed there), else WCG starts sending extra copies of the same to someone else to crunch...this would be wasteful duplication. Theoretically BOINC will manage this automatically i.e. will return to WCG to finish WU's before deadlines are passed, then give extra time to other projects to make sure all get the time share you told them to have.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 5 times, last edit by Sekerob at Jul 1, 2006 8:15:47 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sekerob,
----------------------------------------There are a few curious entries in his log. Notice here Einstein pauses, WCG does not start, WCG pauses 1 hour later. How can WCG pause when it wasn't started? Thu 29 Jun 2006 05:01:42 CEST|Einstein@Home|Pausing task h1_0112.0_S5R1__694_S5R1a_1 (removed from memory) Thu 29 Jun 2006 06:01:42 CEST|World Community Grid|Pausing task za071_00381_2 (removed from memory) Now Einstein starts, as it should Thu 29 Jun 2006 06:01:42 CEST|Einstein@Home|Restarting task h1_0112.0_S5R1__694_S5R1a_1 using einstein_S5R1 version 401 Thu 29 Jun 2006 06:25:53 CEST|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi Thu 29 Jun 2006 06:25:53 CEST|Einstein@Home|Reason: To report completed tasks Thu 29 Jun 2006 06:25:53 CEST|Einstein@Home|Reporting 1 tasks Thu 29 Jun 2006 06:25:58 CEST|Einstein@Home|Scheduler request succeeded Now Einstein pauses, WCG does not start but it pauses 1 hour later. Thu 29 Jun 2006 07:01:43 CEST|Einstein@Home|Pausing task h1_0112.0_S5R1__694_S5R1a_1 (removed from memory) Thu 29 Jun 2006 08:01:44 CEST|World Community Grid|Pausing task za071_00381_2 (removed from memory) There is a very good explanation for all this and I wish I knew what that explantion is. [Edit 2 times, last edit by Former Member at Jul 1, 2006 9:31:22 AM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
On the off-chance....maybe the RAM conditions changed between downloading and starting it.....if the recently discussed video steal took place after download, APO01 could have sunk below the required 256mb ram...did he maybe disable Page-Swapping....wild guesses
----------------------------------------![]() PS, the WCG is started 00:37:15, but is it really unloaded at these hourly intervals, merely sitting there waiting on sufficient memory? Over to the BOINC specialists...... ![]()
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The video steal happens immediately after power on, I think, so the kernel can know where it can load the OS. For sure the available physical RAM is determined by the time the OS is finished loading and before any apps run.
I think if you disable paging it's not really disabled until you restart? I don't know, I haven't done that in years but I can't see how it could be done safely without at least closing all running apps. Also, for compiled C/C++ programs physical RAM requirements are determined at compile time and therefore are known at run time. Dynamic arrays cannot be determined at compile time so they are created in virtual memory (though indices into dynamic arrays are kept in physical RAM, often in one of the CPU registers, if available, for speed). If there is not enough virtual memory to satisfy a request then that error condition is trapped and reported and the app attempts to recover or it exits with an error code like those we saw reported recently. I think one of APO01's state files has corrupted or else he's administering an IQ test. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi,
Nothing changed on memory settings as far as I know. Einstein and WGC have 50% of the resources (100 (50%) in the Resource share column. The problem may be the task is not realy taken out of the system, while Einstein is running i still have some WCG processes running (or actually standing stil). Below the output from free -t and the boinc processes (in)active. Thanks for looking into my problem. # free -t total used free shared buffers cached Mem: 515868 497180 18688 0 24108 106508 -/+ buffers/cache: 366564 149304 Swap: 498004 287160 210844 Total: 1013872 784340 229532 # ps -ef|grep boinc boinc 30301 1 0 May21 ? 00:05:58 /usr/bin/boinc_client -redirectio -dir /var/lib/boinc-client boinc 1612 30301 1 Jun29 ? 00:58:15 wcg_hpf2_rosetta_5.07_i686-pc-linux-gnu -abrelax -protein za07 -chain 1 -series 00381 -nstruct 20 -constant_seed -jran 305540 -silent -farlx -ex1 -ex2 -output_silent_gz -output_chi_silent boinc 1613 1612 0 Jun29 ? 00:00:00 wcg_hpf2_rosetta_5.07_i686-pc-linux-gnu -abrelax -protein za07 -chain 1 -series 00381 -nstruct 20 -constant_seed -jran 305540 -silent -farlx -ex1 -ex2 -output_silent_gz -output_chi_silent boinc 1614 1613 0 Jun29 ? 00:00:00 wcg_hpf2_rosetta_5.07_i686-pc-linux-gnu -abrelax -protein za07 -chain 1 -series 00381 -nstruct 20 -constant_seed -jran 305540 -silent -farlx -ex1 -ex2 -output_silent_gz -output_chi_silent hans 29015 1 0 09:07 ? 00:01:52 /usr/bin/boincmgr boinc 15052 30301 88 12:06 ? 00:33:08 einstein_S5R1_4.01_i686-pc-linux-gnu @conf --IFO=LHO --Freq=112.147058824 --FreqBand=0.0147058823529 --startTime=819486815 --endTime=819607696 --f1dot=3.55616e-10 --f1dotBand=-3.91178e-09 --skyGridFile=grid_0120_h_T12_S5R1.dat --metricMismatch=0.15 --WUfpops=1.8476e+13 --NumCandidatesToKeep=1000 root 18522 28915 0 12:43 pts/3 00:00:00 grep boinc |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I think you are correct. The rosetta process is the wcg process. I don't know Linux well enough to say more but someone more knowledgeable should spot this and jump in. My guess is the rosetta process is not unloading properly. It's somehow fooling BOINC into thinking it's already running so BOINC doesn't try to start it and that's why the log doesn't mention starting rosetta when Einstein pauses.
----------------------------------------Thanks for being patient. I'm sure the experts here will help you get this sorted out soon. [Edit 1 times, last edit by Former Member at Jul 1, 2006 11:18:02 AM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Eureka (not really)
---------------------------------------- , I wonder, since C.A's say that the 15% BOINC crunchers @ WCG take 85% of total support, if we should be getting some sort of problem reporting datasheet....windows or linux or mac one of those variables (only coming out @ the 6th post in this thread).....attach to my favorite 'sticky posts' not (yet) used on our forum titled something like "BOINC problem reporting form".
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
With you 100% on that, Sekerob.
1. Sticky posts. 2. A problem reporting template in a sticky in each help forum - CPU speed - # cores/processors - RAM - operating system - router? firewall? proxy server? which antivirus? - etc. - etc. Another thing that might be nice for CAs and other volunteer helpers is a tool for retrieving a poster's global BOINC/UD preferences. The tool would read preferences but not allow making changes. Or would that violate existing member confidentiality agreement(s)? With 300 members coming onboard every day it's obvious the current support system is going to break before long. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
After a reboot the thing worked fine for a;most a day. After a succesfull scheduler request the WCG task is hanging again (see messages below).
Answers to the reporting template proposal - CPU speed Pentium III 799Mhz - # cores/processors No core-dump found, 1 Intel processor - RAM 512Mb - operating system Debian testing (kernel 2.6.15-1-686) - router? firewall? proxy server? which antivirus? Belkin ADSL modem/router, firewall included, no antivirus. - Boinc version Boinc 5.4.10 - WCG task hpf2 5.07 - Other tasks einstein_SSR1 4.01 Wed 05 Jul 2006 10:09:54 CEST|World Community Grid|Sending scheduler request to https://secure.worldcommunitygrid.org/boinc/wcg_cgi/fcgi Wed 05 Jul 2006 10:09:54 CEST|World Community Grid|Reason: To report completed tasks Wed 05 Jul 2006 10:09:54 CEST|World Community Grid|Reporting 1 tasks Wed 05 Jul 2006 10:10:00 CEST|World Community Grid|Scheduler request succeeded Wed 05 Jul 2006 10:57:37 CEST|Einstein@Home|Pausing task h1_0112.0_S5R1__514_S5R1a_0 (removed from memory) Wed 05 Jul 2006 11:57:38 CEST|World Community Grid|Pausing task za094_00222_0 (removed from memory) Wed 05 Jul 2006 11:57:38 CEST|Einstein@Home|Restarting task h1_0112.0_S5R1__514_S5R1a_0 using einstein_S5R1 version 401 |
||
|
|
|