Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 33
Posts: 33   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 6147 times and has 32 replies Next Thread
d_a_dempsey
Cruncher
USA
Joined: Jun 17, 2008
Post Count: 19
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

Sorry for the delay in update, but it's taken a while to get firm results.

1. After reseting the project, it appeared to work. Times looked normal, progress was incrementing, life appeared to be good. Appearances are deceiving. After reaching 0.232% progress, it hung. Elapsed time was increasing, time to completion was increasing, progress remained at 0.232%. I let that go that way for about 6 hours. I stopped the client, noticed that all WCG and other projects left memory, but CEP2 did not. I rebooted. Same behavior. I aborted the packet.

2. I now have the exact same behavior as originally posted.

3. The stderr.txt and stdout.txt files are 0 bytes.

@skgiven
According to task manager, I am consuming ~ 87% of CPU and have 5800+ of my 8192 physical RAM available. 7 processes are consuming CPU - all BOINC related. I have 607 of 688GB free disk space, with no fragmentation.

I think resource starvation can be ruled out.

@seippel
I have four CEP2 processes:

wcgrid_cep2_6.40_windows_intel86*32 @ 1,232K
wcgrid_cep2_qchem_6.40_windows_intel86*32 @ 120K
wcgrid_cep2_qchem_6.40_windows_intel86*32 @ 116K (two of these)

None of them have accumulated any CPU time.

This is a Win7 64-bit platform.

Hope this helps,

DAD
----------------------------------------
David

---------------------------------


----------------------------------------
[Edit 1 times, last edit by d_a_dempsey at Apr 30, 2011 4:27:39 AM]
[Apr 30, 2011 4:24:57 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

Did you exclude the Boinc folders from the Virus scanning software, and did you setup Firewall exceptions for Boinc?

Anything in your event logs?

You could check for chipset/Bios updates and set log files to try to get a better picture of what is happening/not happening (its a bit of work though):

To do this you would need to create/edit a cc_config.xml file:

Create (or edit) the cc_config.xml file in this folder (Win7),
C:\ProgramData\BOINC\

It's usually the handiest to just right click in the folder and select to create a new text document, or open an existing one as a text document. When you have entered the text just save it as cc_config.xml (select all file types when you go to save it).

<cc_config>
<log_flags>
<task>1</task>
<cpu_sched>1</cpu_sched>
<task_debug>1</task_debug>
<app_msg_receive>1</app_msg_receive>
<app_msg_send>1</app_msg_send>
</log_flags>
</cc_config>

The following flags might also be useful,
<checkpoint_debug>
<cpu_sched_debug>
<mem_usage_debug>
<rr_simulation>

One of the tech's could probably advise what to look for better than I can, if they think it's worth doing.

Note, you need to restart (close and open) Boinc to get these settings to kick in.

The last resort would be just to run other tasks for now, and periodically try these ones again (with default settings); there might be app/task changes that make them work in the future.

Good luck,
[Apr 30, 2011 9:13:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
d_a_dempsey
Cruncher
USA
Joined: Jun 17, 2008
Post Count: 19
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

Nothing in the event logs that even looks remotely useful, in fact, not much of anything at all from BOINC.

I'll try the cc_config.xml bit.

The last few WUs have accumulated 14 seconds of CPU time to wcgrid_cep2_6.40_windows_intel86*32, but that seems to be the uniform cut-off point.
----------------------------------------
David

---------------------------------


[May 7, 2011 10:57:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
d_a_dempsey
Cruncher
USA
Joined: Jun 17, 2008
Post Count: 19
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

I have a log file, but it grows very quickly with all those options on.

I imagine you'd like me to provide a link to the file as opposed to posting it here. biggrin
----------------------------------------
David

---------------------------------


[May 10, 2011 3:31:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

Hi,

you can mail to support@worldcommunitygrid.com with reference to this thread. Put in there f.a.o. knreed.

If you go to the Result Status page and hit one of the failed result links ("Error", not "User Aborted"), we might have a log with an error message. Starts like this:
Result Log

Result Name: E202059_ 139_ C.25.C19H10N4OS.00176681.0.set1d06_ 0--
<core_client_version>6.10.59</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[18:33:19] Number of jobs = 16
[18:33:19] Starting job 0,CPU time has been restored to 0.000000.
[18:33:19] Starting new Job
...

--//--
[May 10, 2011 7:13:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

d_a_dempsey,

Something is preventing the qchem process, wcgrid_cep2_qchem_6.40_windows_intel86*32, from getting any cpu time. Normally I would say it is Antivirus software or something like that but typically that would result in the process crashing with an error. I would not normally recommend looking too much into the slot directory but for this case we need to. If you look into the slot directory of a running cep2 task you should see two directories. First there should be a "qcaux" directory, ignore this one, second there should be a working directory for the current qchem job named something like "A.24.C18H13N5S.32.4.bp86.svp.n.pbe0.tzvp.n.sp", it will be different based on your task. This directory will not be created until the initialization is complete so it may take a little bit of time. In this directory look to see if there is a "stdout.txt" file. If so please open it in notepad and copy the contents to another file and send this to the support email address given by SekeRob. Thanks for your patience in trying to resolve this issue.

Thanks,
armstrdj
[May 10, 2011 3:31:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

Just a thought: Would the permissions correctly propagate during the slot sub dir creation? If not done already, uninstalling and reinstalling as earlier proposed would set that right and is lossless, meaning that the data_dir content and everything else stays in place. The ideal place is per your log correct... C:\ProgramData\BOINC, but since the user name is shown as the account BOINC is run from, this is not an Protected Application Execution (service) install, much better, combined with electing the "Allow all users" during the install. Then BOINC will run even if noone is logged in.

22-Apr-2011 00:02:23 [---] Running under account David Dempsey

ttyl
[May 10, 2011 4:00:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
d_a_dempsey
Cruncher
USA
Joined: Jun 17, 2008
Post Count: 19
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

Hi,

you can mail to support@worldcommunitygrid.com with reference to this thread. Put in there f.a.o. knreed.

If you go to the Result Status page and hit one of the failed result links ("Error", not "User Aborted"), we might have a log with an error message. Starts like this:

Mail sent.

The work units do not complete, do not error. There are no error results for this project from this device.

@armstrdj
I bravely ignored the qcaux directory and went directly to C.24.C22H14N2.00112949.4.noopt.bp86.sto6g.n.sp. Unfortunately, there are only two files in it:

Directory of C:\ProgramData\BOINC\slots\9\C.24.C22H14N2.00112949.4.noopt.bp86.s
to6g.n.sp

05/11/2011 12:24 PM <DIR> .
05/11/2011 12:24 PM <DIR> ..
12/14/2010 07:12 AM 438 C.24.C22H14N2.00112949.4.noopt.bp86.sto6g.n.sp.in
12/14/2010 07:12 AM 1,881 molecule
2 File(s) 2,319 bytes


...The ideal place is per your log correct... C:\ProgramData\BOINC, but since the user name is shown as the account BOINC is run from, this is not an Protected Application Execution (service) install, much better, combined with electing the "Allow all users" during the install. Then BOINC will run even if noone is logged in.

I'm not sure I understand what you're telling me to do. Are you asking me to:

  • Reinstall allowing PAE?
  • Reinstall not as a PAE but allowing all users?
  • Something else?

I am the only user account on this computer.

Thank you,
d_a_dempsey
----------------------------------------
David

---------------------------------


[May 12, 2011 2:31:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

Hi,

a few words fell off my typewriter, so a Service / PAE install is better, even with single user. This way the client is sand-boxed under a limited special user account and if ever going wild could not touch anything. Then with "Allow all Users", you're being granted rights to manage the BOINC install.

There's one downside presently to a service install in Windows: So-called GPU computing would not work, but don't know if this pertains to the NVidia exclusive CUDA or the OpenCL too (works on brands such as ATI as well) which WCG will use for their GPU version. At any rate as said if you boot your computer with a service install, BOINC will load before you sign in.

Just 2 files in the slot... auch... there would have to be about 6,600 files in the structure, in the main slot a further 6 sub dirs, some of them branching out further into more subdirs. It's a major complex.

--//--

edit: spelling and as per previous post, uninstall and install is needed. When installed already, there are only repair and uninstall offered, for as long as I can remember.:D
----------------------------------------
[Edit 1 times, last edit by Former Member at May 12, 2011 10:06:52 AM]
[May 12, 2011 7:16:58 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU time completes, but 0% completed. Not finishing

.
----------------------------------------
[Edit 3 times, last edit by skgiven at May 19, 2011 10:13:32 PM]
[May 12, 2011 9:54:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 33   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread