Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 55688 times and has 5 replies Next Thread
Spiderman
Advanced Cruncher
United States
Joined: Jul 13, 2020
Post Count: 117
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Never-ending OPNG tasks?

I'm uncertain whether to post here, or under the GPU forum, or OpenPandemics...

Q: Anyone else noticed a few OPN GPU tasks that never end? It's not all, but enough to be troublesome.

--

I happened to notice one of my machines with a "No Reply" when I downloaded my Results into a spreadsheet yesterday morning.

When I looked over on that particular box, there was a OPNG WU (OPNG_0193257_00054_0) that had over 2-days of runtime ticking. It was overdue and had another WU behind it that was within an hour of being overdue as well.

This is one of my Windows machines and has the integrated AMD GPU enabled (no add-on card), set to run without interruption. There are no other processes on it and even if so, it's told to go-forward and compute CPU/GPU no matter what. [Local Preferences]

I rebooted it and it went about it's business re-running from scratch (not sure where the checkpoint disappeared to?).

This morning it was still going so I aborted the task. It started running another OPNG that was due yesterday (which I will probably abort also if it doesn't finish it soon).

Sadly, this box is now suspect by the WCG Server and I'm seeing "This computer has finished a daily quota of 1 tasks" in the Event Logs.

--

Another Windows machine has an integrated *Intel* GPU and I found it with the same issue, but instead of rebooting, I suspended it and then told it to resume.

It eventually finished the WU.

--

I'm not seeing anything in the Event Logs to tell me anything.

The only common denominator of these are:

1) Windows

2) Latest v7.24.1 BOINC Client

3) Whatever it is doing, it isn't listening to the WCG Due Date in order to stop after the time-limit has occurred.

4) Doesn't matter if Intel or AMD GPU

5) v7.22.2 BOINC Client doesn't appear to do this (I only have one other Windows with GPU, all others are CPU-only or Linux).

6) This only happens on select OPN GPU tasks (not all) but holds-up the chain since this is an integrated graphics processor.

--

Very odd and nothing in the Event Logs to give a hint what the issue is.

I could turn GPU off, but like most, I try to squeeze every portion of processing power out of these systems.

--

Before I started down the path of reporting to BOINC Support over at Berkley, I wanted to see if anyone else had noticed this?

Thanks...
[Sep 21, 2023 7:08:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never-ending OPNG tasks?

Never seen this with any OPNG WUs, and honestly don't remember to have ever seen this for OPN1 either.

I just see this occasionally with MCM1 or SCC1 WUs, where they most of the time then run up to 99.xxx% before stopping to properly finish. Usually when checking in those cases the properties of those WUs in the BOINC manager, the CPU time has just dashes. A couple of times recently I saw however some of those MCM1 or SCC1 tasks that run to maybe 10-15% and then just running up the clock, with no obvious progress. But those are 1 or 2 out of 5000-6000 WUs total at that point in the Results list, so nothing that got me worried at this point. And then WCG reaction to any report, if I would even able to make this in time is probably just to ignore this anyway...

Ralf
----------------------------------------

[Sep 21, 2023 9:09:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never-ending OPNG tasks?

Spiderman

I have an i7 3770 with an Intel GPU. Its current OPNG unit has not reached its deadline but looking at its properties I see that it has a fraction done of 78.899%, elapsed time of 2:52:41 and CPU time of just 0:10:35.

This indicates to me that it is only active part time. Is that what is happening with you?

Incidentally, I find that BOINC does not stop units that exceed their deadline. It is possible to earn the credit if it reports its results before its replacement.

The only server aborts that I get are of re-sends where the late running units report before my unit reaches its first check-point.

Mike
[Sep 22, 2023 12:09:33 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Spiderman
Advanced Cruncher
United States
Joined: Jul 13, 2020
Post Count: 117
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never-ending OPNG tasks?

Mike,

Thanks -- perhaps I should've left running just to see if it ever does stop. I'm unsure. However, I've aborted a total of (3) now that were 2-3 days overdue (and surely had new Wingmen assigned and reported back -- I know my first one did).

This last one that I stopped this morning had run for over 20-hours -- normally they take 1.5 hours.

The one it does it most on is an Intel i7. The other machine I noticed it on is an Intel i5. I double-checked and Power & Sleep modes are turned-off on everything possible on those systems.

If it continues, I'll see if I can dig-up the associated logs + files and send to BOINC-Berkeley for them to analyze & determine if there's a bug in this latest release of the Client.

Appreciate the reply!
[Sep 22, 2023 2:09:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Paul Schlaffer
Senior Cruncher
USA
Joined: Jun 12, 2005
Post Count: 244
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never-ending OPNG tasks?

I have one AMD APU that will stay on 99% until it times out. I just moved that machine into a profile which doesn't allow GPU work.
----------------------------------------

“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
[Sep 22, 2023 8:42:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Spiderman
Advanced Cruncher
United States
Joined: Jul 13, 2020
Post Count: 117
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never-ending OPNG tasks?

This Dell Intel i7-3770 machine has continually been hanging on OPNG GPU tasks -- most of the other Dell's and HP's in the stack don't have that issue, but after aborting several OPNG WU's over the past 1-2 months I decided to turn GPU processing off for WCG on this single machine (no GPU issues for my backup project when WCG is down which is what is so strange).

I don't use Profiles so added a 'cc_config.xml' file and told it to not allow GPU processing on this machine for WCG but allow it to run for others.

A 'BoincCmd --read_cc_config' called from the commandline reinitialized and allowed me to confirm GPU processing was off for WCG on this single machine.

<cc_config>
<options>
<exclude_gpu>
<url>http://www.worldcommunitygrid.org</url>
</exclude_gpu>
</options>
</cc_config>

--

That fixed the issue.
[Nov 5, 2023 1:21:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread