Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 21
Posts: 21   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5465 times and has 20 replies Next Thread
erich56
Senior Cruncher
Austria
Joined: Feb 24, 2007
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

this morning, I had a WU which ended with ERROR after some 15 hours:

Result Name: ARP1_ 0034214_ 083_ 1--
<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 3221225477 (0xc0000005)</message>
<stderr_txt>
INFO: Initializing
INFO: No state to restore. Start from the beginning.
Starting WRFMain
[17:03:30] INFO: Checkpoint taken at 2018-12-14_06:00:00
[22:15:36] INFO: Checkpoint taken at 2018-12-14_12:00:00
[03:41:26] INFO: Checkpoint taken at 2018-12-14_18:00:00

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0245475A read attempt to address 0x1B18EE14

Engaging BOINC Windows Runtime Debugger...


No idea, what the problem is. I have had such problems neither with other WCG subprojects nor with other projects (like LHC).
For some reason, ARP does not run properly on this machine. So I might change back to other WCG subprojects.
[Aug 20, 2021 3:27:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
maeax
Advanced Cruncher
Joined: May 2, 2007
Post Count: 144
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

For some reason, ARP does not run properly on this machine. So I might change back to other WCG subprojects.

The System requirements for ARP are reaching the limit of your PC.
To run an other project of WCG is a good idea!
----------------------------------------
AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro
[Aug 20, 2021 8:41:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
erich56
Senior Cruncher
Austria
Joined: Feb 24, 2007
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

To run an other project of WCG is a good idea!

I have now changed to MCM. Runs without any problems, as before :-)
[Aug 20, 2021 10:59:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

erich56

How many threads does your machine have and how many ARP were you running?

Mike
[Aug 20, 2021 5:39:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
erich56
Senior Cruncher
Austria
Joined: Feb 24, 2007
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

erich56

How many threads does your machine have and how many ARP were you running?

Mike

the CPU has 4 cores/4 threads, and I had 2 ARP running concurrently.
[Aug 21, 2021 11:41:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

That shouldn't be a problem. Maybe you haven't enough RAM to run 2.

Mike
[Aug 21, 2021 3:49:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
erich56
Senior Cruncher
Austria
Joined: Feb 24, 2007
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

That shouldn't be a problem. Maybe you haven't enough RAM to run 2.

Mike

Windows Task Manager as well as MemInfo show a usage of about 300-400MB RAM per WU.
The few other running apps are minor, so out of the total system RAM of 8 GB, more than half was unused all the time.

I suspect the system components are too old to cope with ARP.
[Aug 21, 2021 5:59:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sam6861
Advanced Cruncher
Joined: Mar 31, 2020
Post Count: 107
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

RAM: 8 GB, DDR3, non-ECC, has undergone Memtest recently for a different reason. Test was okay.
The mainboard is an old Fujitsu D3041, Chipset Intel G41
Processor is an old Intel Core2 Quad Q9550 @ 2.83GHz, no overclocking.
This 8GB is more then enough memory to run 2 ARP1, something else is going wrong. My Linux Debian 11, intel Atom N270, 2GB RAM, is able to run 1 ARP1 just fine and got a Valid, but CPU too slow, 7.1 hours CPU time.

I am still kind of guessing it is some sort of hardware problems causing random errors or invalids. Possibly CPU, possibly motherboard if it has blown capacitors, or Possibly RAM mis-matched, overheating, or just failing. A non-ECC RAM can possibly cause silent memory errors at anytime, undetected, not logged, and can cause random crash or invalids. You can pull out half of memory to check if computer runs with better stability, then try the other half of RAM to check which have better stability.

I have had random reboots WHEA errors with Asus B550-E Ryzen 3900x 2x16GB unbuffered ECC 3200MT/s NEMIX memory from my oops, wrong DDR4 voltage, use 1.2, not 1.35 volts. Now working much better now.
[Aug 21, 2021 6:16:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
erich56
Senior Cruncher
Austria
Joined: Feb 24, 2007
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

...
I am still kind of guessing it is some sort of hardware problems causing random errors or invalids. Possibly CPU, possibly motherboard if it has blown capacitors, or Possibly RAM mis-matched, overheating, or just failing. ...
this a a fairly old PC. So I am not too surprised that it cannot meet the challenges which ARP is posing to a system.
[Aug 22, 2021 1:33:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: unclear Invalids

Just one last thought. ARP is very intensive at checkpoints and upload. Did your 2 units happen to do that at about the same time? If so you only need to keep them apart by occasionally suspending one until theirprogress shows 6% difference, say. That wouldn't need doing very often.

Mike
[Aug 22, 2021 3:06:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 21   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread