Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3630 times and has 5 replies Next Thread
XrtX
Cruncher
Joined: Mar 25, 2006
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Repeated Computation Errors only on Mapping Cancer Markers

I'm getting repeated Computation errors reported in the tasks view of BOINC. This only seems to be with Mapping Cancer Markers WUs. Right now I can see 7 such failures, showing 100% and having run already from just over a minute to 2 hours 22 minutes. The tasks have just disappeared (presumably uploaded to WCG). Is there a problem with MCM WUs? I have a load lined up and I don't want to waste computing time if they are all subject to some issue.
[Apr 5, 2020 4:26:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7844
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Repeated Computation Errors only on Mapping Cancer Markers

Please post a couple of samples from the error messages.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Apr 5, 2020 4:45:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
XrtX
Cruncher
Joined: Mar 25, 2006
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Repeated Computation Errors only on Mapping Cancer Markers

Thanks for the feedback. As far as I can see all my MCM WUs were getting the computation error, and I think many when completed. I've now turned MCM off (i.e. not contributing to the project) and have aborted all remaining MCM WUs on my machine (all 56 of them!). I don't get any other project WUs aborting and I don't know how many have already previously failed before I noticed. My stats on MCM are not as high as some: 8,613,562 points 10,717 results returned 4:274:19:36:05 time donated, but this doesn't feel good. If I can help discover anything, please let me know the details required from me to help. I usually just let the machine work in the background, but have been paying a bit more notice since SETI@Home stopped sending work for my GPU.
[Apr 6, 2020 12:00:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dayle Diamond
Senior Cruncher
Joined: Jan 31, 2013
Post Count: 452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Repeated Computation Errors only on Mapping Cancer Markers

Joe is asking you to go into your list of completed tasks, sort by error Ed out, and click on them. It will open a new window with much more detail about the problem, which you can copy and paste here.

If the errors are no longer in the system, it may be worthwhile to allow in a few more cancer units so if/when they fail you can revisit this post.
[Apr 7, 2020 8:34:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
XrtX
Cruncher
Joined: Mar 25, 2006
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Repeated Computation Errors only on Mapping Cancer Markers

Thanks for your reply. I decided to leave MCM project because of the issue I mentioned above, but also because of others I mention in a new post.
[Apr 12, 2020 9:31:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
NorthernRaider
Cruncher
Canada
Joined: Dec 10, 2008
Post Count: 12
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Repeated Computation Errors only on Mapping Cancer Markers

rtX was not very helpfull here.

I got the same errors on a whole bunch of workunits but this time from Smash Childhood Cancer. Here is from 3 last WU's the Process got signal 11 - Segmentation Violation. Likely means that the WU's are bad

Result Name: SCC1_ 0003838_ FoxO1-A_ 21638_ 0--
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>

Result Name: SCC1_ 0003804_ Prdm-C_ 87258_ 0--
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>

Result Name: SCC1_ 0003773_ Prdm-B_ 10089_ 0--
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>

I have 20 GB Disk available to BOINC where only about 3 GB is being used. Machine has 32 GB of Memory and only 20% is being used of it, and has 8 CPU where only 6 are being used at the moment.
You could look up the WU's in the logs, and I shall dig up some more to see if it is a System Problem at my end.

UPDATE :
I had 5 SCC1 WU's in the queue.
They all started and then quit with the Computation Error, I bet it is all the SIGSEGV Computation Error, that shows in the WCG Log. CONFIRMED !! after Transmit.
They were transmitted back to WCG, and here is the part of the log that shows no Output file in the BOINC Log ::

Mon 13 Apr 2020 07:26:32 PM EDT | | [mem_usage] enforce: available RAM 24050.60MB swap 49626.40MB
Mon 13 Apr 2020 07:26:33 PM EDT | World Community Grid | Output file SCC1_0003829_FoxO1-A_78908_0_r1617702272_0 for task SCC1_0003829_FoxO1-A_78908_0 absent
Mon 13 Apr 2020 07:26:33 PM EDT | | [mem_usage] enforce: available RAM 24050.60MB swap 49626.40MB
Mon 13 Apr 2020 07:26:34 PM EDT | World Community Grid | Started upload of MCM1_0162034_6606_1_r1609257074_0
Mon 13 Apr 2020 07:26:34 PM EDT | World Community Grid | Output file SCC1_0003778_Prdm-B_84465_1_r521594503_0 for task SCC1_0003778_Prdm-B_84465_1 absent
Mon 13 Apr 2020 07:26:34 PM EDT | | [mem_usage] enforce: available RAM 24050.60MB swap 49626.40MB
Mon 13 Apr 2020 07:26:35 PM EDT | World Community Grid | Output file SCC1_0003772_Prdm-B_54836_1_r1181842435_0 for task SCC1_0003772_Prdm-B_54836_1 absent
Mon 13 Apr 2020 07:26:35 PM EDT | | [mem_usage] enforce: available RAM 24050.60MB swap 49626.40MB
Mon 13 Apr 2020 07:26:36 PM EDT | World Community Grid | Finished upload of MCM1_0162034_6606_1_r1609257074_0
Mon 13 Apr 2020 07:26:36 PM EDT | World Community Grid | Output file SCC1_0003804_Prdm-C_88955_0_r1259533660_0 for task SCC1_0003804_Prdm-C_88955_0 absent
Mon 13 Apr 2020 07:26:36 PM EDT | | [mem_usage] enforce: available RAM 24050.60MB swap 49626.40MB
Mon 13 Apr 2020 07:26:37 PM EDT | World Community Grid | Output file SCC1_0003772_Prdm-B_54842_1_r609797721_0 for task SCC1_0003772_Prdm-B_54842_1 absent
Mon 13 Apr 2020 07:26:37 PM EDT | | [mem_usage] enforce: available RAM 24050.60MB swap 49626.40MB
Mon 13 Apr 2020 07:26:42 PM EDT | World Community Grid | [mem_usage] ARP1_0016230_008_0: WS 742.51MB, smoothed 742.51MB, swap 815.76MB, 0.00 page faults/sec, user CPU 53219.750, kernel CPU 39.320
Mon 13 Apr 2020 07:26:42 PM EDT | World Community Grid | [mem_usage] ARP1_0010974_008_1: WS 742.05MB, smoothed 742.05MB, swap 815.76MB, 0.00 page faults/sec, user CPU 51869.840, kernel CPU 37.340
Mon 13 Apr 2020 07:26:42 PM EDT | World Community Grid | [mem_usage] ARP1_0034711_008_0: WS 742.06MB, smoothed 742.06MB, swap 815.76MB, 0.00 page faults/sec, user CPU 35291.110, kernel CPU 19.330
Mon 13 Apr 2020 07:26:42 PM EDT | World Community Grid | [mem_usage] MCM1_0162036_1741_1: WS 85.52MB, smoothed 85.52MB, swap 86.35MB, 0.00 page faults/sec, user CPU 7277.610, kernel CPU 4765.910
Mon 13 Apr 2020 07:26:42 PM EDT | World Community Grid | [mem_usage] MCM1_0161971_0040_1: WS 73.93MB, smoothed 73.93MB, swap 74.70MB, 0.00 page faults/sec, user CPU 3344.580, kernel CPU 2112.880
Mon 13 Apr 2020 07:26:42 PM EDT | World Community Grid | [mem_usage] MIP1_00289572_6543_0: WS 66.85MB, smoothed 33.42MB, swap 133.16MB, 0.00 page faults/sec, user CPU 2.940, kernel CPU 0.080
Mon 13 Apr 2020 07:26:42 PM EDT | | [mem_usage] BOINC totals: WS 2452.92MB, smoothed 2419.49MB, swap 2741.49MB, 0.00 page faults/sec
Mon 13 Apr 2020 07:26:42 PM EDT | | [mem_usage] All others: WS 4311.16MB, swap 271351.43MB, user 5738.470s, kernel 3786.640s

Let me know if you have an idea on how to solve this.
----------------------------------------


----------------------------------------
[Edit 4 times, last edit by DutchRaider at Apr 13, 2020 11:55:09 PM]
[Apr 13, 2020 11:06:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread