Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1987 times and has 5 replies Next Thread
pvh513
Senior Cruncher
Joined: Feb 26, 2011
Post Count: 260
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
heap corruption in one of my WUs

My WU ugm1_ ugm1_ 00446_ 0851_ 1 crashed due to heap corruption

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
Unable to open checkpoint file starting from 0
500 query sequences compared.
1000 query sequences compared.
1500 query sequences compared.
2000 query sequences compared.
2500 query sequences compared.
3000 query sequences compared.
3500 query sequences compared.
4000 query sequences compared.
4500 query sequences compared.
5000 query sequences compared.
5500 query sequences compared.
*** glibc detected *** ../../projects/www.worldcommunitygrid.org/wcgrid_ugm1_7.22_x86_64-pc-linux-gnu: free(): invalid next size (normal): 0x0000000002d1ba00 ***
======= Backtrace: =========
[0x510d42]
[0x5137af]
[0x51648b]
[0x516850]
[0x411f34]
[0x415280]
[0x4028d7]
[0x44d825]
[0x4e4cdb]
[0x400449]
======= Memory map: ========
00400000-005f8000 r-xp 00000000 09:01 5636714 /data1/pvh/BOINC/projects/www.worldcommunitygrid.org/wcgrid_ugm1_7.22_x86_64-pc-linux-gnu
007f7000-00804000 rw-p 001f7000 09:01 5636714 /data1/pvh/BOINC/projects/www.worldcommunitygrid.org/wcgrid_ugm1_7.22_x86_64-pc-linux-gnu
00804000-0083d000 rw-p 00000000 00:00 0
01670000-03b57000 rw-p 00000000 00:00 0 [heap]
2b9ba1e70000-2b9ba1e72000 rw-s 00000000 09:01 5636559 /data1/pvh/BOINC/slots/4/boinc_mmap_file
2b9ba1e72000-2b9ba1e73000 ---p 00000000 00:00 0
2b9ba1e73000-2b9ba1e7a000 rw-p 00000000 00:00 0 [stack:10630]
2b9ba1e7a000-2b9ba1e7b000 rw-s 00000000 09:01 5636613 /data1/pvh/BOINC/slots/4/boinc_ugm1_4
2b9ba1e7b000-2b9ba27a5000 rw-p 00000000 00:00 0
2b9ba3058000-2b9ba3059000 rw-p 00000000 00:00 0
7fff17227000-7fff17248000 rw-p 00000000 00:00 0 [stack]
7fff173df000-7fff173e1000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
SIGABRT: abort called
Stack trace (15 frames):
[0x4657cd]
[0x4e34a0]
[0x4e336b]
[0x4ed725]
[0x50b227]
[0x510d42]
[0x5137af]
[0x51648b]
[0x516850]
[0x411f34]
[0x415280]
[0x4028d7]
[0x44d825]
[0x4e4cdb]
[0x400449]

Exiting...

</stderr_txt>
]]>


This is under openSUSE 13.1, kernel 3.11.10, glibc version 2.18. Has anybody else seen these?
----------------------------------------
[Edit 1 times, last edit by pvh513 at Oct 25, 2014 7:37:09 PM]
[Oct 25, 2014 7:35:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pvh513
Senior Cruncher
Joined: Feb 26, 2011
Post Count: 260
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: heap corruption in one of my WUs

Update. One of my wingmen on the same WU had the exact same crash (ugm1_ ugm1_ 00446_ 0851_ 2) so this must be a bug in the client.
[Oct 28, 2014 12:42:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: heap corruption in one of my WUs

We've recreated this error locally and are working on a resolution. Interestingly, one of your wingmen was able to run this work unit to completion, so it doesn't seem to be a 100% failure and it seems to be specific to this (and maybe other) work units.

Seippel
[Oct 29, 2014 3:36:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: heap corruption in one of my WUs

Maybe the knowledge of how code 193 occurs for ugm can be cross-pollinated to why this is now the second most frequent cause of failure for fahv on android. For my tablet that is a surety after signal 11. Any other fails are so far rarities.
[Oct 29, 2014 3:40:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: heap corruption in one of my WUs

Although we were able to recreate this locally, tracking the bug down has been difficult. For the moment, we're adding this to our backlog so we can focus on other priorities given the extreme low occurance of this error.

Seippel
[Nov 6, 2014 4:56:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pvh513
Senior Cruncher
Joined: Feb 26, 2011
Post Count: 260
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: heap corruption in one of my WUs

Have you run the code through valgrind? It is very good at finding the source of heap corruption.
[Nov 8, 2014 11:33:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread