World Community Grid - View Thread - [Error] ATOM syntax incorrect: "62 " is not a valid atom number

World Community Grid Forums

Category: Completed Research

Forum: Smash Childhood Cancer

Thread: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 101

[ ]

Author

This topic has been viewed 20345 times and has 100 replies

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

90 day badge for Nutritious Rice for the World

2 year badge for Help Fight Childhood Cancer

2 year badge for Help Cure Muscular Dystrophy - Phase 2

14 day badge for Discovering Dengue Drugs - Together - Phase 2

180 day badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

100 year badge for Mapping Cancer Markers

1 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

5 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Al,
It's easy to read a post and forget about the time that the author spent before typing it, studying the material, thinking over writing it and during the writing up of the article. Your article was impressive, not too long, concise enough to fit on one page of reading material, and understandable for most people (I hope), and it spoke about the internals of a BOINC server. My compliments. It must have cost you quite some time to delve into (this specific part of) the sources. Thanks for the clarification!

Sgt.Joe:

in my results I show 495 completed SCC units with 103 of them listed as "error."

While I understand this is not a contest of any sorts my results (about 65-70 pages) show a mix of 874 S(uccess)(*1), 108 E(rror)(*2) and 27 W(orkunit error)(*3).

Sgt.Joe:

the faulty work units are still being created.

That's a correct observation, Sgt.Joe. They might want to let it blow out while the system is holding up. They are probably also looking for a way to assess the loss of everyone's reliable status while the storm is blowing over.

By the pace of the receipt of workunits we could make an estimate when this (batch 0004176) should - or better, - could all be over. From my observations, the situation seemed to 'stabilize' (FWIW), or is stable enough to be called 'stable', at 00:00 UTC this Sunday (morning). So that would mean that that time would make a good starting point. idea

Great. Then we would like to know what sequence was distributed at that time. My records say they were sequences 22087 and 22096 (see post 686913, returned to the server at 2 minutes past 00:00 UTC this Sunday). While writing this it is almost 15:00 UTC and I am still seeing a slow pace: 24833 at 12:03 UTC, 25207 at 13:23 UTC, 25602 at 14:51 UTC. That's roughly, optimistically, 400 sequences in 80 minutes, 5 per minute. At 15:00 UTC, looking back at 14:51, this would mean we would reach sequence 25602 + (9 minutes left till 15:00 * 5 per minute) = 25602 + 45 = 25647. Does that match the past 15 hours? Let's see (computing the difference between the amounts at 15:00 and at 00:00): 25647 - 22087 = 3560. Now, 3560 sequences in 15 hours is a pace of 237⅓ per hour (about 4 per minute) = 5696 per day. Expectations are that 99999 will be the last sequence of this batch. So there's still 99999 - 25647 = 74352 sequences to go. That's more than 12 days (12 * 6000 per day = 72000). Let's say the pace will increase a bit to 6 per minute (pure speculation, of course), that's 6/min. * 60min. * 24hours = 8640 / day, then 74352 / 8640 = 8.6 days. So speeding up the distribution a little bit could(*4) considerably reduce the time it will take to complete this whole faulty batch.

[*1] incl. Pendings
[*2] Server Aborted, User Aborted and Computation Error
[*3] Too Late (thanks to the 'HOT FIX') biggrin

[*4] Again, pure speculation biggrin

Adri
EDIT: It is Sunday today, not Saturday d oh

- so I should have written "this Sunday", not "this Saturday" - I've corrected it now

----------------------------------------
[Edit 2 times, last edit by adriverhoef at Jun 4, 2023 7:07:34 PM]

[Jun 4, 2023 3:39:27 PM]

Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

14 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

14 day badge for The Clean Energy Project

10 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

45 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

5 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

200 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

10 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

100 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Adri:
Interesting analysis. I hadn't really thought about how long it would take for this batch to go through the system, but based on your figures it should be less than 2 weeks. You may have hit on the rationale for the way they are dealing with the problematic batch, just letting all of them blow through the system and then correct the entire run all at once after the last item crashes out. Potentially then we will know the answer in due time. On the bright side about 80% are doing just fine.
Cheers

----------------------------------------

Sgt. Joe
*Minnesota Crunchers*

[Jun 4, 2023 5:37:18 PM]

alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

10 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Adri,

Regarding your recent reply about communication [this is for information, not a complaint!]...

The main point of my post to which you responded was meant to be the technical stuff, not the comment about WCG and information :-) In fact, I cut out quite a large section about the problems inherent in a project team knowing there might be problems, the specific case of WCG's messy forum structure not really having a single obvious place that is specific to reporting failing tasks, and whether there was an easy solution... If I'd retained that section, the post would've failed your "one page of reading material" test by quite a lot :-)

However, your observations about users playing their part in the communication process made up for that (and were probably better phrased than some of mine!) - thanks for that :-)

By the way, my footnote [1] in the post in question was addressing the same point you made at various places in the reply -- perhaps it lost something when I culled the bulk of that subject...

Cheers - Al.

P.S. When still employed, I sometimes used in-house stuff where I knew about problems (as an end user) long before the folks in an adjacent office who were responsible for that particular system became aware! Fault detection at the service end isn't always easy :-)

----------------------------------------
[Edit 1 times, last edit by alanb1951 at Jun 4, 2023 6:13:04 PM]

[Jun 4, 2023 6:04:12 PM]

alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Adri,

Interesting analysis of the recent flow of WUs. Nice to know I wasn't hallucinating about the scrambled nature of work-unit ID allocation for SCC1, even if it does suggest that killing a set of WUs for a specific target is a job for the Mission Impossible team :-)

Cheers - Al.

[Jun 4, 2023 6:22:23 PM]

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Al:

The main point of my post to which you responded was meant to be the technical stuff, not the comment about WCG and information :-)

Acknowledged, not to worry, all understood.

And now for something completely different(*1), or rather, something almost completely on-topic. smile

We've recently seen batches of type C, numbered 0004175 and 0004176. While keeping this at the back of our minds, I was - by chance - looking at some output from my scripts, just two hours ago, and couldn't help but notice some 'unseen' (read: new) batches, which are now looming on the horizon:
workunit 314811854

SCC1_0004174_MyoD1-C_1315_0                Waiting to be sent

... and while unseen batch 0004174 may be only 1 step away from 0004175 (see the start of this thread), just like the current batch 0004176, there is an even bigger distance to another unseen batch, like this one:
workunit 314684240

SCC1_0004165_MyoD1-C_1293_0                Waiting to be sent

... (and perhaps there are more unseen, new batches, I can't tell yet at this moment). The situation may be that the SCC1-scientists have already uploaded more faulty batches, so that these have already (at least partly) been injected into the 'bloodstream' of BOINC, server-side. We can only wait for them, wait on what's gonna happen with them, or perhaps ask TigerLily if it's possible to let the techs examine this situation - only one task each from the new batches is enough: two 'unseen' batches at the moment, so examining two separate tasks is enough.

Anyway, the difference between batches 0004165 and 0004175 is ten. Who knows what lies in-between. Or perhaps a better phrasing (c|sh|w)ould be: are these also faulty batches?

Adri

[*1] I've tried to include some video footage, but couldn't quickly find a suitable fragment.

PS I've already adjusted my script to abort any faulty task from any batch from type (MyoD1-)C, should they arrive. Now it's not "SCC1_0004176_MyoD1-C_.*_.$" anymore, the pattern has changed into: "SCC1_000...._MyoD1-C_.*_.$" cool

Just as a precaution. biggrin

----------------------------------------
[Edit 1 times, last edit by adriverhoef at Jun 5, 2023 9:53:38 AM]

[Jun 4, 2023 11:20:11 PM]

Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 984
Status: Offline
Project Badges:

14 day badge for Human Proteome Folding - Phase 2

14 day badge for The Clean Energy Project - Phase 2

90 day badge for Uncovering Genome Mysteries

1 year badge for Outsmart Ebola Together

1 year badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

5 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Hi!
Adri, if You are interested, just got this wu from batch 4176:

SCC1_0004176_MyoD1-C_28836 created May. 29, 2023 - 15:08 UTC, so they are still around ready to error out!!

Hans S.

PS.
And because of the error, no new SCC received!!

----------------------------------------
[Edit 1 times, last edit by Hans Sveen at Jun 5, 2023 7:58:24 AM]

[Jun 5, 2023 7:54:40 AM]

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Hans:

And because of the error, no new SCC received!!

That's probably caused by a lack of sufficient supply. When I went to bed, on the computer where I'm aborting any faulty task, my queue was still growing, but when I woke up, the size of my SCC1-queue had shrunk by more than 50%. On the computers where I don't abort the faulty tasks there was hardly any decrease in the number of SCC1-tasks (with a 0.7 day queue). The number of SCC1-tasks seems to be stabilizing at the moment. It's a matter of having sufficient MCM1-tasks in my queue(s).

PS In this phase it would be more interesting to notice when a C-type task didn't error out. laughing

[Jun 5, 2023 10:48:09 AM]

KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1684
Status: Offline
Project Badges:

180 day badge for Help Cure Muscular Dystrophy

5 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

2 year badge for Influenza Antiviral Drug Search

20 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

2 year badge for GO Fight Against Malaria

10 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

20 year badge for Africa Rainfall Project


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

145 errored WUs on my side; batch: SCC1_0004176_MyoD1-C
Cheers,
Yves

----------------------------------------

Décrypthon team progress - KerSamson's contribution

[Jun 5, 2023 2:52:29 PM]

sptrog1
Master Cruncher
Joined: Dec 12, 2017
Post Count: 1592
Status: Offline
Project Badges:

90 day badge for Outsmart Ebola Together

180 day badge for FightAIDS@Home - Phase 2

180 day badge for Smash Childhood Cancer


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

4 more batch 004176 errors received today. Is this because of an error in the program of the batch?

[Jun 6, 2023 1:08:55 AM]

yoro42
Ace Cruncher
United States
Joined: Feb 19, 2011
Post Count: 8979
Status: Offline
Project Badges:

180 day badge for Discovering Dengue Drugs - Together - Phase 2

10 year badge for The Clean Energy Project - Phase 2

20 year badge for Uncovering Genome Mysteries

50 year badge for FightAIDS@Home - Phase 2

50 year badge for Microbiome Immunity Project

20 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

ATOM 62 Sounds like an old TV show/

Rumming Windows 11 Pro approx 25GB memory available at time of failure...

Result log & Properties follow:

Results log

<core_client_version>7.16.11</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
INFO: result number = 0
INFO: No state to restore. Start from the beginning.
[21:29:21] Number of tasks = 1
[21:29:21] Running task 0,CPU time at start of task 0 was 0.000000
[21:29:21] ./cmpd-1130725.pdbqt size = 19 3 ../../projects/www.worldcommunitygrid.org/scc1.MyoD1-C.pdbqt size = 1268 0

Parse error on line 190 in file "..\..\projects\www.worldcommunitygrid.org\60fef8d136128d73bc38a1c07d4b6f66.pdbqt": ATOM syntax incorrect: "62 " is not a valid atom number
VINA failed. rc = 1. Exiting

</stderr_txt>
]]>
Peoperties:

Application Smash Childhood Cancer 7.18
Name SCC1_0004176_MyoD1-C_28842
State Computation error
Received 6/5/2023 3:23:30 AM
Report deadline 6/11/2023 3:23:30 AM
Estimated computation size 36,225 GFLOPs
CPU time ---
Elapsed time ---
Executable wcgrid_scc1_vina_7.18_windows_x86_64

----------------------------------------

----------------------------------------
[Edit 1 times, last edit by yoro42 at Jun 6, 2023 6:57:51 AM]

[Jun 6, 2023 6:55:16 AM]

[ ]