World Community Grid - View Thread - [Error] ATOM syntax incorrect: "62 " is not a valid atom number

World Community Grid Forums

Category: Completed Research

Forum: Smash Childhood Cancer

Thread: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 101

[ ]

Author

This topic has been viewed 20336 times and has 100 replies

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

90 day badge for Nutritious Rice for the World

2 year badge for Help Fight Childhood Cancer

2 year badge for Help Cure Muscular Dystrophy - Phase 2

14 day badge for Discovering Dengue Drugs - Together - Phase 2

180 day badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

100 year badge for Mapping Cancer Markers

1 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

5 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Soon I will be running out of SCC1-tasks, apart from faulty batch 0004176. Faulty? Not entirely. You can fix it yourself! I did that with two tasks and one of them went Valid (see post 686839), just by putting an extra space (the single space that was missing) between "ATOM" and "62" in a file.

So what I will do now is to repair the remaining erroneous tasks from batch 0004176 in my queue, in the hopes that someone else will do as I did, so that the two partnered tasks (wingmen) will match and both go Valid.

If you are also running out of SCC1-tasks and are left with defective ones from batch 0004176, just give it a try. There is a tiny little, slight chance that you will find a wingman such as I. All you need to do is this as superuser:

# cd ~boinc/projects/www.worldcommunitygrid.org
# a=$(grep -l ^ATOM"    62 " [0-9a-f]*.pdbqt 2>/dev/null)
# [ -n "$a" ] && printf "HOT FIX for:\n%s\n" "$a" && sed -i 's/^ATOM    62 /ATOM     62 /' $a && ls -l $a

Adri

[Jun 2, 2023 4:57:22 PM]

alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

10 year badge for Microbiome Immunity Project

10 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Adri,

I wondered about doing that the first time round, but opted against it because I couldn't be sure there wasn't also something else wrong with the file that didn't show as a syntax error... So I just hope that is the only error in that data file :-)

One side-effect of users fixing the data file might be to disguise the problem, and if ~~[as appears to be the case] they have suspended~~ they suspend SCC1 supply with a view to identifying and removing the remains of the bad batch (as appeared to happen last time), any "repaired" jobs still out in the field probably won't count for anything (depending on how they "remove" the bad WUs...)

[Edit:] I thought they had suspended SCC1 to do some clean-up as there didn't seem to be any new SCC1 of any type for quite a long time... However, new SCC1 tasks started turning up late this afternoon, so perhaps it was just an overnight precaution (their time, not UTC...)

I'm more concerned about how a second bad batch got turned into active WUs after they'd had to deal with the first one -- if it had already been delivered by the scientists, could it not have been checked[1] (and either repaired before WU generation or suppressed. as appropriate!); if it was a new delivery, why hadn't the scientists checked the flex file and repaired it before shipping?

And if there are still more bad batches already in the pipeline, I hope they get culled or cured in advance :-)

Cheers - Al.

[1] I don't know how automated the process of accepting SCC1 work and making WUs is, so that might not be as easy as it sounds :-(

[Edited in light of the [apparent] resumption of SCC1 supply, including bad batch cases...]

----------------------------------------
[Edit 1 times, last edit by alanb1951 at Jun 2, 2023 9:09:24 PM]

[Jun 2, 2023 7:14:36 PM]

TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

10 year badge for Help Fight Childhood Cancer

5 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

200 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

50 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project

100 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Well, half a day later, as we are definitely heading into the weekend, WCG is still pushing out the new fault SCC1 batch. Just like the last time.

And from WCG Towers, still crickets. sad

Makes me wonder if their strategy is to just run through that batch until they all have errored out at the users, instead of cancelling the batch on the server side before they are wasting anyone's bandwidth... crying

Ralf

[Jun 2, 2023 8:51:52 PM]

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:

45 day badge for Help Cure Muscular Dystrophy

20 year badge for Mapping Cancer Markers

1 year badge for Outsmart Ebola Together

90 day badge for FightAIDS@Home - Phase 2

2 year badge for Africa Rainfall Project


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Clever workaround adriverhoef, but yeah, it won't address the root cause of what caused it in the first place, and the odds are very low that the whole batch won't be invalidated and re-issued.

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Jun 2, 2023 9:17:59 PM]

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

I can't abort these 4176 tasks fast enough. Keep getting sent new ones. Are WCG techs asleep at the wheel*?

* That's a joke. I'll be here all night.

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Jun 2, 2023 9:32:37 PM]

Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1326
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding - Phase 2

90 day badge for Drug Search for Leishmaniasis

90 day badge for GO Fight Against Malaria

10 year badge for Mapping Cancer Markers

2 year badge for Microbiome Immunity Project

5 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

I can't abort these 4176 tasks fast enough. Keep getting sent new ones. Are WCG techs asleep at the wheel*?

* That's a joke. I'll be here all night.

To save you being there all night and you thought about using Boinc Tasks this will allow you to cancel all tasks ready to start. I do recommend setting "no new tasks" before cancelling tasks waiting to start :-)

----------------------------------------

[Jun 2, 2023 11:28:40 PM]

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

At the moment - as I see it - if you abort your faulty tasks (from batch 0004176), your wingmen's tasks will be Server Aborted:

 <8> * SCC1_0004176_MyoD1-C_6035_0  Fedora Linux  User Aborted    2023-06-02T03:22:07  2023-06-03T00:05:59
 <8>   SCC1_0004176_MyoD1-C_6035_1  Linux Ubuntu  Server Aborted  2023-06-02T03:22:16  2023-06-03T00:10:06

So there isn't much use anymore of fixing and getting these faulty tasks to work, since as soon as a repaired (and finished) task is returned, the server will Server Abort all wingmen's tasks (if they're not running yet), so that the mended task will be marked Too Late sooner or later:

<15> * SCC1_0004176_MyoD1-C_1087_0  Fedora Linux  Too Late        2023-06-01T21:40:10  2023-06-02T20:50:37
<15>   SCC1_0004176_MyoD1-C_1087_1  LinuxMint     Server Aborted  2023-06-01T21:40:22  2023-06-02T22:44:59

Adri

[Jun 3, 2023 12:51:58 AM]

alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Adri,

Thanks for posting about those, as I'd been aborting any I spotted but hadn't followed up to see what happened to them!

It looks as if something was done about these bad WUs some time between about 17:00 and 19:00 UTC on 2nd June (WCG afternoon shift?) as any tasks of mine that failed (or that I aborted if I spotted them) before that period ended up with retries (up to about the same time interval), whereas after then any tasks that were sent back (or aborted) didn't get retries!

As for those two examples, I think "Too Late" may also appear for returned tasks that are for "Don't need" cases, and as retries don't seem to be going out for MyoD1-C tasks any longer and tasks already out there are being Server Aborted it looks as if they may have [finally] marked the bad work units as unwanted!

An unwelcome current side-effect of whatever they've done is that the only available SCC1 work now seems to be retries for MyoD1-A/B work-units :-( -- ~~It now being the weekend (or almost so in WCG's time-zone!), it'll be interesting to see if any new work shows up before Monday~~ and in the two+ hours since I first posted this. the tap has been turned on again, and there are still occasional MyoD1-C tasks amongst them (but not many...)

I hope they post something about what is happening regarding the ongoing problems with MyoD1-C batches[1]...

Cheers - Al.

[1] And if that includes the information that the only thing wrong with the flex file was that missing space, it might legitimize your work-around :-) -- not that tampering with data files should ever be acceptable, even in what seems to be a good cause... :-) :-)

----------------------------------------
[Edit 1 times, last edit by alanb1951 at Jun 3, 2023 5:58:26 AM]

[Jun 3, 2023 3:22:47 AM]

Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1403
Status: Offline
Project Badges:

90 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

90 day badge for Influenza Antiviral Drug Search

2 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

I give the HOT FIX a try on a Win10 machine. It's a quorum 1 workunit and still running.

https://www.worldcommunitygrid.org/contribution/workunit/312168628

EDIT: all in vain - Too Late / Quorum 1, Replication 2 sad

----------------------------------------
[Edit 1 times, last edit by Crystal Pellet at Jun 3, 2023 10:06:20 AM]

[Jun 3, 2023 8:03:56 AM]

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Al,

An unwelcome current side-effect of whatever they've done is that the only available SCC1 work now seems to be retries for MyoD1-A/B work-units :-(

That may be the result of (types A and B) tasks needing wingmen to resolve the "unreliable" status that you get after processing any, each and every task from the faulty batch. All faulty tasks together are creating a hausse (upturn) in tasks (of types A and B) needing verification. Also, type C is still being sent out at a slow pace, because of the resends for types A and B needing verification. Important: the system is holding up and still hasn't collapsed.

Also, new workunits for types A and B are being distributed, albeit still scarcely.

[if] the only thing wrong with the flex file was that missing space, it might legitimize your work-around :-) -- not that tampering with data files should ever be acceptable, even in what seems to be a good cause... :-) :-)

Agreed. It seemed like a good idea at first, but in the end it only led to a lot of wasted cycles (and one Valid(*1)). It should probably never be acceptable in any way but to point out and document the error.

Adri
[*1] (Output generated by 'wcgstats -frrre* SCC1_0004176_MyoD1-C_0299')
workunit 311931323

SCC1_0004176_MyoD1-C_0299_0  Fedora Linux  Valid  2023-06-01T21:21:18  2023-06-02T09:46:32  0.77/0.78  69.0/69.0
	Logfile:
	<core_client_version>7.20.2</core_client_version>
	<stderr_txt>
	INFO: result number = 0
	INFO: No state to restore.  Start from the beginning.
	[10:58:41] Number of tasks = 1
	[10:58:41] Running task 0,CPU time at start of task 0 was 0.000000
	[10:58:41] ./cmpd-1100299.pdbqt size = 19 3 ../../projects/www.worldcommunitygrid.org/scc1.MyoD1-C.pdbqt size = 1268 0
	[11:45:29] Finished task #0 cpu time used 2784.904472
	11:45:29 (1000920): called boinc_finish(0)
	
	</stderr_txt>

PS Crystal Pellet, nice try!

----------------------------------------
[Edit 2 times, last edit by adriverhoef at Jun 3, 2023 11:16:18 AM]

[Jun 3, 2023 10:29:08 AM]

[ ]