World Community Grid - View Thread - [Error] ATOM syntax incorrect: "62 " is not a valid atom number

World Community Grid Forums

Category: Completed Research

Forum: Smash Childhood Cancer

Thread: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 101

[ ]

Author

This topic has been viewed 20351 times and has 100 replies

Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1296
Status: Offline
Project Badges:

180 day badge for Smash Childhood Cancer

45 day badge for Microbiome Immunity Project

1 year badge for Africa Rainfall Project

1 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

I got a few errors from "C" tasks, and am no longer getting SCC. I'm asking, but none are being sent.

Am I not getting SCC because of my errors, or are other people not seeing SCC without having errors. Is it me, or the system?

[Jun 9, 2023 4:12:23 PM]

Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

45 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

14 day badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

90 day badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Loads of 4174's and a 4165 here errored multiple times.

Mike

[Jun 9, 2023 4:25:50 PM]

Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

I notice that when I get a batch of ATOM 62 errors, they error in quick succession so a number of them get uploaded together. However, my cache is only replenished 1 at a time and spasmodically at that. In between I get the dread tasks committed to other platforms message.

Mike

[Jun 9, 2023 4:56:27 PM]

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

90 day badge for Nutritious Rice for the World

2 year badge for Help Fight Childhood Cancer

2 year badge for Help Cure Muscular Dystrophy - Phase 2

180 day badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

1 year badge for GO Fight Against Malaria

100 year badge for Mapping Cancer Markers

1 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

5 year badge for Microbiome Immunity Project

50 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Unixchick, the number of new SCC1-tasks being distributed is dropping fast, as the server (a) needs to find reliable clients (because of the increasing amount of tasks that need to be verified and that is because of the increasing amount of unreliable clients, caused by tasks from the faulty batch (still with Replication > 0) that error out immediately) and (b) to abort (Server Abort) tasks that will be 'Too Late' anyway (from the faulty batch that has Replication > 0).
So, SCC1-tasks are still being distributed, but the system has difficulty finding reliable clients. This is the same situation as reported in post 686894. The good news is that the server is holding up.
Still, in this situation I think that it is a good idea to abort (User Abort) the faulty tasks that you receive, for you will lose your reliability status if you execute a faulty task and as long as you have a reliable client your tasks don't need verification, giving the server more breathing room and more chance to send some tasks to you.

Adri

[Jun 9, 2023 5:06:31 PM]

NixChix
Veteran Cruncher
United States
Joined: Apr 29, 2007
Post Count: 1187
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

180 day badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

90 day badge for Influenza Antiviral Drug Search

180 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

1 year badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

10 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

20 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

I don't understand why this problem is not being addressed by WCG staff.

Cheers coffee

----------------------------------------

[Jun 9, 2023 5:24:20 PM]

alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:

14 day badge for Discovering Dengue Drugs - Together

1 year badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

14 day badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

10 year badge for Microbiome Immunity Project


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Adri,

From your post 687153 from a few hours back...

Nice to see that somebody else (see task _3 below) also (probably automatically(*1) (see post 686915)) aborts incoming tasks from the 'new' faulty batch 0004174

In this case that would've been me :-) I've written a Python script that scans client_state.xml for tasks that could be from invalid work-units, finds the specific flex file, checks it for the fault and invokes boinccmd to abort the task if appropriate. Here's a sample from its log on one of my machines (times are BST [UTC+1][*1]):

2023-06-08 19:17:11 - SCC1_0004176_MyoD1-C_50786_0:  aborted.
2023-06-09 04:32:21 - SCC1_0004165_MyoD1-C_0518_0:  aborted.
2023-06-09 08:07:26 - SCC1_0004174_MyoD1-C_0213_1:  aborted.
2023-06-09 09:52:28 - SCC1_0004174_MyoD1-C_0092_3:  aborted. 
2023-06-09 11:57:30 - SCC1_0004165_MyoD1-C_1998_0:  aborted.
2023-06-09 12:02:30 - SCC1_0004165_MyoD1-C_2032_0:  aborted.
2023-06-09 12:12:31 - SCC1_0004165_MyoD1-C_2086_1:  aborted.
2023-06-09 13:27:33 - SCC1_0004176_MyoD1-C_56113_0:  aborted.
2023-06-09 13:42:34 - SCC1_0004174_MyoD1-C_1241_2:  aborted.
2023-06-09 14:07:34 - SCC1_0004174_MyoD1-C_1565_1:  aborted.
2023-06-09 16:17:37 - SCC1_0004165_MyoD1-C_2405_3:  aborted.
2023-06-09 17:42:38 - SCC1_0004174_MyoD1-C_2246_2:  aborted.

If/when it sees a MyoD1-C task that doesn't have the bad flex file, the script will report "valid file!" and leave the task to run :-)

Your logic for aborting parallels mine, and the effect is obvious... Tthe machine from which that log snippet is taken typically returns about 100 valid SCC1 tasks a day; since I introduced the script I've not had any Errors (as expected) so I still manage to keep my [small] cache topped up despite still seeing "Tasks are committed to other platforms" fairly regularly (for reasons stated frequently in this and other fhreads...) My other systems that run SCC1 are also getting consistent supplies of work (but they don't handle as many SCC1 tasks a day')

Cheers - Al.

[*1] The script is based on the daemon scripts I've written for various other aspects of watching WCG work flow; they all use Python's logger module for the output and I've never bothered to work out how to get it to use UTC instead of local time (if it even can...)

[Jun 9, 2023 6:15:50 PM]

Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1296
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Thanks for the replies. I had a short queue and got a couple of error WUs in a row, that ran before I could abort them. I'm guessing that I'm now deemed unreliable for SCC. I've added MCM to my mix for the moment.

I too am surprised about the lack of attention to this problem.

[Jun 9, 2023 6:46:45 PM]

Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

And 4099 But they will be off for the weekend now!

Mike

[Jun 9, 2023 7:04:49 PM]

TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:

10 year badge for Help Fight Childhood Cancer

2 year badge for Computing for Sustainable Water

200 year badge for Mapping Cancer Markers

50 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

50 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

100 year badge for OpenPandemics - COVID-19


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

And 4099 But they will be off for the weekend now!

Mike

Well, yes, WCG Towers always does this right before the weekend. Nothing new here, beside that communication the last week has been even more abysmal than before...

But I can't confirm that SCC1 batch 4099 is bad per se, I just checked several hosts that have some of those and all of them are at least starting and running fine, though I didn't see any that had already finished.
So if there is a problem with that particular batch, the it is different from the subject of this thread, for which I have seen WUs of the batches 4165, 4174, 4175 and 4176, and which will error out right when they are beings started.

And I do not agree with Adri that they can't do anything about this, the question is rather if they KNOW how and where to cancel such jobs and more importantly, can be actual proactive and prevent the root cause of those faulty batches been created in the first place. But that's something that only WCG Towers could answer (if they are truthful and don't spread more platitudes), but right now, they once again ain't talking...

Ralf

[Jun 9, 2023 7:36:54 PM]

adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:


Re: [Error] ATOM syntax incorrect: "62 " is not a valid atom number

Al, thanks for your response.

You wrote:

I've written a Python script that scans client_state.xml for tasks that could be from invalid work-units, finds the specific flex file, checks it for the fault and invokes boinccmd to abort the task if appropriate.

"And I wonder, still I wonder, who'll stop the ..."(*1)

Great! And I wonder, if people are getting inquisitive and interested in your script.

Still I wonder, how does that script handle the situation where a task is received that needs to be executed right away because its deadline is only 3 days instead of 6?
(Occasionally I get a task that has a deadline of 3 days, so it gets a high priority to run and this will always lead to that task in Running state - unless I have enough (MCM1/SCC1) tasks with a 3 day deadline in the queue, which is probably never. sad

)

If/when it sees a MyoD1-C task that doesn't have the bad flex file, the script will report "valid file!" and leave the task to run :-)

So, the task stays in the queue, unharmed. Good. The conceivable situation hasn't happened yet, I guess, but - I'm thinking along with you - what will happen when that script sees the same task? Will it report "valid file!" again? wink

they all use Python's logger module for the output and I've never bothered to work out how to get it to use UTC instead of local time

So it isn't as simple as searching for 'python date utc' on internet and then finding this:

>>> from datetime import datetime, timezone
>>> print(datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S"))

Nevertheless, I think you should keep local time and just be aware of it. Logging, a nice feature of Python.

[*1] faulty tasks/workunits/batches

Adri
PS I don't have a weekend puzzle ready at this time. biggrin

[Jun 9, 2023 10:12:33 PM]

[ ]