Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 23
Posts: 23   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3109 times and has 22 replies Next Thread
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 376
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

I am still seeing the 0x100 error in Job #6. Lost 10+ hours on this this unit. Two wingmen have error'ed out also.

This project is OK with wasting our CPU time. These errors keep happening and don't get corrected. That's the way it is. We'll just have to deal with it.

E226230_ 927_ S.298.C38H25N3O3.VUZWCEQNTYAYDC-UHFFFAOYSA-N.10_ s1_ 14_ 2--

00:51:17] Starting job 6,CPU time has been restored to 22252.462689.
[00:51:17] Starting new Job
[00:51:17] Qink name = fldman
[00:51:26] Qink name = gesman
[00:51:27] Qink name = scfman
Application exited with RC = 0x100
[05:00:44] Finished Job #6
[05:00:44] Starting job 7,CPU time has been restored to 36896.925910.
[05:00:44] Skipping Job #7
05:00:49 (7534): called boinc_finish

</stderr_txt>
]]>

E226230_ 927_ S.298.C38H25N3O3.VUZWCEQNTYAYDC-UHFFFAOYSA-N.10_ s1_ 14_ 2-- 700 Error 11/6/14 23:17:46 11/7/14 09:57:33 10.32 265.4 / 0.0
E226230_ 927_ S.298.C38H25N3O3.VUZWCEQNTYAYDC-UHFFFAOYSA-N.10_ s1_ 14_ 1-- 700 Error 11/6/14 23:11:20 11/7/14 19:19:40 18.00 191.5 / 0.0
E226230_ 927_ S.298.C38H25N3O3.VUZWCEQNTYAYDC-UHFFFAOYSA-N.10_ s1_ 14_ 0-- 700 Error 10/31/14 17:21:12 11/5/14 21:06:51 7.40 273.6 / 0.0
----------------------------------------
[Edit 2 times, last edit by AgrFan at Nov 10, 2014 3:38:17 AM]
[Nov 8, 2014 2:09:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 376
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

Another 9 hours wasted. Wingman error'ed out also.

E226369_ 821_ S.324.C39H26S4.WYWZASOXIHBODY-UHFFFAOYSA-N.8_ s1_ 14_ 0--

[16:26:44] Starting job 6,CPU time has been restored to 22551.200000.
[16:26:44] Starting new Job
[16:26:44] Qink name = fldman
[16:26:52] Qink name = gesman
[16:26:53] Qink name = scfman
Application exited with RC = 0x100
[19:23:14] Finished Job #6
[19:23:14] Starting job 7,CPU time has been restored to 32939.460000.
[19:23:14] Skipping Job #7
19:23:20 (4530): called boinc_finish

</stderr_txt>
]]>

E226369_ 821_ S.324.C39H26S4.WYWZASOXIHBODY-UHFFFAOYSA-N.8_ s1_ 14_ 1-- 700 Error 11/8/14 15:07:15 11/9/14 04:16:42 4.82 217.7 / 0.0
E226369_ 821_ S.324.C39H26S4.WYWZASOXIHBODY-UHFFFAOYSA-N.8_ s1_ 14_ 0-- 700 Error 11/8/14 15:01:22 11/9/14 00:31:09 9.20 235.2 / 0.0
----------------------------------------
[Edit 1 times, last edit by AgrFan at Nov 10, 2014 3:37:37 AM]
[Nov 9, 2014 2:14:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

Just to clarify, do these show as an Error in the Workunit Status? All current CEP2 units end during Job #6 and I have vague memories of RC = 0x100 being a "success" exit code on Linux (my Windows equivalent showing RC = 0x1). Shame if your examples are indeed Error status.
[Nov 9, 2014 4:14:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

Just to clarify, do these show as an Error in the Workunit Status? All current CEP2 units end during Job #6 and I have vague memories of RC = 0x100 being a "success" exit code on Linux (my Windows equivalent showing RC = 0x1). Shame if your examples are indeed Error status.


Yes mine shows as an error.

E226337_ 551_ S.314.C39H22N6O2.MQFBHXKKBRAWIG-UHFFFAOYSA-N.7_ s1_ 14_ 0-- 700 Error 11/6/14 20:56:47 11/7/14 20:24:12 12.02 463.0 / 0.0
[Nov 9, 2014 8:43:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

Hi all,

I will chat with our friends at IBM with this, but as I said earlier - the internal grid errors are not my area of expertise, and so I wont muddy the water! I would like to say that I am sure that if there were simple fix then it would already be fixed :) This is especially true if the error originates from within the quantum chemistry software which is very complex.

Your Harvard CEP Team
[Nov 9, 2014 10:17:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 376
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

And yet another 10 hours wasted. Wingman error'ed out also.

E226372_ 813_ S.318.C40H20N6O2.QCVJGDUPAQTTBK-UHFFFAOYSA-N.13_ s1_ 14_ 1--

[19:48:53] Starting job 6,CPU time has been restored to 22938.537567.
[19:48:53] Starting new Job
[19:48:54] Qink name = fldman
[19:49:03] Qink name = gesman
[19:49:05] Qink name = scfman
Application exited with RC = 0x100
[23:42:06] Finished Job #6
[23:42:06] Starting job 7,CPU time has been restored to 36617.672459.
[23:42:06] Skipping Job #7
23:42:12 (9504): called boinc_finish

</stderr_txt>
]]>

E226372_ 813_ S.318.C40H20N6O2.QCVJGDUPAQTTBK-UHFFFAOYSA-N.13_ s1_ 14_ 1-- 700 Error 11/8/14 18:04:32 11/9/14 04:39:23 10.24 241.0 / 0.0
E226372_ 813_ S.318.C40H20N6O2.QCVJGDUPAQTTBK-UHFFFAOYSA-N.13_ s1_ 14_ 0-- 700 Error 11/8/14 17:56:38 11/9/14 17:40:41 8.28 262.3 / 0.0
----------------------------------------
[Edit 1 times, last edit by AgrFan at Nov 10, 2014 3:36:38 AM]
[Nov 10, 2014 3:35:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 376
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

8+ hours wasted. Three wingmen error'ed out also.

[10:14:25] Starting job 6,CPU time has been restored to 20040.770000.
[10:14:25] Starting new Job
[10:14:25] Qink name = fldman
[10:14:33] Qink name = gesman
[10:14:34] Qink name = scfman
Application exited with RC = 0x100
[13:14:45] Finished Job #6
[13:14:45] Starting job 7,CPU time has been restored to 30642.560000.
[13:14:45] Skipping Job #7
13:14:50 (8390): called boinc_finish

</stderr_txt>
]]>

E226353_ 379_ S.314.C34H20N6O4S1.STWPYDFUUJKZTQ-UHFFFAOYSA-N.7_ s1_ 14_ 4-- - In Progress 11/13/14 18:29:49 11/17/14 06:29:49 0.00 0.0 / 0.0
E226353_ 379_ S.314.C34H20N6O4S1.STWPYDFUUJKZTQ-UHFFFAOYSA-N.7_ s1_ 14_ 3-- 700 Error 11/13/14 04:32:47 11/13/14 18:23:41 8.56 186.2 / 0.0
E226353_ 379_ S.314.C34H20N6O4S1.STWPYDFUUJKZTQ-UHFFFAOYSA-N.7_ s1_ 14_ 2-- 700 Error 11/13/14 04:32:25 11/13/14 10:00:52 4.57 198.9 / 0.0
E226353_ 379_ S.314.C34H20N6O4S1.STWPYDFUUJKZTQ-UHFFFAOYSA-N.7_ s1_ 14_ 1-- 700 Error 11/7/14 18:07:36 11/8/14 04:10:41 5.06 214.5 / 0.0
E226353_ 379_ S.314.C34H20N6O4S1.STWPYDFUUJKZTQ-UHFFFAOYSA-N.7_ s1_ 14_ 0-- 700 Error 11/7/14 18:03:23 11/13/14 04:32:05 7.80 265.0 / 0.0
----------------------------------------
[Edit 1 times, last edit by AgrFan at Nov 14, 2014 2:52:50 AM]
[Nov 14, 2014 2:52:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

Result Name: E226281_ 641_ S.320.C42H38N2O2.YDSAJTJFNHKLPF-UHFFFAOYSA-N.18_ s1_ 14_ 3--





<core_client_version>7.4.27</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[02:11:18] Number of jobs = 8
[02:11:18] Starting job 0,CPU time has been restored to 0.000000.
[08:09:49] Finished Job #0
[08:09:49] Starting job 1,CPU time has been restored to 21000.137015.
[08:30:28] Finished Job #1
[08:30:28] Starting job 2,CPU time has been restored to 22231.421708.
[08:51:38] Finished Job #2
[08:51:38] Starting job 3,CPU time has been restored to 23480.568515.
[09:15:11] Finished Job #3
[09:15:11] Starting job 4,CPU time has been restored to 24885.014318.
[09:33:20] Finished Job #4
[09:33:20] Starting job 5,CPU time has been restored to 25966.350850.
[09:50:08] Finished Job #5
[09:50:08] Starting job 6,CPU time has been restored to 26966.379660.
Application exited with RC = 0x1
[13:11:02] Finished Job #6
[13:11:02] Starting job 7,CPU time has been restored to 38841.004179.
[13:11:02] Skipping Job #7
13:11:10 (5372): called boinc_finish

</stderr_txt>
]]>

The above one is marked as PVAL. On the other hand,

Result Name: E226281_ 641_ S.320.C42H38N2O2.YDSAJTJFNHKLPF-UHFFFAOYSA-N.18_ s1_ 14_ 1--





<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[20:10:33] Number of jobs = 8
[20:10:33] Starting job 0,CPU time has been restored to 0.000000.
[02:10:58] Finished Job #0
[02:10:58] Starting job 1,CPU time has been restored to 21279.207204.
[02:32:41] Finished Job #1
[02:32:41] Starting job 2,CPU time has been restored to 22573.407100.
[02:53:19] Finished Job #2
[02:53:19] Starting job 3,CPU time has been restored to 23791.119706.
[03:17:42] Finished Job #3
[03:17:42] Starting job 4,CPU time has been restored to 25242.256608.
[03:36:12] Finished Job #4
[03:36:12] Starting job 5,CPU time has been restored to 26339.817244.
[03:52:06] Finished Job #5
[03:52:06] Starting job 6,CPU time has been restored to 27282.812089.
Application exited with RC = 0x1
[07:43:55] Finished Job #6
[07:43:55] Starting job 7,CPU time has been restored to 40976.002665.
[07:43:55] Skipping Job #7
07:44:03 (3916): called boinc_finish

</stderr_txt>
]]>

The above one is marked as Error. What is the difference? I have experienced this type of results many times. In my understanding, RC=0x1(Windows) or RC=0x100(Linux) on job#6 is quite usual and should be treated as Valid.

E226281_ 641_ S.320.C42H38N2O2.YDSAJTJFNHKLPF-UHFFFAOYSA-N.18_ s1_ 14_ 3-- 700 Pending Validation 14/11/14 17:09:26 14/11/15 05:10:32 10.79 349.4 / 0.0
E226281_ 641_ S.320.C42H38N2O2.YDSAJTJFNHKLPF-UHFFFAOYSA-N.18_ s1_ 14_ 4-- - In Progress 14/11/14 17:08:56 14/11/18 05:08:56 0.00 0.0 / 0.0
E226281_ 641_ S.320.C42H38N2O2.YDSAJTJFNHKLPF-UHFFFAOYSA-N.18_ s1_ 14_ 2-- 700 Error 14/11/13 12:30:10 14/11/14 17:06:20 10.49 381.5 / 0.0
E226281_ 641_ S.320.C42H38N2O2.YDSAJTJFNHKLPF-UHFFFAOYSA-N.18_ s1_ 14_ 1-- 700 Error 14/11/03 12:29:43 14/11/04 22:47:51 11.38 429.0 / 0.0
E226281_ 641_ S.320.C42H38N2O2.YDSAJTJFNHKLPF-UHFFFAOYSA-N.18_ s1_ 14_ 0-- - No Reply 14/11/03 12:27:16 14/11/13 12:27:16 0.00 0.0 / 0.0
[Nov 15, 2014 12:04:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

Hi All,

Our monthly phone call with IBM is coming up. I will bring this up with them then!

Your Harvard CEP Team
[Nov 15, 2014 3:00:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 376
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Application exited with RC = 0x100

14+ hours ... wingman error'ed out also.

[14:49:59] Starting job 6,CPU time has been restored to 30570.640000.
[14:49:59] Starting new Job
[14:49:59] Qink name = fldman
[14:50:07] Qink name = gesman
[14:50:08] Qink name = scfman
Application exited with RC = 0x100
[20:50:08] Finished Job #6
[20:50:08] Starting job 7,CPU time has been restored to 51850.100000.
[20:50:08] Skipping Job #7
20:50:14 (9299): called boinc_finish

</stderr_txt>
]]>

E226458_ 129_ S.326.C36H18N4S4.WJTXHZYZGYUENY-UHFFFAOYSA-N.2_ s1_ 14_ 2-- - In Progress 11/16/14 03:32:44 11/19/14 15:32:44 0.00 0.0 / 0.0
E226458_ 129_ S.326.C36H18N4S4.WJTXHZYZGYUENY-UHFFFAOYSA-N.2_ s1_ 14_ 3-- - In Progress 11/16/14 03:32:30 11/19/14 15:32:30 0.00 0.0 / 0.0
E226458_ 129_ S.326.C36H18N4S4.WJTXHZYZGYUENY-UHFFFAOYSA-N.2_ s1_ 14_ 1-- 700 Error 11/14/14 11:15:17 11/16/14 03:25:02 9.26 268.1 / 0.0
E226458_ 129_ S.326.C36H18N4S4.WJTXHZYZGYUENY-UHFFFAOYSA-N.2_ s1_ 14_ 0-- 700 Error 11/14/14 11:10:49 11/15/14 01:59:21 14.47 301.8 / 0.0
----------------------------------------
[Edit 1 times, last edit by AgrFan at Nov 16, 2014 4:03:38 AM]
[Nov 16, 2014 4:03:18 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 23   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread