Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 7
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2370 times and has 6 replies Next Thread
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Why is there a third WU sent within 11 minutes of the first two ?



I know why mine was server aborted: the other two finished before mine, but why was the third one sent so quickly ? I have seen this before but thought maybe it was a feeder hiccup or something, but now I am seeing it more frequently. Just curious.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Oct 18, 2015 12:19:30 AM]
[Oct 18, 2015 12:18:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1326
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why is there a third WU sent within 11 minutes of the first two ?

Do you recall how long the deadline was? I am wondering if it was a repair job or perhaps they are wanting to the tasks through quicker. Another thing I note is _2 job was considerably longer than the first. I'm gathering this is because it was on a slower host?
----------------------------------------

[Oct 18, 2015 12:40:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why is there a third WU sent within 11 minutes of the first two ?

Do you recall how long the deadline was? I am wondering if it was a repair job or perhaps they are wanting to the tasks through quicker. Another thing I note is _2 job was considerably longer than the first. I'm gathering this is because it was on a slower host?

No, it was not a repair job. Repair jobs have an ending of 2 or greater, but should only be issued if one of the first two jobs ( the ones ending in "0" and "1") either errors out, is not completed for any reason, or if they both are in need of further verification, i.e they are both in the status of "pending verification." None of those conditions exist here.

Yes, the longer one is most likely on a slower host.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Oct 18, 2015 1:09:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Seoulpowergrid
Veteran Cruncher
Joined: Apr 12, 2013
Post Count: 823
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why is there a third WU sent within 11 minutes of the first two ?

Same issue is still going on: link. I'm posting a message here (thanks for the link Sgt. Joe) in hopes that it'll bump the thread high enough for a tech to see it and reply.
----------------------------------------

[Dec 3, 2015 4:56:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why is there a third WU sent within 11 minutes of the first two ?

This is still happening and this time with a slightly new twist. Not only is a third WU sent in a short time, but when it is processed, it did so instantly - zero time elapsed. And it is valid. Mine is the aborted WU, which is why I noticed it.


Here is the result log:

Result Log

Result Name: MCM1_ 0020159_ 3350_ 2--
<core_client_version>7.5.0</core_client_version>
<![CDATA[
<stderr_txt>
05:45:44 (1432): Can't set up shared mem: -1. Will run in standalone mode.
Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.35_x86_64-pc-linux-gnu -SettingsFile MCM1_0020159_3350.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Settings File
DateOfDesign = 08/05/2014
Designer = PMCC_OCI_0.1
WorkOrderID = 0020159_3350
DatasetID = 17_72_SDG_v1
NumberOfGenesInStartingSignature = 70
NumberOfGenesInSignatureMin = 70
NumberOfGenesInSignatureMax = 70
GroupVectorValues = {A}{B}{C}{D}{E}{F}
ExplicitStartingGeneSignatures = A B D F
StartingGeneSignatureAlgorithm = randomFixedLengthSearch
SearchAlgorithmNumberToCreate = 542
SearchAlgorithmSequentialStartPosition = 5
RunPermutationAlgorithm = 0
PermutationGroups = A
PermutationGroupsForReplacement = G
PermutationAlgorithm = replaceFromRandomlyToRandomlyGreedy
PermutationsNumIterations = 0
OptimizationAlgorithmFrequency = 0 0 1
FBeta = 1.5
SimAnnealIMax = 20000
SimAnnealAlpha = 0.9996
FitnessFn = 0
MinFitness = 0.37
NReps = 10
TrainFrac = 0.7
NFolds = 10
VMethod = LOO
ModelType = SVM
SvmArgs = "-v 0 -c 0.1 -t 1 -d 2 -r 0"

SvmLearnLimit = 500000
RSeed = 157903351


[05:45:44] Initializing
[05:45:54] Running
[05:45:54] EvaluateFitnessOfStartingGeneSignatures 542
08:16:31 (9782): Can't set up shared mem: -1. Will run in standalone mode.
Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.35_x86_64-pc-linux-gnu -SettingsFile MCM1_0020159_3350.txt -DatabaseFile dataset-17_72_SDG_v1.txt
[08:16:31] Initializing
[08:16:41] Running
[08:16:41] EvaluateFitnessOfStartingGeneSignatures 542
09:48:01 (2931): Can't set up shared mem: -1. Will run in standalone mode.
Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.35_x86_64-pc-linux-gnu -SettingsFile MCM1_0020159_3350.txt -DatabaseFile dataset-17_72_SDG_v1.txt
[09:48:01] Initializing
[09:48:14] Running
[09:48:14] EvaluateFitnessOfStartingGeneSignatures 542
[10:13:46] Writing final output
[10:13:46] Closing Output Stream
[10:13:46] Cleaning up
Result.out = 34457.000000
Run complete, CPU time: 1529.835608
10:13:47 (2931): called boinc_finish

</stderr_txt>
]]>

Just weird.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Jan 13, 2016 3:11:29 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Why is there a third WU sent within 11 minutes of the first two ?

Quorum detail shows only the CPU time [zero here] but Run complete, 'CPU time: 1529.835608' per log, and this makes it difficult to at all see with what efficiency results were processed by wingmen. The sent / received time interval and log 'suggest' the result actually took normal time, just one of these long standing frops of clients, failing to transmit one or the other time parameter. Client 7.5.0 is of course a not smart to use in production, the very first [bug laden] alpha test build in development towards 7.6 public release.
[Jan 13, 2016 8:59:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why is there a third WU sent within 11 minutes of the first two ?

Client 7.5.0 is of course a not smart to use in production, the very first [bug laden] alpha test build in development towards 7.6 public release.

You may have put your finger on that part of the problem. I had not thought to look at the version number. Still wondering why the third WU sent out though. Thanks for the input.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Jan 13, 2016 12:21:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread