Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Locked
Total posts in this thread: 192
Posts: 192   Pages: 20   [ Previous Page | 4 5 6 7 8 9 10 11 12 13 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 106656 times and has 191 replies Next Thread
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Re: Discovering Dengue Drugs - Together Phase 2 BETA


Type A require LARGE memory (1.75MB) and have large result files. Thus they have the bandwidth limit set.


Try GB.

Perhaps next time you could use a more reasonable bandwidth. Some of us are being nobbled because we have more than one system. Four hungry quad core systems, but only one 10Mb internet connection.
[Oct 8, 2009 12:31:42 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TimAndHedy
Senior Cruncher
Joined: Jan 27, 2009
Post Count: 267
Status: Offline
Project Badges:
Re: Discovering Dengue Drugs - Together Phase 2 BETA

I ended up with one of the big ones succeeding, so it appears they do not all fail.

BETA_ erag_ a172_ ps0000_ 2
[Oct 8, 2009 12:41:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Re: Discovering Dengue Drugs - Together Phase 2 BETA

My one ps0000 WU that survived the error 29 deathtrap crashed with the Access Violation Error after 20.5h, probably 80-90% done.

Fellow crunchers who've "lost" lots of CPU hours on these tests, we haven't really wasted the time, as this had to happen to find the bugs in the new project. Our time has been well spent!
And the lost points? The crunching is much more important than the points, and I expect the techs will eventually give us credit anyway. Right now, I'm sure they have more important things to do though. coffeecoffeecoffeecoffeecoffee

I agree that the bandwidth limits should be relaxed, but the underlying difficulty is to set limits that depend on the number of results that will need to be uploaded by each member. That in turn depends on the number and speed of his active devices, and the project mix in each.
There is also the factor of the member's willingness to have WCG use a big fraction of his internet bandwidth.
I suggest that at least there be a manual override box for DDDT2 (Type A) on the project selection page(s).
----------------------------------------
[Edit 1 times, last edit by Rickjb at Oct 8, 2009 4:54:00 AM]
[Oct 8, 2009 4:45:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
X-Files 27
Senior Cruncher
Canada
Joined: May 21, 2007
Post Count: 391
Status: Offline
Project Badges:
Re: Discovering Dengue Drugs - Together Phase 2 BETA

I agree to the relaxed bandwidth limits as long as its not 56k dial-up. Tighten up the hardware spec though - low DCF should be best. Type A eats a lot of memory so if the computer is not a dedicated cruncher, trouble awaits.
----------------------------------------

[Oct 8, 2009 5:43:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Discovering Dengue Drugs - Together Phase 2 BETA

rDCF has nothing to do with bandwidth. It's a correction factor between actual expected computation duration and the original estimate of the fpops in the task header. That value on my quad bounces up and down between 3 and 0.7 when running periods of only HCMD2 [the very smallest system resource project] and even as observed for HFCC yesterday, the range in the last 3 days was between 2.25 and 11 hours on a duo dedicated to that project. All valid, all getting credit close to claimed.

The beta of course is a way to find out what actual bandwidths were encountered on a global scale... it should, as commented, not be that the guy next to the server center gets them and no one in other parts of the world because they happen to sit 50 router hops away... the measurement will be as fast as the slowest link. But, we're exploding a topic. Take the Ratio of A to B to C work units. 1 A generates 2 B which generates 5000 C units, in quorum 2. Are the constraints for A only? Can live with that. Facts of live... at launch there will be only A type ;>)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 8, 2009 6:43:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Re: Discovering Dengue Drugs - Together Phase 2 BETA

@X-files 27: I think the Type As don't need so much physical memory (220MB ea) as virtual memory (page/swap file) (1.3GB ea).
And I think the virtual memory is not accessed very often. While my 2nd WU was running, there was no noticeable increase in my disc activity. Total page faults, which I assume is a proxy for accesses to system cache/page file, got to 60000 early on, and when I last looked, at about 50% done, it had crept up to only about 65k. For the first 11min, until my 1st WU got error 29, my quad was running 2 Type As plus 2 other WUs (HCC and/or FAAH) in 2GB physical memory, which is less than the 2 x 1.3GB VM needed by the 2 Type As alone.
I have limited knowledge of current virtual memory/swappping characteristics though. For example, I recently discovered that 32- and 64-bit Windows XP may not handle swapping between WCG tasks beyond physical but not pagefile memory, properly. Suspended tasks can get killed with exit code 0. BOINC silently restarts these from checkpoints when need be, so the problem is usually hidden.
[Oct 8, 2009 6:52:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Discovering Dengue Drugs - Together Phase 2 BETA

The Page Faulting that slows down the computations has very little to do with the swap file exchange. It's an CPU/RAM bottleneck issue that as has been proven, can be coded around, if the exact cause is found.

OT, but can you expand on that last paragraph. Talking about checkpoint resumes on suspended tasks? They will if LAIM is not on.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 8, 2009 7:17:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Re: Discovering Dengue Drugs - Together Phase 2 BETA

@Sek: (Quote):"OT, but can you expand on that last paragraph" - OT?
Scenario: Quad with only 1GB memory, waiting for return of 2x1GB sticks from RMA. LAIM is ON. Running FAAH + HCC.
I like to keep an even mix of the 2 running, so I micromanage when I'm around, which means suspending & resuming tasks. WTR tasks have absolute priority over RTS, so I grab next-to-run WUs by starting them, running them for a short while, then resuming previous tasks. This goes OK while starting a new WU, and while it allocates itself memory about 2min into the run. (At this point, BOINC Progress moves off 0%). However, when I then suspend the new task to resume a previous one, there is the likelihood that all suspended tasks will be spat out, even though Task Mgr Commit Charge is well below Limit. The tasks will disappear from Task Manager, and BOINC Messages will suggest resetting the project, but they will stay unchanged in BOINC Tasks list. When they resume, they will start from 0 CPU time in Task Manager, but from the checkpointed time (I asume) in BOINC. Only tasks that were suspended very near the start, restart from 0% in BOINC.

Workaround: Suspend the next-to-run WUS after only a few seconds, before they start grabbing lots of memory. They stay in their original spot in the Tasks list where I may not notice later that they are WTR, though.
This workaround may not work for DDDT2-Type A, because these go to 170MB at or very close to startup.
----------------------------------------
[Edit 2 times, last edit by Rickjb at Oct 8, 2009 12:15:16 PM]
[Oct 8, 2009 10:08:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Discovering Dengue Drugs - Together Phase 2 BETA

Yes, well, what could I say. Don't do that if you know it breaks tasks...

There's an item on that wishlist for a very very long time... Allow to set a limit of same number of apps to run concurrently. One is temperature, yes 4 Primegrid raises my systemps 6C, another is memory reasons. Doubt it will ever come.

There's some interesting functionality under the hood in 6.10, so by the time DDDT-2 launches I'll be trying to break that client, playing with the memory options and swap file limitations. At my own risk of course.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 8, 2009 11:21:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Discovering Dengue Drugs - Together Phase 2 BETA

Is there any word on when/if the errored wus will be resent? I see some of my results have other wu's in the "waiting to be sent" status.
[Oct 8, 2009 1:21:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 192   Pages: 20   [ Previous Page | 4 5 6 7 8 9 10 11 12 13 | Next Page ]
[ Jump to Last Post ]
Post new Thread