World Community Grid - View Thread - DDD2 Type B work units going out.

World Community Grid Forums

Category: Beta Testing

Forum: Beta Test Support Forum

Thread: DDD2 Type B work units going out.

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 369

[ ]

Author

This topic has been viewed 683832 times and has 368 replies

Mathilde2006
Senior Cruncher
Germany
Joined: Sep 30, 2006
Post Count: 269
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

1 year badge for Discovering Dengue Drugs - Together

5 year badge for Nutritious Rice for the World

2 year badge for The Clean Energy Project

5 year badge for Help Fight Childhood Cancer

2 year badge for Influenza Antiviral Drug Search

5 year badge for Help Cure Muscular Dystrophy - Phase 2

5 year badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

5 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

5 year badge for Computing for Sustainable Water

10 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: DDD2 Type A work units going out.

Mathilde2006, did you do anything or was this spontaneous?
- A Restart, Snooze, Suspend and Resume task, or an automatic update perhaps?

- edit - I would suggest you keep an eye on it when it gets back to 28%

Yes- I'll do that.
I changed the preferences 'limit memory usage' from ~748 MB to 1198 MB , requested an update, automatic update requested from the client and then I restartet the client after it was back to 2.9%.

I don't use the option 'leave application in memory while suspended.

OS: Suse 11.2 32 bit

----------------------------------------

----------------------------------------
[Edit 2 times, last edit by Mathilde2006 at Feb 5, 2010 12:14:08 AM]

[Feb 5, 2010 12:10:52 AM]

sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:

10 year badge for Human Proteome Folding - Phase 2

180 day badge for Discovering Dengue Drugs - Together

180 day badge for The Clean Energy Project

20 year badge for Help Fight Childhood Cancer

1 year badge for Influenza Antiviral Drug Search

20 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

10 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

45 day badge for OpenPandemics - COVID-19


Re: DDD2 Type A work units going out.

I think you might have spotted the problem. Hope your fix works out,

[Feb 5, 2010 12:16:05 AM]

Mathilde2006
Senior Cruncher
Germany
Joined: Sep 30, 2006
Post Count: 269
Status: Offline
Project Badges:


Re: DDD2 Type A work units going out.

I think you might have spotted the problem. Hope your fix works out,

I hope so too.

Hypernova is offline -knreed also.
So- it looks like no new beta today.

----------------------------------------

[Feb 5, 2010 12:52:54 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: DDD2 Type A work units going out.

I think VM is more of an issue than actual Memory usage - as my two have a Memory Usage of 214/209 Mb and a VM usage of 688/682 MB usage respectively.

Why would you care about VM at all? Does your OS (presumably windows?) care? VM under linux is totally irrelevant, when it includes a massive section of uninitialised global data which apparently never gets referenced. It has absolutely zero impact on anything, unless that uninitialised data area gets accessed, in which case the actual memory usage will rise.

Memory usage tends to be around 200MB and VM around 540MB. The uninitialised data (BSS) area of the executable is 357MB, which suggests that around 340MB is unused.

(Of course, if there is some performance impact under Windows, then it would be best if the DDD2 programs went on a diet and didn't declare 300+MB of memory that they don't use.)

[Feb 5, 2010 1:37:42 AM]

JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy

1 year badge for Nutritious Rice for the World

1 year badge for Help Fight Childhood Cancer

180 day badge for Influenza Antiviral Drug Search

10 year badge for Help Cure Muscular Dystrophy - Phase 2

90 day badge for Discovering Dengue Drugs - Together - Phase 2

180 day badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

180 day badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

180 day badge for Africa Rainfall Project


Re: DDD2 Type A work units going out.

[I don't use the option 'leave application in memory while suspended.

OS: Suse 11.2 32 bit

Not very safe for productivity, but interesting for the test. smile

With "Leave application in memory while suspended (LAIM)" not allowed any event** leading to suspend an application will make it go back to its last checkpoint.
For applications which checkpoint very often it does not matter much, but when there are long times between checkpoints that may cost much wasted time.

And when the application is unloaded, then reloaded and initialized there is an additional overhead to consider too.

With LAIM ON the application is simply paused, then it resumes with the next instruction.

If you want to go on with LAIM OFF (it's up to you) I would recommend to activate checkpoints logging in your cc_config.xml file (see Start Here FAQs) to be able to figure out the impact for each project.

Cheers. Jean.

** That could be a snooze, a benchmark, etc... Have you checked your message log for a clue?

----------------------------------------

Team--> Decrypthon -->Statistics/Join -->Thread

----------------------------------------
[Edit 1 times, last edit by JmBoullier at Feb 5, 2010 6:19:17 AM]

[Feb 5, 2010 6:17:36 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: DDD2 Type A work units going out.

gb009761, Point taken wink

Im getting 147K and 56K Page Faults on my W7 system (so far) and 60K Page Faults on my Vista system. This is roughly after 1day.

Commit size is around 680MB for each Beta. A fair bit if you had 4 running.
I would not like to be listening to 8 such tasks grinding the drive on an i7 with an x86 Operating System!

On the page fault, see in process explorer 75k (soft) PFs which is absolutely minute compared to some other sciences, the delta always on zero, the important bit. W7-32. That's after 10.5 hours of uninterrupted crunching since last hard boot (several), to test if it survives, just after the 1.5 hours spread checkpoints were written :D

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

----------------------------------------
[Edit 1 times, last edit by Sekerob at Feb 5, 2010 6:57:44 AM]

[Feb 5, 2010 6:57:02 AM]

Mathilde2006
Senior Cruncher
Germany
Joined: Sep 30, 2006
Post Count: 269
Status: Offline
Project Badges:


Re: DDD2 Type A work units going out.

[I don't use the option 'leave application in memory while suspended.

OS: Suse 11.2 32 bit

Not very safe for productivity, but interesting for the test. smile

Thanks Jean.

I think, the 'problem' was, that I resized the memory without setting the WU on hold.
Are the checkpoint rules changed? From December:

Type A: These work units are the very long running work units.
Runtime: 30-100 hours
Identifier: ps

Checkpoints: 50 times within a work unit. Evenly throughout the run, every 2%.

There is nothing suspicious in the message log- the client was at last started ~ 13 hours ago.
I assume, that he lost the work, that was crunched in this time frame.

I tried suspending the WU and Project yet and it's working fine.
Now running at 17%.

I know, that deactivating LAIM is dangerous. But that's testing. smile

----------------------------------------

[Feb 5, 2010 7:36:31 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: DDD2 Type A work units going out.

Checkpoints change on the long A types? Well, per my note above they're spread 1.5 hours apart (1:26 hours to be exact) and 28 CPU hours into the job show a completion time of about 66 hours... that would make on the envelope's back about 46... just a little over 2% per CP. Guess the TTC is not correct though it's been stable in the 66-67 hour range.

PS: Change LAIM back and forth and % memory allowance during work/idle should have zero impact on the execution of the job. Now changing the Page file size down does have a potential to cause issues. At 720Mb a pop in VM, there could be trouble when several run concurrently. Then, not sure how well BOINC / the science behaves in shifting portions between virtual an physical memory when that's done on the run. Can't say I've seen adverse effects though. Even been running for weeks without VM at all.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

----------------------------------------
[Edit 1 times, last edit by Sekerob at Feb 5, 2010 7:51:09 AM]

[Feb 5, 2010 7:46:20 AM]

Mathilde2006
Senior Cruncher
Germany
Joined: Sep 30, 2006
Post Count: 269
Status: Offline
Project Badges:


Re: DDD2 Type A work units going out.

Just returned to the Computer and the WU is back to 2.5%.
I activated LAIM before and did'nt change anything.
I did change the size up. This is the only WU running.

Should I reset the project?

----------------------------------------

[Feb 5, 2010 8:22:46 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: DDD2 Type A work units going out.

Please no, suspend the job and wait for a tech to give guidance. The coders will want to learn from the files that are stored in the job-slot. If you can't get replacement work, increase the cache/additional buffer size in steps until you do, so at least the client wont idle. Will be another 6 hours before the techs are in the office.

thanks.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Feb 5, 2010 8:33:08 AM]

[ ]