Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 49
Posts: 49   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7683 times and has 48 replies Next Thread
zolople
Cruncher
Spain
Joined: Apr 25, 2020
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

you can use a GPU (activate in menu Edit / book Configuration)

008: 02-Aug-2020 13:08:04 [---] OpenCL: NVIDIA GPU 0: Tesla T4 (driver version 418.67, device version OpenCL 1.2 CUDA, 15080MB, 3968MB available, 16282 GFLOPS peak)
007: 02-Aug-2020 13:08:04 [---] CUDA: NVIDIA GPU 0: Tesla T4 (driver version 418.67, CUDA version 10.1, compute capability 7.5, 4096MB, 3968MB available, 16282 GFLOPS peak)
----------------------------------------


[Aug 2, 2020 5:59:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Falconet
Master Cruncher
Portugal
Joined: Mar 9, 2009
Post Count: 3295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

Got it:

008: 02-Aug-2020 19:01:12 [---] OpenCL: NVIDIA GPU 0: Tesla K80 (driver version 418.67, device version OpenCL 1.2 CUDA, 11441MB, 4007MB available, 4111 GFLOPS peak)
007: 02-Aug-2020 19:01:12 [---] CUDA: NVIDIA GPU 0: Tesla K80 (driver version 418.67, CUDA version 10.1, compute capability 3.7, 4096MB, 4007MB available, 4111 GFLOPS peak)

Also got this one:

008: 02-Aug-2020 19:10:51 [---] OpenCL: NVIDIA GPU 0: Tesla T4 (driver version 418.67, device version OpenCL 1.2 CUDA, 15080MB, 3968MB available, 16282 GFLOPS peak)
007: 02-Aug-2020 19:10:51 [---] CUDA: NVIDIA GPU 0: Tesla T4 (driver version 418.67, CUDA version 10.1, compute capability 7.5, 4096MB, 3968MB available, 16282 GFLOPS peak)

Gracias
----------------------------------------


AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W
AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W
AMD Ryzen 7 7730U 8C/16T 3.0 GHz
----------------------------------------
[Edit 1 times, last edit by Mosqueteiro at Aug 2, 2020 7:19:44 PM]
[Aug 2, 2020 7:03:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2156
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

Have you tried Google Colab?

I have the following question.

After the VM session times out, you would have to restart the session. Preferably with all the same BOINC-datafiles that are stored on Google Drive in the Boinc directory, so that tasks that were running don't get lost and can continue from their last checkpoint. How do you restart the session in the correct way, step by step, so that WCG doesn't create a new device for your VM?
[Aug 4, 2020 12:20:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
zolople
Cruncher
Spain
Joined: Apr 25, 2020
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

We can't force Google to assign the same machine and in fact there are even multiple models, both CPU and GPU (that's why every time it starts it's a benchmark).
When you start again, the data already stored in Google Drive is used, ensuring that these tasks are not lost, even those that have already started will continue from your checkpoint.
There are no problems with the tasks already started and they finish correctly.
The only problem is that many devices appear in our account ... 150 Device Installations appear to me.
----------------------------------------


[Aug 4, 2020 9:43:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2156
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

We can't force Google to assign the same machine and in fact there are even multiple models, both CPU and GPU (that's why every time it starts it's a benchmark).
I think I have disabled the GPU, because I'm staying at WCG's side, so to speak.
When you start again, the data already stored in Google Drive is used, ensuring that these tasks are not lost, even those that have already started will continue from your checkpoint.

This is what I do:
When the VM ends, I click Reconnect, then I click 'Connect to hosted runtime', it says Allocating..., Reconnect..., then there is RAM and Disk available according to the icons that are appearing and hovering the mouse over RAM/Disk says: "Waiting for Python 3 backend to finish its current execution." Then the VM seems to restart, because the 'bottom command line' (0-STOP, ..., 2-Get State, ...) appears, but then it appears that I can't start any bottom commands, and clicking 'Runtime' at the top menu also doesn't work. Right next to RAM/Disk there is this down pointing little triangle, when I click that, I see 'Manage sessions', clicking that doesn't do anything anymore. I'm completely at a loss there. I'm stuck.

OK, let's try this again. I'm opening a new browser window for Colab. Now devilish I can go to Runtime at the Colab menu and click 'Interrupt execution'! Then I can click 'Run all' from Runtime. And now the VM seems to have restarted, I can enter a command like e.g. '2' at the bottom command line and I'm seeing tasks that continue from their latest checkpoints and there is this message at the bottom of the Colab screen: "Automatic saving failed. This file was updated remotely or in another tab. Show diff".

I guess I'm making a mess ... So here's my question. What steps do I need to take to continue tasks from their latest checkpoints after the VM ends?
(Do I need to click Reconnect? Do I need to click 'Connect to hosted runtime'?)
So the VM is running, tasks are executing, then I enter '1' (COMMAND LINE) at the bottom command line, just trying out something, and I don't know how to end that (I seem to have landed in the COMMAND LINE). Wait, RAM/Disk in the upper right corner has disappeared, now there is 'Reconnect'. So I click Reconnect and then Run All (from Runtime). And... a new VM starts. And all my previous tasks are gone. crying

The only problem is that many devices appear in our account ... 150 Device Installations appear to me.
Yes, I'm seeing that, too. Many new devices with tasks that will end in 'No Reply'. sad
----------------------------------------
[Edit 3 times, last edit by adriverhoef at Aug 4, 2020 11:19:21 PM]
[Aug 4, 2020 11:02:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
zolople
Cruncher
Spain
Joined: Apr 25, 2020
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

I think I have disabled the GPU, because I'm staying at WCG's side, so to speak.

The GPU only needs to be activated if necessary. At WCG, now, it is NOT necessary. If you want to add other projects that need it (GPUGrid, Einstein ...), activate it in menu Edit / notebook settings

This is what I do:
When the VM ends, I click Reconnect, then I click 'Connect to hosted runtime', it says Allocating..., Reconnect..., then there is RAM and Disk available according to the icons that are appearing and hovering the mouse over RAM/Disk says: "Waiting for Python 3 backend to finish its current execution." Then the VM seems to restart, because the 'bottom command line' (0-STOP, ..., 2-Get State, ...) appears, but then it appears that I can't start any bottom commands, and clicking 'Runtime' at the top menu also doesn't work. Right next to RAM/Disk there is this down pointing little triangle, when I click that, I see 'Manage sessions', clicking that doesn't do anything anymore. I'm completely at a loss there. I'm stuck.

OK, let's try this again. I'm opening a new browser window for Colab. Now devilish I can go to Runtime at the Colab menu and click 'Interrupt execution'! Then I can click 'Run all' from Runtime. And now the VM seems to have restarted, I can enter a command like e.g. '2' at the bottom command line and I'm seeing tasks that continue from their latest checkpoints and there is this message at the bottom of the Colab screen: "Automatic saving failed. This file was updated remotely or in another tab. Show diff".


Yes, these are communication errors between Colab and the browser.
When the form for giving instructions disappears, the solution is to stop the execution of THE CELL. It is enough to press the icon of the cell itself, (directly with the left button, or right button and the option to interrupt execution) and then start it again. If successful, you have to reconnect and tasks appear without reinstalling the environment.
Only if everything goes very very bad (when the runtime stops and gives error every time I try to start it) I give menu / reset to factory state.
The save error ... just hit CTRL-S to save it manually and remove the error.

I guess I'm making a mess ... So here's my question. What steps do I need to take to continue tasks from their latest checkpoints after the VM ends?
(Do I need to click Reconnect? Do I need to click 'Connect to hosted runtime'?)
So the VM is running, tasks are executing, then I enter '1' (COMMAND LINE) at the bottom command line, just trying out something, and I don't know how to end that (I seem to have landed in the COMMAND LINE). Wait, RAM/Disk in the upper right corner has disappeared, now there is 'Reconnect'. So I click Reconnect and then Run All (from Runtime). And... a new VM starts. And all my previous tasks are gone. crying


For a permanent connection with Google Drive, you must activate Drive, before executing, by clicking on this icon:


You must verify that you are correctly connected to Google Drive. If you don't see a "Drive" folder, a "My Drive" folder inside it and a "Boinc" folder inside it, something is not working.
1) Stop running
2) disconnect Drive
3) wait for it to completely disconnect
4) reconnect it
5) wait for it to completely connected
6) run again


Yes, I'm seeing that, too. Many new devices with tasks that will end in 'No Reply'. sad

The devices have no solution, but ... Those missed tasks shouldn't exist! Are you sure you have Google Drive well connected to the Colab notebook?
----------------------------------------


[Aug 5, 2020 7:00:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2156
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

For a permanent connection with Google Drive, you must activate Drive, before executing, by clicking on this icon:

I don't see those three icons:


Uhm, let's try something ... Clicking the 'Files' icon ...
Ha! There's the row of three icons ('Upload to session storage', 'Refresh', 'Mount Drive').

Clicking the 3rd one: "Mounting Google Drive". Succeeded.

Clicking Runtime→Run all ..
... It's executing.

Let's see how this session goes.

Question:
Do you need to have the browser open and connected to Colab to prevent halting notebook execution at all times?
[Aug 5, 2020 11:34:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
zolople
Cruncher
Spain
Joined: Apr 25, 2020
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

Question: Do you need to have the browser open and connected to Colab to prevent halting notebook execution at all times?


YES! You need to have the browser open! If you close the browser, Google will close the connection in a maximum of 1 hour

I take this opportunity to clarify a question that remained pending:

The script runs Boinc as a daemon, so:
- The "0-stop" option stops Boinc completely. If you start again, it will reinstall everything and start Boinc.
- The "1-command line" option does NOT stop Boinc. It only gives us access to be able to execute code in other cells. If you start the cell where the script is again, it will immediately follow where it was, since Boinc was still running like a daemon and the script only has to read the state and show it.

There are quite a few things that can be done by taking advantage of the command line (in ANOTHER cell), for example, restart WCG on this machine:
! boinccmd --project "http://www.worldcommunitygrid.org" reset
----------------------------------------


[Aug 5, 2020 1:50:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

Any web browser I know I keep open and open goes bloat and bloat and bloat and bloat some more until is eats gigabytes of memory in VM. Last I caught a chrome based browser which took the measly 3.2Gb. Everything went ultra slow, no wonder with all that swapping. Suppose you need one browser to exclusively serve this 'keep google drive open' purpose, without extensions or anything to minimize memory leak build-up.
[Aug 5, 2020 2:23:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
zolople
Cruncher
Spain
Joined: Apr 25, 2020
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: crunching in Google Cloud or IBM cloud?

For the browser it is not directly VM, it is a Google session (or two ... or three ... the limit is ten).
I use a portable Firefox (from portableapps) without add-ons and only use it for this. Now, it has been open for about 13 hours with 20 tabs and is using 2846MB.
----------------------------------------


[Aug 5, 2020 5:49:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 49   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread