Like many of you, I have been very excited by Google's Colaboratory project. While it isn't exactly new, its recent public release has generated a lot of renewed interest in the collaborative platform.
For those that don't know, Google Colaboratory is...
[...] a Google research project created to help disseminate machine learning education and research. It's a Jupyter notebook environment that requires no setup to use and runs entirely in the cloud.
Here are a few simple tips for making better use of Colab's capabilities while you play around with it. To be clear, these aren't hidden hacks, but a handy collection of documented (and further clarified) functionality that may be essential.
1. Using a Free GPU Runtime
Select "Runtime," "Change runtime type," and this is the pop-up you see:
Ensure "Hardware accelerator" is set to GPU (the default is CPU). Afterward, ensure that you are connected to the runtime (there is a green check next to "connected" in the menu ribbon).
To check whether you have a visible GPU (i.e. you are currently connected to a GPU instance), run the following excerpt (directly from Google's code samples):
If you are connected, here is the response:
Alternatively, supply and demand issues may lead to this:
And there you go. This allows you to access a free GPU for up to 12 hours at a time.
2. Installing Libraries
Currently, software installations within Google Colaboratory are not persistent, in that you must reinstall libraries every time you (re-)connect to an instance. Since Colab has numerous useful common libraries installed by default, this is less of an issue than it may seem, and installing those libraries which are not pre-installed are easily added in one of a few different ways.
You will want to be aware, however, that installing any software which needs to be built from source may take longer than is feasible when connecting/reconnecting to your instance.
Colab supports both the
apt package managers. Regardless of which you are using, remember to prepend any bash commands with a !.
After your file(s) is/are selected, use the following to iterate the uploaded files in order to find their key names, using:
Now, load the contents of the file into a Pandas DataFrame using the following:
There you go. There are other ways out there of getting to the same place uploading and using data files, but I find this one the most straightforward and simple.
Google Colab has me excited to try machine learning in a similar way as using Jupyter notebooks, but with less setup and administration. That's the idea, anyways; we'll see how it plays out.
If you have any helpful Colab tips or tricks, leave them in the comments below.
- Fast.ai Lesson 1 on Google Colab (Free GPU)
- From Notebooks to JupyterLab – The Evolution of data Science IDE
- Exploratory data Analysis in Python