Appendix B — Introduction to Colab

Author

phonchi

Published

February 17, 2023

Open In Colab


B.1 Setup

You can lookup the resources first:

import multiprocessing
cores = multiprocessing.cpu_count() # Count the number of cores in a computer
cores
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))
import sys
# Is this notebook running on Colab or Kaggle?
IS_COLAB = "google.colab" in sys.modules
IS_KAGGLE = "kaggle_secrets" in sys.modules

gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  if IS_COLAB:
    print("Go to Runtime > Change runtime and select a GPU hardware accelerator.")
  if IS_KAGGLE:
    print("Go to Settings > Accelerator and select GPU.")
else:
  from tensorflow.python.client import device_lib 
  print(device_lib.list_local_devices())
!nvidia-smi -L

V100 > P100 > T4 > K80 (but most of the time you get K80 or T4 using the free Colab)

B.2 Cells

A notebook is a list of cells. Cells contain either explanatory text or executable code and its output. Click a cell to select it.

B.2.1 Code cells

Below is a code cell. Once the toolbar button indicates CONNECTED, click in the cell to select it and execute the contents in the following ways:

  • Click the Play icon in the left gutter of the cell;
  • Type Cmd/Ctrl+Enter to run the cell in place;
  • Type Shift+Enter to run the cell and move focus to the next cell (adding one if none exists); or
  • Type Alt+Enter to run the cell and insert a new code cell immediately below it.

There are additional options for running some or all cells in the Runtime menu.

a = 10
a

B.2.2 Text cells

This is a text cell. You can double-click to edit this cell. Text cells use markdown syntax. To learn more, see our markdown guide.

You can also add math to text cells using LaTeX to be rendered by MathJax. Just place the statement within a pair of $ signs. For example $\sqrt{3x-1}+(1+x)^2$ becomes \(\sqrt{3x-1}+(1+x)^2.\)

Table generator also works here.

B.2.3 Adding and moving cells

You can add new cells by using the + CODE and + TEXT buttons that show when you hover between cells. These buttons are also in the toolbar above the notebook where they can be used to add a cell below the currently selected cell.

You can move a cell by selecting it and clicking Cell Up or Cell Down in the top toolbar.

Consecutive cells can be selected by “lasso selection” by dragging from outside one cell and through the group. Non-adjacent cells can be selected concurrently by clicking one and then holding down Ctrl while clicking another. Similarly, using Shift instead of Ctrl will select all intermediate cells.

B.3 Working with Bash

!pip install colab-xterm -qq
%load_ext colabxterm
%xterm

B.4 Working with python

Colaboratory is built on top of Jupyter Notebook. Below are some examples of convenience functions provided.

Long running python processes can be interrupted. Run the following cell and select Runtime -> Interrupt execution (hotkey: Cmd/Ctrl-M I) to stop execution.

import time
print("Sleeping")
time.sleep(90) # sleep for a while; interrupt me!
print("Done Sleeping")

B.4.1 System aliases

Jupyter includes shortcuts for common operations, such as ls:

%ls /bin

! calls out to a shell (in a new process), while % affects the process associated with the notebook

!cd sample_data
%cd sample_data

That !ls probably generated a large output. You can select the cell and clear the output by either:

  1. Clicking on the clear output button (x) in the toolbar above the cell; or
  2. Right clicking the left gutter of the output area and selecting “Clear output” from the context menu.

Execute any other process using ! with string interpolation from python variables, and note the result can be assigned to a variable:

message = 'Colaboratory is great!'
!echo -e '{message}\n'
foo = !echo -e '{message}\n'
foo
!mkdir test
OUT_DIR = './test'
!rm -rf {OUT_DIR}
!apt-get -qq install htop

B.4.2 Magics

Colaboratory shares the notion of magics from Jupyter. There are shorthand annotations that change how a cell’s text is executed. To learn more, see Jupyter’s magics page.

%load_ext autoreload
%autoreload 2

B.4.3 Automatic completions and exploring code

Colab provides automatic completions to explore attributes of Python objects, as well as to quickly view documentation strings. As an example, first run the following cell to import the numpy module.

import numpy as np
from numpy import arccos

If you now insert your cursor after np and press Period(.), you will see the list of available completions within the np module.

np

If you type an open parenthesis after any function or class in the module, you will see a pop-up of its documentation string:

np.ndarray()
np.min??
help(np.min)

When hovering over the method name the Open in tab link will open the documentation in a persistent pane. The View source link will navigate to the source code for the method.

B.5 Integration with Drive

Colaboratory is integrated with Google Drive. It allows you to share, comment, and collaborate on the same document with multiple people:

  • The SHARE button (top-right of the toolbar) allows you to share the notebook and control permissions set on it.

  • File->Make a Copy creates a copy of the notebook in Drive.

  • File->Save saves the File to Drive. File->Save and checkpoint pins the version so it doesn’t get deleted from the revision history.

  • File->Revision history shows the notebook’s revision history.

B.5.1 Uploading files from your local file system

files.upload returns a dictionary of the files which were uploaded. The dictionary is keyed by the file name and values are the data which were uploaded.

from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

Files are temporarily stored, and will be removed once you end your session.

B.5.2 Downloading files to your local file system

files.download will invoke a browser download of the file to your local computer.

from google.colab import files

with open('example.txt', 'w') as f:
  f.write('some content')

files.download('example.txt')

B.5.3 Mounting Google Drive locally

The example below shows how to mount your Google Drive on your runtime using an authorization code, and how to write and read files there. Once executed, you will be able to see the new file (foo.txt) at https://drive.google.com/.

This only supports reading, writing, and moving files; to programmatically modify sharing settings or other metadata, use one of the other options below.

Note: When using the ‘Mount Drive’ button in the file browser, no authentication codes are necessary for notebooks that have only been edited by the current user.

from google.colab import drive
drive.mount('/content/drive')
!ls /content/drive
with open('/content/drive/My Drive/foo.txt', 'w') as f:
  f.write('Hello Google Drive!')
!cat /content/drive/My\ Drive/foo.txt
#drive.flush_and_unmount()
#print('All changes made in this colab session should now be visible in Drive.')
!gdown --fuzzy https://drive.google.com/file/d/1KE8dUFWUM389SdDhGj-UJgyGtXpnsqpl/view?usp=sharing
from nsysu import hello
hello()
import sys
sys.path.append('/content/drive/MyDrive/colab_test/')
from nsysu_math import hello_math
# file available at https://drive.google.com/file/d/1KAu1yxGmR_oAcCLk4aEltWWM1eMp3FJj/view?usp=sharing
hello_math()

Remember DO NOT store input data in your drive and load from there. The input/output is very slow (store at ./ instead). Your output data should be stored in your google drive so that it can be accessed next time.

B.5.4 Loading Public Notebooks Directly from GitHub

Colab can load public github notebooks directly, with no required authorization step.

For example, consider the notebook at this address: https://github.com/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb.

The direct colab link to this notebook is: https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb.

To generate such links in one click, you can use the Open in Colab Chrome extension.

B.6 Run Flask or other web app

!pip install flask -qq
!pip install pyngrok -qq
from pyngrok import ngrok, conf
import getpass
print("Enter your authtoken, which can be copied from https://dashboard.ngrok.com/auth")
conf.get_default().auth_token = getpass.getpass()
# Setup a tunnel to the streamlit port 8050
public_url = ngrok.connect(8050)
public_url
from flask import Flask

app = Flask(__name__)

@app.route('/')
def hello():
    return 'Hello NSYSU!'

if __name__ == '__main__':
    app.run(port=8050)

B.7 Use different version of python/environment or lanaguage

Refer to here and condacolab for more information.

import sys
print(sys.version)

Jupyter is named after Julia, Python and R. You can change the kernel to R or Julia.

B.8 Downloading data from Kaggle

The Dogs vs. Cats dataset that we will use isn’t packaged with Keras. It was made available by Kaggle as part of a computer vision competition in late 2013, back when convnets weren’t mainstream. You can download the original dataset from www.kaggle.com/c/dogs-vs-cats/data.

But you can also use Kaggle API. First, you need to create a Kaggle API key and download it to your local machine. Just navigate to the Kaggle website in a web browser, log in, and go to the My Account page. In your account settings, you’ll find an API section. Clicking the Create New API Token button will generate a kaggle.json key file and will download it to your machine.

# Upload the API’s key JSON file to your Colab
# session by running the following code in a notebook cell:
from google.colab import files
files.upload()

Finally, create a ~/.kaggle folder, and copy the key file to it. As a security best practice, you should also make sure that the file is only readable by the current user, yourself:

!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
# You can now download the data we’re about to use:
!kaggle competitions download -c dogs-vs-cats

The first time you try to download the data, you may get a “403 Forbidden” error. That’s because you need to accept the terms associated with the dataset before you download it—you’ll have to go to www.kaggle.com/c/dogs-vs-cats/rules (while logged into your Kaggle account) and click the I Understand and Accept button. You only need to do this once.

More information about Kaggle API