We introduce several useful programming tools for this course.
While you are not required to know Python, we assume that you have sufficient programming experience and are able to quickly pick up the language on your own.
Python is designed with ease of reading and writing in mind, and you should be able to comprehend the code in this section with little effort.
Numbers
x = 2
y = 3.
print(x, type(x))
print(y, type(y))
print(x + y, x - y, x * y, x / y, x//y)
Booleans
is_wealthy = True
is_happy = True
is_wealthy & is_happy
Strings
s1 = "hello"
s2 = "world"
print(s1 + ' ' + s2)
print('number of letters in ' + s1 + ": " + str(len(s1)))
print('number of letters in %s: %d' % (s1, len(s1)))
Lists
colors = ['red', 'green', 'blue']
print('We have', len(colors), 'colors:', colors)
colors.reverse()
print('Reversed list:', colors)
print('yellow appears ', colors.count('yellow'), 'times in the list')
print('green is at position', colors.index('green'))
del colors[1]
print('List after deleting item 1:', colors)
Dictionaries
contacts = {'Taylor Swift': '0711112222', 'George Bush': '0711113333'}
type(contacts)
print('The complete contacts:', contacts)
print('Contact names:', contacts.keys())
print("Taylor Swift's phone number:", contacts['Taylor Swift'])
print("Alex Taylor is a contact:", ('Alex Taylor' in contacts))
Resources
The following are some useful resources that we encourage you to self-study if you need to hone up your Python skills.
To start learning and using Colab, open the URL https://colab.research.google.com/ in your browser. The webpage already contains an introduction to Colab, with links to many additional resources. Log in to your Google account to start writing and executing code using Colab.
We provide a quick start guide on how to create your own notebook starting from scratch below.
File
on the menu bar on the top left corner, then click New Notebook
. You should see an empty notebook, which is simply a document that can contain code and its documentation.+ Code
at the top left corner in your browser. A code cell will be created below your current cursor position. Alternatively, you can move your cursor to be between two cells, then you will see a horizontal line with a + Code
button and + Text
button at the middle. Click the + Code
button and you will have a new cell created between the two existing cells.print("Hello World!")
in a code cell. If you hover your cursor over the cell, a run button appears on the left hand side of the cell, and you can click it to execute your code. You should see Hello World!
printed below the cell. Voila! You can use CTRL+ENTER or SHIFT+ENTER to run the code as well - the cursor stays in the same cell for the former, and moves down to the next cell for the latter.+ Text
button.Click File
, then click Upload notebook
and then use the popup window to select a file to upload.
For example, you can upload a copy of this notebook to Colab and play with it.
The text cells support rich text formatting, and you can create headers, tables, italicized text etc. You can do this by entering plain text in the text cells. However, when the text cells are executed, certain symbols are interpreted as instructions to render the text in a specific way. The markdown language specifies these special symbols and their effects. See https://colab.research.google.com/notebooks/markdown_guide.ipynb for details.
For those of you who are familiar with Jupyter Notebook, note that Colab's markdown syntax does not support HTML tags, which are supported in Jupyter Notebook.
Colab gives free access to GPUs and TPUs for free. As compared to CPUs, GPUs and TPUs can allow your deep learning programs to run in a massively parallel way - sometimes this can speed up your program by several orders of magnitude.
To use a GPU or a TPU, click Runtime
on the top left corner, then click Change runtime type
, then change the hardware accelerator to GPU or TPU (default is None).
If you choose GPU and want to confirm that in your code, input the following code in a code cell and execute it.
import torch
torch.cuda.is_available()
The output should be True
if GPU is active.
If you choose TPU and want to check whether that's successful, input the following code in a code cell and execute it.
import os
assert os.environ['COLAB_PU_ADDR']
If a TPU is active, the assert
statement should be successful, and nothing should be printed. Otherwise, you see a KeyError
.
Now you already have tried out using Jupyter Notebook on Colab. We cover a few tips that will make it more convenient for you to generate a nice-looking report containing both math and code.
Follow this brief instructions to install Jupyter Notebook on your own computer: https://jupyter.org/install.html.
Jupyter Notebook support writing maths using LaTeX. For example, typing $\beta$
in a text cell gives you $\beta$.
You can type several aligned equations as shown below.
\begin{align}
E &= m c^{2}, \\
E - m c^{2} &= 0.
\end{align}
This gives you the output below. \begin{align} E &= m c^{2}, \\ E - m c^{2} &= 0. \end{align}
If you haven't used LaTeX before, you are encouraged to use it when typesetting math. This cheat sheet will be very handy: http://tug.ctan.org/info/undergradmath/undergradmath.pdf.
You can display the content of a PDF/image/Word document in Jupyter Notebook using the wand
package. Unfortunately, the code below can't be run on Colab due to a permission restriction, so you will need to try it on your computer.
To use wand
, follow the official instructions to install it first: https://docs.wand-py.org/en/0.6.5/guide/install.html.
Basically, you will need to first install ImageMagick
if you don't have it yet, and then install wand
.
If you already have ImageMagick
installed, you don't need to read the official instructions, but can simply run the cell below to install wand
.
import sys
!{sys.executable} -m pip install wand
With wand
, we can define the following utility function for embedding a PDF/image/Word document in a Jupyter Notebook.
from wand.image import Image as WImage
def show_file(filename, pages=[0], scale=1):
'''
Display selected pages from a file at a chosen scale.
'''
for i in pages:
img = WImage(filename="%s[%d]" % (filename, i), resolution=100)
img.resize(width=int(scale*img.width), height=int(scale*img.height))
display(img)
Now we can display the first page of the file prac01.pdf
located in the same directory as this notebook. Try increasing the scale to make the displayed content larger.
show_file('prac01.pdf', scale=0.1)
We can display the second and third pages using the pages
argument.
show_file('prac01.pdf', pages=[1,2], scale=0.1)
We illustrate how to perform linear regression using sklearn in this section. The modules that we use include
The following links are useful
Linear regression on a random dataset
We create a random linear regression dataset with 1000 examples and 20 features using the make_regression
function in the sklearn.datasets
module.
from sklearn.datasets import make_regression
import numpy as np
n_samples, n_features = 1000, 20
rng = np.random.RandomState(0)
X, y = make_regression(n_samples, n_features, noise=0.5, random_state=rng)
X.shape, y.shape
We do a random 70-30 train-test split.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
X_train.shape, X_test.shape
Now we train and evaluate an ordinary least squares model.
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
reg = LinearRegression().fit(X_train, y_train)
print("R2 (train) = ", reg.score(X_train, y_train))
print("MSE (train) = ", mean_squared_error(y_train, reg.predict(X_train)))
print("MSE (test) = ", mean_squared_error(y_test, reg.predict(X_test)))
from tqdm import *
from time import *
for j in range(2):
progress = tqdm(range(10), desc='Epoch %d' %j)
for i in progress:
sleep(0.1)
progress.set_postfix(loss=i)
Plotting a gallery of images
We will often work images in this course. Sometimes we want to display a collection of images nicely on a panel, possibly with some labels. The following is a handy utility written using the matplotlib library for this purpose.
import sklearn.datasets
import scipy.misc
import matplotlib.pyplot as plt
def plot_gallery(images, titles=None, xscale=1, yscale=1, nrow=3, ncol=6, output=None):
plt.figure(figsize=(xscale * ncol, yscale * nrow))
for i in range(nrow * ncol):
plt.subplot(nrow, ncol, i + 1)
plt.imshow(images[i])
if titles is not None:
# use size and y to adjust font size and position of title
plt.title(titles[i], size=12, y=-0.2)
plt.xticks(())
plt.yticks(())
plt.tight_layout()
if output is not None:
plt.savefig(output)
plt.show()
racoon = scipy.misc.face()
plot_gallery([racoon, racoon, racoon], titles=['1', '2', '3'], xscale=3, yscale=3, nrow=1, ncol=3)