(a) Load the Boston house price dataset in sklearn, and construct a 80-20 train-test split.
Answer. [Write your solution here. Add cells as needed.]
(b) Use Numpy to fit a ridge regression model with $\lambda = 0.1$. Show the model parameters, and calculate its training and test MAEs (mean absolute error).
Answer. [Write your solution here. Add cells as needed.]
(c) Read the documentation of sklearn.linear_model.Ridge
, and use it to fit the same ridge regression model. Do you obtain the same model parameters?
Answer. [Write your solution here. Add cells as needed.]
We will work with a faces dataset provided in the scikit-learn library, namely, the Olivetti dataset. The API for loading this dataset can be found at https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_olivetti_faces.html#sklearn.datasets.fetch_olivetti_faces. You may want to use sklearn.decomposition.PCA class to answer these questions. See https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA. html.
(a) Load the dataset. How many face images are there in the dataset? What is the size of each image?
Answer. [Write your solution here. Add cells as needed.]
(b) Use matplotlib.pyplot.imshow to display the first five images in the dataset.
Answer. [Write your solution here. Add cells as needed.]
(c) Find the top 5 eigenfaces for this dataset, and display them.
Answer. [Write your solution here. Add cells as needed.]
(d) Compute the pairwise dot product between the eigenfaces, and show the results as a 5x5 matrix with the (i, j)-th entry being the dot product between the i-th and j-th eigenfaces.
Answer. [Write your solution here. Add cells as needed.]
(e) What are the variances of the projections of all the face images on each of the top 5 eigenfaces?
Answer. [Write your solution here. Add cells as needed.]