Lab: Supervised Machine Learning - Regression and Classification

Optional labs - W1

Lab01 - Lab02

Just follow its instructions.

Lab03

There is a markdown syntax error in Notation paragraph:

1

To solve this, turn:

1
|: ------------|: ------------------------------------------------------------||

to

1
|:---:|:---:|:---:|

In Tools paragraph, if you have put deeplearning.mplstyle into the same folder of your labs'file, but code plt.style.use('./deeplearning.mplstyle') still can't run, you should delete ./ in ./deeplearning.mplstyle. If your os is Linux, this problem will not occur.

Lab04

In Tools paragraph, code %matplotlib widget can't run because of the lack of ipympl even though you installed jupyter using anaconda. Maybe the reason is that the version of ipympl is not compatible with jupyter. To solve this, run your anaconda prompt as a administrator and:

1
conda install -c conda-forge ipympl

The problem of ./deeplearning.mplstyle will still occur, modify it as before. In addition, you should also modify the same path in lab_utils_commonpy and lab_utils_uni.py. For the following labs, you should keep doing so once you encounter ./deeplearning.mplstyle.

Lab05

You may run into int overflow when running plt_divergence(p_histm J_hist, x_train, y_train). Solution:

1
2
3
4
5
6
7
8
9
# in file lab_utils_uni.py

# line 300: add np.int64
w_array = np.arange(-70000, 70000, 1000, dtype=np.int64)
# line 301: add np.int64
cost = np.zeros_like(w_array, dtype=np.int64)
# line 319: add np.int64
z=np.zeros_like(tmp_b, dtype=np.int64)

Optional labs - W2

Lab01

See NumPy.

Lab02

The same bugs as Supervised Machine Learning.Optional labs - W1.Lab03.

Lab03

The same bugs as Supervised Machine Learning.Optional labs - W1.Lab03.

Lab04

Just follow its instructions.

Lab05 - Lab06

See Scikit-learn.

Optional labs - W3

Lab01 - Lab09_Soln

Just follow its instructions.

Lab01_user

Answer:

1
g = 1 / (1 + np.exp(-z))

Lab02_user

Answer:

1
x1 = 3 - x0

Besides, there is a bug in lab_utils.plot_data:

1
2
3
# Add the following codes after neg = y == 0
pos = pos.reshape(-1,) # work with 1D or 1D y vectors
neg = neg.reshape(-1,)

Lab03_user

Answer:

1
2
3
4
5
for i in range(m):
### START CODE HERE ###
g = sigmoid(X[i] @ w + b)
cost -= y[i] * np.log(g) + (1 - y[i]) * np.log(1 - g)
### END CODE HERE ###

@ can represent matrix multiplication.

Lab04_user

Non-vectorized answer:

1
2
3
4
5
6
7
### START CODE HERE ### 
for i in range(m):
err = sigmoid(X[i] @ w + b) - y[i]
for j in range(n):
dJdw[j] += err * X[i][j]
dJdb += err
### END CODE HERE ###

For each example, get its error and apply it to different $w_j$.

Vectorized answer:

1
2
3
4
5
6
7
### START CODE HERE ### 
z = X @ w + b
g = sigmoid(z)
err = g - y
dJdw = 1 / m * (np.dot(X.T, err))
dJdb = 1 / m * (np.sum(err))
### END CODE HERE ###

Get error of all examples simultaneously.

Lab05_user

See Scikit-learn.

Lab06_user

This lab realize multiclass classification using multiple binary classification models, that is One Vs All algorithm. In fact, its core idea is quite simple. For an example with $n$ possible $y$, represent its label using a vector with $n$ binary elements, only one of which is 1. Then, we just need to train $n$ binary classification models and choose the biggest prediction of them as $\widehat{y}$.

Answer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# step 1
### START CODE HERE ###
w_init = np.zeros((2, 1))
b_init = 0.
# call gradient descent
w_final, b_final,_,_ = gradient_descent(X_train, yc, w_init, b_init,
compute_cost_logistic_matrix,
compute_gradient_logistic_matrix,
predict_logistic_matrix, 1e-2, 1000)
### END CODE HERE ###

# make prediction
### START CODE HERE ###
z_wb = X @ W + b
G = sigmoid(z_wb)
pclass = np.argmax(G, axis=1)
### END CODE HERE ###

# Second Test Case
# plot the decison boundary. Pass in our models - the w's and b's assocated with each model and predict_mc
plot_mc_decision_boundary(X_train,3, W_models, b_models, predict_mc)
plt.title("model decision boundary vs original training data")

# add the original data to the decison boundary
plot_mc_data(X_train,y_train,["blob one", "blob two", "blob three"], legend=True)
plt.show()

Lab07_user

Just follow its instructions. However, there is also a bug because of the update of sklearn. We should turn penalty='none' to penalty=None.

map_feature is a very interesting function:

1
2
3
4
5
6
7
8
9
10
11
12
13
def map_feature(X1, X2, degree):
"""
Feature mapping function to polynomial features
"""
X1 = np.atleast_1d(X1)
X2 = np.atleast_1d(X2)

out = []
for i in range(1, degree+1):
for j in range(i + 1):
out.append((X1**(i-j) * (X2**j)))

return np.stack(out, axis=1)

It produces polynomials of degree up to degree formed by X1 and X2. That is, if degree=3:
$$(x_1+x_2)+(x_1^2+x_1x_2+x_2^2)+(x_1^3+x_1^2x_2+x_1x_2^2+x_2^3)$$

Lab08_user

Answer:

1
2
3
4
### START CODE HERE ### 
f_wb = sigmoid(X @ w + b) # m*1
cost += 1 / m * (np.dot(-y, np.log(f_wb)) - np.dot(1 - y, np.log(1 - f_wb))) + lambda_ / 2 * np.sum((w**2))
### END CODE HERE ###

Lab09_user

Answer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Looping version
### START CODE HERE ###
for i in range(m):
err = np.sum(sigmoid(X[i] @ w + b) - y[i])
for j in range(n):
dJdw[j] += err * X[i][j]
dJdb += err
### END CODE HERE ###

# Vectorized version
### START CODE HERE ###
f_wb = sigmoid(X @ w + b) # m*1
err = f_wb - y # m*1
dJdw = 1 / m * np.dot(X.T, err) + lambda_ * w
dJdb = np.sum(err) / m
### END CODE HERE ###

PracticeLab01

This lab requires us to implement the compute_cost and compute_gradient function of a linear regression model with only one feature. It is quite simple:

1
2
3
4
5
6
7
8
9
10
11
12
13
# compute_cost
### START CODE HERE ###
cost = 1 / (2 * m) * (w * x + b - y)**2 # m*1
total_cost = np.sum(cost)
### END CODE HERE ###

# compute_gradient
### START CODE HERE ###
f_wb = w * x + b # m*1
err = f_wb - y # m*1
dj_dw = err.T @ x / m
dj_db = np.sum(err) / m
### END CODE HERE ###

PracticeLab02

This lab requires us to implement a logistic regression model with regularization. It is a bit more complicated than PracticeLab01, but it's still easy to finish.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# sigmoid
### START CODE HERE ###
g = 1 / (1 + np.exp(-z))
### END SOLUTION ###

# compute_cost
### START CODE HERE ###
f_wb = sigmoid(X @ w + b) # m*1
total_cost = 1 / m * (np.dot(-y.T, np.log(f_wb)) - np.dot(1 - y.T, np.log(1 - f_wb)))
### END CODE HERE ###

# compute_gradient
### START CODE HERE ###
f_wb = sigmoid(X @ w + b)
err = f_wb - y
dj_dw = 1 / m * (X.T @ err)
dj_db = 1 / m * np.sum(err)
### END CODE HERE ###

# predict
### START CODE HERE ###
y_pred = sigmoid(X @ w + b)
p = 0 + (y_pred >= 0.5)
### END CODE HERE ###

# compute_cost_reg
### START CODE HERE ###
reg_cost = np.sum(w**2)
### END CODE HERE ###

# compute_gradient_reg
### START CODE HERE ###
dj_dw += lambda_ / m * w
### END CODE HERE ###