Lab: Unsupervised Learning, Recommenders, Reinforcement Learning

Posted on 2023-04-23 Edited on 2023-08-09 In Machine Learning by AndrewNg Views: Word count in article: 566 Reading time ≈ 2 mins.

This post records the experimental process of labs of Unsupervised Learning, Recommenders, Reinforcement Learning by Andrew Ng and some bugs that may be encountered under Windows.

C3_W1_PracticeLab1

To finish this lab and make the code run efficiently, we must have a good understanding of the slicing and broadcasting of numpy so that we can implement vectorized code.

# Exercise 1
m = X.shape[0]
# You need to return the following variables correctly
idx = np.zeros(X.shape[0], dtype=int)
### START CODE HERE ###
for i in range(m):
    d = np.sum((centroids - X[i])**2, axis=1) # For each point, compute its distance to each centroid
    idx[i] = np.argmin(d, axis=0) # For each point, choose the closest centroid
### END CODE HERE ###

# Exercise 2
centroids = np.zeros((K, n))
### START CODE HERE ###
for i in range(K):
    p_i = idx == i # Get the index of points that were assigned to centroid i 
    x_i = X[p_i] # Slice
    centroids[i] = np.mean(x_i, axis=0) # Compute the mean value of each column
### END CODE HERE ##

axis is the dimension or the label, that will be taken into account. In other words, in each step, the other dimensions will not change. Therefore, for a matrix A, np.sum(A, axis=1) compute the sum of each row (The first dimension does't change).

See Numpy to know more about it.

C3_W1_PracticeLab2

This lab is quite easy. However, there are some points to notice:

Slicing and broadcasting of numpy;

Divide by 0 problem.

# Exercise 1
### START CODE HERE ### 
mu = np.mean(X, axis=0) # average
var = np.mean((X- mu)**2, axis=0)
### END CODE HERE ###

# Exercise 2
best_epsilon = 0
best_F1 = 0
prec = 0.
rec = 0.
F1 = 0

step_size = (max(p_val) - min(p_val)) / 1000

for epsilon in np.arange(min(p_val), max(p_val), step_size):

    ### START CODE HERE ### 
    actual_pos_num = np.sum(y_val) + 0. # Number of actual positive
    pred_pos = (p_val<epsilon) + 0 # Make predictions
    pred_pos_num = np.sum(pred_pos) + 0. # Number of predict positive
    tp = np.sum(y_val[pred_pos==1]) # Number of true positive
    if pred_pos_num != 0:
        prec = tp / pred_pos_num
    if actual_pos_num != 0:
        rec = tp / actual_pos_num
    if prec != 0 and rec != 0:
        F1 = 2 * prec * rec / (prec + rec)
    ### END CODE HERE ###
### END CODE HERE ###

C3_W2_PracticeLab1

This lab is about collaborative filtering. We only need to implement the cost function. The other parts are almost the same as linear regression.

# Vectorized
### START CODE HERE ###  
reg = lambda_ / 2 * (np.sum(W**2) + np.sum(X**2)) # regularization
err = (X @ W.T + b - Y)**2
J = np.sum(err[R==1]) / 2 + reg
### END CODE HERE ###

C3_W4_PracticeLab

Install dependent libraries (it is recommended to install anaconda first):

pip install gym==0.25.1
pip install pyvirtualdisplay
conda install swig # or pip install swig
conda install -c conda-forge gym-box2d
pip install imageio[ffmpeg]
pip install [pyav]

Other configurations:

import os
os.environ['KMP_DUPLICATE_LIB_OK']='TRUE'
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import warnings # ignore some warnings
warnings.filterwarnings("ignore", category=Warning)

If there is still a bug in Display(visible=0, size=(840, 480)).start(), you can just comment out this code.

# Exercise 1
# Create the Q-Network
q_network = Sequential([
    ### START CODE HERE ### 
    Dense(units=64, activation='relu', input_dim=8),
    Dense(units=64, activation='relu'),
    Dense(units=4, activation='linear')
    ### END CODE HERE ### 
    ])

# Create the target Q^-Network
target_q_network = Sequential([
    ### START CODE HERE ### 
    Dense(units=64, activation='relu', input_dim=8),
    Dense(units=64, activation='relu'),
    Dense(units=4, activation='linear')
    ### END CODE HERE ###
    ])

### START CODE HERE ### 
optimizer = Adam(learning_rate=ALPHA)
### END CODE HERE ###

# Exercise 2
### START CODE HERE ### 
y_targets = rewards + (1 - done_vals) * gamma * max_qsa
### END CODE HERE ###

### START CODE HERE ### 
loss = MSE(q_values, y_targets)
### END CODE HERE ###