Exercise 1 - Co-occurrence Matrix
Write a co-occurrence matrix for the following sentence:
“A bird in the hand is worth two in the bush.”
Count each word only if it appears directly after the reference word. Is the co-occurrence matrix unique?
Solution
The solution is not unique because we can change the order of the reference entries. In our case we will simply follow the intuitive ordering resulting in the following co-occurence matrix:
Reference | A | bird | in | the | hand | is | worth | two | bush |
---|---|---|---|---|---|---|---|---|---|
A | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
bird | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
in | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 |
the | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |
hand | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
is | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
worth | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
two | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
bush | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Exercise 2 - Convolutional Layers
Consider the following \(4\times 4 \times 1\) input X and a \(2\times 2 \times 1\) convolutional kernel K with no bias term
\[ X = \bpmat 1 & 0 & -2 & 1 \\ 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ -3 & 4 & 0 & 0 \epmat, \qquad % K = \bpmat 2, & 1 \\ 0, & 1 \\ \epmat \]
What is the output of the convolutional layer for the case of stride 1 and no padding?
What if we have stride 2 and no padding?
What if we have stride 2 and zero-padding of size 1?
Solution
- Here, we simply apply the convolutional kernel over each \(2\times 2\) patch of the input. There are 9 such patches. The output \(Y\) is then
\[ Y = \bpmat 3 & -1 & -3 \\ 2 & 3 & 3 \\ 5 & 2 & 1 \epmat \]
- Same idea except that we skip every other patch resulting in only 4 patches. The output \(Y\) is then
\[ Y = \bpmat 3 & -3 \\ 5 & 1 \epmat \]
- Now, we have added zeros on each side of the input. The resulting \(6\times 6\) padded input \(X_\mathrm{padded}\) and corresponding output \(Y\) are
\[ X_\mathrm{padded} = \bpmat 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & -2 & 1 & 0\\ 0 & 0 & 1 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & 0\\ 0 & -3 & 4 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 \epmat, \qquad Y = \bpmat 1 & -2 & 0 \\ 0 & 3 & 0 \\ -3 & 8 & 0 \epmat \]
Exercise 3 - Sizes in MLPs Refresher
You are given an MLP with ReLU activations. It has 3 layers consisting of 5, 10, and 5 neurons respectively. The input is a vector of size 10. How many parameters does this network have?
Solution
The number of parameters for each neuron is the number of weights plus one for the biaas term. The number of weights corresponds to the number of inputs / activations from the previous layer. So for the first layer, we have 10 inputs and thus 11 parameters per neuron resulting in 55 parameters total per layer.
A similar computation gives 60 and 55 as the number of parameters for the next two layers. Thus, the network has a total of 170 parameters.
Exercise 4 - Sizes in CNNs
You are givne a neural network with the following architecture:
Input: 100 x 100 x 3 Image
Layers:
1. Conv(in_channels=3, out_channels=5, kernel_size=3, stride=1, padding=0)
2. MaxPool2d(kernel_size=2, stride=2, padding=0)
3. Conv(in_channels=5, out_channels=10, kernel_size=3, stride=1, padding=0)
4. MaxPool2d(kernel_size=2, stride=2, padding=0)
5. Conv(in_channels=10, out_channels=5, kernel_size=3, stride=1, padding=0)
6. Flatten()
7. MLP(neurons=20)
8. MLP(neurons=10)
What is the dimensionality of the activations after each layer.
How many parameters does this network have?
Solution
- The outputs dimensions after each layer are:
1. 98 x 98 x 5
2. 49 x 49 x 5
3. 47 x 47 x 10
4. 23 x 23 x 10
5. 21 x 21 x 5
6. 2205
7. 20
8. 10
- The number of parameters for each layer is:
1. 140
2. 0
3. 460
4. 0
5. 455
6. 0
7. 44120
8. 210