The exercises this week involve some old material so you can check your learning and understanding.
Exercise 1 - Maximum Likelihood Estimator
Assume you are given datapoints \((x_i)_{i=1}^N\) with \(x_i\in\R\) coming from a Exponential distribution. The probability density function of a exponential distribution is given by \(f(x) = \la \exp(-\la x)\) with \(x\in\R\). Derive the maximum likelihood estimator of the parameter \(\la\).
Exercise 2 - Convolutional Layers
Consider the following \(4\times 4 \times 1\) input X and a \(2\times 2 \times 1\) convolutional kernel K with no bias term
\[ X = \bpmat 1 & 2 & -1 & 1 \\ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 2 \\ 2 & 1 & 0 & -1 \epmat, \qquad % K = \bpmat 1, & 0 \\ 2, & 1 \\ \epmat \]
What is the output of the convolutional layer for the case of stride 1 and no padding?
What if we have stride 2 and no padding?
What if we have stride 2 and zero-padding of size 1?
Exercise 3 - Computational Parameter Counting
Use PyTorch to load the vgg11
model and automatically compute its number of parameters. Output the number of parameters for each layer and the total number of parameters in the model.
Exercise 4 - Influence Functions
Let \(\hte\) and \(\hte(\ve)\) be as defined in class. Show that the first order Taylor expansion of \(\hte(\ve)\) around \(\ve=0\) is given by the equation given in class, i.e. by \[\begin{align*} \hat{\theta}({\epsilon}) \approx \hat{\theta} + \epsilon\frac{d\hat{\theta}(\epsilon)}{d\epsilon} {\Bigr |}_{\epsilon=0} . \end{align*}\]