程序代写代做代考 Report

Report
Topic 1
How effective are early stopping methods at reducing overfitting?

In this topic, I will investigate the effective of early stopping at reduceing overfitting.

I implement the early stopping as follows. In the train method of class Optimiser, for
a fixed interval, compute the validation error, if the validation error doesn’t decrease,
then we end the training process.

When the training time increase, the training error will always decrease, but the validation error may go up. If the validation error go up, it may indicate that the overfitting happens. The hypothesis is that if we end the training when validation error doesn’t decrease, it can avoid the overfitting.

To test the hypothesis, I design the experiment as follows. I use the MultipleLayerModel with 5 AffineLayers. Two runs with the same parameters and with only 1 difference. One is using early stopping method and another without.

Figure 1. Error plot without early stopping.

Figure 2. Error plot with early stopping.

with_early_stop
final error(train)
final error(valid)

No
1.81e-03
1.47e-01

Yes
4.85e-02
1.23e-01

Table 1. Final Error.

From the graph, we can see that in the plot without early stopping, the validation error start increasing after about 40 epochs indicating it is over fitting. In the plot of early stopping, we can see that it stops at epoch 30 instead of 100, because the error in epoch 30 increase a little, so it stops training.

From the final error, we can see that although the training error of the early stopping is above 10 times bigger than the one without early stopping, it is final validation error is much smaller. From the result, we can see that early stopping method indeed reduces the overfitting. And it is a simple method to implement, and by early stopping, it also can save training time.
Topic 2
Data Augmentation
In this topic, I will investigate whether data augmentation can reduce overfitting and improve the performance.
The hypothesis is that data augmentation can increase the amount of data, reduce overfitting and makes the model better generalize.

My data augmentation works as follows. For a batch of images, 50% images of them will be rotated randomly between -30 to 30 degrees. And then 50% of the resulting samples will be shifted randomly between -4 and 4 pixels both in X and Y axis.
To test the effectiveness, I train two models, everything is the same except that one with data augmentation and one without.

Figure 3. Error plot with(left) and without(right) data augmentation.

Figure 3. Accuracy plot with(left) and without(right) data augmentation.

Data Augmentation
Final Error(train)
final error(valid)
final acc(train)
final acc(valid)

Yes
6.88e-02
6.17e-02
0.9788
0.9808

No
3.92e-04
1.06e-01
1.0000
0.9803

Table 2. Final Error and Accuracy.
In the error plot, the validation error with data augmentation decrease steadily and the validation error without data augmentation increases after about 30 epochs.
In the accuracy plot, the validation accuracy with data augmentation increase steadily, but the validation accuracy without data augmentation keeps about the same after 50 epochs.
The final validation error with data augmentation is also much lower than that without data augmentation.
All these shows that the one without data augmentation is overfitting and data augmentation indeed can reduce the overfitting and improve the accuracy.
Topic 3
models with convolutional layers

In this topic, I will investigate whether adding convolutional layer can improve the model’s performance.

I use the provided skeleton code to implement class ConvolutionalLayer. In the method of fprop, for each window of the image subx and for each filter weight w, the output is np.sum(subx * w) + b. In the method of bprop, for each window of output gradients dout, for each filter weight w, add dout * w to input gradients.In the method of grads_wrt_params, for each image window subx and gradients dout, add dout * subx to dw and dout to db. The implementation passes all the tests.
In the experiment, I create a model with 1 ConvolutionalLayer and AffineLayer to test its performance.

/docProps/thumbnail.jpeg

Posted in Uncategorized

Leave a Reply

Your email address will not be published. Required fields are marked *