Assignment #1 from Udacity's Deep Learning course gives you insight that a logistic multi-nomial (linear) regression model may not give you the best accuracy required for the non-MNIST dataset classification problem.
Let us look at the logisitic multi-nomial model as an algorithm and try to calculate it's complexity.
The 2 parameters to consider here are W - Weight Matrix and b - bias matrix with 1 layer.
Imagine, the input image is a 28x28 image and the output is a 10 class vector.
The input image is going to be stretched out into individual pixels feeding into each unit. This makes the input layer dimensions to be 28x28. The dimensions of the parameter W become (28x28)x10 which gets added to a 10x1 bias matrix. The total number of parameters are:
28x28x10+10 = (N+1)*K
Where N is the number of inputs and K is the number of outputs.
Another way to understand this is - Between an input layer with 28x28 nodes and an output layer with 10 nodes, you need a minimum of 28x28x10 weights for a fully connected network with bias on top of that which adds the extra 10 to the above equation.
The argument started with exploring the accuracy of a logistic regression model which is close to 90% for this problem. In reality to achieve higher accuracy we need a lot more parameters to generalize and extend the model for a better solution to the problem. This paves to way to further exploring Deep Neural Networks.
Stay tuned for more.
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete