Introduction To Machine Learning: An Overview Of Deep Neural Networks

October 28, 2018

“closeup photo of eyeglasses” by Kevin Ku on Unsplash

Introduction To Machine Learning: An Overview Of Deep Neural Networks

The MNIST dataset is often referred to as the “Hello World” of machine learning programs for computer vision. The MNIST dataset is composed of 28x28 pixels images of handwritten digits (0, 1, 2, etc).

In the proceeding article, we’ll look at how a computer could use a neural network to determine which images correspond to what number.

In theory, we should have 784 values for x (28 x 28 = 784), each corresponding to a pixel.

The circles in the middle of the image are neurons. They’re aligned to represent layers. A deep neural network is composed of two or more hidden layers. You can have any number of neurons in a layer. Every neuron is connected to every other neuron in the next layer.

Each connection is associated with it’s own unique weight and bias. The weights are random in the beginning. During the feed forward process, the values entering a neuron are determined by the following formula.

The neuron sums all the incoming values and then passes the result through an activation function.

An activation function determines whether the neuron will fire or not. There are several types of activation functions, the simplest of which being the step function.

Step Function

If the value coming in to the neuron is greater than the specified threshold the output will be 1 (activated) otherwise 0.

The step function is rarely used because it becomes difficult to determine which of the neurons classified the data correctly when more than one is activated. The latter being necessary to train our model.

In the vast majority of cases you will end up using a Sigmoid function.

Sigmoid Function

In this case, neurons can be partially activated making it easier to find the one that was correct in its classification.

Upon reaching the end, you compare the output your sample data. Then, you need to adjust the weights to help you get your output to match the desired output. The process by which we go backwards and begin adjusting weights is known as backwards propagation.

An iteration of feed forward followed by backwards propagation is called an epoch.