Non-linear Activation Functions for Neural Networks Simplified


Activation function is a what forms an output of a neuron. This is what adds non-linearlity to your prediction and makes a Neural Network based predictor so much better than linear models.

The question we usually ask ourselves is which activation function should I use? 

The answer is there is no one-works-all answer to this question. It depends.

Let me walk you through the most commonly used activation functions and their pros and cons to help you make a better decision.

We can define our own activation functions to best fit our need, the most commonly used ones are:
1. Sigmoid Activation 
2. Tan hyperbolic Activation
3. ReLU (Rectified Linear Unit)
4. Leaky ReLU
This is how each of them looks like:

Photo Source: DeepLearning.ai Specialization

1. Sigmoid Activation

The sigmoid activation ranges between 0 and 1. It looks like the common "S shaped" curve we see in different fields of studies. 

Pros:
Simple - logic and arithmetic-wise
Offers good non-linearity
Natural probablity output - between 0 and 1 for classification problems.
Cons:
Network stops learning when you shoot values towards the extremes of the sigmoid - This is called the problem of vanishing gradients.

2. Tan Hyperbolic

This is pretty much a sigmoid over an extended range (-1 to 1). 

Pros:
This increases the steady non-linear range in the middle of the sigmoid before the slope/gradient flattens out. This increased range helps the network learn faster.
Cons:
This activation does limit the problem of vanishing gradients on the tail ends of the sigmoid to a certain grade but we have better options to learn faster.

3. ReLU

Rectified - MAX(0,value)
Linear - for z >0 (positive values)

ReLU is a fancy name for a postive-only linear function. For negative predictions the unit has a slope of 0. However, for the positive activations the network can surely learn a lot faster with a linear slope.

Pros:
Learns faster.
Slope is 1 as long as z is positive.
Cons:
Yet to find any :)

4. Leaky ReLU

This provides a slight slope for negative values of the prediction from the neuron. It's an improvement over ReLU.


These non-linear activations help improve and lay the foundation for Neural Networks. 

Until next time!





Comments

  1. Shooting Casino Resort: 150 Free Spins | ShootersCasino.com
    Shooting gioco digitale Casino Resort in Shreveport, LA 샌즈카지노 has 1xbet an amazing experience of it all. Play, win and try your hand at some of the most popular slot games in

    ReplyDelete

Post a Comment