# Artificial Neuron

machine-learning
Author

Akhilesh

Published

February 8, 2017

Artificial neuron is simply a computational unit, which makes a particular computation based on other units it is connected to. Pre-activation (or input activation): a(x) = b + ∑wi + xi

Neuron (output) activation: h(x) = g(a(x))

where, g(.) is called activation function, b is called neuron bias and w is the connection weights.

Activation functions:

• Linear activation function

g(a) = a

Performs no input squashing

• Sigmoid activation function

g(a) = sigm(a) = 1/1 + exp(-a)

Squashes the neuron’s pre-activation between 0 and 1.

Always positive

Bounded

Strictly increasing

• Hyperbolic tangent(tanh) activation function

g(a) = tanh(a)

Squashes the neuron’s pre-activation between -1 and 1

Can be positive or negative

Bounded

Strictly increasing

• Rectified Linear Activation function

g(a) = reclin(a) = max(0,a)

Bounded below by 0 and always non-negative

Not upper bounded

Monotomically increasing

Tends to give neurons sparse activities. That is, the value is zero across many different sets of inputs.

Capacity of a single neuron

Capacity essentially means, the complexity of the computation that a neuron can perform.

• It could do binary classification with Sigmoid, which can interpret neuron as estimating p(y=1|x). That is, the probability of input ‘x’ belonging to class 1. (Since, we are using sigmoid activation function the output is bounded b/w 0 and 1). It is also known as logistic regression classifier. If the value is greater than 0.5, predict class 1, otherwise, predict 0.

• The classifier is performing linear classification, so, if we have a problem where we want to classify objects described by input vectors into different classes, if we can draw a hyperplane or a line (in case of 2d) between these two types of objects, then a single artificial neuron could do that for us!

• However, a single neuron cannot model a non-linearly seperable output.

If the input is transformed in a better representation, we can actually have a linear seperable.

Thereby, complicated computations cannot be modelled by single neuron. This is the main reason behind why we go for multilayer neural network.