Binary classification

Neural Network

Computing systems inspired by the biological neural networks that constitute animal brains. - Wiki

Image 1. Neural network

Input layer is a set of neurons to receive inputs and pass them to hidden layer.
Hidden layer is a set of neurons to describe hypothesis.
Output layer is a set of neurons having output function transforming the outputs of hidden layer to a format for the problem.
This is a simplified single neural network model to explain our example.

Image 2. Single neural network

In this model, there are 3 layers and 4 nodes, and each nodes represent Neuron.
Bias neuron is drawn to be clear.
This can explain all single neural network, such as linear regression and binary classification.
Here, single means there is only one hidden layer and the layer includes only one hypothesis neuron.

Logistic Regression Classification

Logistic Regression = Logistic Classification
A regression model where the dependent variable is categorical. - Wiki
Binary(or binomial) classification or multinomial classification

Binary Classification

The task of classifying the elements of a given set into two groups on the basis of a classification rule. - Wiki
Its result is True or False. If a hypothesis is satisfied with a condition, the value is 1. If not, 0.
Output function checks the condition of the result of hypothesis, and returns the result.

Binary Classification with Linear Hypothesis

import matplotlib.pyplot as plt

hypothesis = lambda _x : 0.1 * _x - 0.1
threshold = 0.5
output = lambda _x : _x >= threshold

x = [i for i in range(10)]
_hypo = list(map(hypothesis, x))
y = list(map(output, _hypo))

plt.plot(x, y, "o")
plt.plot(x, _hypo)
plt.ylim(-0.5, 1.5)
plt.grid()
plt.title("Binary classification")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

Image 3. Binary classification

In the example, hypothesis is $0.1 * x - 0.1 $, and output layer returns True if the result of hypothesis is larger than $ 0.5 $.
To get a value which is bigger than 0.5, x should be bigger than 6.
Therefore, values being small than 6 is 0, the others are 1.

Problem of Binary Classification with Linear Hypothesis

If the result of hypothesis is too bigger than 1 or too smaller than 0, cost function will return very high values. This can change weight and bias to be accustomed to the input value. Therefore, this can make machine dull.
The representative example is that an input is too far from other input.

import matplotlib.pyplot as plt

hypothesis = lambda _x : 0.1 * _x - 0.1
threshold = 0.5
output = lambda _x : _x >= threshold

# Correct hypothesis
x = [i for i in range(10)]
x.append(20)
ans_hypo = list(map(hypothesis, x))
ans_y = list(map(output, ans_hypo))

# The result of hypothesis for input 20 is 1.9.
#  However, others are smaller than 1.
#  By MSE, their errors are almost 4 and 1.
#  The error 4 updates weights and bias to fit itself.
#  Therefore, the weight can be smaller.

# Trained hypothesis
hypothesis = lambda _x : 0.09 * _x - 0.1
trn_hypo = list(map(hypothesis, x))
trn_y = list(map(output, trn_hypo))

plt.plot(x, ans_y, "ro")
plt.plot(x, ans_hypo,"r")
plt.plot(x, trn_y, "b*")
plt.plot(x, trn_hypo, "b")
plt.ylim(-0.5, 1.5)
plt.grid()
plt.title("Binary classification")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

Image 4. Problem of linear hypothesis

Because of new input 20, the weight becomes smaller, and now 6 is wrong with trained hypothesis.
Therefore, simple combination of weighted sum and output function is not enough for this problem.

Active Function

Defines the output of that node given an input or set of inputs. - Wiki
Active function transforms the result of weighted sum to a format for the next neuron.

Image 5. Active function

$$ H(X) = A(S(X)) $$

These equations are for matrix, but scalar values' are very similar.
S() is the weighted sum, and A() is the active function.
Now, hypothesis is the combination of weighted sum and active function..
There are many types of Active function, such as sigmoid function, step function, linear, gausian and so on, but sigmoid function is representative.

Sigmoid Function

= Logistic Function
Return a value which is from 0 to 1.

$$ g(x) = \frac{1}{1 + \exp^{-x + b}} $$

import numpy as np
import matplotlib.pyplot as plt

sigmoid = lambda _x : 1 / (1 + np.exp(-1 * (w*_x + b)))

x = [i * 0.01 for i in range(-700,700)]

b = 0
for i in range(1, 4):
    w = i * 0.5
    y = list(map(sigmoid, x))
    plt.plot(x, y, label="W={0}, b={1}".format(w, b))

plt.grid()
plt.title("Changing Weight")
plt.xlim(-7, 7)
plt.ylim(-0.2, 1.2)
plt.xlabel("x")
plt.ylabel("y")
plt.legend(loc="lower right")
plt.show()

w = 1
for b in range(-1, 2):
    y = list(map(sigmoid, x))
    plt.plot(x, y, label="W={0}, b={1}".format(w, b))

plt.grid()
plt.title("Changing Bias")
plt.xlim(-7, 7)
plt.ylim(-0.2, 1.2)
plt.xlabel("x")
plt.ylabel("y")
plt.legend(loc="lower right")
plt.show()

Image 6. Active function changing weight

Image 7. Active function chaning bias

Applying Sigmoid Function

$$ S(X) = X*W + b $$ $$ H(X) =A(S(X)) $$ $$ H(X) = \frac{1}{1 + \exp^{- W^{T} X + b}} $$

Here, sigmoid function is used as active function.
Transpose is optional to make them multiplicable.
For previous example, by input 20, the weight is dramatically changed. However, sigmoid function prevents it by limiting the result of weighted sum from 0 to 1.

identify = lambda _x : _x

x = [i * 0.01 for i in range(-700,700)]

y = list(map(identify, x))
plt.plot(x, y)

plt.grid()
plt.title("Identify Function")
plt.xlim(-7, 7)
plt.ylim(-7, 7)
plt.xlabel("x")
plt.ylabel("y")
plt.show()

Image 8. Weighted sum & Active function

Identity Function

Identity function is the active and output function for linear regression.
Identity function returns the value which is the same as the input.

import numpy as np
import matplotlib.pyplot as plt

# Weighted sum without bias to be simple
weightSum = lambda _x : w * _x
# Active function
sigmoid = lambda _x : 1 / (1 + np.exp(-_x))
# Output function
output = lambda _x : _x > 0.5

# Input
x = [i * 0.01 for i in range(-800, 800)]
# Weight
w = 1

# Hypothesis
hypo = list(map(sigmoid, x))

# Output
y = list(map(output, hypo))

# Graph for hypothesis
plt.plot(x, hypo, label="Hypothesis")

# Graph for output
plt.plot(x, y, label="Answer")

plt.ylim(-0.2, 1.2)
plt.legend(loc="lower right")
plt.grid()
plt.show()

Image 9. Identity function

Universe In Computer

Header$type=social_icons

$type=grid$count=3$meta=0$sn=0$rm=0

09. Binary Classification

Toc

Neural Network

Logistic Regression Classification

Binary Classification

Binary Classification with Linear Hypothesis

Problem of Binary Classification with Linear Hypothesis

Active Function

Sigmoid Function

Applying Sigmoid Function

Identity Function

라벨:

COMMENTS

Labels

RECENT$type=list-tab$date=0$au=0$c=5

REPLIES$type=list-tab$com=0$c=4$src=recent-comments

RANDOM$type=list-tab$date=0$au=0$c=5$src=random-posts

$type=grid$count=3$meta=0$sn=0$rm=0

09. Binary Classification

Toc

Neural Network

Logistic Regression Classification

Binary Classification

Binary Classification with Linear Hypothesis

Problem of Binary Classification with Linear Hypothesis

Active Function

Sigmoid Function

Applying Sigmoid Function

Identity Function

라벨:

SHARE:

COMMENTS

Labels

RECENT$type=list-tab$date=0$au=0$c=5

REPLIES$type=list-tab$com=0$c=4$src=recent-comments

RANDOM$type=list-tab$date=0$au=0$c=5$src=random-posts