Binary classification
Toc
- Neural Network
- Logistic Regression Classification
- Binary Classification
- Binary Classification with Linear Hypothesis
- Problem of Binary Classification with Linear Hypothesis
- Active Function
- Sigmoid Function
- Applying Sigmoid Function
- Identity Function
Neural Network
- Computing systems inspired by the biological neural networks that constitute animal brains. - Wiki
Image 1. Neural network
-
Input layer is a set of neurons to receive inputs and pass them to hidden layer.
-
Hidden layer is a set of neurons to describe hypothesis.
-
Output layer is a set of neurons having output function transforming the outputs of hidden layer to a format for the problem.
-
This is a simplified single neural network model to explain our example.
Image 2. Single neural network
- In this model, there are 3 layers and 4 nodes, and each nodes represent Neuron.
- Bias neuron is drawn to be clear.
- This can explain all single neural network, such as linear regression and binary classification.
- Here, single means there is only one hidden layer and the layer includes only one hypothesis neuron.
Logistic Regression Classification
- Logistic Regression = Logistic Classification
- A regression model where the dependent variable is categorical. - Wiki
- Binary(or binomial) classification or multinomial classification
Binary Classification
- The task of classifying the elements of a given set into two groups on the basis of a classification rule. - Wiki
- Its result is True or False. If a hypothesis is satisfied with a condition, the value is 1. If not, 0.
- Output function checks the condition of the result of hypothesis, and returns the result.
Binary Classification with Linear Hypothesis
import matplotlib.pyplot as plt
hypothesis = lambda _x : 0.1 * _x - 0.1
threshold = 0.5
output = lambda _x : _x >= threshold
x = [i for i in range(10)]
_hypo = list(map(hypothesis, x))
y = list(map(output, _hypo))
plt.plot(x, y, "o")
plt.plot(x, _hypo)
plt.ylim(-0.5, 1.5)
plt.grid()
plt.title("Binary classification")
plt.xlabel("x")
plt.ylabel("y")
plt.show()
Image 3. Binary classification
- In the example, hypothesis is \(0.1 * x - 0.1 \), and output layer returns True if the result of hypothesis is larger than \( 0.5 \).
- To get a value which is bigger than 0.5, x should be bigger than 6.
- Therefore, values being small than 6 is 0, the others are 1.
Problem of Binary Classification with Linear Hypothesis
- If the result of hypothesis is too bigger than 1 or too smaller than 0, cost function will return very high values. This can change weight and bias to be accustomed to the input value. Therefore, this can make machine dull.
- The representative example is that an input is too far from other input.
import matplotlib.pyplot as plt
hypothesis = lambda _x : 0.1 * _x - 0.1
threshold = 0.5
output = lambda _x : _x >= threshold
# Correct hypothesis
x = [i for i in range(10)]
x.append(20)
ans_hypo = list(map(hypothesis, x))
ans_y = list(map(output, ans_hypo))
# The result of hypothesis for input 20 is 1.9.
# However, others are smaller than 1.
# By MSE, their errors are almost 4 and 1.
# The error 4 updates weights and bias to fit itself.
# Therefore, the weight can be smaller.
# Trained hypothesis
hypothesis = lambda _x : 0.09 * _x - 0.1
trn_hypo = list(map(hypothesis, x))
trn_y = list(map(output, trn_hypo))
plt.plot(x, ans_y, "ro")
plt.plot(x, ans_hypo,"r")
plt.plot(x, trn_y, "b*")
plt.plot(x, trn_hypo, "b")
plt.ylim(-0.5, 1.5)
plt.grid()
plt.title("Binary classification")
plt.xlabel("x")
plt.ylabel("y")
plt.show()
Image 4. Problem of linear hypothesis
- Because of new input 20, the weight becomes smaller, and now 6 is wrong with trained hypothesis.
- Therefore, simple combination of weighted sum and output function is not enough for this problem.
Active Function
- Defines the output of that node given an input or set of inputs. - Wiki
- Active function transforms the result of weighted sum to a format for the next neuron.
Image 5. Active function
$$ H(X) = A(S(X)) $$
- These equations are for matrix, but scalar values' are very similar.
- S() is the weighted sum, and A() is the active function.
- Now, hypothesis is the combination of weighted sum and active function..
- There are many types of Active function, such as sigmoid function, step function, linear, gausian and so on, but sigmoid function is representative.
Sigmoid Function
- = Logistic Function
- Return a value which is from 0 to 1.
$$ g(x) = \frac{1}{1 + \exp^{-x + b}} $$
import numpy as np
import matplotlib.pyplot as plt
sigmoid = lambda _x : 1 / (1 + np.exp(-1 * (w*_x + b)))
x = [i * 0.01 for i in range(-700,700)]
b = 0
for i in range(1, 4):
w = i * 0.5
y = list(map(sigmoid, x))
plt.plot(x, y, label="W={0}, b={1}".format(w, b))
plt.grid()
plt.title("Changing Weight")
plt.xlim(-7, 7)
plt.ylim(-0.2, 1.2)
plt.xlabel("x")
plt.ylabel("y")
plt.legend(loc="lower right")
plt.show()
w = 1
for b in range(-1, 2):
y = list(map(sigmoid, x))
plt.plot(x, y, label="W={0}, b={1}".format(w, b))
plt.grid()
plt.title("Changing Bias")
plt.xlim(-7, 7)
plt.ylim(-0.2, 1.2)
plt.xlabel("x")
plt.ylabel("y")
plt.legend(loc="lower right")
plt.show()
Image 6. Active function changing weight
Image 7. Active function chaning bias
Applying Sigmoid Function
$$ S(X) = X*W + b $$ $$ H(X) =A(S(X)) $$ $$ H(X) = \frac{1}{1 + \exp^{- W^{T} X + b}} $$
- Here, sigmoid function is used as active function.
- Transpose is optional to make them multiplicable.
- For previous example, by input 20, the weight is dramatically changed. However, sigmoid function prevents it by limiting the result of weighted sum from 0 to 1.
identify = lambda _x : _x
x = [i * 0.01 for i in range(-700,700)]
y = list(map(identify, x))
plt.plot(x, y)
plt.grid()
plt.title("Identify Function")
plt.xlim(-7, 7)
plt.ylim(-7, 7)
plt.xlabel("x")
plt.ylabel("y")
plt.show()
Image 8. Weighted sum & Active function
Identity Function
- Identity function is the active and output function for linear regression.
- Identity function returns the value which is the same as the input.
import numpy as np
import matplotlib.pyplot as plt
# Weighted sum without bias to be simple
weightSum = lambda _x : w * _x
# Active function
sigmoid = lambda _x : 1 / (1 + np.exp(-_x))
# Output function
output = lambda _x : _x > 0.5
# Input
x = [i * 0.01 for i in range(-800, 800)]
# Weight
w = 1
# Hypothesis
hypo = list(map(sigmoid, x))
# Output
y = list(map(output, hypo))
# Graph for hypothesis
plt.plot(x, hypo, label="Hypothesis")
# Graph for output
plt.plot(x, y, label="Answer")
plt.ylim(-0.2, 1.2)
plt.legend(loc="lower right")
plt.grid()
plt.show()
Image 9. Identity function
COMMENTS