Concept, active function and output function of softmax regression

Multinomial Classification

Softmax Classification = Softmanx Regression
Finds in which category the input is involved.

import numpy as np
import matplotlib.pyplot as plt

nrDots = 10

def createDots(xMean, yMean, form):
    x = [np.random.normal(xMean, 0.5, 1) for i in range(nrDots + 1)]
    y = [np.random.normal(yMean, 0.5, 1) for i in range(nrDots + 1)]
    plt.plot(x, y, form)

dotConfigs = [(3, 5, "bo"),
             (1, 1, "ro"),
             (5, 2, "go")]
for conf in dotConfigs:
    createDots(conf[0], conf[1], conf[2])

lineConfigs = [(1/6, 3, "b"),
               (-7, 15, "r"),
               (6, -20, "g")]

x = [i for i in range(-1, 7)]
for conf in lineConfigs:
    y = [i * conf[0] + conf[1] for i in x]
    plt.plot(x, y, conf[2])

plt.xlim(-1, 6)
plt.ylim(-1, 6)
plt.xlabel("x")
plt.ylabel("y")
plt.title("Mutlinomial Classification")
plt.show()

Image 1. Multinomial classification

In this example, dots are classified in blue, red and green, and the lines show their borders.
It is multinomial classification to find the line which divides True and False areas for each color. This means multinomial classification is the combination of binary classifications.
However, multinomial classification has different active function and cost function to make simple.
- Affine function is the base of neural network, so it is used here too.

$$ Active(W \cdot X) = Y_p $$ $$ Affine = W \cdot X = \begin{bmatrix} w_{11} & w_{12} & w_{13} \\ w_{21} & w_{22} & w_{23} \\ w_{31} & w_{32} & w_{33} \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} w_{11}x_1 + w_{12}x_2 + w_{13}x_3 \\ w_{21}x_1 + w_{22}x_2 + w_{23}x_3 \\ w_{31}x_1 + w_{32}x_2 + w_{33}x_3 \end{bmatrix} $$ $$ Active(Affine) = \begin{bmatrix} y_p \\ y_p \\ y_p \end{bmatrix} = Y_p$$

Therefore, $ Active(Affine) $ is the hypothesis of multinomial classification.
Until now, $ Y $ is called as answer, but from here it is called as label. Also, $Y_p $ is predicted label.

Softmax Function

A generalization of the logistic function that "squashes" a K-dimensional vector z of arbitrary real values to a K-dimensional vector $ \sigma(z) $ of real values in the ranges 0,1 that add up to 1. - Wiki
Simply speaking, softmax function returns probability of the label.

$$ S(y_i) = \frac{e^{y_i}}{\sum_j e^{y_j}} $$

import numpy as np

# Softmax function
def softmax(a):
    max_a = np.max(a)   # To prevent overflow
                        # Exponential could have too large value.
                        # max_a will be downscale the original value.
    exp_a = np.exp(a - max_a)
    sum_exp_a = np.sum(exp_a)
    return exp_a / sum_exp_a

x = np.random.rand(5,1)
y = softmax(x)

print("    {0:7}   {1:7}".format("X", "S(X)"))
for i in range(len(x)):
    print("    {0:7.6f}  {1:7.6f}".format(x[i][0], y[i][0]))
print("Sum {0:7.6f}  {1:7.6f}".format(np.sum(x), np.sum(y)))

    X         S(X)   
    0.717855  0.207212
    0.620096  0.187914
    0.505516  0.167570
    0.541942  0.173787
    0.958234  0.263518
Sum 3.343643  1.000000

In this example, each input x are transformed to its probability with softmax function.
Because of this feature of softmax function, it is used as active function of multinomial classification.
- As we learn from binary classification, if output is between 0 and 1, the neural network become strong from redundant odd inputs.

Output Function

From softmax function, we know which one has the highest probability. However, we have not still set True or False, yet.
To do this, multinomial classification uses max function.
Therefore, the one which has the highest probability is True, and others are False.
Also, to know the index of True, argmax() is widely used.
- The index is started from 0.

import numpy as np

# Softmax function
def softmax(a):
    max_a = np.max(a)   # To prevent overflow
                        # Exponential could have too large value.
                        # max_a will be downscale the original value.
    exp_a = np.exp(a - max_a)
    sum_exp_a = np.sum(exp_a)
    return exp_a / sum_exp_a

x = np.random.rand(5,1)
y = softmax(x)
max_i = np.argmax(y)

print("    {0:7}   {1:7}".format("X", "S(X)"))
for i in range(len(x)):
    print("    {0:7.6f}  {1:7.6f}".format(x[i][0], y[i][0]))
print("Sum {0:7.6f}  {1:7.6f}".format(np.sum(x), np.sum(y)))
print("Biggest index: {0}, value: {1:7.6f}".format(max_i, x[max_i][0]))

    X         S(X)   
    0.471060  0.177648
    0.557162  0.193621
    0.729981  0.230148
    0.370831  0.160705
    0.763011  0.237877
Sum 2.892045  1.000000
Biggest index: 4, value: 0.763011

One-Hot Encoding

a group of bits among which the legal combinations of values are only those with a single high(1) bit and all the others low(0). - Wiki
In a bit stream, only one digit is 1 and the others are 0.
This is the usual output format of multinomial classification.
In multinomial classification, 1 means the neural network inferences that this input is involved in this label.

import numpy as np

# Softmax function
def softmax(a):
    max_a = np.max(a)   # To prevent overflow
                        # Exponential could have too large value.
                        # max_a will be downscale the original value.
    exp_a = np.exp(a - max_a)
    sum_exp_a = np.sum(exp_a)
    return exp_a / sum_exp_a

x = np.random.rand(5,1)
y = softmax(x)
max_i = np.argmax(y)
oh = np.zeros((5, 1))
oh[max_i] = 1

print("    {0:7}   {1:7}   {2:10}".format("X", "S(X)", "Prediction"))
for i in range(len(x)):
    print("    {0:7.6f}  {1:7.6f}  {2}".format(x[i][0], y[i][0], oh[i][0]))
print("Sum {0:7.6f}  {1:7.6f}  {2}".format(np.sum(x), np.sum(y), np.sum(oh)))

    X         S(X)      Prediction
    0.332139  0.181766  0.0
    0.299237  0.175883  0.0
    0.011281  0.131876  0.0
    0.531769  0.221928  0.0
    0.794273  0.288546  1.0
Sum 1.968699  1.000000  1.0

Universe In Computer

Header$type=social_icons

$type=grid$count=3$meta=0$sn=0$rm=0

12. Multinomial Classification

TOC

Multinomial Classification

Softmax Function

Output Function

One-Hot Encoding

라벨:

COMMENTS

Labels

RECENT$type=list-tab$date=0$au=0$c=5

REPLIES$type=list-tab$com=0$c=4$src=recent-comments

RANDOM$type=list-tab$date=0$au=0$c=5$src=random-posts

$type=grid$count=3$meta=0$sn=0$rm=0

12. Multinomial Classification

TOC

Multinomial Classification

Softmax Function

Output Function

One-Hot Encoding

라벨:

SHARE:

COMMENTS

Labels

RECENT$type=list-tab$date=0$au=0$c=5

REPLIES$type=list-tab$com=0$c=4$src=recent-comments

RANDOM$type=list-tab$date=0$au=0$c=5$src=random-posts