12. Multinomial Classification

Concept, active function and output function of softmax regression

TOC

tensorflow

Multinomial Classification

  • Softmax Classification = Softmanx Regression
  • Finds in which category the input is involved.
import numpy as np
import matplotlib.pyplot as plt

nrDots = 10

def createDots(xMean, yMean, form):
    x = [np.random.normal(xMean, 0.5, 1) for i in range(nrDots + 1)]
    y = [np.random.normal(yMean, 0.5, 1) for i in range(nrDots + 1)]
    plt.plot(x, y, form)

dotConfigs = [(3, 5, "bo"),
             (1, 1, "ro"),
             (5, 2, "go")]
for conf in dotConfigs:
    createDots(conf[0], conf[1], conf[2])

lineConfigs = [(1/6, 3, "b"),
               (-7, 15, "r"),
               (6, -20, "g")]

x = [i for i in range(-1, 7)]
for conf in lineConfigs:
    y = [i * conf[0] + conf[1] for i in x]
    plt.plot(x, y, conf[2])

plt.xlim(-1, 6)
plt.ylim(-1, 6)
plt.xlabel("x")
plt.ylabel("y")
plt.title("Mutlinomial Classification")
plt.show()
output_1_0
Image 1. Multinomial classification
  • In this example, dots are classified in blue, red and green, and the lines show their borders.
  • It is multinomial classification to find the line which divides True and False areas for each color. This means multinomial classification is the combination of binary classifications.
  • However, multinomial classification has different active function and cost function to make simple.
    • Affine function is the base of neural network, so it is used here too.

$$ Active(W \cdot X) = Y_p $$ $$ Affine = W \cdot X = \begin{bmatrix} w_{11} & w_{12} & w_{13} \\ w_{21} & w_{22} & w_{23} \\ w_{31} & w_{32} & w_{33} \end{bmatrix} \cdot \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} w_{11}x_1 + w_{12}x_2 + w_{13}x_3 \\ w_{21}x_1 + w_{22}x_2 + w_{23}x_3 \\ w_{31}x_1 + w_{32}x_2 + w_{33}x_3 \end{bmatrix} $$ $$ Active(Affine) = \begin{bmatrix} y_p \\ y_p \\ y_p \end{bmatrix} = Y_p$$

  • Therefore, \( Active(Affine) \) is the hypothesis of multinomial classification.
  • Until now, \( Y \) is called as answer, but from here it is called as label. Also, \(Y_p \) is predicted label.

Softmax Function

  • A generalization of the logistic function that "squashes" a K-dimensional vector z of arbitrary real values to a K-dimensional vector \( \sigma(z) \) of real values in the ranges 0,1 that add up to 1. - Wiki
  • Simply speaking, softmax function returns probability of the label.

$$ S(y_i) = \frac{e^{y_i}}{\sum_j e^{y_j}} $$

import numpy as np

# Softmax function
def softmax(a):
    max_a = np.max(a)   # To prevent overflow
                        # Exponential could have too large value.
                        # max_a will be downscale the original value.
    exp_a = np.exp(a - max_a)
    sum_exp_a = np.sum(exp_a)
    return exp_a / sum_exp_a

x = np.random.rand(5,1)
y = softmax(x)

print("    {0:7}   {1:7}".format("X", "S(X)"))
for i in range(len(x)):
    print("    {0:7.6f}  {1:7.6f}".format(x[i][0], y[i][0]))
print("Sum {0:7.6f}  {1:7.6f}".format(np.sum(x), np.sum(y)))
    X         S(X)   
    0.717855  0.207212
    0.620096  0.187914
    0.505516  0.167570
    0.541942  0.173787
    0.958234  0.263518
Sum 3.343643  1.000000
  • In this example, each input x are transformed to its probability with softmax function.
  • Because of this feature of softmax function, it is used as active function of multinomial classification.
    • As we learn from binary classification, if output is between 0 and 1, the neural network become strong from redundant odd inputs.

Output Function

  • From softmax function, we know which one has the highest probability. However, we have not still set True or False, yet.
  • To do this, multinomial classification uses max function.
  • Therefore, the one which has the highest probability is True, and others are False.
  • Also, to know the index of True, argmax() is widely used.
    • The index is started from 0.
import numpy as np

# Softmax function
def softmax(a):
    max_a = np.max(a)   # To prevent overflow
                        # Exponential could have too large value.
                        # max_a will be downscale the original value.
    exp_a = np.exp(a - max_a)
    sum_exp_a = np.sum(exp_a)
    return exp_a / sum_exp_a

x = np.random.rand(5,1)
y = softmax(x)
max_i = np.argmax(y)

print("    {0:7}   {1:7}".format("X", "S(X)"))
for i in range(len(x)):
    print("    {0:7.6f}  {1:7.6f}".format(x[i][0], y[i][0]))
print("Sum {0:7.6f}  {1:7.6f}".format(np.sum(x), np.sum(y)))
print("Biggest index: {0}, value: {1:7.6f}".format(max_i, x[max_i][0]))
    X         S(X)   
    0.471060  0.177648
    0.557162  0.193621
    0.729981  0.230148
    0.370831  0.160705
    0.763011  0.237877
Sum 2.892045  1.000000
Biggest index: 4, value: 0.763011

One-Hot Encoding

  • a group of bits among which the legal combinations of values are only those with a single high(1) bit and all the others low(0). - Wiki
  • In a bit stream, only one digit is 1 and the others are 0.
  • This is the usual output format of multinomial classification.
  • In multinomial classification, 1 means the neural network inferences that this input is involved in this label.
import numpy as np

# Softmax function
def softmax(a):
    max_a = np.max(a)   # To prevent overflow
                        # Exponential could have too large value.
                        # max_a will be downscale the original value.
    exp_a = np.exp(a - max_a)
    sum_exp_a = np.sum(exp_a)
    return exp_a / sum_exp_a

x = np.random.rand(5,1)
y = softmax(x)
max_i = np.argmax(y)
oh = np.zeros((5, 1))
oh[max_i] = 1

print("    {0:7}   {1:7}   {2:10}".format("X", "S(X)", "Prediction"))
for i in range(len(x)):
    print("    {0:7.6f}  {1:7.6f}  {2}".format(x[i][0], y[i][0], oh[i][0]))
print("Sum {0:7.6f}  {1:7.6f}  {2}".format(np.sum(x), np.sum(y), np.sum(oh)))
    X         S(X)      Prediction
    0.332139  0.181766  0.0
    0.299237  0.175883  0.0
    0.011281  0.131876  0.0
    0.531769  0.221928  0.0
    0.794273  0.288546  1.0
Sum 1.968699  1.000000  1.0

COMMENTS

Name

0 weights,1,abstract class,1,active function,3,adam,2,Adapter,1,affine,2,argmax,1,back propagation,3,binary classification,3,blog,2,Bucket list,1,C++,11,Casting,1,cee,1,checkButton,1,cnn,3,col2im,1,columnspan,1,comboBox,1,concrete class,1,convolution,2,cost function,6,data preprocessing,2,data set,1,deep learning,31,Design Pattern,12,DIP,1,django,1,dnn,2,Don't Repeat Your code,1,drop out,2,ensemble,2,epoch,2,favicon,1,fcn,1,frame,1,gradient descent,5,gru,1,he,1,identify function,1,im2col,1,initialization,1,Lab,9,learning rate,2,LifeLog,1,linear regression,6,logistic function,1,logistic regression,3,logit,3,LSP,1,lstm,1,machine learning,31,matplotlib,1,menu,1,message box,1,mnist,3,mse,1,multinomial classification,3,mutli layer neural network,1,Non Virtual Interface,1,normalization,2,Note,21,numpy,4,one-hot encoding,3,OOP Principles,2,Open Close Principle,1,optimization,1,overfitting,1,padding,2,partial derivative,2,pooling,2,Prototype,1,pure virtual function,1,queue runner,1,radioButton,1,RBM,1,regularization,1,relu,2,reshape,1,restricted boltzmann machine,1,rnn,2,scrolledText,1,sigmoid,2,sigmoid function,1,single layer neural network,1,softmax,6,softmax classification,3,softmax cross entropy with logits,1,softmax function,2,softmax regression,3,softmax-with-loss,2,spinBox,1,SRP,1,standardization,1,sticky,1,stride,1,tab,1,Template Method,1,TensorFlow,31,testing data,1,this,2,tkinter,5,tooltip,1,Toplevel,1,training data,1,vanishing gradient,1,Virtual Copy Constructor,1,Virtual Destructor,1,Virtual Function,1,weight decay,1,xavier,2,xor,3,
ltr
item
Universe In Computer: 12. Multinomial Classification
12. Multinomial Classification
Concept, active function and output function of softmax regression
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE9QfIQg9MqxmXv8wo1jRHrMgva3N0n9uaoJIHiM44Vt8k6nlufCwcOrXM4piATO-QqQmLgh_JEZUv2KXJVRIATvdu0xwckn-JPaRyfJpu9tFP929dbQgKHcd0zfVFfe9EjSkH18A4MxU4/s0/
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE9QfIQg9MqxmXv8wo1jRHrMgva3N0n9uaoJIHiM44Vt8k6nlufCwcOrXM4piATO-QqQmLgh_JEZUv2KXJVRIATvdu0xwckn-JPaRyfJpu9tFP929dbQgKHcd0zfVFfe9EjSkH18A4MxU4/s72-c/
Universe In Computer
https://kunicom.blogspot.com/2017/07/12-multinomial-classification.html
https://kunicom.blogspot.com/
https://kunicom.blogspot.com/
https://kunicom.blogspot.com/2017/07/12-multinomial-classification.html
true
2543631451419919204
UTF-8
Loaded All Posts Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS CONTENT IS PREMIUM Please share to unlock Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy