30. RNN Basic

RNN basic

tensorflow

Recurrent Neural Network (RNN)

  • A class of artificial neural network where connections between units form a directed cycle. This allows it to exhibit dynamic temporal behavior. RNN can use their internal memory to process arbitrary sequences of inputs. - Wiki
  • Data in nature is usually in a sequence, and sequential data gives us many clues for context.
    • Weather
    • Speech and conversation
    • Object detection in a scene
  • RNN is a neural network for sequential data. Compared to other neural networks, RNN has a feedback(recursion) path. With the path, the previous states affect the next output, and RNN learns the context from the sequential data.
Image 1. RNN and FCN
  • Like CNN, fully connected network (FCN) is followed after RNN. RNN extracts context information from sequential data, then it is required to analyze what they are in DNN.
Image 2. Representation of RNN
  • The left side of above image is a basic representation of RNN, and the right side of RNN is an unrolled representation. The unrolling representation shows several RNN layers, but actual RNN layer is just one. The extended RNN layers represent time flow.
  • In the unrolled representation, RNN has at least 2 inputs and 2 outputs. The inputs for RNN are real input data and the states from the previous input data. Also, the outputs from RNN are real output and the current state from the inputs.
  • It is represented by

$$ h_t = f_W(h_{t-n}, ..., h_{t-2}, h_{t-1}, x_t) $$

  • \(h_t\) is the new state.
  • \(f_W\) is a function with weights.
  • \(h_{t-n}\) is old states. n is the number of previous steps. As n is larger, RNN learns more previous states simultaneously. If n is 1, RNN considers only the one previous state. If n is 3, RNN considers three previous states.
  • \(x_t\) is the input data.

Weights of Recurrent Neural Network

  • Vanilla RNN is a RNN whose state consists of a single hidden vector \(h\). In this chapter, vanilla RNN is used to explain because it is simple.
  • Vanilla RNN is trained with the input data and the last previous state, so n is 1.

$$ h_t = f_W(h_{t-1}, x_t) $$

  • For RNN, \(tanh()\) is used as an activation function widely.

$$ h_t = tanh(W_{hh} \cdot h_{t-1} + W_{xh} \cdot x_t) $$

  • For state, there are two weights. One is for the previous state, and the other is for the input data.

  • The output of RNN is

$$ y_t = W_{hy} \cdot h_t $$

  • Not only the previous state and input data, RNN also has weights of the current state for the output.
  • In vanilla RNN, there are 3 weights, and these weights and activation function are used at every step.

Example: Character Level Language Model

  • Now, RNN will train a word, "hello". After training, the RNN will suggest next character of what user gives. For instance, when user types "h", RNN will suggest "e". If user types "e", RNN will suggest "l".
Image 3. Character level training for "hello" with RNN
  • The numbers of input data and output data are 4, not 5, because there is no need to predict the next character for the last character.
Image 4. Train "hello" word with RNN
  • To train RNN, the characters should be represented by one-hot encoded vector. "hello" has 4 identical characters, "h", "e", "l" and "o".
  • The state for the first data is set as 0, because there is no previous step.
  • RNN trains the input data, and update \(W_{hh}\), \(W_{xh}\) and \(W_hy\).

$$ h_t = tanh(W_{hh} \cdot h_{t-1} + W_{xh} \cdot x_t) $$ $$ y_t = W_{hy} \cdot h_t $$

  • \(y_t\) is fed to softmax regression neural network, then the final decision comes out.
  • In this example, the results of the first 2 step are wrong. The results should be "e" and "l", but they are "o" and "o". These wrong results will be used to calculate cost and update the weights and biases.

Network Variation of RNN

  • With the concept of RNN, there are many variations of neural network.
Image 5. Network variations(src: karpathy.github.io)
  • One-to-one: ex) Vanilla neural network
  • One-to-many: ex) Image captioning
  • Many-to-one: ex) Sentiment classification
  • Many-to-many: ex) Machine translation
  • Many-to-many: ex) Video classification on frame level

Further RNN

  • RNN is a great approach to train sequential data, but there are better new approaches have been proposed until now.

  • Long Short-term Memory (LSTM) - Wiki

  • Gated Recurrent Unit (GRU) - Wiki

COMMENTS

Name

0 weights,1,abstract class,1,active function,3,adam,2,Adapter,1,affine,2,argmax,1,back propagation,3,binary classification,3,blog,2,Bucket list,1,C++,11,Casting,1,cee,1,checkButton,1,cnn,3,col2im,1,columnspan,1,comboBox,1,concrete class,1,convolution,2,cost function,6,data preprocessing,2,data set,1,deep learning,31,Design Pattern,12,DIP,1,django,1,dnn,2,Don't Repeat Your code,1,drop out,2,ensemble,2,epoch,2,favicon,1,fcn,1,frame,1,gradient descent,5,gru,1,he,1,identify function,1,im2col,1,initialization,1,Lab,9,learning rate,2,LifeLog,1,linear regression,6,logistic function,1,logistic regression,3,logit,3,LSP,1,lstm,1,machine learning,31,matplotlib,1,menu,1,message box,1,mnist,3,mse,1,multinomial classification,3,mutli layer neural network,1,Non Virtual Interface,1,normalization,2,Note,21,numpy,4,one-hot encoding,3,OOP Principles,2,Open Close Principle,1,optimization,1,overfitting,1,padding,2,partial derivative,2,pooling,2,Prototype,1,pure virtual function,1,queue runner,1,radioButton,1,RBM,1,regularization,1,relu,2,reshape,1,restricted boltzmann machine,1,rnn,2,scrolledText,1,sigmoid,2,sigmoid function,1,single layer neural network,1,softmax,6,softmax classification,3,softmax cross entropy with logits,1,softmax function,2,softmax regression,3,softmax-with-loss,2,spinBox,1,SRP,1,standardization,1,sticky,1,stride,1,tab,1,Template Method,1,TensorFlow,31,testing data,1,this,2,tkinter,5,tooltip,1,Toplevel,1,training data,1,vanishing gradient,1,Virtual Copy Constructor,1,Virtual Destructor,1,Virtual Function,1,weight decay,1,xavier,2,xor,3,
ltr
item
Universe In Computer: 30. RNN Basic
30. RNN Basic
RNN basic
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE9QfIQg9MqxmXv8wo1jRHrMgva3N0n9uaoJIHiM44Vt8k6nlufCwcOrXM4piATO-QqQmLgh_JEZUv2KXJVRIATvdu0xwckn-JPaRyfJpu9tFP929dbQgKHcd0zfVFfe9EjSkH18A4MxU4/s0/
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiE9QfIQg9MqxmXv8wo1jRHrMgva3N0n9uaoJIHiM44Vt8k6nlufCwcOrXM4piATO-QqQmLgh_JEZUv2KXJVRIATvdu0xwckn-JPaRyfJpu9tFP929dbQgKHcd0zfVFfe9EjSkH18A4MxU4/s72-c/
Universe In Computer
https://kunicom.blogspot.com/2017/08/30-rnn-basic.html
https://kunicom.blogspot.com/
https://kunicom.blogspot.com/
https://kunicom.blogspot.com/2017/08/30-rnn-basic.html
true
2543631451419919204
UTF-8
Loaded All Posts Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS CONTENT IS PREMIUM Please share to unlock Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy