Architecture of RNN
The below image shows an RNN being unfolded to a full network, xtis input to the network at time t, htis the hidden state at time t also referred to as the memory of the network. It is calculated based on previous hidden state and current input.
Represented by ht=f(Uxt+Wht+bh)

Here U and W are weights for input and previous state value input respectively, bhis the bias associated to the hidden network and f is the non-linearity applied to the sum to generate final cell state.
And output at time t is calculated as shown below :
Ot=f(Wht+bo)
bo is the bias for the output layer
Now, having understood about the maths behind the architecture of an RNN , lets see how to train the network.
Last updated