Architecture of RNN
Last updated
Last updated
The below image shows an RNN being unfolded to a full network, is input to the network at time t, is the hidden state at time t also referred to as the memory of the network. It is calculated based on previous hidden state and current input.
Represented by
Here U and W are weights for input and previous state value input respectively, is the bias associated to the hidden network and f is the non-linearity applied to the sum to generate final cell state.
And output at time t is calculated as shown below :
is the bias for the output layer
Now, having understood about the maths behind the architecture of an RNN , lets see how to train the network.