Architecture of RNN

The below image shows an RNN being unfolded to a full network, xtx_tis input to the network at time t, hth_tis the hidden state at time t also referred to as the memory of the network. It is calculated based on previous hidden state and current input.

Represented by ht=f(Uxt+Wht+bh)h_t= f(Ux_t + Wh_t +b_h)

Here U and W are weights for input and previous state value input respectively, bhb_h is the bias associated to the hidden network and f is the non-linearity applied to the sum to generate final cell state.

And output at time t is calculated as shown below :

Ot=f(Wht+bo)O_t = f(Wh_t + b_o)

bob_o is the bias for the output layer

Now, having understood about the maths behind the architecture of an RNN , lets see how to train the network.

Last updated