Quick Contact

    Python Tutorial
    Python Panda Tutorial
    Python Selenium
    Python Flask Tutorial
    Python Django
    Interview Questions & Answers

    Recurrent Neural Network

    It is used mainly for natural language processing tasks. So if you think about deep learning overall CNN is mainly for images, RNN are mainly for NLP. There are other use cases as well so you will understand how recurrent neural network works and look at different applications of RNN in the field of NLP as well as some other domains. Sequence models are some real life use case; you must have use google mail, Gmail. When you type in a sentence it will auto complete it. So, RNN is embedded into the system which brings auto complete feature.

    – Other use case is translation, where you can translate from one language to another language.

    – Third use case is Named Entity Recognition like-:

    – Sentiment Analysis is also another use case. In which you have a written paragraph and this feature tells the sentiment whether this product review is one star, two star and so on.

    Recurrent model feed the outputs of units as inputs in next time step, that’s from where recurrent word comes from. RNN is actually feed forward neural network with repeated units. You can unfold the recurrent graph into a full network as a chain of repeated units.

    – RNN can be thought of as neural networks that share parameters in time. RNN can handle inputs and outputs of different types and lengths.

    – For example, to translate from one language to another a model would input a piece of text and output another piece of text where the length of the inputs and outputs are not necessarily the same.

    – Input and the Output don’t have to be both sequences either. A model can input a sequence like blog post and output a categorical variable such as variable that indicates whether the text carries positive, negative or neural sentiment.

    – Similarly the output can be sequence whereas the input is not. A random text generator can input a random seed and output random sentences.


    – It is possible to have many different types of input and output configurations. RNN can be one-to-one, one-to-many, many-to-one, and many-to-many.

    – The input and output don’t have to be the same length. They can time-delayed as well like in this figure.

    – It can be even none-to-many where a model generated a sequence without an input. This type is essentially the same as one-to-many since the output would depend on some initial seat, even if it’s not explicitly defined as an input.


    – Optimizing models become more difficult as the chain of units gets longer.

    – In RNN you can easily end up with very long chains of units when you unfold them in time.

    – Problem may arise is exploding gradients problem. Long sequenced can result in long chains of parameter multiplications.

    – When you multiply so many weights together, the loss becomes highly sensitive to the weights. This sensitivity may results in steep slopes in the loss function. The slope of the cost function at a point might be too large that when you use it to update the weights, it might go outside a reasonable range and end up having an unrepresenatble value such as a NaN value. It can happen over the course of several updates.

    A long chain of large weights lead to large activations, large activations lead to large gradients, and large gradients lead to large weight updates and even larger activations.

    – Quick fix for the problem is to clip the gradient magnitude to prevent it from being larger than some maximum value. This is called gradient clipping.


    Another problem you can consider is vanishing gradient problem. When you backpropagate the error in a deep network, the gradient sometimes gets so small unit it reaches the early layers.

    – In feed forward network it makes harder to optimize the early layers since they barely get any updates. In context of RNN the results in quickly forgetting things that model has seen earlier.

    There are might be long-term dependencies. For example, if task is to predict a missing word in paragraph, the contextual clues need might not be very close to the word being predicted.

    – Two popular RRN architectures called LSTMs and gated recurrent units both aim to remember long term dependencies while alleviating the vanishing and exploding gradient problems. These architectures use gated modules to keep what’s important in a sequence of data points. The main idea in gated architectures is to have straight channel that flows through times and modules connected to it. These modules are regulated by gates which determine how much the module should contribute to the main channel. The gates are simply sigmoid units that produce a number between zero and one.

    – Zero means nothing passes through the gate, and one means everything is let through. Let’s build an extremely simplified version of a gated unit.

    It’s possible to increase the representational capacity of recurrent neural networks by stacking recurrent units on top of each other. Deeper RNNs can learn more complex patterns in sequential data but this extra depth makes the model harder to optimize.


    Apply now for Advanced Python Training Course

    Copyright 1999- Ducat Creative, All rights reserved.

    Anda bisa mendapatkan server slot online resmi dan terpercaya tentu saja di sini. Sebagai salah satu provider yang menyediakan banyak pilihan permainan.