Using Deep Learning Across Domains

Deep Learning, Machine Learning, and Artificial Intelligence are used across a plethora of difficult problem domains. Through Deep Learning, several problems which were considered impossible to solve only a couple of decades ago is now a reality. Let’s look at some of the areas making the use of deep learning to solve problems:

Coloring black and white images: While this is not a new concept and people have been doing this for a somewhat long time by hand, it is a difficult task. Deep Learning recognizes the objects in the pictures and the context of the photograph and uses this knowledge to color the image.

Adds sound to silent videos: Deep Learning also learned the ability to synthesize sounds to put into silent videos. The system was trained from a data sample of a thousand videos which showed a drumstick creating different sounds by striking on different surfaces. The Deep Learning model associated the video frames with pre-recorded sounds in the database, for the purpose of selecting a sound that matches the actions in the scene accurately. This was then tested by showing it to humans who had to figure out if the sound in the video were real or synthesized. The results were mostly positive with the model creating near accurate sounds. This has the potential of bringing sound back to old silent movies.

Real-Time Translation: Real-time translation means translating text or images into other languages without a significant time lapse so that it seems instantaneous. While computers were used for automatic translation for a long time, Deep Learning is taking it a step further by translating text as well as images without any need for preprocessing the sequence. Google Translate uses this method wherein you can translate any text or image simply by pointing your camera towards it. Deep Learning models identify the alphabets and other characters and translate them. If the text happens to be a part of an image, the model also recreates the image with the translated text. This is known as Instant Visual Translation.

Detecting and Classifying Objects in Photographs: By making the use of deep learning models, the objects, as well as their actions, in the photographs can now be detected and classified. Face recognition is a type of such detecting and classification ability of the Deep Learning models. It can be largely seen in the social media feeds where the AI can now identify the people in any photograph. It was made possible by using large Neural Networks. A much more complicated variation of this task is identifying all the objects and their actions in relation to the context of the image. It also enables the AI to write captions based on the objects, their actions, and the context.

Generating Handwriting: Ever since computers have been used, various fonts have been developed and introduced. Deep Learning took it a step further by generating new styles of handwritings. It was provided with a huge collection of handwriting samples and the sequence of coordinates which was used by the writing device to create the handwriting. Deep Learning was able to identify the relationship between the movement of the writing device and the handwriting. Shortly after, it generated new handwriting styles. Deep Learning can also mimic several styles of handwriting. In doing so, it is a huge step in increasing the accuracy of forensic handwriting analysis.

Playing Video Games: A few years back, a team at Google DeepMind, set up Deep Learning to play video games, Atari 2600 to be exact. The Deep Learning model played a total of seven Atari 2600 games, winning six of them. Initially, it started by performing poorly with random and unconnected moves. After only two hours of playing, the Deep Learning model had learned the rules of the games and was playing at par with the level of a professional player. After another two hours, the Deep Learning model had successfully figured out the shortcut to win the game, that is digging a tunnel underneath the wall, and was now playing at an inhuman level. There were no alterations to the algorithm of the Deep Learning model, which meant that it learned how to play and improved itself all without any human assistance. After the Atari 2600, several teams at various companies have trained Deep Learning models to play several games. In all the games Deep Learning has performed exceptionally, learning to play the game and surpassing all human levels in matters of only a few hours. In the game Doom, Deep Learning models have twice the kill rate than that of the humans. Which has also paved the way for Deep Learning models to be used for military purposes to carry out drone strikes.

Thus, we see how making the use of deep learning, companies have been trying to solve many significant problems and use it across various domains.