Wednesday, January 03, 2018

Understanding Artificial Neural Networks

Artificial neural networks are computer programs that learn a subject matter of their own accord. So an artificial neural network is a method of machine learning. Most software is created by programmers painstakingly detailing exactly how the program is expected to behave. But in machine learning systems, the programmers create a learning algorithm and feed it sample data, allowing the software to learn to solve a specific problem by itself.

Artificial neural networks were inspired by animal brains. They are a network of interconnected nodes that represent neurons, and the thinking is spread throughout the network. 

But information doesn't fly around in all directions in the network. Instead it flows in one direction through multiple layers of nodes from an input layer to an output layer. Each layer gets inputs from the previous layer and then sends calculation results to the next layer. In an image classification system, the initial input would be the pixels of the image, and the final output would be the list of classes.



The processing in each layer is simple: Each node get numbers from multiple nodes in the previous layer, and adds them up. If the sum is big enough, it sends a signal to the nodes in the layer below it. Otherwise it does nothing. But there is a trick: The connections between the nodes are weighted. So if node A sends a 1 to nodes B and C, it might arrive at B as 0.5, and a C as 3, depending on the weights in the connections. 

The system learns by adjusting the weights of the connections between the nodes. To stay with visual classification, it gets a picture and guesses which class it belongs to, for example "cat" or "fire truck". If it guesses wrong, the weights are adjusted.This is repeated until the system can identify pictures.

To make all this work, the programmer has to design the network correctly. This is more an art than a science, and in many cases, copying someone else's design and tweaking it is the best bet.

In practice, neural network calculations boil down to lots and lots of matrix math operations as well at the threshold operation the neurons use to decide whether to fire. It's fairly easy to imagine all this as a bunch of interconnected nodes sending each other signals, but fairly painful to implement in code. 

The reason it is so hard is that there can be many layers that are hard to tell apart, making it easy to get confused about which is doing what. The programmer also has to keep in mind how to orient the matrices the right way to make the math work, and other technical details. 

It is possible to do all this from scratch in a programming language like Python, and recommended for beginner systems. But fortunately there is a better way to do advanced systems: In recent years a number of libraries such as Tensorflow have become available that greatly simplify the task. These libraries take a bit of fiddling to understand at first, and learning how to deal with them is key to learning how to create neural networks. But they are a huge improvement over hand coded systems. Not only do they greatly reduce programming effort, they also provide better performance.

3 comments:

Post a Comment

Please leave a comment