Thursday, January 04, 2018

Negative energy prices and artificial intelligence

Since renewable energy has started to become popular, an odd problem has appeared in wholesale energy markets: negative prices.

In other words, energy plants sometimes pay their customers to take energy off their hands. Usually older, less flexible plants that can't shut down without incurring costs are affected.
One solution to this problem is batteries. The idea is to store the energy when it is overabundant, and use it later when it is expensive. This is sometimes called "peak shaving".

Batteries are a great idea, but not the only solution. Another is to simply find an application that is energy hungry and can be run intermittently.

One possible application for soaking up excess energy is desalination. For example, a desert region near an ocean could build solar plants to desalinate water during the day only. The question is whether building a desalination plant that only runs 12 hours a day is worth the savings in energy. 
Another way to make use of energy that might go to waste is using it to power computers that perform analytics. The energy demand of data centers is growing quickly.

One source of energy needs is Bitcoin. Bitcoin mining consumes huge amounts of energy, so it is a great example of a use for negative energy prices. In fact there are already a lot of bitcoin miners in Western China, where solar and wind installations have outstripped grid upgrades. In these areas renewable energy is often curtailed because the grid can't keep up. So the energy is basically free to the miners.

Extremely cheap bitcoin mining arguably undermines the whole concept, but here is a more productive idea: Training artificial intelligence. For example, have a look at this link to gcp leela, a clone of Google Deepmind Alphago zero:
The entire source code is free, and it's not a lot of code. But that free code is just the learning model, and its based on well known principles. It's probably just as good as Deepmind Alphago Zero when trained, but they figure it would take them 1700 years to train -- unless of course they could harness other resources.

This is partly because they don't have access to Google's specialized TPU hardware. Whatever the reason, training it is going to burn through a lot of energy.
This would be a great application for negatively priced energy. Game playing is more a stunt than a commercial application, but when they are paying you to use the energy, why not? And as time passes, more useful AI apps will need training.
So it gets down to whether the business model of peak shaving with batteries makes more economic sense than banks of custom chips for training neural networks for AI in batches. The advantage of batteries is that you can sell the energy later for more, but it's not terribly efficient, and using it directly is a better idea. Cheap computer hardware and a growing demand for AI may fit this niche very well.
This puts a whole new twist on the idea that big tech companies are investing in renewables. These companies make extensive used of AI, which is trained in batch processes. 

Wednesday, January 03, 2018

Understanding Artificial Neural Networks

Artificial neural networks are computer programs that learn a subject matter of their own accord. So an artificial neural network is a method of machine learning. Most software is created by programmers painstakingly detailing exactly how the program is expected to behave. But in machine learning systems, the programmers create a learning algorithm and feed it sample data, allowing the software to learn to solve a specific problem by itself.

Artificial neural networks were inspired by animal brains. They are a network of interconnected nodes that represent neurons, and the thinking is spread throughout the network. 

But information doesn't fly around in all directions in the network. Instead it flows in one direction through multiple layers of nodes from an input layer to an output layer. Each layer gets inputs from the previous layer and then sends calculation results to the next layer. In an image classification system, the initial input would be the pixels of the image, and the final output would be the list of classes.

The processing in each layer is simple: Each node get numbers from multiple nodes in the previous layer, and adds them up. If the sum is big enough, it sends a signal to the nodes in the layer below it. Otherwise it does nothing. But there is a trick: The connections between the nodes are weighted. So if node A sends a 1 to nodes B and C, it might arrive at B as 0.5, and a C as 3, depending on the weights in the connections. 

The system learns by adjusting the weights of the connections between the nodes. To stay with visual classification, it gets a picture and guesses which class it belongs to, for example "cat" or "fire truck". If it guesses wrong, the weights are adjusted.This is repeated until the system can identify pictures.

To make all this work, the programmer has to design the network correctly. This is more an art than a science, and in many cases, copying someone else's design and tweaking it is the best bet.

In practice, neural network calculations boil down to lots and lots of matrix math operations as well at the threshold operation the neurons use to decide whether to fire. It's fairly easy to imagine all this as a bunch of interconnected nodes sending each other signals, but fairly painful to implement in code. 

The reason it is so hard is that there can be many layers that are hard to tell apart, making it easy to get confused about which is doing what. The programmer also has to keep in mind how to orient the matrices the right way to make the math work, and other technical details. 

It is possible to do all this from scratch in a programming language like Python, and recommended for beginner systems. But fortunately there is a better way to do advanced systems: In recent years a number of libraries such as Tensorflow have become available that greatly simplify the task. These libraries take a bit of fiddling to understand at first, and learning how to deal with them is key to learning how to create neural networks. But they are a huge improvement over hand coded systems. Not only do they greatly reduce programming effort, they also provide better performance.