"Artificial neural networks" What are they?
Since the invention of the computer, there have been people talking about the things that computers will never be able to do. Whether it was beating a grand master at chess or winning on Jeopardy!, these predictions have always been wrong. However, some such nay-saying always had a better grounding in computer science. There were goals that, if you knew how computers worked, you knew they would be virtually impossible to achieve. Recognizing human emotions through facial expressions. Reading a wide variety of cursive handwriting. Correctly identifying the words in spoken language. Driving autonomously through busy streets.
Well, computers are now starting to be able to do all of those things, and quite a bit more. Were the nay-sayers really just too cynical about the true capabilities of digital computers? In a way, no. To solve those monumental challenges, scientists were forced to come up with a whole new type of computer, one based on the structure of the brain. These artificial neural networks (ANNs) only ever exist as a simulation running on a regular digital computer, but what goes on inside that simulation is fundamentally very different from classical computing.
Is an artificial neural network an exercise in computing science? Applied biology? Pure mathematics? Experimental philosophy? It’s all of those things, and much more.
Well, computers are now starting to be able to do all of those things, and quite a bit more. Were the nay-sayers really just too cynical about the true capabilities of digital computers? In a way, no. To solve those monumental challenges, scientists were forced to come up with a whole new type of computer, one based on the structure of the brain. These artificial neural networks (ANNs) only ever exist as a simulation running on a regular digital computer, but what goes on inside that simulation is fundamentally very different from classical computing.
Is an artificial neural network an exercise in computing science? Applied biology? Pure mathematics? Experimental philosophy? It’s all of those things, and much more.
What are ANNs?
Most people already know that the neurons that do the computation in our brain are not organized like the semiconductors in a computer processor, in a linear sequence, attached to the same board, and controlled by one unifying clock cycle. Rather, in the brain each neuron is nominally its own self-contained actor, and it’s wired to most or all of the neurons that physically surround it in highly complex and somewhat unpredictable ways.
What this means is that for a digital computer to achieve an ordered result, it needs one over-arching program to direct it and tell each semiconductor just what to do to contribute toward the overall goal. A brain, on the other hand, unifies billions of tiny, exceedingly simple units that can each have their own programming and make decisions without the need for an outside authority. Each neuron works and interacts with the neurons around it according to its own simple, pre-defined rules.
An artificial neural network is (supposed to be) the exact same thing, but simulated with software. In other words, we use a digital computer to run a simulation of a bunch of heavily interconnected little mini-programs which stand in for the neurons of our simulated neural network. Data enters the ANN and has some operation performed on it by the first “neuron,” that operation being determined by how the neuron happens to be programmed to react to data with those specific attributes. It’s then passed on to the next neuron, which is chosen in a similar way, so that another operation can be chosen and performed. There are a finite number of “layers” of these computational neurons, and after moving through them all, an output is produced.
The overall process of turning input into output is an emergent result of the programming of each individual neuron the data touches, and the starting conditions of the data itself. In the the brain, the “starting conditions” are the specific neural signals arriving from the spine, or elsewhere in the brain. In the case of an ANN, they’re whatever we’d like them to be, from the results of a search algorithm to randomly generated numbers to words typed out manually by researchers.
So, to sum up: artificial neural networks are basically simulated brains. But it’s important to note that we can give our software “neurons” basically any programming we want; we can try to set up their rules so their behavior mirrors that of a human brain, but we can also use them to solve problems we could never consider before.
How do ANNs work?
What we’ve described so far is very interesting, but largely useless for computation. That is to say, it’s very scientifically interesting to be able to simulate the cellular structure of the brain, but if I know how to go in and program every little sub-actor such that my inputs are always processed into my desired outputs, then why do I need an ANN at all? Put differently, the nature of an ANN means that intentionally building one to solve a particular problem requires such a deep working knowledge of that problem and its solutions that the ANN itself becomes a bit redundant.However, there’s a big advantage to working with many simple actors rather than a single complex one: simple actors can self-correct. There have been attempts at self-editing versions of regular software, but it’s artificial neural networks that have taken the concept of machine learning to new heights.
You’ll hear the word “non-deterministic” used to describe the function of a neural network, and that’s in reference to the fact that our software neurons often have weighted statistical likelihoods associated with different outcomes for data; there’s a 40% chance than an input of type A gets passed to this neuron in the next layer, a 60% chance it gets passed to that one instead. These uncertainties quickly add up as neural networks get larger or more elaborately interconnected, so that the exact same starting conditions might lead to many different outcomes or, more importantly, get to the same outcome by many different paths.
So, we introduce the idea of a “learning algorithm.” A simple example is improving efficiency: send the same input into the network over and over and over, and every time it generates the correct output, record the time it took to do so. Some paths from A to B will be naturally more efficient than others, and the learning algorithm can start to reinforce neuronal behaviors that occurred during those runs that proceeded more quickly.
Much more complex ANNs can strive for more complex goals, like correctly identifying the species of animal in a Google image result. The steps in image processing and categorization get adjusted slightly, relying on an evolution-like sifting of random and non-random variation to produce a cat-finding process the ANN’s programmers could never have directly devised.
Non-deterministic ANNs becomes much more deterministic as they restructure themselves to be better at achieving certain results, as determined by the goals of their learning algorithms. This is called “training” the ANN — you train an ANN with examples of its desired function, so it can self-correct based on how well it did on each of these runs. The more you train an ANN, the better it should become at achieving its goals.
There’s also the idea of “unsupervised” or “adaptive” learning, in which you run the algorithm with no desired outputs in mind, but let it start evaluating results and adjusting itself according to its own… whims? As you might imagine, this isn’t well understood just yet, but it’s also the most likely path down which we might find true AI — or just really, really advanced AI. If we’re ever truly going to send robots out into totally unknown environments to figure out totally unforeseen problems, we’re going to need programs that can assign significance to stimuli on their own, in real time.
That’s where the power of ANNs truly lies: since their structure allows them to make iterative changes to their own programming, they have the ability to find answers that their own creators never could have. Whether you’re a hedge fund, an advertising company, or an oil prospector, the sheer potential of combining the speed of a computer with the versatility of a brain is impossible to ignore. That’s why being able to program “machine learning” algorithms is now one of the most sought-after skill sets in the world.
In the coming century we may very well be less concerned with solving problems than with teaching computers to learn to solve problems for us.
OK, but what can ANNs actually do?
The usefulness of ANNs falls into one of two basic categories: as tools for solving problems that are inherently difficult for both people and digital computers, and as experimental and conceptual models of something — classically, brains. Let’s talk about each one separately.First, the real reason for interest (and, more importantly, investment) in ANNs is that they can solve problems. Google uses an ANN to learn how to better target “watch next” suggestions after YouTube videos. The scientists at the Large Hadron Collider turned to ANNs to sift the results of their collisions and pull the signature of just one particle out of the larger storm. Shipping companies use them to minimize route lengths over a complex scattering of destinations. Credit card companies use them to identify fraudulent transactions. They’re even becoming accessible to smaller teams and individuals — Amazon, MetaMind, and more are offering tailored machine learning services to anyone for surprisingly modest a fee.
Things are just getting started. Google’s been training its photo-analysis algorithms with more and more pictures of animals, and they’re getting pretty good at telling dogs from cats in regular photographs. Both translation and voice synthesis are progressing to the point that we could soon have a babelfish-like device offering natural, real time conversations between people speaking different languages. And, of course, there are the Big Three ostentatious examples that really wear the machine learning on their sleeve: Siri, Now, and Cortana.
The other side of a neural network lies in carefully designing it to mirror the structure of brains. Both our understanding of that structure, and the computational power necessary to simulate it, are nowhere close to what we’d need to do robust brain-science in a computer model. There have been some amazing efforts at simulating certain aspects of certain portions of the brain, but it’s still in the very preliminary stages.
One advantage of this approach is that while you can’t (or… shouldn’t) genetically engineer humans to have an experimental change built into their brains, you absolutely can perform such mad-scientist experiments on simulated brains. ANNs can explore a far wider array of possibilities than medicine could ever practically or ethically consider, and they could someday allow scientists to quickly check on more out-there, “I wonder” hypotheses with potentially unexpected results.
When you ask yourself, “Can an artificial neural network do it?” immediately after, ask yourself “Can I do it?” If the answer is yes, then your brain must be capable of doing something that an ANN might one day be able to simulate. On the other hand, there are plenty of things an ANN might one day be able to do that a brain never could.