AI Project, Part 2: Artificial Neural Networks

A little late, but finally here. This is Part 2 in an ongoing series regarding the development of an AI by Ethan Block, and the efforts of Clark Hubbard to prevent the AI from destroying humanity. Part 1 is located here.


Computers truly are amazing things. Give them instructions, and they will carry them out for you like the digital servants they are. However, they are simply mindless machines, transmitting tiny pulses of electricity through their circuit boards as they solve a math problem or download a webpage. They lack one very important aspect of intelligence: the ability to take past experiences and learn from them, then react to new experiences with this knowledge.

It is natural to almost any animal on Earth. Ivan Pavlov, a Russian psychologist, demonstrated this idea with dogs. He rang a bell just prior to whenever he fed them, and eventually, he rang a bell and gave no food at all, which caused the dogs to salivate in anticipation for their meal. The dogs had been conditioned, or trained, to expect food whenever the bell was rung. This sort of thing happens all the time in all remotely intelligent beings. But how do we emulate it in computer systems?


I bet you’re about to tell me how. Also, ten bucks says Ethan is going to try a Pavlovian experiment with a robot, and instead of salivating, it will kill him.


Alright, Clark, calm down. Anyways, in the year of 1943, Warren McCulloch and Walter Pitts, two very smart people, proposed a model of the neuron.


Are you telling me that smart people proposed the model of a neuron? I never would have guessed.


I’m gonna ignore that. Over a decade later, in 1957, Frank Rosenblatt, a researcher at the Cornell Aeronautical Laboratory, invented something called a perceptron. It could be trained to recognize patterns from certain inputs, and could be likened to a neuron, specifically, the McCulloch and Pitts model. Some time later, the multilayer perceptron was developed. It consisted of several layers of these perceptrons, and was much more efficient in both learning and recognizing patterns.

This is a simple diagram of a neural network:

As you can see, each node is connected to every node in the next layer. Each node in the input layer has four connections, one for each of the nodes in the hidden layer. Each node in the hidden layer has two connections, one for each output. These connections take data from the previous node and add it to the next, multiplying it by something called a “weight”. Each connection has a unique weight, which is usually randomized at the beginning. Remember these weights, they are important, and I’ll mention them again shortly.


image01.jpgAs you can see, Ethan is an idiot.


Once data has been propagated forward from the input nodes to a hidden node, it is run through something known as the activation function. Usually, a sigmoid function will work, though it is possible to use other functions, such as hyperbolic tangent. The transformed data is then sent through the connections to the output nodes, where they are (depending on the neural network’s architecture) either run through the activation function again, or not.


Ok, slow down nerd. So data goes from these sort of “brains” on the left side, to the intermediate brains in the middle, which sorts the data, and send it to the brains on the right, which then transmit the data?


Yes, Clark, that’s right. Now for the interesting part. I won’t go too in-depth here-


I don’t believe you.


-firstly because I don’t want to make this explanation too complex, and also because I barely understand the math of it myself. The next phase is called backpropagation. Usually, when you feed a neural network input data, you also give it target data, which it will train itself to output by modifying the weights the data is multiplied by. And that’s exactly what backpropagation does — first, it calculates the difference between the target output and the expected output, or the “error”. Then it uses this error to traverse backwards through the network and alter the weight of each connection (told you it’d be important), which slightly tweaks the output. Then it runs the input data through the network again, finds the error, and alters the weights again. Over time, the error shrinks, smaller and smaller, until it (mostly) reaches zero, and the output matches the target data.


So what you’re saying is that computers don’t get things right, and have to keep feeding data to itself until it finally gets the error small enough to say what it means?


Yes, because you’re not programming the computer to do something, you’re teaching it, meaning it’s learning from its mistakes.

Once a neural network has been trained with many different sets of training data, it can be given input without target data and will come up with an output that follows the pattern it has been trained in. This can be used to classify data into different groups, predict data (given that the pattern is simple enough), and more.

A while after this original neural network architecture was created, someone had the brilliant idea of making one with multiple hidden layers, able to process problems of an even higher degree of complexity. Networks like these are known as deep neural networks, and are incredibly useful for a wide range of purposes.


We have different opinions on what the words “brilliant idea” mean, apparently. So these new networks could solve more complex problems faster?


Yes. That’s correct. And in the world of computing, that would be classified as a “brilliant idea”.

Madison, my artificial intelligence, consists of many neural networks of different kinds, all feeding into one another and learning by multiple methods — supervised backpropagation as I have described here, and many types of unsupervised learning, such as the Generalized Hebbian Algorithm, Oja’s rule, and self-organizing maps. I’ll go over these at some point.


Unsupervised learning? Are you actually out of your mind?


Not to my knowledge, Clark.

There are many different types of neural networks, such as recurrent neural networks, long short-term memory networks, and convolutional neural networks, but I’ll save those for a later post. Goodbye for now.

P.S. For those of you who are mathematically inclined, here is another explanation of how simple artificial neural networks function by Steven Miller, a professional software engineer. And here is a very in-depth explanation of backpropagation, courtesy of Matt Mazur.


P.S. for those of you that are sane, follow me on Twitter @Classic_Clark and we can plot Ethan’s downfall together.

Until next time (if there is a next time),

Ethan and Clark

1 Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s