AI Blog Post 5 – Machine Learning Methods, Part II

Hello, and welcome back to the series about the ongoing development of my artificial intelligence dubbed “Madison.” I am, as always, joined by Clark Hubbard, fellow AI enthusiast and a prolific writer.

 

Clark

I know it’s sarcastic, but I wish to clarify that I am in no way an AI enthusiast and furthermo-

Ethan

ALL RIGHT, so last post, we took a look at several different methods used in the field of machine learning. These mostly included variants on the typical neural network structure — simulated neurons and synapses — with the exception of the Self-Organizing Map.

Clark

If you don’t know what the heck he’s saying, just go read the last post. I help explain some things.

Ethan

Right. Well today, we’re going to go over several other machine learning methods, including genetic algorithms, convolutional neural networks, and a few others. Let’s get started.

Clark

You know this is gonna be good when “Convoluted” is basically a part of one of the methods.

Ethan

The first machine learning method I’d like to discuss is the genetic algorithm. Regardless of whether you believe he was right, Charles Darwin was nevertheless very intelligent, and it has been observed that applying his principles of evolution in the field of computer science has had great success. A genetic algorithm is the most common, and arguably simplest, way of doing this. First, you set a goal for it to achieve, perhaps to maximize the efficiency of something, or minimize an error.

Clark

Normal person talk: You give the computer a job: either to do something very quickly, or to do something very well.

Ethan

Right, and then the AI generates a starting population, a certain number of artificial constructs each containing their own genes. Those with the genes most desirable (i.e. closest to the goal) are bred to produce “offspring” (more artificial constructs) that combine the traits of the parents. Very occasionally, mutation may occur, which alters a construct’s genes in an unpredictable (though usually minute) way. Often, these mutations are worthless in the scheme of things, but very rarely, they can bring even more desirable genes into the pool, so to speak. Eventually, the genetic algorithm reaches a new population where one of its individuals has all of the desired genes to achieve the predefined goal.

Clark

So the AI creates a group of smaller “computers” who try and be very efficient or very well. Eventually, through a sort of technological breeding program, the AI weans out the weakest “computers,” in a sort of “survival of the fittest” way.

Ethan

Yeah, to review, it mimics evolutionary principles within a computer without the need for billions of years (in fact, it can often evolve an optimal solution in just milliseconds). One of the coolest applications of this algorithm I’ve seen is the evolution of an antenna shape used onboard several NASA satellites. The antenna, launched on one of the microsatellites as part of the Space Technology 5 project, may look very strange, but it is capable of producing a unique radiation pattern, a pattern which is optimized for its particular mission.

Briefly, I’d like to talk about Cartesian Genetic Programming, as it is known. It’s the same as a genetic algorithm, with the exception that it evolves the structure and weights of a multi-layer neural network (the first type of machine learning algorithm we went over in the last post). By doing this, it creates a neural network on its own that is capable of solving a given problem. While I’m not planning to use this algorithm in Madison’s design, I think it is cool enough to be worthy of mention.

Clark

Okay, so a more complicated version of the first method, evolving neural networks instead of genes?

Ethan

Precisely, Clark. The second method of machine learning I’ll talk about here is the convolutional neural network. It is somewhat like other neural networks, although it typically takes in images and learns to classify individual objects in them. For instance, you can give it an image of a cat, and, once trained thoroughly on what cats and dogs look like, it will be able to tell you, “that is a cat, not a dog.” It may not seem that impressive, but the fact that computers can now recognize objects in images in a way similar to us is pretty awesome.

Clark

Interestingly, when Ethan was young, he thought cats and dogs were the different genders of the same animal.

Ethan

Doesn’t every toddler think that? Anyways, a convolutional neural network works like this: first, it takes in an image. It then convolves the image (the word from which its moniker is derived), meaning it slides a filter over it to see edges, curves, and other features more clearly. This is, in practice, what that looks like:

giphy1

A nonlinearity is then applied, similar to the hidden layer in neural networks, which uses some mathematical wizardry to alter the convolved image. Then, the feature maps are run through a “max-pooling” layer, which finds only the most interesting parts of each map and saves those to a new, much smaller map. Essentially, it’s reducing the size of the image (which is important; you’ll see why shortly) without losing any of the good stuff, the important features by which it recognizes things. Finally, each pixel value of each feature map is fed into a fully-connected neural network, which is just a fancy name for the multi-layer neural network architecture I described in the last post. This network is what recognizes the individual features of the image and utilizes those to classify what the image is of. The primary reason why reducing the size of the image is important is because trying to run a neural network with thousands, perhaps hundreds of thousands of inputs would require supercomputers we don’t know how to build. Another reason is that by giving the fully-connected network too much input, it could start seeing patterns that aren’t there (something called “overfitting”). I could go into further detail about these networks, but there are people who are much more qualified and knowledgeable about it than I am. I’ll leave some good explanations in the links section, as always.

Clark

I bet Ethan doesn’t even know what he’s saying anymore.

Ethan

Trust me, Clark, I know what I’m talking about. Before I sign off for this post, I’ll talk about one more type of algorithm: a Support Vector Machine. While I have yet to utilize this particular method of machine learning in Madison’s design, it is still incredibly useful for a wide range of applications. An SVM works by finding the best possible hyperplane, which is basically a line in whatever dimension of data you’re working with, to separate two different classes of data. Let me show you what that looks like:

null

Clark: Pictured: Nonsense

So, as you can see, the line H1 doesn’t separate the data at all. H2 does, but not in a very good way. H3 is obviously the best possible separator for the data. The SVM’s goal is to find this particular hyperplane for any given set of data, so that when you give it new data it’s never seen before, it can correctly classify it based on the examples it was previously trained on. Make sense? Good.

Sometimes, however, an SVM can’t find the best hyperplane, especially if the examples are all mixed up among each other. But this is the cool part: the algorithm can utilize something called a “kernel trick”, which puts all the examples and the hyperplane itself into the next dimension. That sounds kind of weird when I say it like that, so let me explain it visually:

null

On the left, you have all of the mixed-up examples that the SVM just can’t separate. That Greek symbol in the middle represents the kernel trick, and on the right, you have the new, higher-dimensional “feature space” where it’s much easier for the SVM to separate the data into different groups. It then does so. And then, once it’s found that plane, you can give it new data to classify, and it will do so quite well.

Clark

Essentially, this method is looking at the best way to split something in half. He just uses big words to appear smarter.

Ethan

While your simplified explanation is fairly accurate, that’s a ridiculously nonsensical accusation, Clark. Regardless of my writing habits, I think I’ve covered the majority of machine learning methods I’m using in Madison’s design (and even a few I’m not), but who knows? As time goes on, maybe I’ll add some more crazy algorithms. Strong artificial intelligence is difficult to achieve, after all, and a program will require a lot of complexity to do so.

Until next time,

Ethan and Clark

P.S. Check out Clark’s blog at classicclark.com, and his Twitter.

LINKS:

https://arxiv.org/ftp/arxiv/papers/1308/1308.4675.pdf – a very in-depth research paper on genetic algorithms.

http://www.theprojectspot.com/tutorial-post/creating-a-genetic-algorithm-for-beginners/3 – a great tutorial on creating genetic algorithms for those among you who are programmers.

http://thelackthereof.org/docs/library/cs/gp/Miller,%20Julian:%20Cartesian%20Genetic%20Programming.pdf – a good paper on Cartesian genetic programming.

http://cs231n.github.io/ – a Stanford course on convolutional networks. It explains them very well.

https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ – an explanation of convolutional networks by Ujjwal Karn.

http://cs229.stanford.edu/notes/cs229-notes3.pdf – some more Stanford notes, this time on support vector machines.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s