January 9, 2025 · 29 min read

Tight Genes: Intro to Genetic Algorithms with Dave Aronson

Open Spaces is a Gun.io series dedicated to exploring the world of technology through the eyes of our community’s engineers. This week, we’ve invited Dave Aronson, a Codosaurus and tech conference phenom, to give us the technical highlights of one of his most popular presentations: Genetic Algorithms—what they are and how and when to use them.

What are genetic algorithms in the first place?

They are optimization heuristics, which is fancy-talk for shortcuts to find solutions to a problem. Ideally, they’ll find the best solution, but we usually have to settle for something “good enough” due to constraints like time or money. After all, that’s why we use shortcuts in the first place.

There are many kinds of optimization heuristics, but genetic algorithms are (as you may have guessed from the name) unique in that they are inspired by real-world biological evolution,

mainly the principles of survival of the fittest, random combination of old sets of genes (no, not old jeans like I’m wearing) into one new set, and random mutation.

A brief history of genetic algorithms

The history goes back to 1950 when Alan Turing (as in the Turing Test, Turing Machine, Turing Completeness, the ACM’s Turing Award, and so on) proposed a “learning machine” in which he thought that the mechanism of learning would be similar to evolution. Nothing much ever came of that, and it took a few decades for genetic algorithms, in general, to get some traction.

The first commercial product based on genetic algorithms, a mainframe toolkit for industrial processes, came out in the late 1980s from General Electric. In 1989, a more general genetic algorithm toolkit called Evolver came out for PCs. These days, MATLAB and such tools have some built-in genetic algorithm facilities, and many programming languages have libraries available. (Plus, the basic idea is simple enough that any decent programmer can implement it from scratch.) However, the actual uses of genetic algorithms remain mostly hidden and, in my opinion, rather boring to most of us, used by companies in their internal industrial processes, logistics, scheduling, etc.

But once in a while, they do get used for something more interesting and more publicly known. Most famously, in 2005, NASA used a genetic algorithm to design an antenna for the ST5 series of satellites, shown here with a US quarter (25-cent piece) for scale:

(No, that’s not just a paperclip bent up by someone fidgeting in a boring meeting!)

The NASA Jet Propulsion Laboratory website says: “Its unusual shape is expected because most human antenna designers would never think of such a design.” That is one of the great advantages of this approach. These algorithms are less hampered by expectations of similarity to past solutions. They can come up with things that a human might initially dismiss as unrealistic, but the numbers work out just fine. However, in this post, we’ll only look at simpler problem domains to make understandable demos.

So, how do genetic algorithms work?

They consist of a simple series of steps:

Step 1: Create an initial population of candidates.

(In Genetic Algorithm terms, these are called “chromosomes,” but I don’t like that term, since most living beings contain many chromosomes in each cell. I think it leads to confusion. For the sake of this article and understanding, we’ll call them candidates.)

Step 2: Assess the “fitness” of each candidate.

This is rated according to whatever criteria we want to apply. We do this here mainly because it supplies the data usually used in step 3.

Step 3: Check: Are we done yet?

We usually base this on fitness, but it could be based on other criteria—we’ll see some later—or a combination. If we’re not done, we move on to step 4.

Step 4: Select some candidates to breed the next generation.

This is also usually based on fitness (to simulate survival of the fittest), sometimes with some randomness thrown in.

Step 5: Use those candidates to breed a new population.

Some of the previous population, especially the fittest ones, may be carried over into this new population, but usually not.

Step 6: Mutate those new candidates.

This is a very important but easily forgotten step. It is necessary to increase diversity in the gene pool.

Step 7: Repeat as needed.

We go back to step 2, assessing the fitness of these new candidates now that they’re in their final form. This sequence could be represented at a high level with some rather simple code, like so:

(This code is in Ruby, which I chose because it reads pretty closely to plain English.)

Going to market: An example, with code!

What is a candidate?

These are different solutions to some problems, usually represented as different instances of the same data structure. They could be any data structure we want–as long as we can evaluate their fitness, combine old ones to make a new one, and mutate them.

The simplest common type of candidate is a simple string of bits, like the candidates in this population:

These will do fine for candidates that consist of a simple series of yes/no decisions. This may sound simplistic, but there is a huge class of problems that boil down to this: knapsack problems, which are a category of “constrained resource allocation” problems.

The canonical example is that you have a knapsack (or rucksack, backpack, or whatever you call it) and many things you want to carry in it, but they won’t all fit, or the total weight is more than you can carry, or some such similar constraint, or combination of constraints. So, you want to find the combination of items that will fit the constraints and have the maximum value.

How do we create a candidate?

To look at a concrete example, suppose we know a cattle farmer with a smallish truck. He needs to decide what to take to market each week. Among the things he can take are cows, milk, cheese, butter, ice cream, meat, and leather. (My apologies to any vegetarians out there.)

For the sake of simplicity, we won’t differentiate between price and profit, nor dairy versus meat cows. We’ll assume he can only take a set amount of each item and always has that amount on hand. His truck has room to take all the items, but it can only carry so much weight, so that’s our constraint. His choices are as follows:

The main thing to notice here is that it totals 8,600 pounds, but his truck’s suspension can only handle 4,000 pounds.

We could brute-force the problem by running through all the combinations and seeing which of the light-enough ones totals the highest. Or we could use a very common heuristic for knapsack problems: adding the most expensive thing we can, over and over, until we can’t add anything.

Let’s see what happens if we use a genetic algorithm to determine a “good enough” truckload. First, we need a way to represent each candidate. In code, we could represent them as a class and create one randomly, like so:

“Whoa,” you say, “that looks like we’re just making a random number!” That’s right! We’re making a random number with seven bits, so we have a random 1 or 0 for each of our seven possible items. We could get as complex as we want in this function, like dictating a minimum or maximum number of items, but let’s keep it simple.

To create an initial population, we can just create a bunch of candidates and stuff them into an array, like so:

(This could be done in much more idiomatic Ruby, but I’m keeping this easily understandable for those who don’t know Ruby. So those of you who do, please don’t scold me for that)

If we create a population of ten* truckloads, we might wind up with a list like this:

*Why ten? Because I usually do this as an in-person conference presentation, and that’s what fits on the screen in a size that is still decently readable from the back row! If I were doing this for real, with a much more complex domain, I might use a hundred, a thousand, a million, or even more, depending on the scenario.

How did we get from random numbers to those combinations?

Behind the scenes, that translation might look like this:

We have a list of items we can take as instances of another class describing them, with name, weight, and value attributes. (In this example, it’s an inner class, but it doesn’t have to be.) To check what’s in our cargo manifest, represented by our truckload’s contents value, we can iterate through the list of possible items, checking whether the corresponding bit is on. But I’ll handwave over those details to keep this code simple.

Assessing fitness with (surprise!) a “Fitness Function”

Now that we’re done with Initialization, we assess how “fit” each truckload is. We do this with something called a “fitness function.” Just like how biological creatures might be perfectly fit for one environment but a lousy fit for another, this should reflect how fit a candidate is for some particular purpose. In this case, we already know we want the total value, BUT any load too heavy for the truck is worthless. In Ruby, that would look like this:

We decode which items we want to take and then sum up their weights. (Writing the bit_on? function is left as an exercise to keep this code simple.) If that exceeds the truck’s capacity, we return zero; otherwise, we sum up their values.

Again, we could get as complex as we want in this function, and NASA’s antenna fitness function certainly must have been! For instance, we could consider the costs of refrigerating items that need it. Or if the truck were even smaller, and we had the volume of each item, we could also total up the volume and make sure it all fits. We could even consider very long or flat shapes so that, for instance, a flagpole or a large sheet of plywood might not fit, even though its weight and total volume may be relatively small.

If we run this fitness function on our population and sort on fitness descending to make it easy to find the best, we get this:

The “Are We Done Step” considerations and criteria

Now that we’ve assessed their fitness, we can check if we’re done. So, what are our criteria?

This function can be simple, but it can take some thinking to figure out what it should do. With a knapsack problem, a good solution, especially the best, can be made worthless by adding just one more wafer-thin item, thereby exceeding the constraints. So, we will record the best we’ve seen and stop if we see nothing better within 100 generations. (Why 100? Another somewhat random choice. It seems like enough for a good chance for improvement, and since what we’re doing is so simple and our population is so small, using lots of generations isn’t very slow.) In Ruby, that would look like this:

When this code is initially run, to define the function, we set the initial best combo as empty and how many generations it’s been since we saw that as zero, both as class variables. (That’s what the double at-signs mean.) When the function is called, we increment the number of generations, look at the fitness of the current candidates, and select the ones with better fitness than our benchmark.

If there are any better candidates, we make the fittest one our new benchmark, reset the generation counter, and return false.

If it’s been 100 generations since the best one, we return true; otherwise, we return false.

Again, we can get as complex as we want, not only in checking the maximum fitness, but we could also look at other stopping criteria, like the average or minimum fitness, or achieving some specific level of maximum fitness, some maximum number of generations, or amount of time (whether clock time or CPU or whatever), or letting the user click a STOP button, or many other ways, or a combination of ways.

The next generation candidate selection

Since we’re still going (rarely done on the first loop), the next step is to select candidates to breed the next generation. The obvious way is to take the top two most fit, like so:

We take the population, sort them by fitness in descending order, and take the first two. Out of our current population, that would choose these two:

As usual, we could also get more complicated, and some common, more complex alternatives exist. For instance, with the “Roulette Wheel” selection, we would select the candidates randomly, but each has a chance to be selected proportional to their fitness.

We could also take more than two, whether to breed all pairs in that set or to combine more than two at once. Or we could combine strategies, such as using all trios from a randomly chosen five, with the fitter ones having a better chance of being chosen.

The crossover – breeding the next population

Now that we’ve chosen our breeders, we breed them. The usual way is called crossover. This consists of taking the data points, or in Genetic Algorithm terms, the “genes” from one parent, up to some randomly chosen crossover point, then switching to the other parent. This can be extended with multiple crossover points, but we’re just going to use one, like so:

We establish the crossover point for each new candidate as a random number between zero and how many items there are, inclusive. Then, we iterate through the list of items by index number. If we still need to hit the crossover point, we get the decision for that item from the first parent; otherwise, we get it from the other parent. This means it could be copied from one parent or the other or switched at some point. (As usual, we could get as complex as we want, like making some crossover points more likely or forbidden.) If we do this once, with a crossover point of 3, so we take three values from the first parent and the rest from the second, that would get us a result like this:

But this is just one of ten results because we’re making a whole new population, like so:

This is just like how we created the initial population, except that instead of each candidate being made from scratch, they’re the product of breeding our chosen breeders. The whole list might look like this:

Don’t forget: mutate and diversify again

Lots of family resemblance there, eh? None of these loads include a cow, butter, or leather, and they all include cheese. That’s because both of our two breeders were like that. If we were to continue breeding the fittest of each generation, we wouldn’t ever see any loads that include a cow, butter, leather, or no cheese, but we fix that in the next step: mutate them. Again, I will keep it simple and give each gene a 1 in 4 chance of flipping. In code, that looks like this:

We iterate through the item numbers, and for each one, if a random number from zero to three is a zero, we flip that bit. Again, we could get as complex as we want, like having some genes more likely to mutate than others, having some minimum or maximum number of mutations per candidate, or a multitude of other options. If we run this mutation function on these new candidates, we might wind up with something like this:

We now DO have some truckloads that include a cow, butter, leather, or no cheese. Now we go back to assessing the fitness of these newly finalized candidates, and we get this:

Assess fitness again [step 2]

Oh, noes! Our maximum fitness went down! As you may recall, our previous best score was 20,000. But don’t worry. In our “Are we done yet” function, we hang onto the best one and try to outdo it so we haven’t lost it.

The next generation might look like this:

A slight improvement over our prior best! So, we set that top one as our benchmark and reset the counter of generations since we saw it. If we let this run to completion, we might wind up with something like this, with our best truckload scoring 26,000, made up of cheese, meat, and leather:

So that’s one complete run of a genetic algorithm. If we wanted to check whether that was the best that this algorithm could produce, we could just run it again and again and again, as many times as we like (within reason), since it’s so much faster than brute force. Okay, maybe writing all this code is slower when we only have seven items and such simple criteria. However, creating a genetic algorithm might be worthwhile if we had to choose among many more items, with more complex criteria, for many truckloads a day.

A Note from the Author: There are better approaches than this. Remember how I said that sometimes the fittest members of a population might be carried over into the next one? If I had carried over the fittest one, the maximum fitness would never go down, so the “are we done” function could have been much simpler, and we’d be done much faster. But I only thought of it once I had already made all the slides for the talk from which this blog post is extracted, and I didn’t want to redo them. Also, I think this version is still worth exploring to make the point that the fitness can go down and then come back up. So let’s keep going.

Another example? Sure, with Dungeons and Dragons character stats

Let’s evolve something completely different, perhaps a bit less practical. Suppose we want to “evolve” a good set of stats for a Dungeons and Dragons (D&D) fighter character, so our candidates are tuples of numbers rather than strings of bits. D&D character stats are Strength, Intelligence, Dexterity, Constitution, Wisdom, and Charisma, each determined by rolling three six-sided dice (or 3d6 for short). For simplicity’s sake, I will gloss over how a D&D character can sometimes have extra strength. In Ruby, that might look like this (defining the roll function is left as an exercise to keep this code simple):

Creating candidates and assessing fitness

If we create an initial population of ten characters, it might look like this:

We then move to assess how “fit” each one of these characters is. We’re trying to evolve a good set of Fighter stats, so it should be based mainly on strength, constitution, and dexterity. Intelligence, wisdom, and charisma, not so much — but we don’t want them too low for the sake of occasional saving throws. I tried several different things, such as totaling up double the strength, the constitution, and half the dexterity:

but, the other stats tended to get too low, and even the dexterity

So, I tried prioritizing them linearly, adding up six times the strength, five times the constitution, and so on down to one times the charisma:

but then the other stats got too high, and the characters seemed too generalized.

Finally, I settled on prioritizing the stats again but much more strongly, totaling up 32 times the strength, 16 times the constitution, and so on, down to one times the charisma. Here, we see that even though the fitness function can be very simple, it can be difficult to figure out one that will yield good results, whatever that means in the situation.

If we run this on our population and sort them, we get this:

“Are We Done?”

Now that we’ve assessed their fitness, we can ask: are we done? What are our criteria? Let’s call it done if any candidates get 90% of the way to the maximum score of our fitness function. (I’ll spare you most of the math, but the maximum is 1,134, so 90% would round to 1,021.) In code, checking that would look like this:

None of our current candidates score anywhere near 1,021, so we select some candidates to breed the next generation. Taking the top two scorers again, we get these two*:

*Author’s Note: The abstraction gets a bit more leaky now because we’re ignoring sexes; we have no guarantee that these characters will be a male and a female, as we’re not even making that part of the data! (We had the same leak last time with the truckloads, but then it was irrelevant, so I let it slide.) Now that they’re living beings (humans or elves or whatever), we could add such complications and many other factors. That would complicate our breeder selection enormously, maybe even make it impossible to find a viable pair in such a small population.

Breeding the next population

Next, we breed our chosen pair. This time, we’ll use another common strategy of essentially flipping a coin for each gene. We go through the stats one by one, flip a coin (or “roll a d2”), and if it comes up 1, we get that stat from the first parent. Else, we get it from the other parent.

That could get us a result like this:

But again, this is just one of ten results because we’re making a whole new population, which might look something like this:

There are two main things to notice here.

First, the number of stats (or, in Genetic Algorithm terms, “alleles”) inherited from each parent is not always the same, neither in a single candidate nor the whole population. It’s a series of random coin flips, so on average, there will be three and three, but just like with the truckloads, it could be anything up to six and zero either way.
Second, notice the family resemblance again! Even though we’re no longer using yes/no decisions, there are only two possible values for each stat, or in Genetic Algorithm terms, alleles for each gene, for 64 possible combinations. If any stats were the same between the two parents, there would be only one possible value and, therefore, half as many possible combinations.

At a glance, these look, on average, much more suitable as fighters than the previous generation. (We’ll figure out their actual fitness scores later.) Just like before, if we were to continue breeding the fittest of each generation, we wouldn’t see any change, let alone improvement, in the possible values of each stat. For instance, the Wisdom would never be anything other than 6 or 11. We fix that in the next step.

Mutate the new candidates

Keeping it simple, I give each stat a 1/3 chance of staying the same, going up a point, or going down a point, within the valid range. For each stat, we add a random number from 0 to 2 and subtract one, which is like adding a random number from -1 to 1, but we clamp it to the range of 3 to 18. We could get as complex as we want, like giving it a higher chance of going up or down, maybe by multiple points, if it’s very low or very high, to simulate the real-world phenomenon of regression to the mean or many other options.

If we run this on our new population, we wind up with something like this:

Looking at the values in each column, you can see they’re much more diverse—some stayed the same, but some went up, and some went down.

Repeat until we’re done

We go back to Step 2 and assess the fitness of these new candidates. This generation has improved much from the previous one. The old ranged from 476 to 760, and the new ranged from 635 to 851.

It’s still far from our stopping criterion of 1021, so fast-forwarding through six more rounds, we finally get one suitable character, with 18 Strength and Constitution, acceptable though sub-par Dexterity, and surprisingly high Intelligence:

That’s not a problem, just a bit of a waste. If we wanted it more specialized, we could complicate the fitness function further, explicitly demand well above average scores in the class’s primary stats, and forbid it to be so high in the others, or at least apply a penalty.

Now that we’ve evolved a set of Fighter stats, let’s suppose we don’t need any more Fighters in the party, but now we need a Wizard! All we need to do is tweak our fitness function to prioritize intelligence first, then wisdom, dexterity, and so on, down to strength. (We could also make it take parameters and pass in different ones this time, but I will just hardcode it here for simplicity.)

An initial population would look roughly the same since we have yet to change how it’s generated, so I’ll spare you those steps, but after 11 generations, I got three candidates who are 90% fit to be wizards:

They’re mostly pretty good in the other stats, but not so much as to be better suited for some other class, except that that top one looks like a bard to me.

Author’s Note: Remember, though, that this is all very random. It may converge on a suitable solution faster or more slowly, and the fitness function may be good or poor at getting just the right mix of alleles.

Conclusion

There are many other ways we can use genetic algorithms. They can create images, music, and even code. I’m now working on a system to use them to schedule conference talks. Mey Beisaron has a talk on using one to schedule her college classes. So, think about it, and you might be able to use them for something.

To recap what you’ve learned here:

Genetic Algorithms are optimization heuristics (fancy talk for shortcuts to finding good enough solutions).
They’re simpler than you probably thought. All you have to do is create an initial population and cycle through five simple steps, of assessing their fitness, checking if you’re done, picking breeders, breeding them, and mutating the new candidates, over and over until you’re done.
They can use very simple functions, but it can be tricky to figure out exactly what the functions should do for best results (whatever that means), especially for evaluating fitness and checking if you’re done, so to make the solutions converge quickly enough and not ignore too many other very good solutions.
This technique applies to a wide variety of problems, including ones so complex that a semi-random algorithm can generate excellent solutions that we humans would never have thought of.

More about Open Spaces: We believe that the best insights come from those who are deeply engaged in the field, which is why we invite our talented engineers to share their knowledge, experiences, and passions.

In each installment, our contributors (all Gun.io engineers) delve into a wide range of technical topics, from emerging technologies and innovative practices to personal projects and industry trends. They aim to inspire, educate, and foster a deeper understanding of what interests us.

If you’re a Gun.io community member interested in writing, email Victoria Stahr ([email protected]). Join us as we celebrate the voices of our Gun.io community and spark conversations that drive innovation forward!