Faster sorting with AlphaDev

2023 has been an exciting year in the world of AI. It’s not just the talk of our industry, but the talk of the world. With the emergence of large language models like GPT, you can’t navigate any form of media without being bombarded with things being created using AI.

Prior to the dawn of OpenAI and the remarkable effect its LLMs have had, I looked to Google Deepmind as the industry trailblazer. Which is why, the other day I decided to see what they were up to, and I wasn’t disappointed.


In computer science — as you probably know — sorting is quite fundamental. Given an unsorted number of items, order them. An example might be to sort pile of books into alphabetical order, or arrange some numbers from smallest to biggest.

One of the first algorithms you might learn as a budding computer scientist is how to sort. There’re a number of different ways to do so, with their own strengths and weaknesses. Sorting algorithms have been studied by many and thought to have been solved problem.

Asymptotically optimal algorithms have been known since the mid-20th century

Which is why what Deepmind have done is completely fascinating.


Google Deepmind released a paperFaster sorting algorithms discovered using deep reinforcement learning“. Where they took the fundamentals learnt from AlphaZero used in projects like AlphaGo — of which there is an amazing documentary I recommend everyone to watch — and applied it to programming to create novel sorting algorithms. They created AlphaDev.

AlphaDev produced a faster sorting algorithm than the collective intelligence of the human race has been able to produce to date:

AlphaDev uncovered new sorting algorithms that led to improvements in the LLVM libc++ sorting library that were up to 70% faster for shorter sequences and about 1.7% faster for sequences exceeding 250,000 elements.

How did AlphaDev do it?

Google Deepmind approached this task be treating the problem as though it was a game, like Chess or GO. AlphaDev is the deep reinforcement learning (DRL) agent that played the game. The game composed of AlphaDev generating assembly instructions each turn. A series of input sequences were provided to the program generated by AlphaDev and compared to the expected output. If AlphaDev got the correct output it would be rewarded, as well as being rewarded for writing faster solutions, where both actual latency and number of instructions were taken into consideration.

The table below shows the length of the algorithms for different sort functions. Sort 3/4/5 are sorting algorithms where the length of the input is known, VarSort3/4/5 are where the length is not known, and can be between 0-3, 0-4, 0-5 respectively. Their focus was primarily on these shorter algorithms as they are called multiple times by sorting functions.

It’s important to understand here that in the Sort X examples there is no branching meaning they have been highly optimised for the usecase. VarSortX on the other hand does require branching which makes the problem much more difficult to solve because subroutines need to be created and evaluated as well.

Given the static and complete nature of Sort X it is amazing that AlphaDev has managed to improve on the length of both Sort 3 and Sort 5.

AlphaDev found two really interesting sequences of instructions: The AlphaDev swap move and the AlphaDev copy move:


Another breakthrough they had which they briefly mentioned in their blog is to do with hashing. This was glossed over in the paper as it wasn’t the main focus. Hashing is also another fundamental algorithm in computer science:

hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output.[1] The values returned by a hash function are called hash valueshash codesdigests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

They applied AlphaDev to the 9-16 bytes range of the hashing function and found that it was 30% faster. This is truly incredible.


AlphaDev’s output on such a fundamental part of computing is nothing short of amazing. If there are optimisations to be had in algorithms that have been painstakingly reviewed and updated by numerous computer scientists, what does this hold for more complicated algorithms?

What will we see next from AlphaDev? Could it find a novel search algorithm? Could it find a more efficient way to traverse a graph? What about compression? There is unlimited potential! Much like AlphaGo, I can visualise AlphaDev rewriting computer science text books with its novel approaches.

AlphaDev’s assembly code has since been reverse-engineered into C++ and is now apart of the libc++ sorting library. With sort said to be used trillions of times per day, the world just got a bit faster.

Leave a Reply