AlphaEvolve: ToolKami Style
(Last updated: 2025-09-23):
TL;DR: We implemented AlphaEvolve as an LLM workflow with MCP tools to optimize Perlin noise implementation for procedural generation of images. Code is available at the end of this post.
When I had just started experimenting with ToolKami, Google released AlphaEvolve: A coding agent for scientific and algorithmic discovery. The paper made waves in the news—unsurprisingly, given its impressive results and its innovative combination of two powerful techniques:

While their implementation remains closed-source, they shared enough details for me to quickly replicate a pseudo loop using my existing agentic setup. For this experiment, I set myself the goal of optimizing a Perlin noise implementation that resembles a target fire image.
Perlin noise
Perlin noise is an algorithm commonly used for procedural content generation of wave-like, undulating material or textures. You’ll find it in games such as Minecraft, where it generates terrain and biomes, and in movies, where it powers VFX effects like clouds, fire, or water.

Sweet spot of AI
Optimizing Perlin Noise is a sweet spot for AI because it fulfill three important criteria.
Massive combinatorial search space There are countless ways to implement Perlin noise and populate its parameters.
Clear objective function (metric) to optimize against We can use the Mean Squared Error (MSE) between pixels of the target image and the image generated by the Perlin noise implementation.

- Either logs of data and/or an accurate and efficient simulator In this case, the “efficient simulator” is the code interpreter, which can be run in parallel.
Implementation
The “Controller Loop” can be implemented manually, or as an agentic workflow with the right tools:

The general steps are as follows:
- Tell the agent its role and goal
- Sample the implementations using the
Filetool - Instruct the LLM which region of code it is allowed to modify
- Evaluate the implementation using the
Shelltool - Save the implementations and evaluation results with
Filetool - Repeat
Here’s the actual prompt I used:
Act as an expert software developer. Your task is to iteratively improve the provided codebase.
Create a Perlin noise implementation that resembles the target image (a fire in this case).
1. Use list_directory tool with sort_order 'asc' and limit 10 in the directory '/workspaces/toolkami/projects/perlin/results' which were saved with convention '{score}_{md5sum}.py'.
2. Sample 1 program from the list, it doesn't have to be the best, sample randomly.
3. Make a copy of the file with the name 'candidate_{random_id}_{md5sum}.py' with executable permission and save it in the directory '/workspaces/toolkami/projects/perlin/results'.
4. You are only allowed to modify the content between '# EVOLVE-BLOCK-START' and '# EVOLVE-BLOCK-END', suggest a new idea to improve the code that is inspired by your expert knowledge of game programming, graphics and optimization.
5. Edit the candidate file using the edit tool with the diff-fenced format.
6. Write the output of edit tool to the candidate file
7. Execute the program (as a UV script) and after obtaining the output score
8. rename the file with convention '{score}_{md5sum}.py'.
9. Forget current memory
10. Repeat the process
Notice the red block specifying the edit region. The whitepaper highlights this technique:
API. To support evolving multiple components across a codebase, AlphaEvolve exposes an input API where blocks of code can be annotated as to-be-evolved-by-the-system; This design facilitates integrating it with existing codebases while requiring only minimal changes, simply by adding special markers as comments into the code.
Result
The agent successfully reduced the MSE of the base implementation from -0.1373 to -0.0564 (closer to 0 is better) in just 100 iterations. Improvements came mainly from:
- Adding a bilinear_interpolate function
- Using exact fractional and integer parts of coordinates for gradient dot product computations
- Generating smoother transitions across gradient cells due to higher-fidelity calculations
Visually, the progression looks like this:
- Base implementation: resembles lava more than fire
- After 20 iterations: begins to take the shape of fire
- After 100 iterations: fire streaks appear, burning brightly at the center
With more iterations, we’d expect even better results.

Conclusion
To re-summarize our modifications to the original setup:
- We implemented it as an agentic workflow with simple, composable tools instead of a full-fledged program
- Instead of a database, we stored single-file executable programs (UV scripts) in a directory, with scores prefixed in the filenames
- We used diff-fenced editing (credits to Phil Schmid) as an API for modifying Evolve Blocks
The complete implementation is available here: Code.
As this is my first blog post, I hope you enjoyed reading it! Feel free to share suggestions to improve both the post and the implementation.