ToolKami

OpenAI Codex: Tooling Deep-Dive

With the release of GPT-5-Codex (a version of GPT‑5 further optimized for agentic coding in Codex), Codex has became my default coding agent tool of choice:

codex \
  --model 'gpt-5-codex' \
  --full-auto \
  -c model_reasoning_summary_format=experimental \
  --search "$@"

I noticed that web --search is disabled by default and must be explicitly enabled when starting the agent. That prompted me to dig deeper into the tooling setup.

Codex Core Tools

From their thankfully open repository, I learnt that only 2 tools are enabled by default: shell and view_image. The latter is straightforward - view_image lets the LLM “see” uploaded images in the conversation. The shell tool, however, has some additional and every interesting capabilities.

Shell

The basic mode executes a command and immediately returns the result as JSON:

Codex Basic Shell

Beyond this, there are two interactive modes that allows for a persistent, streamable shell session using a session_id. The first is unified_exec.

Codex Interactive Shell 1

unified_exec is useful for sessions that operates at the line or command level. For lower-level control, the API splits into exec_command and write_stdin:

Codex Interactive Shell 2

The key distinction is that write_stdin can transmit raw control characters (e.g. \u0003) and streams direct text output instead of JSON, enabling more fine-grained, low-level integration with the shell.

Interestingly, the code editing tool apply_patch is also just a wrapper around shell.

Apply Patch

Getting LLMs to perform multiple file edits with precision is tricky, as each model favors a different patch specification.

%%bash
apply_patch <<"EOF"
*** Begin Patch
[YOUR_PATCH]
*** End Patch
EOF

Beyond syntax, we also need to consider the implementation:

These choices directly affect whether edits apply reliably across files.

This is OpenAI’s apply_patch implementation.

Before diving into web_search, though, I want to zoom out and reflect on coding agents more broadly.

Coding Agents

It feels like coding agents have been around forever, but in reality, it’s been less than six months since I started experimenting with them—and even tried to build my own.

Back then, I created ToolKami because I wanted a Claude Code–like agent that worked with Gemini 2.5 Pro. I had fun (and some success), and eventually released it as an open-source project. Ironically, in the same month OpenAI announced Codex and Google announced Jules.

ToolKami Timeline

I had to pause ToolKami for about three months due to other priorities. Now, with Codex as my go-to agent, I find myself questioning whether ToolKami should still exist.

There are, however, strong arguments in its favor:

ToolKami Premise

This design also unlocked some neat possibilities:

Moreover, one area consistently underwhelmed me: web_search.

While web_search is invaluable for pulling up-to-date documentation, a true browse tool goes further. It allows the agent to:

ToolKami VLM

This isn’t just nice-to-have. It’s foundational, and too important not to be a default tool for coding agents.

Conclusion

With all this in mind:

So yes, I’ll be resuming work on ToolKami. Every star (or even issue) you leave on GitHub motivates me to keep going.

I’m also planning another post on update_plan and cover some context engineering.

Thanks for reading—I hope you found some takeaways, especially if you’re experimenting with Codex yourself.

References:

If you found this post useful, get updated whenever there is a new post!
    We respect your privacy. Unsubscribe at any time.

    #Agent