Entries for December 2025

  1. Portrait of Onur Solmaz
    Onur Solmaz · Post · /2025/12/26

    Depth on Demand

    I gave Codex a task of porting an OpenCV tracking algorithm (CSRT) from C++ to Rust, so that I can directly use it in my project without having to cross-compile

    It one-shot the task perfectly in 1hr, and even developed a GUI on top of it. All I did was to provide the original source and algo paper

    I’ve spent years getting specialized in writing numerical code (computational mechanics, fem), and now AI can automate 95% of the low-level grunt work

    Acquiring these skills involved highly difficult, excruciating intellectual labor spanning many years, very similar to ML research. Doing tensor math, writing out the solver code, wondering why your solution is not converging, finally figuring out it was a sign typo after 2 days

    Kids these days both have it easy and hard. They can fast forward large chunks of the work, but then they will never understand things as deeply as someone who wrote the whole thing by hand

    I guess the more valuable skill now is being able to zoom in and out of abstraction levels quickly when needed. Using AI, but recognizing fast when it fails, learning what needs to be done, fixing it, zooming back out, repeat. Adaptive learning, a sort of “depth-on-demand”. The quicker you can pick up new skills and knowledge, the more successful you will be

  2. Portrait of Onur Solmaz
    Onur Solmaz · Post · /2025/12/22

    How to stop AI agents from littering your codebase with Markdown files

    A simple documentation workflow for AI agents.

    For setup instructions, skip to the How to setup SimpleDoc in your repo section.

    If you have used AI agents such as Anthropic’s Claude Code, OpenAI’s Codex, etc., you might have noticed their tendency to create markdown files at the repository root:

    ...
    ├── API_SPEC.md
    ├── ARCHITECTURE.md
    ├── BACKLOG.md
    ├── CLAUDE.md
    ├── CODE_REVIEW.md
    ├── DECISIONS.md
    ├── ENDPOINTS.md
    ├── IMPLEMENTATION_PLAN.md
    ├── NOTES.md
    ├── QA_CHECKLIST.md
    ├── SECURITY_PLAN.md
    └── src/
        └── ...
    ├── TEST_COVERAGE.md
    ├── TEST_REPORTS.md
    ├── TEST_RESULTS.md
    ...
    

    The default behavior for models as of writing this in December 2025 is to create capitalized Markdown files at the repository root. This is of course very annoying, when you accidentally commit them and they accumulate over time.

    The good news is, this problem is 100% solvable, by using a simple instruction in your AGENTS.md file:

    **Attention agent!** Before creating ANY documentation, read the docs/HOW_TO_DOC.md file first. It contains guidelines on how to create documentation in this repository.
    

    But what should be in docs/HOW_TO_DOC.md file and why is it a separate file? In my opinion, the instructions for solving this problem are too specific to be included in the AGENTS.md file. It’s generally a good idea to not inject them into every context.

    To solve this problem, I developed a lightweight standard over time, for organizing documentation in a codebase. It is framework-agnostic, unopinionated and designed to be human-readable/writable (as well as agents). I was surprised to be not able to find something similar enough online, crystallized the way I wanted it to be. So I created a specification myself, called SimpleDoc.

    Basically, it tells the agent to

    1. Create documentation files in the docs/ folder, with YYYY-MM-DD prefixes and lowercase filenames, like 2025-12-22-an-awesome-doc.md, so that they will by default be chronologically sorted.
    2. Always include YAML frontmatter with author, so that you can identify who created it without checking git history, if you are working in a team.
    3. The exception here are timeless and general files like README.md, INSTALL.md, AGENTS.md, etc. which can be capitalized. But these are much rarer, so we can just follow the previous rules most of the time.

    Here is your call to action to check the spec itself: SimpleDoc.

    How to setup SimpleDoc in your repo

    Run the following command from your repo root:

    npx -y @simpledoc/simpledoc migrate
    

    This starts an interactive wizard that will:

    1. Migrate existing Markdown docs to SimpleDoc conventions (move root docs into docs/, rename to YYYY-MM-DD-… using git history, and optionally insert missing YAML frontmatter with per-file authors).
    2. Ensure AGENTS.md contains the reminder line and that docs/HOW_TO_DOC.md exists (created from the bundled SimpleDoc template).

    If you just want to preview what it would change:

    npx -y @simpledoc/simpledoc migrate --dry-run
    

    If you run into issues with the workflow or have suggestions for improvement, you can email me at onur@solmaz.io.

    Happy documenting!

  3. Portrait of Onur Solmaz
    Onur Solmaz · Post · /2025/12/06

    Agentic coding tools should give more control over message queueing

    Below: Why agentic coding tools like Cursor, Claude Code, OpenAI Codex, etc. should implement more ways of letting users queue messages.

    See Peter Steinberger’s tweet where he queues continue 100 times to nudge the GPT-5-Codex model to not stop while working on a predictable, boring and long-running refactor task:

    Tweet embed disabled to avoid requests to X.
    

    This is necessary while working with a model like GPT-5-Codex. The reason is that the model has a tendency to stop generating at certain checkpoints, due to the way it has been trained, even when you instruct it to FINISH IT UNTIL COMPLETION!!1!. So the only way to get it to finish something is to use the message queue.1

    But this isn’t the only use case for queued messages. For example, you can use the model to retrieve files into its context, before starting off a related task. Say you want to find the root cause of a <bug in component X>. Then you can queue

    1. Explain how <component X> works in plain language. Do not omit any details.
    2. Find the root cause of <bug> in <component X>.

    This will generally help the model to find the root cause easier, or make more accurate predictions about the root cause, by having the context about the component.

    Another example: After exploring a design in a dialogue, you can queue the next steps to implement it.

    <Prior conversation exploring how to design a new feature>

    1. Create an implementation plan for that in the docs/ folder. Include all the details we discussed
    2. Commit and push the doc
    3. Implement the feature according to the plan.
    4. Continue implementing the feature until it is done. Ignore this if the task is already completed.
    5. Continue implementing the feature until it is done. Ignore this if the task is already completed.

    … you get the idea.

    I generally queue like this when the feature is specified enough in the conversation already. If it’s underspecified, then the model will make up stuff.

    When I first moved from Claude Code to Codex, the way it implemented queued messages was annoying (more on the difference below). But as I grew accustomed to it, it started to feel a lot like something I saw elsewhere before: chess premoves.

    Chess???

    A premove is a relatively recent invention in chess which is made possible by digital chess engines. When the feature is turned on, you don’t need to wait for your opponent to finish their move, and instead can queue your next move. It then gets executed automatically if the queued move is valid after your opponent’s move:

    If you are fast enough, this let’s you move without using up your time in bullet chess, and even lets you queue up entire mate-in-N sequences, resulting in highly entertaining cases like the video above.

    I tend to think of message queueing as the same thing: when applied effectively, it saves you a lot of time, when you can already predict the next move.

    In other words, you should queue (or premove) when your next choice is decision-insensitive to the information you will receive in the next turn—so waiting wouldn’t change what you do, it would only delay doing it.

    With this perspective, some obvious candidates for queuing in agentic codeing are rote tasks that come before and after “serious work”, e.g:

    • making the agent explain the codebase,
    • creating implementation plans,
    • fixing linting errors,
    • updating documentation during work before starting off a subsequent step,
    • committing and pushing,
    • and so on.

    Different ways CLI agents implement queued messages

    As I have mentioned above, Claude Code implements queued messages differently from OpenAI Codex. In fact, there are three main approaches that I can think of in this design space, which is based on when a user’s new input takes effect:

    1. Post-turn queuing (FIFO2): User messages wait until the current action finishes completely before they’re handled. Example: OpenAI Codex CLI.

    2. Boundary-aware queuing (Soft Interrupt): New messages are inserted at natural breakpoints, like after finishing a tool call, assistant reply or a task in the TODO list. This changes the model’s course of action smoothly, without stopping ongoing generation. Example: Claude Code, Cursor.

    3. Immediate queuing (Hard Interrupt): New user messages immediately stop the current action/generation, discarding ongoing work and restarting the assistant’s generation from scratch. I have not seen any tool that implements this yet, but it could be an option for the impatient.

    Why not implement all of them?

    And here is my title-sake argument: When I move away from Claude Code, I miss boundary-aware queuing. When I move away from OpenAI Codex, I miss FIFO queueing.

    I don’t see a reason why we could not implement all of them in all agentic tools. It could be controlled by a key combo like Ctrl+Enter, a submenu, or a button, depending on whether you are in the terminal or not.

    Having the option would definitely make a difference in agentic workflows where you are running 3-4 agents in parallel.

    So if you are reading this and are implementing an agentic coding tool, I would be happy if you took all this into consideration!

    1. Pro tip: Don’t just queue continue by itself, because the model might get loose from its leash and start to make up and execute random tasks, especially after context compaction. Always specify what you want it to continue on, e.g. Continue handling the linting errors until none remain. Ignore this if the task is already completed. 

    2. First-in, first-out.