<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>HAHWUL</title>
    <link>https://www.hahwul.com</link>
    <description>Offensive Security Engineer, Developer and H4cker.</description>
    <atom:link href="https://www.hahwul.com/rss.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Traveling with Hermes in Japan</title>
      <link>https://www.hahwul.com/posts/2026/traveling-with-hermes-in-japan/</link>
      <guid>https://www.hahwul.com/posts/2026/traveling-with-hermes-in-japan/</guid>
      <description>&lt;p&gt;I spent the past week traveling in Japan. It&apos;s a country I visit every year, so it&apos;s familiar territory, but this year I prepared a little experiment. I set up the Hermes Agent on the Mac Studio at home, and kept directing it through Discord to work on projects while I was away. The results turned out better than I expected, and I wanted to share that experience.&lt;/p&gt;
&lt;p&gt;In most of my AI workflows, I sometimes hand the orchestrator role to an AI, but for important projects I prefer to stay hands-on. This time, driving Hermes remotely let me make good use of travel and downtime. Of course, I brought my MacBook too, so I still did some work directly late at night or early in the morning.&lt;/p&gt;
&lt;h2 id=&quot;hermes-agent&quot;&gt;Hermes Agent&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://hermes-agent.nousresearch.com&quot;&gt;Hermes&lt;/a&gt; is an open-source AI Agent from Nous Research, and it&apos;s been getting attention as an alternative to OpenClaw. Its core feature is self-improvement. After the Agent finishes a task, it reviews the result itself, remembers patterns, or turns them into new skills it can reuse. The more you use it, the smarter it gets.&lt;/p&gt;
&lt;p&gt;I&apos;ve personally been a fan of the self-learning concept, so I&apos;d been testing it locally since March. This time I finally put it to work on an actual open-source project.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&quot;lazy&quot; src=&quot;images/hermes.webp&quot; alt=&quot;Hermes Agent&quot; /&gt;&lt;/p&gt;
&lt;h3 id=&quot;setup&quot;&gt;Setup&lt;/h3&gt;
&lt;p&gt;Hermes supports a variety of providers, so you can configure it however you like. Since I use Claude, Codex, and Gemini subscriptions heavily, I hooked up GitHub Copilot, which had more room to spare, for lighter tasks. (I&apos;d been struggling to burn through Copilot&apos;s monthly quota anyway, so it was a perfect fit.)&lt;/p&gt;
&lt;p&gt;Since provider handling is a lightweight task, I mostly used the Sonnet 4.6 model. If you want to save even more on cost, Grok Code Fast would also be a solid choice.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&quot;lazy&quot; src=&quot;images/copilot.webp&quot; alt=&quot;Output of hermes model&quot; /&gt;&lt;/p&gt;
&lt;p&gt;For the messaging gateway, I went with &lt;a href=&quot;https://hermes-agent.nousresearch.com/docs/user-guide/messaging/discord&quot;&gt;Discord&lt;/a&gt;. You create a Discord bot and register its token with Hermes.&lt;/p&gt;
&lt;p&gt;Personally, I think the messaging channel is critical from a security perspective. Discord lets you control bot permissions in fine detail, and Hermes&apos; settings let you restrict access to specific user IDs via &lt;code&gt;ALLOWED_USERS&lt;/code&gt;. With this setup, the bot only carries the permissions it actually needs, and random users can&apos;t just invoke it at will.&lt;/p&gt;
&lt;p&gt;If you look at &lt;code&gt;~/.hermes/.env&lt;/code&gt;, it&apos;s defined like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-toml hljs&quot;&gt;DISCORD_BOT_TOKEN=********
DISCORD_ALLOWED_USERS=********
DISCORD_HOME_CHANNEL=********
&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&quot;workflow&quot;&gt;Workflow&lt;/h3&gt;
&lt;p&gt;Coding requires precision, so Hermes (Copilot) alone isn&apos;t quite enough in terms of quality. So most of the real work got delegated to Claude Code, Codex, and Gemini.
Early on I had to give multiple directions, but once the conversation built up some context, Hermes started routing to the appropriate model (Claude, Codex, Gemini) on its own depending on the situation. Copilot was mostly used for the chat channel and light tasks.
(The way Hermes automatically branches between Claude, Codex, and Gemini based on purpose, once the conversation has some history, was especially appealing.)&lt;/p&gt;
&lt;div class=&quot;mermaid&quot;&gt;flowchart LR
    A[Me] --&amp;gt;|Send Message| B(Hermes Agent with Github Copilot)
    B --&amp;gt; C{Thinking}
    C --&amp;gt;|Write Code| D[Claude Code]
    C --&amp;gt;|Code Refactoring| E[Codex]
    C --&amp;gt;|Design Task| F[Gemini]
&lt;/div&gt;
&lt;h3 id=&quot;block-macos-sleep&quot;&gt;Block macOS sleep&lt;/h3&gt;
&lt;p&gt;Leaving the Mac Studio idle for long periods puts it to sleep automatically, so I needed to prevent that. I handled it with a small app I&apos;ve been developing myself called &lt;a href=&quot;https://apps.apple.com/kr/app/nodecaf/id6762029386?l=en-GB&amp;amp;mt=12&quot;&gt;NoDecaf&lt;/a&gt;, which was nice because it doubled as a real-world test.&lt;/p&gt;
&lt;h2 id=&quot;in-japan&quot;&gt;In Japan&lt;/h2&gt;
&lt;p&gt;Before leaving, I finished plenty of testing and dropped the main work instructions into Discord, then went off to enjoy the trip. Feedback requests and permission approval notifications came in occasionally, but nothing overwhelming. I could check and handle them pretty casually.&lt;/p&gt;
&lt;div class=&quot;images-full-width&quot;&gt;


&lt;div class=&quot;images-grid&quot;&gt;
    
    &lt;div class=&quot;images-grid-item&quot;&gt;
        &lt;img src=&quot;images/1.webp&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;
    &lt;/div&gt;
    
    &lt;div class=&quot;images-grid-item&quot;&gt;
        &lt;img src=&quot;images/2.webp&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;
    &lt;/div&gt;
    
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Once the conversation had built up some history, I could keep my instructions brief and Hermes would still handle things well, which was satisfying. One moment that honestly moved me a little was watching it notice a usage limit and queue up a pending task on its own :D&lt;/p&gt;
&lt;h2 id=&quot;generated-skill&quot;&gt;Generated Skill&lt;/h2&gt;
&lt;p&gt;I could see the skills Hermes had automatically generated from its workflow. The &lt;code&gt;hwaro-examples-batch&lt;/code&gt; skill visible in the image above is a good example. It studied the patterns of tasks I frequently assign around &lt;a href=&quot;https://github.com/hahwul/hwaro-examples&quot;&gt;hahwul/hwaro-examples&lt;/a&gt; and turned them into a skill on its own.&lt;/p&gt;
&lt;p&gt;Skills are stored under &lt;code&gt;~/.hermes/skills&lt;/code&gt;, with built-in skills and user-generated ones managed together. &lt;code&gt;hwaro-examples-batch&lt;/code&gt; lived at &lt;code&gt;~/.hermes/skills/github/hwaro-examples-batch&lt;/code&gt;, and looking at its SKILL.md, it documented the trigger phrases I usually use, the local clone path (interestingly, even though I told it a specific path was fine to use, it set up a separate one — probably to avoid conflicts with the user&apos;s own workspace), and the scripts needed to carry out the task.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&quot;lazy&quot; src=&quot;images/hwaro-example.webp&quot; alt=&quot;Screenshot of SKILL&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Since skills are built from actual performed work, it&apos;s starting to feel like generating them automatically from the AGENT&apos;s workflow might be a better choice than writing SKILLs by hand. I&apos;ll have to keep running Hermes and squeeze out more skills.&lt;/p&gt;
&lt;h2 id=&quot;jules&quot;&gt;Jules&lt;/h2&gt;
&lt;p&gt;Separately from Hermes, &lt;a href=&quot;https://jules.google&quot;&gt;Jules&lt;/a&gt; was actually running continuously on autopilot during the trip too. I keep Jules on lighter tasks, and because it&apos;s purely cloud-based, I can run it comfortably from home or on the road. A few scheduled jobs are set up on the hwaro-example side, periodically identifying pages with problems, fixing them, and sending PRs. The PRs it opens then get handled by Hermes using Codex.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&quot;lazy&quot; src=&quot;images/jules.webp&quot; alt=&quot;Jules&quot; /&gt;&lt;/p&gt;
&lt;p&gt;If you set up the working environment within Jules&apos; Environment, there will be little difference from running it on an actual local PC.&lt;/p&gt;
&lt;h2 id=&quot;areas-for-improvement&quot;&gt;Areas for Improvement&lt;/h2&gt;
&lt;p&gt;Convenience comes with security risks. One thing I felt going through this flow is that permission separation matters. In particular, for high-privilege accounts like GitHub, I think it&apos;s worth creating a separate dedicated account just for the Agent.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Since I usually do most of my work sitting right at my Mac, my biggest question was whether I could actually use AI effectively from just my phone while traveling. It turned out to be more comfortable than I expected, and the setup seems to save a lot of time during transit, so once vacation is over I might try applying this flow to my commute as well. Serious work is still better when I&apos;m hands-on (easier for me, easier for the AI, better results), but anything that can be verified through skills (e.g., development with tight test coverage, tasks that don&apos;t need broad permissions) seems better handled through Hermes in the gaps.&lt;/p&gt;
&lt;p&gt;Being able to fully enjoy my own time while small tasks keep ticking along is genuinely appealing. Raising an Agent with self-improvement like Hermes also feels like a pretty meaningful experience. If it sounds interesting, I&apos;d recommend giving it a try.&lt;/p&gt;
</description>
      <pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate>
    </item>
    <item>
      <title>Building AI-Friendly CLIs</title>
      <link>https://www.hahwul.com/posts/2026/building-ai-friendly-clis/</link>
      <guid>https://www.hahwul.com/posts/2026/building-ai-friendly-clis/</guid>
      <description>&lt;p&gt;These days, AI agents are writing code, calling tools, and even handling deployments. With that shift, the CLI is getting attention again. GUIs and web dashboards are great for humans, but from an AI agent&apos;s perspective, CLIs are a much easier interface to work with.&lt;/p&gt;
&lt;p&gt;The thing is, most existing CLIs were designed for humans. Pretty table outputs, color codes, shorthand flags. All nice for human eyes, but painful for agents to parse. Output formats subtly change between versions, and figuring out the exact usage often means reading separate documentation.&lt;/p&gt;
&lt;p&gt;This week, I did a major overhaul of our team&apos;s internal CLI at work, rebuilding it around JSON I/O and schema commands. The agent&apos;s task success rate jumped noticeably. Based on that experience, I want to share some thoughts on how to build AI-friendly CLIs.&lt;/p&gt;
&lt;h2 id=&quot;why-json-first-input-output-matters-for-ai-agents&quot;&gt;Why JSON-First Input / Output Matters for AI Agents&lt;/h2&gt;
&lt;h3 id=&quot;human-vs-ai-cli-usage-patterns&quot;&gt;Human vs AI CLI Usage Patterns&lt;/h3&gt;
&lt;p&gt;When humans use a CLI, they check &lt;code&gt;--help&lt;/code&gt;, read man pages, eyeball error messages, and iterate through trial and error. Even if the output changes a bit, they adapt by reading the context. AI agents, on the other hand, take the output as a raw string and process it literally. They have to parse table-formatted output with regex, and the moment column order shifts or line breaks change, things break immediately.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash hljs&quot;&gt;# This is convenient for humans, but...
$ kubectl get pods
NAME                     READY   STATUS    RESTARTS   AGE
my-app-7d4b8c6f5-x2k9z  1/1     Running   0          3d

# This is what agents need
$ kubectl get pods -o json
{
  &amp;quot;items&amp;quot;: [{
    &amp;quot;metadata&amp;quot;: {&amp;quot;name&amp;quot;: &amp;quot;my-app-7d4b8c6f5-x2k9z&amp;quot;},
    &amp;quot;status&amp;quot;: {&amp;quot;phase&amp;quot;: &amp;quot;Running&amp;quot;, &amp;quot;containerStatuses&amp;quot;: [{&amp;quot;ready&amp;quot;: true}]}
  }]
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&quot;the-pain-of-unstructured-text-output&quot;&gt;The Pain of Unstructured Text Output&lt;/h3&gt;
&lt;p&gt;Unstructured text output causes more problems for agents than expected. Here are some patterns I actually ran into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Inconsistent parsing: sometimes the output has headers, sometimes it doesn&apos;t&lt;/li&gt;
&lt;li&gt;Locale dependency: date/number formats change based on system locale&lt;/li&gt;
&lt;li&gt;Color code pollution: ANSI escape codes sneak in and break string comparisons&lt;/li&gt;
&lt;li&gt;Progress bar collisions: stderr and stdout get mixed up, garbling the output&lt;/li&gt;
&lt;li&gt;Silent truncation: long values get clipped to &lt;code&gt;...&lt;/code&gt; with no way to detect it&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When you start handling these edge cases one by one, your agent code ends up buried in CLI parsing logic instead of actual work.&lt;/p&gt;
&lt;h3 id=&quot;advantages-of-json&quot;&gt;Advantages of JSON&lt;/h3&gt;
&lt;p&gt;JSON I/O solves most of these.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Type safety: numbers are numbers, strings are strings. You can distinguish &lt;code&gt;&amp;quot;3&amp;quot;&lt;/code&gt; from &lt;code&gt;3&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Schema-based validation: define and validate input/output shapes upfront with JSON Schema&lt;/li&gt;
&lt;li&gt;Easy chaining: instantly parseable by &lt;code&gt;jq&lt;/code&gt;, pipelines, and any programming language&lt;/li&gt;
&lt;li&gt;Consistency: identical output regardless of locale or terminal settings&lt;/li&gt;
&lt;li&gt;Structured errors: return errors as JSON so agents can identify error types and respond appropriately&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class=&quot;language-json hljs&quot;&gt;{
  &amp;quot;error&amp;quot;: {
    &amp;quot;code&amp;quot;: &amp;quot;RESOURCE_NOT_FOUND&amp;quot;,
    &amp;quot;message&amp;quot;: &amp;quot;Pod &apos;my-app&apos; not found in namespace &apos;default&apos;&amp;quot;,
    &amp;quot;suggestions&amp;quot;: [&amp;quot;Check namespace with --namespace flag&amp;quot;]
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&quot;real-world-test-results-from-our-project&quot;&gt;Real-World Test Results from Our Project&lt;/h3&gt;
&lt;p&gt;Here&apos;s a quick summary of what changed after adding a &lt;code&gt;--json&lt;/code&gt; flag to the CLI and having agents perform the same tasks:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Text Output&lt;/th&gt;
&lt;th&gt;JSON Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Task success rate&lt;/td&gt;
&lt;td&gt;Around 60%&lt;/td&gt;
&lt;td&gt;Around 90%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Average retries&lt;/td&gt;
&lt;td&gt;2.3 times&lt;/td&gt;
&lt;td&gt;0.4 times&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Parsing-related errors&lt;/td&gt;
&lt;td&gt;41% of all errors&lt;/td&gt;
&lt;td&gt;Nearly 0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;These numbers are from specific tasks I tested, so your mileage may vary. Still, just switching to JSON making this much difference was surprising.&lt;/p&gt;
&lt;h2 id=&quot;the-schema-command-letting-ai-learn-and-adapt-at-runtime&quot;&gt;The Schema Command – Letting AI Learn and Adapt at Runtime&lt;/h2&gt;
&lt;p&gt;JSON I/O alone is a huge improvement, but there&apos;s a way to take it one step further: the schema command.&lt;/p&gt;
&lt;h3 id=&quot;inspiration-from-google-workspace-cli-gws&quot;&gt;Inspiration from Google Workspace CLI (gws)&lt;/h3&gt;
&lt;p&gt;This idea was inspired by &lt;a href=&quot;https://github.com/googleworkspace/cli&quot;&gt;Google Workspace CLI (gws)&lt;/a&gt;. gws has a structure that lets you query schema information per resource at runtime. Looking at that, I thought: &amp;quot;Why not let the agent ask the CLI directly instead of reading documentation?&amp;quot;&lt;/p&gt;
&lt;h3 id=&quot;how-the-schema-subcommand-works&quot;&gt;How the Schema Subcommand Works&lt;/h3&gt;
&lt;p&gt;The concept is simple. Add a &lt;code&gt;schema&lt;/code&gt; subcommand to your CLI — specify a resource and action, and it returns the JSON Schema for that command&apos;s input and output.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash hljs&quot;&gt;$ mytool schema user.create
{
  &amp;quot;$schema&amp;quot;: &amp;quot;http://json-schema.org/draft-07/schema#&amp;quot;,
  &amp;quot;description&amp;quot;: &amp;quot;Create a new user&amp;quot;,
  &amp;quot;input&amp;quot;: {
    &amp;quot;type&amp;quot;: &amp;quot;object&amp;quot;,
    &amp;quot;required&amp;quot;: [&amp;quot;email&amp;quot;, &amp;quot;role&amp;quot;],
    &amp;quot;properties&amp;quot;: {
      &amp;quot;email&amp;quot;: {&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;, &amp;quot;format&amp;quot;: &amp;quot;email&amp;quot;},
      &amp;quot;role&amp;quot;: {&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;, &amp;quot;enum&amp;quot;: [&amp;quot;admin&amp;quot;, &amp;quot;member&amp;quot;, &amp;quot;viewer&amp;quot;]},
      &amp;quot;name&amp;quot;: {&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;, &amp;quot;maxLength&amp;quot;: 100}
    }
  },
  &amp;quot;output&amp;quot;: {
    &amp;quot;type&amp;quot;: &amp;quot;object&amp;quot;,
    &amp;quot;properties&amp;quot;: {
      &amp;quot;id&amp;quot;: {&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;, &amp;quot;format&amp;quot;: &amp;quot;uuid&amp;quot;},
      &amp;quot;email&amp;quot;: {&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;},
      &amp;quot;role&amp;quot;: {&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;},
      &amp;quot;created_at&amp;quot;: {&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;, &amp;quot;format&amp;quot;: &amp;quot;date-time&amp;quot;}
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Even better if you can also query the list of available resources:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash hljs&quot;&gt;$ mytool schema --list
[&amp;quot;user.create&amp;quot;, &amp;quot;user.delete&amp;quot;, &amp;quot;user.get&amp;quot;, &amp;quot;user.list&amp;quot;, &amp;quot;project.create&amp;quot;, ...]
&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&quot;benefits-for-agents&quot;&gt;Benefits for Agents&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;No external docs needed: agents can query the CLI directly and construct accurate inputs&lt;/li&gt;
&lt;li&gt;Auto-adapts to API changes: when the CLI updates, the schema updates with it, so agents always work against the latest spec&lt;/li&gt;
&lt;li&gt;Pairs with dry-run: build input from the schema, validate with &lt;code&gt;--dry-run&lt;/code&gt;, then execute&lt;/li&gt;
&lt;li&gt;Self-describing: the CLI can describe itself without needing a separate AGENTS.md or tool description&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;our-implementation-overview&quot;&gt;Our Implementation Overview&lt;/h3&gt;
&lt;p&gt;In our project, we implemented it with the following structure:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Define input/output schemas on each command handler (based on Pydantic models)&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;schema&lt;/code&gt; subcommand serializes these into JSON Schema and returns them&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;--list&lt;/code&gt; option allows browsing the full resource/action tree&lt;/li&gt;
&lt;li&gt;Include an &lt;code&gt;examples&lt;/code&gt; field in schema responses so agents have something to reference&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The implementation itself wasn&apos;t particularly difficult. Since we were already defining input/output models with Pydantic, most of it was solved just by calling &lt;code&gt;.model_json_schema()&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&quot;example-agent-workflow-using-schema&quot;&gt;Example Agent Workflow Using Schema&lt;/h3&gt;
&lt;p&gt;Here&apos;s what the actual flow looks like when an agent uses the schema:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;1. Agent: mytool schema --list
   → Check available commands

2. Agent: mytool schema user.create
   → Check input schema (required fields: email, role)

3. Agent: mytool user create --json &apos;{&amp;quot;email&amp;quot;:&amp;quot;new@example.com&amp;quot;,&amp;quot;role&amp;quot;:&amp;quot;member&amp;quot;}&apos; --dry-run
   → Validate before execution

4. Agent: mytool user create --json &apos;{&amp;quot;email&amp;quot;:&amp;quot;new@example.com&amp;quot;,&amp;quot;role&amp;quot;:&amp;quot;member&amp;quot;}&apos;
   → Execute, receive JSON response

5. Agent: Use the id field from the response for the next task
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Throughout this entire flow, the agent never references documentation once. The CLI itself serves as the documentation.&lt;/p&gt;
&lt;h2 id=&quot;practical-design-patterns&quot;&gt;Practical Design Patterns&lt;/h2&gt;
&lt;p&gt;Some patterns worth considering when applying JSON I/O and Schema.&lt;/p&gt;
&lt;h3 id=&quot;input-design-choices&quot;&gt;Input Design Choices&lt;/h3&gt;
&lt;p&gt;Roughly three input approaches, pick based on the situation:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;stdin JSON&lt;/td&gt;
&lt;td&gt;&lt;code&gt;echo &amp;#39;{&amp;quot;key&amp;quot;:&amp;quot;val&amp;quot;}&amp;#39; | mytool create&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Large payloads, pipeline chaining&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;argument JSON&lt;/td&gt;
&lt;td&gt;&lt;code&gt;mytool create --json &amp;#39;{&amp;quot;key&amp;quot;:&amp;quot;val&amp;quot;}&amp;#39;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Single command execution, keeping it in shell history&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mixed&lt;/td&gt;
&lt;td&gt;&lt;code&gt;mytool create --name foo --json &amp;#39;{&amp;quot;extra&amp;quot;:&amp;quot;opts&amp;quot;}&amp;#39;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Frequently used options as flags, the rest as JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Personally, I&apos;d recommend argument JSON as the default with stdin support as well. From an agent&apos;s perspective, a self-contained single command is the easiest to work with.&lt;/p&gt;
&lt;h3 id=&quot;output-design-best-practices&quot;&gt;Output Design Best Practices&lt;/h3&gt;
&lt;p&gt;A few important principles for output design:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;--json&lt;/code&gt; flag: keep the default human-readable, but return structured output when &lt;code&gt;--json&lt;/code&gt; is passed&lt;/li&gt;
&lt;li&gt;NDJSON support: for streaming scenarios (logs, events, etc.), support line-delimited JSON&lt;/li&gt;
&lt;li&gt;Errors in JSON too: in &lt;code&gt;--json&lt;/code&gt; mode, errors should also be returned as JSON, alongside exit codes&lt;/li&gt;
&lt;li&gt;Include metadata: pagination info, request IDs, timestamps should be part of the response&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash hljs&quot;&gt;# Normal mode
$ mytool user list
EMAIL              ROLE     CREATED
alice@example.com  admin    2026-01-15
bob@example.com    member   2026-02-20

# JSON mode
$ mytool user list --json
{
  &amp;quot;data&amp;quot;: [
    {&amp;quot;email&amp;quot;: &amp;quot;alice@example.com&amp;quot;, &amp;quot;role&amp;quot;: &amp;quot;admin&amp;quot;, &amp;quot;created_at&amp;quot;: &amp;quot;2026-01-15T00:00:00Z&amp;quot;},
    {&amp;quot;email&amp;quot;: &amp;quot;bob@example.com&amp;quot;, &amp;quot;role&amp;quot;: &amp;quot;member&amp;quot;, &amp;quot;created_at&amp;quot;: &amp;quot;2026-02-20T00:00:00Z&amp;quot;}
  ],
  &amp;quot;meta&amp;quot;: {&amp;quot;total&amp;quot;: 2, &amp;quot;page&amp;quot;: 1, &amp;quot;per_page&amp;quot;: 50}
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the end, I actually went with JSON output as the default and added a &lt;code&gt;--no-json&lt;/code&gt; flag instead. If the tool isn&apos;t meant for human use, unifying all I/O as JSON gave us the best hit rate.&lt;/p&gt;
&lt;h3 id=&quot;versioning-amp-backward-compatibility&quot;&gt;Versioning &amp;amp; Backward Compatibility&lt;/h3&gt;
&lt;p&gt;Versioning JSON output needs some care. A few strategies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Adding fields is safe, removing/changing is not: adding new fields doesn&apos;t break backward compatibility, but removing fields or changing types can break agents&lt;/li&gt;
&lt;li&gt;Include a version field: putting something like &lt;code&gt;&amp;quot;api_version&amp;quot;: &amp;quot;v1&amp;quot;&lt;/code&gt; in the response lets agents branch based on version&lt;/li&gt;
&lt;li&gt;Deprecation warnings: fields slated for removal should be flagged in a separate warnings array&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;using-pydantic-zod-json-schema-for-validation&quot;&gt;Using Pydantic / Zod / JSON Schema for Validation&lt;/h3&gt;
&lt;p&gt;Some tools for schema definition:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Python&lt;/strong&gt;: Pydantic is the most convenient. Model definition → automatic JSON Schema generation → input validation, all in one&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TypeScript/Node&lt;/strong&gt;: Define schemas with Zod and convert using &lt;code&gt;zod-to-json-schema&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Go/Rust etc.&lt;/strong&gt;: Write JSON Schema files directly or use code-generation libraries&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The key point is that the type definitions used in your code and the schema returned by the schema command must come from the same source. If these are managed separately, they will inevitably drift out of sync.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Truth be told, we didn&apos;t pay much attention to this in our project. And it led to quite a few failures.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id=&quot;ai-friendly-helper-flags&quot;&gt;AI-Friendly Helper Flags&lt;/h3&gt;
&lt;p&gt;Beyond schema and JSON, a few more agent-friendly flags worth adding:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;--dry-run&lt;/code&gt;: preview results without actually executing. Lets agents safely test things out&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--explain&lt;/code&gt;: describe what the command will do in natural language. Helps with agent planning&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--output-format&lt;/code&gt;: choose between json, yaml, csv, etc.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--quiet&lt;/code&gt;: strip unnecessary banners and warnings, return only essential output&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--no-color&lt;/code&gt;: remove ANSI escape codes (honestly, every CLI should have this)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;results-lessons-and-caveats-after-adoption&quot;&gt;Results, Lessons, and Caveats After Adoption&lt;/h2&gt;
&lt;h3 id=&quot;quantitative-amp-qualitative-outcomes&quot;&gt;Quantitative &amp;amp; Qualitative Outcomes&lt;/h3&gt;
&lt;p&gt;I shared the JSON transition results earlier. Here&apos;s what changed after adding the schema command:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;JSON Only&lt;/th&gt;
&lt;th&gt;JSON + Schema&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Task success rate&lt;/td&gt;
&lt;td&gt;Around 90%&lt;/td&gt;
&lt;td&gt;~97% (almost all succeeded)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent first-try accuracy&lt;/td&gt;
&lt;td&gt;~70%&lt;/td&gt;
&lt;td&gt;~90% (honestly, this was the biggest improvement)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doc references needed&lt;/td&gt;
&lt;td&gt;Avg 1.2 per task&lt;/td&gt;
&lt;td&gt;Nearly 0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The sample size wasn&apos;t huge, so take the numbers with a grain of salt. But more importantly, the agent code got much simpler. Parsing logic disappeared and we could focus on business logic.&lt;/p&gt;
&lt;h3 id=&quot;common-failure-patterns-we-observed&quot;&gt;Common Failure Patterns We Observed&lt;/h3&gt;
&lt;p&gt;It&apos;s not perfect though. Common failure patterns and their fixes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Oversized JSON responses: when a list API returns thousands of items, it blows the agent&apos;s context window. Pagination and filtering are essential&lt;/li&gt;
&lt;li&gt;Deeply nested structures: JSON nested 5+ levels deep is hard for agents to navigate accurately. Keep it flat when possible&lt;/li&gt;
&lt;li&gt;Enum value errors: even with enums defined in the schema, agents sometimes insert similar but incorrect values. Input validation + clear error messages help&lt;/li&gt;
&lt;li&gt;optional vs required confusion: agents sometimes skip required fields. Mark required fields clearly in the schema and tell them which fields are missing in error messages&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&quot;remaining-challenges&quot;&gt;Remaining Challenges&lt;/h3&gt;
&lt;p&gt;Some unsolved problems remain:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Complex pagination: getting agents to handle cursor-based pagination smoothly remains tricky&lt;/li&gt;
&lt;li&gt;Binary data: file uploads/downloads and other binary data are hard to express cleanly in JSON&lt;/li&gt;
&lt;li&gt;Long-running operations: tracking status and handling timeouts for tasks that take several minutes&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I still haven&apos;t found great answers for these. The long-running operations problem is particularly interesting. Most agents end up polling with their own sleep loops, which is pretty inefficient. Ideally, agents should be able to receive callbacks, but that&apos;s not easy to pull off.&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Two essentials for building AI-friendly CLIs:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;JSON-First I/O: structure your inputs and outputs so agents can parse and use them reliably&lt;/li&gt;
&lt;li&gt;Schema Command: let the CLI describe its own interface, eliminating the dependency on external docs&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Just these two things made a noticeable difference in agent performance. If you want to start right away, the easiest entry point is adding a single &lt;code&gt;--json&lt;/code&gt; flag to your existing CLI. That alone makes agent integration much smoother.&lt;/p&gt;
&lt;p&gt;CLIs are back.&lt;/p&gt;
&lt;p&gt;&lt;img loading=&quot;lazy&quot; src=&quot;cli.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
</description>
      <pubDate>Sun, 22 Mar 2026 00:00:00 +0000</pubDate>
    </item>
    <item>
      <title>10 Years of Reflection and a New Beginning</title>
      <link>https://www.hahwul.com/posts/2026/10years/</link>
      <guid>https://www.hahwul.com/posts/2026/10years/</guid>
      <description>&lt;p&gt;Hello everyone! This is my first post of 2026. I wanted to publish this back in January, but various things kept piling up and the work took longer than expected, so I&apos;m finally getting it out now.&lt;/p&gt;
&lt;p&gt;As we hit 2026, there&apos;s something pretty cool: amazingly, it&apos;s been exactly 10 years since I started using this domain and going by the name hahwul. It feels like such a long time when I think about it, but it passed by in the blink of an eye. Today, along with looking back on these 10 years, I also have some fairly big changes to share regarding the blog&apos;s operation, content, and overall direction.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Domain Name: hahwul.com
Registry Domain ID: 1992489747_DOMAIN_COM-VRSN
...
Updated Date: 2026-01-05T20:29:07Z
Creation Date: 2016-01-07T23:55:03Z
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&quot;10-years-ago&quot;&gt;10 Years Ago&lt;/h2&gt;
&lt;p&gt;Truth be told, even before using the name hahwul, I ran another domain for about two years, and before that, I had a different blog during my kiddie days. So all in all, I&apos;ve been writing online for well over 15 years now. Of course, those early blogs were just basic development notes where I organized stuff. The thing that really brought me to where I am today is the name hahwul and this blog.&lt;/p&gt;
&lt;p&gt;I made up the name hahwul based on my own name, and over these past 10 years, I&apos;ve written a huge number of posts with it. After a couple of major cleanups here and there, I&apos;ve still managed around 1,400 articles. Most of them were about security and development topics, along with my personal struggles, thoughts, and reflections. The habit of putting things into writing has been a great way to truly absorb knowledge, and I feel like those years were incredibly meaningful for learning and growing.&lt;/p&gt;
&lt;p&gt;I couldn&apos;t have kept writing all this time through sheer consistency alone — it&apos;s been possible thanks to all of you who kept coming back to read my posts. Thank you from the bottom of my heart!&lt;/p&gt;
&lt;div class=&quot;images-full-width&quot;&gt;


&lt;div class=&quot;images-grid&quot;&gt;
    
    &lt;div class=&quot;images-grid-item&quot;&gt;
        &lt;img src=&quot;images/1.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;
    &lt;/div&gt;
    
    &lt;div class=&quot;images-grid-item&quot;&gt;
        &lt;img src=&quot;images/2.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;
    &lt;/div&gt;
    
    &lt;div class=&quot;images-grid-item&quot;&gt;
        &lt;img src=&quot;images/3.png&quot; alt=&quot;&quot; loading=&quot;lazy&quot;&gt;
    &lt;/div&gt;
    
&lt;/div&gt;
&lt;/div&gt;
&lt;h2 id=&quot;announcement&quot;&gt;Announcement&lt;/h2&gt;
&lt;p&gt;I have one announcement regarding the direction of my content going forward. While the blog has been more of a general resource covering all sorts of information until now, I want to shift toward sharing more of my own stories and personal thoughts from here on out.&lt;/p&gt;
&lt;p&gt;Of course, that doesn&apos;t mean the existing style of posts will vanish. They&apos;ll just be reorganized. Posts that are too outdated or too light will be removed, and I&apos;ll focus on creating more solid, substantial content.&lt;/p&gt;
&lt;p&gt;Under &lt;strong&gt;Posts&lt;/strong&gt;, I&apos;ll occasionally share my thoughts and opinions on technology. Under &lt;strong&gt;Notes&lt;/strong&gt;, I&apos;ll keep more standardized and well-polished articles. And if I have enough time and energy, I&apos;d also like to add content in series format (e.g., a ZAP Guide series, etc.).&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;Anyway, to sum it up, I want to say thank you once again. I&apos;ll keep writing steadily into the future, so please stick around and enjoy the ride!&lt;/p&gt;
&lt;p&gt;&lt;img loading=&quot;lazy&quot; src=&quot;images/2026.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
</description>
      <pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate>
    </item>
  </channel>
</rss>
