Practical notes on DevOps and infrastructure. What we're working on, what we're testing, and what we think is worth sharing.
We publish when we have something worth saying, not on a schedule. Everything here comes from real work and genuine curiosity: tools we've tested, experiments we've run, and findings we think are worth sharing. No sponsored content, no AI-generated filler. If it's here, one of us wrote it and stands behind it.
Standard APM shows latency and token counts. Purpose-built LLM observability adds prompt and response content. Neither showed that one run was truncated. What it took to find that out.
Four models, two experiments, and a cost spread that didn't predict what mattered. What the data showed about cheap models, omitted warnings, and why the most interesting divergence had nothing to do with price.
Three incidents, four models, and a scoring rubric that missed what mattered. What routing experiments revealed about cheap models, silent failures, and what to optimize for at 2am.
I used Claude Code to build a free, private, browser-based PDF signing tool. Two hours to a working prototype, six more on the design. Here's how it went.
I ran 1,000 identical API calls across five time windows and timed a provider migration twice: with an abstraction layer and without. The latency finding was expected. The behaviour change wasn't.
I tested four configurations (three MCP skills plus bare Claude) against five real infrastructure tasks. The token math is real, but it's not the most interesting finding.
We're always happy to have a straightforward conversation about infrastructure, tooling, or where your DevOps practice is headed.
Get a Free Consultation