I get a feeling that working with multiple AI agents is something that comes VERY natural to most senior+ engineers or tech lead who worked at a large company
You already got used to overseeing parallel work (the goto code reviewer!) + making progress with small chunks of work... because your day has been a series of nonstop interactions, so you had to figure out how to do deep work in small chunks that could have been interrupted
TIL: Testing different Python versions with uv with-editable and uv-test
While tinkering with upgrading various projects to handle Python 3.14 I finally figured out a universal uv recipe for running the tests for the current project in any specified version of Python:
uv run...
This isn't necessarily surprising, but it's worth noting anyway. Claude Sonnet 4.5 is capable of building a full Datasette plugin now.
I've seen models complete aspects of this in the past, but today is the first time I've shipped a new plugin where every line of code and te...
Miguel Grinberg uses some basic benchmarks (like fib(40)) to test the new Python 3.14 on Linux and macOS and finds some substantial speedups over Python 3.13 - around 27% faster.
The optional JIT didn't make a meaningful difference to his benchmarks. On a threaded benchmark he got 3.09x speedup with 4 threads using the free threading build - for Python 3.13 the free threading build only provided a 2.2x improvement.
The cognitive debt of LLM-laden coding extends beyond disengagement of our craft. We’ve all heard the stories. Hyped up, vibed up, slop-jockeys with attention spans shorter than the framework-hopping JavaScript devs of the early 2010s, sling their sludge in pull requests and...
Why NetNewsWire Is Not a Web App
In the wake of Apple removing ICEBlock from the App Store, Brent Simmons talks about why he still thinks his veteran (and actively maintained) NetNewsWire feed reader app should remain a native application.
Part of the reason is cost - NetNew...
Python 3.14
This year's major Python version, Python 3.14, just made its first stable release!
As usual the what's new in Python 3.14 document is the best place to get familiar with the new release:
The biggest changes include template string literals, deferred ev...
I retracted this article after confirming that, while Gemini 2.5 Computer Use did try to solve Google's own CAPTCHA, it was actually Browserbase that successfully performed the solve. I'll update this with more details shortly.
For quite some I wanted to write a small static image gallery so I can share my pictures with friends and family. Of course there are a gazillion tools like this, but, well, sometimes I just want to roll my own. [...]
I used the old, well tested technique I call brain coding, where you start with an empty vim buffer and type some code (Perl, HTML, CSS) until you're happy with the result. It helps to think a bit (aka use your brain) during this process.
I feel like vibe coding is pretty well established now as covering the fast, loose and irresponsible way of building software with AI - entirely prompt-driven, and with no attention paid to how the code actually works. This leaves us with a terminology gap: what should we ca...
Deloitte to pay money back to Albanese government after using AI in $440,000 report
Ouch:
Deloitte will provide a partial refund to the federal government over a $440,000 report that contained several errors, after admitting it used generative artificial intelligence to hel...
gpt-image-1-mini
OpenAI released a new image model today: gpt-image-1-mini, which they describe as "A smaller image generation model that’s 80% less expensive than the large model."
They released it very quietly - I didn't hear about this in the DevDay keynote but I later sp...
GPT-5 pro
Here's OpenAI's model documentation for their GPT-5 pro model, released to their API today at their DevDay event.
It has similar base characteristics to GPT-5: both share a September 30, 2024 knowledge cutoff and 400,000 context limit.
GPT-5 pro has maximum output ...
I'm at OpenAI DevDay in Fort Mason, San Francisco today. As I did last year, I'm going to be live blogging the announcements from the kenote. Unlike last year, this year there's a livestream.
I believed that giving users such a simple way to navigate the internet would unlock creativity and collaboration on a global scale. If you could put anything on it, then after a while, it would have everything on it.
But for the web to have everything on it, everyone had to be able to use it, and want to do so. This was already asking a lot. I couldn’t also ask that they pay for each search or upload they made. In order to succeed, therefore, it would have to be free. That’s why, in 1993, I convinced my Cern managers to donate the intellectual property of the world wide web, putting it into the public domain. We gave the web away to everyone.
— Tim Berners-Lee, Why I gave the world wide web away for free
Two of my public Datasette instances - for my TILs and my blog's backup mirror - were getting hammered with misbehaving bot traffic today. Scaling them up to more Fly instances got them running again but I'd rather not pay extra just so bots can crawl me harder.
The log file...
For a while now I've been hearing from engineers who run multiple coding agents at once - firing up several Claude Code or Codex CLI instances at the same time, sometimes in the same repo, sometimes against multiple checkouts or git worktrees.
I was pretty skeptical about th...
Let the LLM Write the Prompts: An Intro to DSPy in Compound Al Pipelines
I've had trouble getting my head around DSPy in the past. This half hour talk by Drew Breunig at the recent Databricks Data + AI Summit is the clearest explanation I've seen yet of the kinds of problems...
Litestream v0.5.0 is Here
I've been running Litestream to backup SQLite databases in production for a couple of years now without incident. The new version has been a long time coming - Ben Johnson took a detour into the FUSE-based LiteFS before deciding that the single bina...
It turns out Sora 2 is vulnerable to prompt injection!
When you onboard to Sora you get the option to create your own "cameo" - a virtual video recreation of yourself. Here's mine singing opera at the Royal Albert Hall.
You can use your cameo in your own generated videos, an...
Daniel Stenberg's note on AI assisted curl bug reports
Curl maintainer Daniel Stenberg on Mastodon:
Joshua Rogers sent us a massive list of potential issues in #curl that he found using his set of AI assisted tools. Code analyzer style nits all over. Mostly smaller bugs, bu...
When attention is being appropriated, producers need to weigh the costs and benefits of the transaction. To assess whether the appropriation of attention is net-positive, it’s useful to distinguish between extractive and non-extractive contributions. Extractive contributions are those where the marginal cost of reviewing and merging that contribution is greater than the marginal benefit to the project’s producers. In the case of a code contribution, it might be a pull request that’s too complex or unwieldy to review, given the potential upside
— Nadia Eghbal, Working in Public, via the draft LLVM AI tools policy
Two new models from Chinese AI labs in the past few days. I tried them both out using llm-openrouter:
DeepSeek-V3.2-Exp from DeepSeek. Announcement, Tech Report, Hugging Face (690GB, MIT license).
As an intermediate step toward our next-generation architecture, V3.2-Exp bui...
aavetis/PRarena
Albert Avetisian runs this repository on GitHub which uses the Github Search API to track the number of PRs that can be credited to a collection of different coding agents. The repo runs this collect_data.py script every three hours using GitHub Actions to co...
Having watched this morning's Sora 2 introduction video, the most notable feature (aside from audio generation - original Sora was silent, Google's Veo 3 supported audio in May 2025) looks to be what OpenAI are calling "cameos" - the ability to easily capture a video version...
Coding agents like Anthropic's Claude Code and OpenAI's Codex CLI represent a genuine step change in how useful LLMs can be for producing working code. These agents can now directly exercise the code they are writing, correct errors, dig through existing implementation detai...
Anthropic released Claude Sonnet 4.5 today, with a very bold set of claims:
Claude Sonnet 4.5 is the best coding model in the world. It's the strongest model for building complex agents. It’s the best model at using computers. And it shows substantial gains in reasoning and...
Armin Ronacher: 90%
The idea of AI writing "90% of the code" to-date has mostly been expressed by people who sell AI tooling.
Over the last few months, I've increasingly seen the same idea come coming much more credible sources.
Armin is the creator of a bewildering array of...