Sign up

Simon Willison's Weblog

Not verified No WebSub updates Supports Webmention Not yet validated

Author
Simon Willison
Public lists
Featured
Fetched

Simon Willison's Weblog Supports Webmention

First run the tests

Agentic Engineering Patterns > Automated tests are no longer optional when working with coding agents. The old excuses for not writing them - that they're time consuming and expensive to constantly rewrite while a codebase is rapidly evolving - no longer hold when an ...

Simon Willison's Weblog Supports Webmention

Ladybird adopts Rust, with help from AI

Ladybird adopts Rust, with help from AI Really interesting case-study from Andreas Kling on advanced, sophisticated use of coding agents for ambitious coding projects with critical code. After a few years hoping Swift's platform support outside of the Apple ecosystem would m...

Simon Willison's Weblog Supports Webmention

Writing about Agentic Engineering Patterns

I've started a new project to collect and document Agentic Engineering Patterns - coding practices and patterns to help get the best results out of this new era of coding agent development we find ourselves entering. I'm using Agentic Engineering to refer to building softwar...

Simon Willison's Weblog Supports Webmention

Writing code is cheap now

Agentic Engineering Patterns > The biggest challenge in adopting agentic engineering practices is getting comfortable with the consequences of the fact that writing code is cheap now. Code has always been expensive. Producing a few hundred lines of clean, tested code ...

Simon Willison's Weblog Supports Webmention

Quoting Paul Ford

The paper asked me to explain vibe coding, and I did so, because I think something big is coming there, and I'm deep in, and I worry that normal people are not able to see it and I want them to be prepared. But people can't just read something and hate you quietly; they can't see that you have provided them with a utility or a warning; they need their screech. You are distributed to millions of people, and become the local proxy for the emotions of maybe dozens of people, who disagree and demand your attention, and because you are the one in the paper you need to welcome them with a pastor's smile and deep empathy, and if you speak a word in your own defense they'll screech even louder.

Paul Ford, on writing about vibe coding for the New York Times

Tags: vibe-coding, new-york-times, paul-ford

Simon Willison's Weblog Supports Webmention

Reply guy

The latest scourge of Twitter is AI bots that reply to your tweets with generic, banal commentary slop, often accompanied by a question to "drive engagement" and waste as much of your time as possible.

I just found out that the category name for this genre of software is reply guy tools. Amazing.

Tags: ai-ethics, twitter, slop, generative-ai, definitions, ai, llms

Simon Willison's Weblog Supports Webmention

Quoting Summer Yue

Nothing humbles you like telling your OpenClaw “confirm before acting” and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.

I said “Check this inbox too and suggest what you would archive or delete, don’t action until I tell you to.” This has been working well for my toy inbox, but my real inbox was too huge and triggered compaction. During the compaction, it lost my original instruction 🤦‍♀️

Summer Yue

Tags: ai-ethics, generative-ai, ai-agents, openclaw, ai, llms

Simon Willison's Weblog Supports Webmention

Red/green TDD

Agentic Engineering Patterns > "Use red/green TDD" is a pleasingly succinct way to get better results out of a coding agent. TDD stands for Test Driven Development. It's a programming style where you ensure every piece of code you write is accompanied by automated tes...

Simon Willison's Weblog Supports Webmention

The Claude C Compiler: What It Reveals About the Future of Software

The Claude C Compiler: What It Reveals About the Future of Software On February 5th Anthropic's Nicholas Carlini wrote about a project to use parallel Claudes to build a C compiler on top of the brand new Opus 4.6 Chris Lattner (Swift, LLVM, Clang, Mojo) knows more about C c...

Simon Willison's Weblog Supports Webmention

London Stock Exchange: Raspberry Pi Holdings plc

London Stock Exchange: Raspberry Pi Holdings plc Striking graph illustrating stock in the UK Raspberry Pi holding company spiking on Tuesday: The Telegraph credited excitement around OpenClaw: Raspberry Pi's stock price has surged 30pc in two days, amid chatter on social ...

Simon Willison's Weblog Supports Webmention

How I think about Codex

How I think about Codex Gabriel Chua (Developer Experience Engineer for APAC at OpenAI) provides his take on the confusing terminology behind the term "Codex", which can refer to a bunch of of different things within the OpenAI ecosystem: In plain terms, Codex is OpenAI’s s...

Simon Willison's Weblog Supports Webmention

Quoting Thibault Sottiaux

We’ve made GPT-5.3-Codex-Spark about 30% faster. It is now serving at over 1200 tokens per second.

Thibault Sottiaux, OpenAI

Tags: openai, llms, ai, generative-ai

Simon Willison's Weblog Supports Webmention

Andrej Karpathy talks about "Claws"

Andrej Karpathy talks about "Claws" Andrej Karpathy tweeted a mini-essay about buying a Mac Mini ("The apple store person told me they are selling like hotcakes and everyone is confused") to tinker with Claws: I'm definitely a bit sus'd to run OpenClaw specifically [...] Bu...

Simon Willison's Weblog Supports Webmention

Adding TILs, releases, museums, tools and research to my blog

I've been wanting to add indications of my various other online activities to my blog for a while now. I just turned on a new feature I'm calling "beats" (after story beats, naming this was hard!) which adds five new types of content to my site, all corresponding to activity...

Simon Willison's Weblog Supports Webmention

Taalas serves Llama 3.1 8B at 17,000 tokens/second

Taalas serves Llama 3.1 8B at 17,000 tokens/second

This new Canadian hardware startup just announced their first product - a custom hardware implementation of the Llama 3.1 8B model (from July 2024) that can run at a staggering 17,000 tokens/second.

I was going to include a video of their demo but it's so fast it would look more like a screenshot. You can try it out at chatjimmy.ai.

They describe their Silicon Llama as “aggressively quantized, combining 3-bit and 6-bit parameters.” Their next generation will use 4-bit - presumably they have quite a long lead time for baking out new models!

Via Hacker News

Tags: ai, generative-ai, llama, llms

Simon Willison's Weblog Supports Webmention

Recovering lost code

Reached the stage of parallel agent psychosis where I've lost a whole feature - I know I had it yesterday, but I can't seem to find the branch or worktree or cloud instance or checkout with it in.

... found it! Turns out I'd been hacking on a random prototype in /tmp and then my computer crashed and rebooted and I lost the code... but it's all still there in ~/.claude/projects/ session logs and Claude Code can extract it out and spin up the missing feature again.

Tags: parallel-agents, coding-agents, claude-code, generative-ai, ai, llms

Simon Willison's Weblog Supports Webmention

ggml.ai joins Hugging Face to ensure the long-term progress of Local AI

ggml.ai joins Hugging Face to ensure the long-term progress of Local AI I don't normally cover acquisition news like this, but I have some thoughts. It's hard to overstate the impact Georgi Gerganov has had on the local model space. Back in March 2023 his release of llama.cp...

Simon Willison's Weblog Supports Webmention

Quoting Thariq Shihipar

Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost. [...]

At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they're too low.

Thariq Shihipar

Tags: prompt-engineering, anthropic, claude-code, ai-agents, generative-ai, ai, llms

Simon Willison's Weblog Supports Webmention

Gemini 3.1 Pro

Gemini 3.1 Pro The first in the Gemini 3.1 series, priced the same as Gemini 3 Pro ($2/million input, $12/million output under 200,000 tokens, $4/$18 for 200,000 to 1,000,000). That's less than half the price of Claude Opus 4.6 with very similar benchmark scores to that mode...

Simon Willison's Weblog Supports Webmention

Experimenting with sponsorship for my blog and newsletter

I've long been resistant to the idea of accepting sponsorship for my blog. I value my credibility as an independent voice, and I don't want to risk compromising that reputation. Then I learned about Troy Hunt's approach to sponsorship, which he first wrote about in 2016. Tro...

Simon Willison's Weblog Supports Webmention

SWE-bench February 2025 leaderboard update

SWE-bench February 2025 leaderboard update SWE-bench is one of the benchmarks that the labs love to list in their model releases. The official leaderboard is infrequently updated but they just did a full run of it against the current generation of models, which is notable be...

Simon Willison's Weblog Supports Webmention

LadybirdBrowser/ladybird: Abandon Swift adoption

LadybirdBrowser/ladybird: Abandon Swift adoption

Back in August 2024 the Ladybird browser project announced an intention to adopt Swift as their memory-safe language of choice.

As of this commit it looks like they've changed their mind:

Everywhere: Abandon Swift adoption

After making no progress on this for a very long time, let's acknowledge it's not going anywhere and remove it from the codebase.

Via Hacker News

Tags: ladybird, swift

Simon Willison's Weblog Supports Webmention

Typing without having to type

25+ years into my career as a programmer I think I may finally be coming around to preferring type hints or even strong typing. I resisted those in the past because they slowed down the rate at which I could iterate on code, especially in the REPL environments that were key to my productivity. But if a coding agent is doing all that typing for me, the benefits of explicitly defining all of those types are suddenly much more attractive.

Tags: ai-assisted-programming, programming, programming-languages

Simon Willison's Weblog Supports Webmention

The A.I. Disruption We’ve Been Waiting for Has Arrived

The A.I. Disruption We’ve Been Waiting for Has Arrived New opinion piece from Paul Ford in the New York Times. Unsurprisingly for a piece by Paul it's packed with quoteworthy snippets, but a few stood out for me in particular. Paul describes the November moment that so many ...

Simon Willison's Weblog Supports Webmention

Quoting Martin Fowler

LLMs are eating specialty skills. There will be less use of specialist front-end and back-end developers as the LLM-driving skills become more important than the details of platform usage. Will this lead to a greater recognition of the role of Expert Generalists? Or will the ability of LLMs to write lots of code mean they code around the silos rather than eliminating them?

Martin Fowler, tidbits from the Thoughtworks Future of Software Development Retreat, via HN)

Tags: martin-fowler, careers, generative-ai, ai, llms, ai-assisted-programming

Simon Willison's Weblog Supports Webmention

Introducing Claude Sonnet 4.6

Introducing Claude Sonnet 4.6 Sonnet 4.6 is out today, and Anthropic claim it offers similar performance to November's Opus 4.5 while maintaining the Sonnet pricing of $3/million input and $15/million output tokens (the Opus models are $5/$25). Here's the system card PDF. I ...

Simon Willison's Weblog Supports Webmention

Rodney v0.4.0

Rodney v0.4.0 My Rodney CLI tool for browser automation attracted quite the flurry of PRs since I announced it last week. Here are the release notes for the just-released v0.4.0: Errors now use exit code 2, which means exit code 1 is just for for check failures. #15 New r...

Simon Willison's Weblog Supports Webmention

Quoting ROUGH DRAFT 8/2/66

This is the story of the United Space Ship Enterprise. Assigned a five year patrol of our galaxy, the giant starship visits Earth colonies, regulates commerce, and explores strange new worlds and civilizations. These are its voyages... and its adventures.

ROUGH DRAFT 8/2/66, before the Star Trek opening narration reached its final form

Tags: screen-writing, science-fiction

Simon Willison's Weblog Supports Webmention

First kākāpō chick in four years hatches on Valentine's Day

First kākāpō chick in four years hatches on Valentine's Day First chick of the 2026 breeding season! Kākāpō Yasmine hatched an egg fostered from kākāpō Tīwhiri on Valentine's Day, bringing the total number of kākāpō to 237 – though it won’t be officially added to the popula...

Simon Willison's Weblog Supports Webmention

Quoting Dimitris Papailiopoulos

But the intellectually interesting part for me is something else. I now have something close to a magic box where I throw in a question and a first answer comes back basically for free, in terms of human effort. Before this, the way I'd explore a new idea is to either clums...