Sign up

Simon Willison's Weblog

Not verified No WebSub updates Supports Webmention Not yet validated

Author
Simon Willison
Public lists
Featured
Fetched

Simon Willison's Weblog Supports Webmention

Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult

Anthropic released Claude Opus 4.5 this morning, which they call "best model in the world for coding, agents, and computer use". This is their attempt to retake the crown for best coding model after significant challenges from OpenAI's GPT-5.1-Codex-Max and Google's Gemini 3...

Simon Willison's Weblog Supports Webmention

sqlite-utils 3.39

sqlite-utils 3.39 I got a report of a bug in sqlite-utils concerning plugin installation - if you installed the package using uv tool install further attempts to install plugins with sqlite-utils install X would fail, because uv doesn't bundle pip by default. I had the same ...

Simon Willison's Weblog Supports Webmention

sqlite-utils 4.0a1 has several (minor) backwards incompatible changes

I released a new alpha version of sqlite-utils last night - the 128th release of that package since I started building it back in 2018. sqlite-utils is two things in one package: a Python library for conveniently creating and manipulating SQLite databases and a CLI tool for ...

Simon Willison's Weblog Supports Webmention

"Good engineering management" is a fad

"Good engineering management" is a fad Will Larson argues that the technology industry's idea of what makes a good engineering manager changes over time based on industry realities. ZIRP hypergrowth has been exchanged for a more cautious approach today, and expectations of m...

Simon Willison's Weblog Supports Webmention

Agent design is still hard

Agent design is still hard Armin Ronacher presents a cornucopia of lessons learned from building agents over the past few months. There are several agent abstraction libraries available now (my own LLM library is edging into that territory with its tools feature) but Armin h...

Simon Willison's Weblog Supports Webmention

Olmo 3 is a fully open LLM

Olmo is the LLM series from Ai2 - the Allen institute for AI. Unlike most open weight models these are notable for including the full training data, training process and checkpoints along with those releases. The new Olmo 3 claims to be "the best fully open 32B-scale thinkin...

Simon Willison's Weblog Supports Webmention

We should all be using dependency cooldowns

We should all be using dependency cooldowns William Woodruff gives a name to a sensible strategy for managing dependencies while reducing the chances of a surprise supply chain attack: dependency cooldowns. Supply chain attacks happen when an attacker compromises a widely us...

Simon Willison's Weblog Supports Webmention

Nano Banana Pro aka gemini-3-pro-image-preview is the best available image generation model

Hot on the heels of Tuesday's Gemini 3 Pro release, today it's Nano Banana Pro, also known as Gemini 3 Pro Image. I've had a few days of preview access and this is an astonishingly capable image generation model. As is often the case, the most useful low-level details can be...

Simon Willison's Weblog Supports Webmention

Quoting Nicholas Carlini

Previously, when malware developers wanted to go and monetize their exploits, they would do exactly one thing: encrypt every file on a person's computer and request a ransome to decrypt the files. In the future I think this will change. LLMs allow attackers to instead proce...

Simon Willison's Weblog Supports Webmention

Building more with GPT-5.1-Codex-Max

Building more with GPT-5.1-Codex-Max Hot on the heels of yesterday's Gemini 3 Pro release comes a new model from OpenAI called GPT-5.1-Codex-Max. (Remember when GPT-5 was meant to bring in a new era of less confusing model names? That didn't last!) It's currently only availa...

Simon Willison's Weblog Supports Webmention

How I automate my Substack newsletter with content from my blog

I sent out my weekly-ish Substack newsletter this morning and took the opportunity to record a YouTube video demonstrating my process and describing the different components that make it work. There's a lot of digital duct tape involved, taking the content from Django+Heroku...

Simon Willison's Weblog Supports Webmention

Quoting Matthew Prince

Cloudflare's network began experiencing significant failures to deliver core network traffic [...] triggered by a change to one of our database systems' permissions which caused the database to output multiple entries into a “feature file” used by our Bot Management system. That feature file, in turn, doubled in size. The larger-than-expected feature file was then propagated to all the machines that make up our network. [...] The software had a limit on the size of the feature file that was below its doubled size. That caused the software to fail.

Matthew Prince, Cloudflare outage on November 18, 2025

Tags: scaling, postmortem, cloudflare

Simon Willison's Weblog Supports Webmention

llm-gemini 0.27

llm-gemini 0.27

New release of my LLM plugin for Google's Gemini models:
  • Support for nested schemas in Pydantic, thanks Bill Pugh. #107
  • Now tests against Python 3.14.
  • Support for YouTube URLs as attachments and the media_resolution option. Thanks, Duane Milne. #112
  • New model: gemini-3-pro-preview. #113

The YouTube URL feature is particularly neat, taking advantage of this API feature. I used it against the Google Antigravity launch video:

llm -m gemini-3-pro-preview \
 -a 'https://www.youtube.com/watch?v=nTOVIGsqCuY' \
 'Summary, with detailed notes about what this thing is and how it differs from regular VS Code, then a complete detailed transcript with timestamps'

Here's the result. A spot-check of the timestamps against points in the video shows them to be exactly right.

Tags: projects, youtube, ai, generative-ai, llms, llm, gemini

Simon Willison's Weblog Supports Webmention

MacWhisper has Automatic Speaker Recognition now

Inspired by this conversation on Hacker News I decided to upgrade MacWhisper to try out NVIDIA Parakeet and the new Automatic Speaker Recognition feature. It appears to work really well! Here's the result against this 39.7MB m4a file from my Gemini 3 Pro write-up this morning: You can export the transcript with both timestamps and speaker names using the Share -> Segments > .json menu item: Here's the resulting JSON. Tags: whisper, nvidia, ai, speech-to-text, macwhisper

Simon Willison's Weblog Supports Webmention

Google Antigravity

Google Antigravity Google's other major release today to accompany Gemini 3 Pro. At first glance Antigravity is yet another VS Code fork Cursor clone - it's a desktop application you install that then signs in to your Google account and provides an IDE for agentic coding aga...

Simon Willison's Weblog Supports Webmention

Quoting Ethan Mollick

Three years ago, we were impressed that a machine could write a poem about otters. Less than 1,000 days later, I am debating statistical methodology with an agent that built its own research environment. The era of the chatbot is turning into the era of the digital coworker. To be very clear, Gemini 3 isn’t perfect, and it still needs a manager who can guide and check it. But it suggests that “human in the loop” is evolving from “human who fixes AI mistakes” to “human who directs AI work.” And that may be the biggest change since the release of ChatGPT.

Ethan Mollick, Three Years from GPT-3 to Gemini 3

Tags: gemini, ethan-mollick, generative-ai, chatgpt, ai, llms, ai-agents

Simon Willison's Weblog Supports Webmention

Trying out Gemini 3 Pro with audio transcription and a new pelican benchmark

Google released Gemini 3 Pro today. Here's the announcement from Sundar Pichai, Demis Hassabis, and Koray Kavukcuoglu, their developer blog announcement from Logan Kilpatrick, the Gemini 3 Pro Model Card, and their collection of 11 more articles. It's a big release! I had a ...

Simon Willison's Weblog Supports Webmention

The fate of “small” open source

The fate of “small” open source Nolan Lawson asks if LLM assistance means that the category of tiny open source libraries like his own blob-util is destined to fade away. Why take on additional supply chain risks adding another dependency when an LLM can likely kick out the ...

Simon Willison's Weblog Supports Webmention

Quoting Andrej Karpathy

With AI now, we are able to write new programs that we could never hope to write by hand before. We do it by specifying objectives (e.g. classification accuracy, reward functions), and we search the program space via gradient descent to find neural networks that work well a...

Simon Willison's Weblog Supports Webmention

llm-anthropic 0.22

llm-anthropic 0.22

New release of my llm-anthropic plugin:

The plugin previously powered LLM schemas using this tool-call based workaround. That code is still used for Anthropic's older models.

I also figured out uv recipes for running the plugin's test suite in an isolated environment, which are now baked into the new Justfile.

Tags: projects, python, ai, generative-ai, llms, llm, anthropic, claude, uv

Simon Willison's Weblog Supports Webmention

parakeet-mlx

parakeet-mlx

Neat MLX project by Senstella bringing NVIDIA's Parakeet ASR (Automatic Speech Recognition, like Whisper) model to to Apple's MLX framework.

It's packaged as a Python CLI tool, so you can run it like this:

uvx parakeet-mlx default_tc.mp3

The first time I ran this it downloaded a 2.5GB model file.

Once that was fetched it took 53 seconds to transcribe a 65MB 1hr 1m 28s podcast episode (this one) and produced this default_tc.srt file with a timestamped transcript of the audio I fed into it. The quality appears to be very high.

Tags: python, ai, nvidia, uv, mlx, speech-to-text

Simon Willison's Weblog Supports Webmention

GPT-5.1 Instant and GPT-5.1 Thinking System Card Addendum

GPT-5.1 Instant and GPT-5.1 Thinking System Card Addendum I was confused about whether the new "adaptive thinking" feature of GPT-5.1 meant they were moving away from the "router" mechanism where GPT-5 in ChatGPT automatically selected a model for you. This page addresses th...

Simon Willison's Weblog Supports Webmention

Introducing GPT-5.1 for developers

Introducing GPT-5.1 for developers OpenAI announced GPT-5.1 yesterday, calling it A smarter, more conversational ChatGPT. Today they've added it to their API. We actually got four new models today: gpt-5.1 gpt-5.1-chat-latest gpt-5.1-codex gpt-5.1-codex-mini There are a lo...

Simon Willison's Weblog Supports Webmention

Datasette 1.0a22

Datasette 1.0a22

New Datasette 1.0 alpha, adding some small features we needed to properly integrate the new permissions system with Datasette Cloud:

Plus a developer experience improvement for plugin authors:

Tags: projects, datasette, datasette-cloud, annotated-release-notes

Simon Willison's Weblog Supports Webmention

Nano Banana can be prompt engineered for extremely nuanced AI image generation

Nano Banana can be prompt engineered for extremely nuanced AI image generation Max Woolf provides an exceptional deep dive into Google's Nano Banana aka Gemini 2.5 Flash Image model, still the best available image manipulation LLM tool three months after its initial release....

Simon Willison's Weblog Supports Webmention

Quoting Nov 12th letter from OpenAI to Judge Ona T. Wang

On Monday, this Court entered an order requiring OpenAI to hand over to the New York Times and its co-plaintiffs 20 million ChatGPT user conversations [...] OpenAI is unaware of any court ordering wholesale production of personal information at this scale. This sets a dange...

Simon Willison's Weblog Supports Webmention

What happens if AI labs train for pelicans riding bicycles?

Almost every time I share a new example of an SVG of a pelican riding a bicycle a variant of this question pops up: how do you know the labs aren't training for your benchmark? The strongest argument is that they would get caught. If a model finally comes out that produces a...

Simon Willison's Weblog Supports Webmention

Quoting Steve Krouse

The fact that MCP is a difference surface from your normal API allows you to ship MUCH faster to MCP. This has been unlocked by inference at runtime

Normal APIs are promises to developers, because developer commit code that relies on those APIs, and then walk away. If you break the API, you break the promise, and you break that code. This means a developer gets woken up at 2am to fix the code

But MCP servers are called by LLMs which dynamically read the spec every time, which allow us to constantly change the MCP server. It doesn't matter! We haven't made any promises. The LLM can figure it out afresh every time

Steve Krouse

Tags: model-context-protocol, generative-ai, steve-krouse, apis, ai, llms

Simon Willison's Weblog Supports Webmention

Fun-reliable side-channels for cross-container communication

Fun-reliable side-channels for cross-container communication Here's a very clever hack for communicating between different processes running in different containers on the same machine. It's based on clever abuse of POSIX advisory locks which allow a process to create and de...

Simon Willison's Weblog Supports Webmention

Agentic Pelican on a Bicycle

Agentic Pelican on a Bicycle Robert Glaser took my pelican riding a bicycle benchmark and applied an agentic loop to it, seeing if vision models could draw a better pelican if they got the chance to render their SVG to an image and then try again until they were happy with t...