Sign up

Simon Willison's Weblog

Not verified No WebSub updates Supports Webmention Not yet validated

Author
Simon Willison
Public lists
Featured
Fetched

Simon Willison's Weblog Supports Webmention

Is Claude Code going to cost $100/month? Probably not - it's all very confusing

Anthropic today quietly (as in silently, no announcement anywhere at all) updated their claude.com/pricing page (but not their Choosing a Claude plan page, which shows up first for me on Google) to add this tiny but significant detail (arrow is mine, and it's already reverte...

Simon Willison's Weblog Supports Webmention

Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

OpenAI released ChatGPT Images 2.0 today, their latest image generation model. On the livestream Sam Altman said that the leap from gpt-image-1 to gpt-image-2 was equivalent to jumping from GPT-3 to GPT-5. Here's how I put it to the test. My prompt: Do a where's Waldo style...

Simon Willison's Weblog Supports Webmention

Quoting Andreas Påhlsson-Notini

AI agents are already too human. Not in the romantic sense, not because they love or fear or dream, but in the more banal and frustrating one. The current implementations keep showing their human origin again and again: lack of stringency, lack of patience, lack of focus. Faced with an awkward task, they drift towards the familiar. Faced with hard constraints, they start negotiating with reality.

Andreas Påhlsson-Notini, Less human AI agents, please.

Tags: ai-agents, coding-agents, ai

Simon Willison's Weblog Supports Webmention

scosman/pelicans_riding_bicycles

scosman/pelicans_riding_bicycles

I firmly approve of Steve Cosman's efforts to pollute the training set of pelicans riding bicycles.

The heading says "Pelican Riding a Bicycle #1 - the image is a bear on a snowboard

(To be fair, most of the examples I've published count as poisoning too.)

Via Hacker News comment

Tags: ai, generative-ai, llms, training-data, pelican-riding-a-bicycle

Simon Willison's Weblog Supports Webmention

llm-openrouter 0.6

Release: llm-openrouter 0.6

  • llm openrouter refresh command for refreshing the list of available models without waiting for the cache to expire.

I added this feature so I could try Kimi 2.6 on OpenRouter as soon as it became available there.

Here's its pelican - this time as an HTML page because Kimi chose to include an HTML and JavaScript UI to control the animation. Transcript here.

The bicycle is about right. The pelican is OK. It is pedaling furiously and flapping its wings a bit. Controls below the animation provide a pause button and sliders for controlling the speed and the wing flap.

Tags: openrouter, llm, llm-release, pelican-riding-a-bicycle, kimi, ai-in-china, llms, ai, generative-ai

Simon Willison's Weblog Supports Webmention

SQL functions in Google Sheets to fetch data from Datasette

TIL: SQL functions in Google Sheets to fetch data from Datasette

I put together some notes on patterns for fetching data from a Datasette instance directly into Google Sheets - using the importdata() function, a "named function" that wraps it or a Google Apps Script if you need to send an API token in an HTTP header (not supported by importdata().)

Here's an example sheet demonstrating all three methods.

Tags: spreadsheets, datasette, google

Simon Willison's Weblog Supports Webmention

Claude Token Counter, now with model comparisons

Claude Token Counter, now with model comparisons I upgraded my Claude Token Counter tool to add the ability to run the same count against different models in order to compare them. As far as I can tell Claude Opus 4.7 is the first model to change the tokenizer, so it's only ...

Simon Willison's Weblog Supports Webmention

Headless everything for personal AI

Headless everything for personal AI Matt Webb thinks headless services are about to become much more common: Why? Because using personal AIs is a better experience for users than using services directly (honestly); and headless services are quicker and more dependable for t...

Simon Willison's Weblog Supports Webmention

Changes in the system prompt between Claude Opus 4.6 and 4.7

Anthropic are the only major AI lab to publish the system prompts for their user-facing chat systems. Their system prompt archive now dates all the way back to Claude 3 in July 2024 and it's always interesting to see how the system prompt evolves as they publish new models. ...

Simon Willison's Weblog Supports Webmention

Claude system prompts as a git timeline

Research: Claude system prompts as a git timeline

Anthropic publish the system prompts for Claude chat and make that page available as Markdown. I had Claude Code turn that page into separate files for each model and model family with fake git commit dates to enable browsing the changes via the GitHub commit view.

Tags: system-prompts, anthropic, claude, generative-ai, ai, llms

Simon Willison's Weblog Supports Webmention

Adding a new content type to my blog-to-newsletter tool

Agentic Engineering Patterns > Here's an example of a deceptively short prompt that got a lot of work done in a single shot. First, some background. I send out a free Substack newsletter around once a week containing content copied-and-pasted from my blog. I'm effecti...

Simon Willison's Weblog Supports Webmention

Join us at PyCon US 2026 in Long Beach - we have new AI and security tracks this year

This year's PyCon US is coming up next month from May 13th to May 19th, with the core conference talks from Friday 15th to Sunday 17th and tutorial and sprint days either side. It's in Long Beach, California this year, the first time PyCon has come to the US West Coast since...

Simon Willison's Weblog Supports Webmention

datasette 1.0a28

Release: datasette 1.0a28 I was upgrading Datasette Cloud to 1.0a27 and discovered a nasty collection of accidental breakages caused by changes in that alpha. This new alpha addresses those directly: Fixed a compatibility bug introduced in 1.0a27 where execute_wri...

Simon Willison's Weblog Supports Webmention

llm-anthropic 0.25

Release: llm-anthropic 0.25

  • New model: claude-opus-4.7, which supports thinking_effort: xhigh. #66
  • New thinking_display and thinking_adaptive boolean options. thinking_display summarized output is currently only available in JSON output or JSON logs.
  • Increased default max_tokens to the maximum allowed for each model.
  • No longer uses obsolete structured-outputs-2025-11-13 beta header for older models.

Tags: llm, anthropic, claude

Simon Willison's Weblog Supports Webmention

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

For anyone who has been taking my pelican riding a bicycle benchmark seriously as a robust way to test models, here are pelicans from this morning's two big model releases - Qwen3.6-35B-A3B from Alibaba and Claude Opus 4.7 from Anthropic. Here's the Qwen 3.6 pelican, generat...

Simon Willison's Weblog Supports Webmention

datasette.io news preview

Tool: datasette.io news preview The datasette.io website has a news section built from this news.yaml file in the underlying GitHub repository. The YAML format looks like this: - date: 2026-04-15 body: |- [Datasette 1.0a27](https://docs.datasette.io/en/latest/chang...

Simon Willison's Weblog Supports Webmention

datasette-export-database 0.3a1

Release: datasette-export-database 0.3a1

This plugin was using the ds_csrftoken cookie as part of a custom signed URL, which needed upgrading now that Datasette 1.0a27 no longer sets that cookie.

Tags: datasette

Simon Willison's Weblog Supports Webmention

datasette 1.0a27

Release: datasette 1.0a27 Two major changes in this new Datasette alpha. I covered the first of those in detail yesterday - Datasette no longer uses Django-style CSRF form tokens, instead using modern browser headers as described by Filippo Valsorda. The second big chang...

Simon Willison's Weblog Supports Webmention

Quoting John Gruber

The real goldmine isn’t that Apple gets a cut of every App Store transaction. It’s that Apple’s platforms have the best apps, and users who are drawn to the best apps are thus drawn to the iPhone, Mac, and iPad. That edge is waning. Not because software on other platforms is getting better, but because third-party software on iPhone, Mac, and iPad is regressing to the mean, to some extent, because fewer developers feel motivated — artistically, financially, or both — to create well-crafted idiomatic native apps exclusively for Apple’s platforms.

John Gruber

Tags: apple, john-gruber

Simon Willison's Weblog Supports Webmention

Gemini 3.1 Flash TTS

Gemini 3.1 Flash TTS Google released Gemini 3.1 Flash TTS today, a new text-to-speech model that can be directed using prompts. It's presented via the standard Gemini API using gemini-3.1-flash-tts-preview as the model ID, but can only output audio files. The prompting guide...

Simon Willison's Weblog Supports Webmention

Gemini 3.1 Flash TTS

Tool: Gemini 3.1 Flash TTS

See my notes on Google's new Gemini 3.1 Flash TTS text-to-speech model.

Tags: gemini, google

Simon Willison's Weblog Supports Webmention

datasette-ports 0.3

Release: datasette-ports 0.3

A small update for my tool for helping me figure out what all of the Datasette instances on my laptop are up to.

  • Show working directory derived from each PID
  • Show the full path to each database file

Output now looks like this:

http://127.0.0.1:8007/ - v1.0a26
  Directory: /Users/simon/dev/blog
  Databases:
    simonwillisonblog: /Users/simon/dev/blog/simonwillisonblog.db
  Plugins:
    datasette-llm
    datasette-secrets
http://127.0.0.1:8001/ - v1.0a26
  Directory: /Users/simon/dev/creatures
  Databases:
    creatures: /tmp/creatures.db

Tags: datasette

Simon Willison's Weblog Supports Webmention

Quoting Kyle Kingsbury

I think we will see some people employed (though perhaps not explicitly) as meat shields: people who are accountable for ML systems under their supervision. The accountability may be purely internal, as when Meta hires human beings to review the decisions of automated moderation systems. It may be external, as when lawyers are penalized for submitting LLM lies to the court. It may involve formalized responsibility, like a Data Protection Officer. It may be convenient for a company to have third-party subcontractors, like Buscaglia, who can be thrown under the bus when the system as a whole misbehaves.

Kyle Kingsbury, The Future of Everything is Lies, I Guess: New Jobs

Tags: ai-ethics, careers, ai, kyle-kingsbury

Simon Willison's Weblog Supports Webmention

Zig 0.16.0 release notes: "Juicy Main"

Zig 0.16.0 release notes: "Juicy Main" Zig has really good release notes - comprehensive, detailed, and with relevant usage examples for each of the new features. Of particular note in the newly released Zig 0.16.0 is what they are calling "Juicy Main" - a dependency injecti...

Simon Willison's Weblog Supports Webmention

datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection

datasette PR #2689: Replace token-based CSRF with Sec-Fetch-Site header protection Datasette has long protected against CSRF attacks using CSRF tokens, implemented using my asgi-csrf Python library. These are something of a pain to work with - you need to scatter forms in te...

Simon Willison's Weblog Supports Webmention

Trusted access for the next era of cyber defense

Trusted access for the next era of cyber defense OpenAI's answer to Claude Mythos appears to be a new model called GPT-5.4-Cyber: In preparation for increasingly more capable models from OpenAI over the next few months, we are fine-tuning our models specifically to enable d...

Simon Willison's Weblog Supports Webmention

Cybersecurity Looks Like Proof of Work Now

Cybersecurity Looks Like Proof of Work Now The UK's AI Safety Institute recently published Our evaluation of Claude Mythos Preview’s cyber capabilities, their own independent analysis of Claude Mythos which backs up Anthropic's claims that it is exceptionally effective at id...

Simon Willison's Weblog Supports Webmention

Quoting Steve Yegge

The TL;DR is that Google engineering appears to have the same AI adoption footprint as John Deere, the tractor company. Most of the industry has the same internal adoption curve: 20% agentic power users, 20% outright refusers, 60% still using Cursor or equivalent chat tool. It turns out Google has this curve too. [...]

There has been an industry-wide hiring freeze for 18+ months, during which time nobody has been moving jobs. So there are no clued-in people coming in from the outside to tell Google how far behind they are, how utterly mediocre they have become as an eng org.

Steve Yegge

Tags: steve-yegge, google, generative-ai, agentic-engineering, ai, llms

Simon Willison's Weblog Supports Webmention

Exploring the new `servo` crate

Research: Exploring the new `servo` crate In Servo is now available on crates.io the Servo team announced the initial release of the servo crate, which packages their browser engine as an embeddable library. I set Claude Code for web the task of figuring out what it can ...

Simon Willison's Weblog Supports Webmention

Quoting Bryan Cantrill

The problem is that LLMs inherently lack the virtue of laziness. Work costs nothing to an LLM. LLMs do not feel a need to optimize for their own (or anyone's) future time, and will happily dump more and more onto a layercake of garbage. Left unchecked, LLMs will make systems larger, not better — appealing to perverse vanity metrics, perhaps, but at the cost of everything that matters.

As such, LLMs highlight how essential our human laziness is: our finite time forces us to develop crisp abstractions in part because we don't want to waste our (human!) time on the consequences of clunky ones.

Bryan Cantrill, The peril of laziness lost

Tags: bryan-cantrill, ai, llms, ai-assisted-programming, generative-ai