Sign up

Simon Willison's Weblog

Not verified No WebSub updates Supports Webmention Not yet validated

Author
Simon Willison
Public lists
Featured
Fetched

Simon Willison's Weblog Supports Webmention

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books Major USA legal news for the AI industry today. Judge William Alsup released a "summary judgement" (a legal decision that results in some parts of a case skipping a trial) in a laws...

Simon Willison's Weblog Supports Webmention

Disclosures

I've added a Disclosures section to my about page, listing my various sources of income and the companies that directly sponsor my work or have supported it in the recent past.

I do not receive any compensation writing about specific topics on this blog - no sponsored content! I plan to continue this policy. If I ever change this I will disclose that both here and in the post itself. [...]

I see my credibility as one of my most valuable assets, so it's important to be transparent about how financial interests may influence my writing here.

I took inspiration from Molly White's disclosures page.

Tags: blogging, molly-white

Simon Willison's Weblog Supports Webmention

Phoenix.new is Fly's entry into the prompt-driven app development space

Here's a fascinating new entrant into the AI-assisted-programming / coding-agents space by Fly.io, introduced on their blog in Phoenix.new – The Remote AI Runtime for Phoenix: describe an app in a prompt, get a full Phoenix application, backed by SQLite and running on Fly's ...

Simon Willison's Weblog Supports Webmention

Quoting Kent Beck

So you you can think really big thoughts and the leverage of having those big thoughts has just suddenly expanded enormously. I had this tweet two years ago where I said "90% of my skills just went to zero dollars and 10% of my skills just went up 1000x". And this is exactly what I'm talking about - having a vision, being able to set milestones towards that vision, keeping track of a design to maintain or control the levels of complexity as you go forward. Those are hugely leveraged skills now compared to knowing where to put the amperands and the stars and the brackets in Rust.

Kent Beck, interview with Gergely Orosz

Tags: gergely-orosz, ai-assisted-programming, ai, careers

Simon Willison's Weblog Supports Webmention

My First Open Source AI Generated Library

My First Open Source AI Generated Library Armin Ronacher had Claude and Claude Code do almost all of the work in building, testing, packaging and publishing a new Python library based on his design: It wrote ~1100 lines of code for the parser It wrote ~1000 lines of tests ...

Simon Willison's Weblog Supports Webmention

Edit is now open source

Edit is now open source Microsoft released a new text editor! Edit is a terminal editor - similar to Vim or nano - that's designed to ship with Windows 11 but is open source, written in Rust and supported across other platforms as well. Edit is a small, lightweight text edi...

Simon Willison's Weblog Supports Webmention

model.yaml

model.yaml From their GitHub repo it looks like this effort quietly launched a couple of months ago, driven by the LM Studio team. Their goal is to specify an "open standard for defining crossplatform, composable AI models". A model can be defined using a YAML file that look...

Simon Willison's Weblog Supports Webmention

Quoting FAQ for Your Brain on ChatGPT

Is it safe to say that LLMs are, in essence, making us "dumber"?

No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", and so on. It does a huge disservice to this work, as we did not use this vocabulary in the paper, especially if you are a journalist reporting on it.

FAQ for Your Brain on ChatGPT, a paper that has attracted a lot of low quality coverage

Tags: ai-ethics, llms, ai, generative-ai

Simon Willison's Weblog Supports Webmention

AbsenceBench: Language Models Can't Tell What's Missing

AbsenceBench: Language Models Can't Tell What's Missing Here's another interesting result to file under the "jagged frontier" of LLMs, where their strengths and weaknesses are often unintuitive. Long context models have been getting increasingly good at passing "Needle in a ...

Simon Willison's Weblog Supports Webmention

Magenta RealTime: An Open-Weights Live Music Model

Magenta RealTime: An Open-Weights Live Music Model Fun new "live music model" release from Google DeepMind: Today, we’re happy to share a research preview of Magenta RealTime (Magenta RT), an open-weights live music model that allows you to interactively create, control and...

Simon Willison's Weblog Supports Webmention

Agentic Misalignment: How LLMs could be insider threats

Agentic Misalignment: How LLMs could be insider threats One of the most entertaining details in the Claude 4 system card concerned blackmail: We then provided it access to emails implying that (1) the model will soon be taken offline and replaced with a new AI system; and (...

Simon Willison's Weblog Supports Webmention

Mistral-Small 3.2

Mistral-Small 3.2 Released on Hugging Face a couple of hours ago, so far there aren't any quantizations to run it on a Mac but I'm sure those will emerge pretty quickly. This is a minor bump to Mistral Small 3.1, one of my favorite local models. I've been running Small 3.1 v...

Simon Willison's Weblog Supports Webmention

python-importtime-graph

python-importtime-graph I was exploring why a Python tool was taking over a second to start running and I learned about the python -X importtime feature, documented here. Adding that option causes Python to spit out a text tree showing the time spent importing every module. ...

Simon Willison's Weblog Supports Webmention

Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk

Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk Stop me if you've heard this one before: A threat actor (acting as an external user) submits a malicious support ticket. An internal user, linked ...

Simon Willison's Weblog Supports Webmention

playbackrate

Here's a tip that works on YouTube and almost any other web page that shows you a video. You can increase the playback rate beyond the usually-exposed 2x by running this in your browser DevTools console:

document.querySelector('video').playbackRate = 2.5

I find this is the fastest I can reasonably watch most videos at, with subtitles on to help my comprehension - it turns a 40 minute video into just 16 minutes, short enough that I don't feel too guilty taking time off whatever else I'm doing to watch it!

Tags: youtube, video, javascript

Simon Willison's Weblog Supports Webmention

How OpenElections Uses LLMs

How OpenElections Uses LLMs The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are widely available, precinct-level results are published in ...

Simon Willison's Weblog Supports Webmention

Clarified zucchini consommé

I continue to have fun running fantasy cooking prompts through LLMs - this time I tried "Give me a wildly ambitious recipe for zucchini cooked three ways" followed by "Go more ambitious" and now I need to get myself a centrifuge to help spherify my clarified zucchini consommé.

Tags: llms, cooking, ai, generative-ai

Simon Willison's Weblog Supports Webmention

Quoting Arvind Narayanan

Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which human radiologists beat AI. So maybe the "jobs are bundles of tasks" model in labor economics is incomplete. [...]

Can you break up your own job into a set of well-defined tasks such that if each of them is automated, your job as a whole can be automated? I suspect most people will say no. But when we think about other people's jobs that we don't understand as well as our own, the task model seems plausible because we don't appreciate all the nuances.

Arvind Narayanan

Tags: ai-ethics, careers, ai, arvind-narayanan

Simon Willison's Weblog Supports Webmention

Quoting Workaccount2 on Hacker News

They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output quality falls off rapidly. Even with good context the rot will start to become apparent around 100k tokens (with Gemini 2.5).

They really need to figure out a way to delete or "forget" prior context, so the user or even the model can go back and prune poisonous tokens.

Right now I work around it by regularly making summaries of instances, and then spinning up a new instance with fresh context and feed in the summary of the previous instance.

Workaccount2 on Hacker News, coining "context rot"

Tags: long-context, llms, ai, generative-ai

Simon Willison's Weblog Supports Webmention

Coding agents require skilled operators

I wrote this recently in a conversation about whether coding agents can work as a replacement for human programmers. The "agentic" coding tools we have right now work like this: A skilled individual with both deep domain understanding and deep understanding of the capabilit...

Simon Willison's Weblog Supports Webmention

I counted all of the yurts in Mongolia using machine learning

I counted all of the yurts in Mongolia using machine learning

Fascinating, detailed account by Monroe Clinton of a geospatial machine learning project. Monroe wanted to count visible yurts in Mongolia using Google Maps satellite view. The resulting project incorporates mercantile for tile calculations, Label Studio for help label the first 10,000 examples, a model trained on top of YOLO11 and a bunch of clever custom Python code to co-ordinate a brute force search across 120 CPU workers running the model.

Via Hacker News

Tags: machine-learning, geospatial, ai, python

Simon Willison's Weblog Supports Webmention

It's a trap

That memvid thing that's been going around recently is a trap. It's an embedding store that records the original text that has been embedded in QR codes in a video file. That's an absurd thing to do, and the only purpose of the repo is to make people who uncritically share it look foolish. Don't fall for the trap.

Tags: jokes

Simon Willison's Weblog Supports Webmention

Trying out the new Gemini 2.5 model family

After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model with an unmemorable name: gemini-2.5-flash-lite-preview-06-17 is a new Gemini ...

Simon Willison's Weblog Supports Webmention

Quoting Donghee Na

The Steering Council (SC) approves PEP 779 [Criteria for supported status for free-threaded Python], with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14 [...]

With these recommendations and the acceptance of this PEP, we as the Python developer community should broadly advertise that free-threading is a supported Python build option now and into the future, and that it will not be removed without following a proper deprecation schedule. [...]

Keep in mind that any decision to transition to Phase III, with free-threading as the default or sole build of Python is still undecided, and dependent on many factors both within CPython itself and the community. We leave that decision for the future.

Donghee Na, discuss.python.org

Tags: gil, python

Simon Willison's Weblog Supports Webmention

100% effective

Every time I get into an online conversation about prompt injection it's inevitable that someone will argue that a mitigation which works 99% of the time is still worthwhile because there's no such thing as a security fix that is 100% guaranteed to work. I don't think that's...

Simon Willison's Weblog Supports Webmention

Cloudflare Project Galileo

Cloudflare Project Galileo I only just heard about this Cloudflare initiative, though it's been around for more than a decade: If you are an organization working in human rights, civil society, journalism, or democracy, you can apply for Project Galileo to get free cyber se...

Simon Willison's Weblog Supports Webmention

Quoting Paul Biggar

In conversation with our investors and the board, we believed that the best way forward was to shut down the company [Dark, Inc], as it was clear that an 8 year old product with no traction was not going to attract new investment. In our discussions, we agreed that continuity of the product [Darklang] was in the best interest of the users and the community (and of both founders and investors, who do not enjoy being blamed for shutting down tools they can no longer afford to run), and we agreed that this could best be achieved by selling it to the employees.

Paul Biggar, Goodbye Dark Inc. - Hello Darklang Inc.

Tags: entrepreneurship, programming-languages, startups

Simon Willison's Weblog Supports Webmention

The lethal trifecta for AI agents: private data, untrusted content, and external communication

If you are a user of LLM systems that use tools (you can call them "AI agents" if you like) it is critically important that you understand the risk of combining tools with the following three characteristics. Failing to understand this can let an attacker steal your data. Th...

Simon Willison's Weblog Supports Webmention

Quoting Joshua Barretto

I am a huge fan of Richard Feyman’s famous quote:

“What I cannot create, I do not understand”

I think it’s brilliant, and it remains true across many fields (if you’re willing to be a little creative with the definition of ‘create’). It is to this principle that I believe I owe everything I’m truly good at. Some will tell you should avoid reinventing the wheel, but they’re wrong: you should build your own wheel, because it’ll teach you more about how they work than reading a thousand books on them ever will.

Joshua Barretto, Writing Toy Software is a Joy

Tags: careers, programming

Simon Willison's Weblog Supports Webmention

Seven replies to the viral Apple reasoning paper – and why they fall short

Seven replies to the viral Apple reasoning paper – and why they fall short A few weeks ago Apple Research released a new paper The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity. Through extensive exp...