Simon Willison's Weblog
- Author
- Simon Willison
- Public lists
-
Featured
- Fetched
Disclosures
I've added a Disclosures section to my about page, listing my various sources of income and the companies that directly sponsor my work or have supported it in the recent past.
I do not receive any compensation writing about specific topics on this blog - no sponsored content! I plan to continue this policy. If I ever change this I will disclose that both here and in the post itself. [...]
I see my credibility as one of my most valuable assets, so it's important to be transparent about how financial interests may influence my writing here.
I took inspiration from Molly White's disclosures page.
Tags: blogging, molly-white
Phoenix.new is Fly's entry into the prompt-driven app development space
Quoting Kent Beck
So you you can think really big thoughts and the leverage of having those big thoughts has just suddenly expanded enormously. I had this tweet two years ago where I said "90% of my skills just went to zero dollars and 10% of my skills just went up 1000x". And this is exactly what I'm talking about - having a vision, being able to set milestones towards that vision, keeping track of a design to maintain or control the levels of complexity as you go forward. Those are hugely leveraged skills now compared to knowing where to put the amperands and the stars and the brackets in Rust.
— Kent Beck, interview with Gergely Orosz
Tags: gergely-orosz, ai-assisted-programming, ai, careers
My First Open Source AI Generated Library
Edit is now open source
model.yaml
Quoting FAQ for Your Brain on ChatGPT
Is it safe to say that LLMs are, in essence, making us "dumber"?
No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", and so on. It does a huge disservice to this work, as we did not use this vocabulary in the paper, especially if you are a journalist reporting on it.
— FAQ for Your Brain on ChatGPT, a paper that has attracted a lot of low quality coverage
Tags: ai-ethics, llms, ai, generative-ai
AbsenceBench: Language Models Can't Tell What's Missing
Magenta RealTime: An Open-Weights Live Music Model
Agentic Misalignment: How LLMs could be insider threats
Mistral-Small 3.2
python-importtime-graph
Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk
playbackrate
Here's a tip that works on YouTube and almost any other web page that shows you a video. You can increase the playback rate beyond the usually-exposed 2x by running this in your browser DevTools console:
document.querySelector('video').playbackRate = 2.5
I find this is the fastest I can reasonably watch most videos at, with subtitles on to help my comprehension - it turns a 40 minute video into just 16 minutes, short enough that I don't feel too guilty taking time off whatever else I'm doing to watch it!
Tags: youtube, video, javascript
How OpenElections Uses LLMs
Clarified zucchini consommé
I continue to have fun running fantasy cooking prompts through LLMs - this time I tried "Give me a wildly ambitious recipe for zucchini cooked three ways" followed by "Go more ambitious" and now I need to get myself a centrifuge to help spherify my clarified zucchini consommé.
Tags: llms, cooking, ai, generative-ai
Quoting Arvind Narayanan
Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which human radiologists beat AI. So maybe the "jobs are bundles of tasks" model in labor economics is incomplete. [...]
Can you break up your own job into a set of well-defined tasks such that if each of them is automated, your job as a whole can be automated? I suspect most people will say no. But when we think about other people's jobs that we don't understand as well as our own, the task model seems plausible because we don't appreciate all the nuances.
Tags: ai-ethics, careers, ai, arvind-narayanan
Quoting Workaccount2 on Hacker News
They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output quality falls off rapidly. Even with good context the rot will start to become apparent around 100k tokens (with Gemini 2.5).
They really need to figure out a way to delete or "forget" prior context, so the user or even the model can go back and prune poisonous tokens.
Right now I work around it by regularly making summaries of instances, and then spinning up a new instance with fresh context and feed in the summary of the previous instance.
— Workaccount2 on Hacker News, coining "context rot"
Tags: long-context, llms, ai, generative-ai
Coding agents require skilled operators
I counted all of the yurts in Mongolia using machine learning
I counted all of the yurts in Mongolia using machine learning
Fascinating, detailed account by Monroe Clinton of a geospatial machine learning project. Monroe wanted to count visible yurts in Mongolia using Google Maps satellite view. The resulting project incorporates mercantile for tile calculations, Label Studio for help label the first 10,000 examples, a model trained on top of YOLO11 and a bunch of clever custom Python code to co-ordinate a brute force search across 120 CPU workers running the model.Via Hacker News
Tags: machine-learning, geospatial, ai, python
It's a trap
That memvid thing that's been going around recently is a trap. It's an embedding store that records the original text that has been embedded in QR codes in a video file. That's an absurd thing to do, and the only purpose of the repo is to make people who uncritically share it look foolish. Don't fall for the trap.
Tags: jokes
Trying out the new Gemini 2.5 model family
Quoting Donghee Na
The Steering Council (SC) approves PEP 779 [Criteria for supported status for free-threaded Python], with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14 [...]
With these recommendations and the acceptance of this PEP, we as the Python developer community should broadly advertise that free-threading is a supported Python build option now and into the future, and that it will not be removed without following a proper deprecation schedule. [...]
Keep in mind that any decision to transition to Phase III, with free-threading as the default or sole build of Python is still undecided, and dependent on many factors both within CPython itself and the community. We leave that decision for the future.
— Donghee Na, discuss.python.org
100% effective
Cloudflare Project Galileo
Quoting Paul Biggar
In conversation with our investors and the board, we believed that the best way forward was to shut down the company [Dark, Inc], as it was clear that an 8 year old product with no traction was not going to attract new investment. In our discussions, we agreed that continuity of the product [Darklang] was in the best interest of the users and the community (and of both founders and investors, who do not enjoy being blamed for shutting down tools they can no longer afford to run), and we agreed that this could best be achieved by selling it to the employees.
— Paul Biggar, Goodbye Dark Inc. - Hello Darklang Inc.
The lethal trifecta for AI agents: private data, untrusted content, and external communication
Quoting Joshua Barretto
I am a huge fan of Richard Feyman’s famous quote:
“What I cannot create, I do not understand”
I think it’s brilliant, and it remains true across many fields (if you’re willing to be a little creative with the definition of ‘create’). It is to this principle that I believe I owe everything I’m truly good at. Some will tell you should avoid reinventing the wheel, but they’re wrong: you should build your own wheel, because it’ll teach you more about how they work than reading a thousand books on them ever will.
— Joshua Barretto, Writing Toy Software is a Joy
Tags: careers, programming