FeedCity: Simon Willison's Weblog

An agent is an LLM wrecking its environment in a loop

Solomon Hykes just presented the best definition of an AI agent I've seen yet, on stage at the AI Engineer World's Fair:

An AI agent is an LLM wrecking its environment in a loop.

I collect AI agent definitions and I really like this how this one combines the currently popular "tools in a loop" one (see Anthropic) with the classic academic definition that I think dates back to at least the 90s:

An agent is something that acts in an environment; it does something. Agents include worms, dogs, thermostats, airplanes, robots, humans, companies, and countries.

Tags: ai-agents, llms, ai, generative-ai

Simon Willison's Weblog
05 Jun 14:27

OpenAI slams court order to save all ChatGPT logs, including deleted chats

OpenAI slams court order to save all ChatGPT logs, including deleted chats This is very worrying. The New York Times v OpenAI lawsuit, now in its 17th month, includes accusations that OpenAI's models can output verbatim copies of New York Times content - both from training d...

Simon Willison's Weblog
05 Jun 10:57

Cracking The Dave & Buster’s Anomaly

Cracking The Dave & Buster’s Anomaly

Guilherme Rambo reports on a weird iOS messages bug:

The bug is that, if you try to send an audio message using the Messages app to someone who’s also using the Messages app, and that message happens to include the name “Dave and Buster’s”, the message will never be received.

Guilherme captured the logs from an affected device and spotted an XHTMLParseFailure error.

It turned out the iOS automatic transcription mechanism was recognizing the brand name and converting it to the official restaurant chain's preferred spelling "Dave & Buster’s"... which was then incorrectly escaped and triggered a parse error!

Tags: xml, ios, xhtml

Simon Willison's Weblog
04 Jun 00:24

PR #537: Fix Markdown in og descriptions

PR #537: Fix Markdown in og descriptions Since OpenAI Codex is now available to us ChatGPT Plus subscribers I decided to try it out against my blog. It's a very nice implementation of the GitHub-connected coding "agent" pattern, as also seen in Google's Jules and Microsoft's...

Simon Willison's Weblog
03 Jun 21:18

Codex agent internet access

Codex agent internet access Sam Altman, just now: codex gets access to the internet today! it is off by default and there are complex tradeoffs; people should read about the risks carefully and use when it makes sense. This is the Codex "cloud-based software engineering ag...

Simon Willison's Weblog
03 Jun 20:06

Datasette Public Office Hours: Tools in LLM

We're hosting the sixth in our series of Datasette Public Office Hours livestream sessions this Friday, 6th of June at 2pm PST (here's that time in your location).

The topic is going to be tool support in LLM, as introduced here.

I'll be walking through the new features, and we're also inviting five minute lightning demos from community members who are doing fun things with the new capabilities. If you'd like to present one of those please get in touch via this form.

Here's a link to add it to Google Calendar.

Tags: datasette-public-office-hours, llm, datasette, generative-ai, llm-tool-use, ai, llms

Simon Willison's Weblog
03 Jun 19:09

Tips on prompting ChatGPT for UK technology secretary Peter Kyle

Back in March New Scientist reported on a successful Freedom of Information request they had filed requesting UK Secretary of State for Science, Innovation and Technology Peter Kyle's ChatGPT logs: New Scientist has obtained records of Kyle’s ChatGPT use under the Freedom o...

Simon Willison's Weblog
03 Jun 18:21

Run Your Own AI

Run Your Own AI

Anthony Lewis published this neat, concise tutorial on using my LLM tool to run local models on your own machine, using llm-mlx.

An under-appreciated way to contribute to open source projects is to publish unofficial guides like this one. Always brightens my day when something like this shows up.

Via @anthonyllewis.bsky.social

Tags: open-source, llm, generative-ai, mlx, ai, llms

Simon Willison's Weblog
03 Jun 05:12

Quoting Benjamin Breen

By making effort an optional factor in higher education rather than the whole point of it, LLMs risk producing a generation of students who have simply never experienced the feeling of focused intellectual work. Students who have never faced writer's block are also students who have never experienced the blissful flow state that comes when you break through writer's block. Students who have never searched fruitlessly in a library for hours are also students who, in a fundamental and distressing way, simply don't know what a library is even for.

— Benjamin Breen, AI makes the humanities more important, but also a lot weirder

Tags: ai-ethics, generative-ai, benjamin-breen, education, ai, llms

Simon Willison's Weblog
03 Jun 05:00

Shisa V2 405B: Japan’s Highest Performing LLM

Shisa V2 405B: Japan’s Highest Performing LLM Leonard Lin and Adam Lensenmayer have been working on Shisa for a while. They describe their latest release as "Japan's Highest Performing LLM". Shisa V2 405B is the highest-performing LLM ever developed in Japan, and surpasses ...

Simon Willison's Weblog
03 Jun 00:12

My AI Skeptic Friends Are All Nuts

My AI Skeptic Friends Are All Nuts Thomas Ptacek's frustrated tone throughout this piece perfectly captures how it feels sometimes to be an experienced programmer trying to argue that "LLMs are actually really useful" in many corners of the internet. Some of the smartest pe...

Simon Willison's Weblog
02 Jun 19:00

Directive prologues and JavaScript dark matter

Directive prologues and JavaScript dark matter

Tom MacWright does some archaeology and describes the three different magic comment formats that can affect how JavaScript/TypeScript files are processed:

"a directive"; is a directive prologue, most commonly seen with "use strict";.

/** @aPragma */ is a pragma for a transpiler, often used for /** @jsx h */.

//# aMagicComment is usually used for source maps - //# sourceMappingURL=<url> - but also just got used by v8 for their new explicit compile hints feature.

Via Jim Nielsen

Tags: typescript, tom-macwright, javascript, v8, programming-languages

Simon Willison's Weblog
02 Jun 19:00

Quoting Kenton Varda

It took me a few days to build the library [cloudflare/workers-oauth-provider] with AI. I estimate it would have taken a few weeks, maybe months to write by hand. That said, this is a pretty ideal use case: implementing a well-known standard on a well-known platform with a ...

Simon Willison's Weblog
02 Jun 18:12

claude-trace

claude-trace I've been thinking for a while it would be interesting to run some kind of HTTP proxy against the Claude Code CLI app and take a peek at how it works. Mario Zechner just published a really nice version of that. It works by monkey-patching global.fetch and the No...

Simon Willison's Weblog
02 Jun 04:51

Quoting u/xfnk24001

My constant struggle is how to convince them that getting an education in the humanities is not about regurgitating ideas/knowledge that already exist. It’s about generating new knowledge, striving for creative insights, and having thoughts that haven’t been had before. I don’t want you to learn facts. I want you to think. To notice. To question. To reconsider. To challenge. Students don’t yet get that ChatGPT only rearranges preexisting ideas, whether they are accurate or not.

And even if the information was guaranteed to be accurate, they’re not learning anything by plugging a prompt in and turning in the resulting paper. They’ve bypassed the entire process of learning.

— u/xfnk24001

Tags: generative-ai, chatgpt, education, ai, llms, ai-ethics

Simon Willison's Weblog
01 Jun 05:48

May 2025 on GitHub

OK, May was a busy month for coding on GitHub. I blame tool support!

Tags: github, llm

Simon Willison's Weblog
01 Jun 05:06

Progressive JSON

Progressive JSON This post by Dan Abramov is a trap! It proposes a fascinating way of streaming JSON objects to a client in a way that provides the shape of the JSON before the stream has completed, then fills in the gaps as more data arrives... and then turns out to be a sn...

Simon Willison's Weblog
31 May 22:03

How often do LLMs snitch? Recreating Theo's SnitchBench with LLM

A fun new benchmark just dropped! Inspired by the Claude 4 system card - which showed that Claude 4 might just rat you out to the authorities if you told it to "take initiative" in enforcing its morals values while exposing it to evidence of malfeasance - Theo Browne built a...

Simon Willison's Weblog
31 May 21:33

deepseek-ai/DeepSeek-R1-0528

deepseek-ai/DeepSeek-R1-0528 Sadly the trend for terrible naming of models has infested the Chinese AI labs as well. DeepSeek-R1-0528 is a brand new and much improved open weights reasoning model from DeepSeek, a major step up from the DeepSeek R1 they released back in Janua...

Simon Willison's Weblog
31 May 14:36

No build frontend is so much more fun

If you've found web development frustrating over the past 5-10 years, here's something that has worked worked great for me: give yourself permission to avoid any form of frontend build system (so no npm / React / TypeScript / JSX / Babel / Vite / Tailwind etc) and code in HTML and JavaScript like it's 2009.

The joy came flooding back to me! It turns out browser APIs are really good now.

You don't even need jQuery to paper over the gaps any more - use document.querySelectorAll() and fetch() directly and see how much value you can build with a few dozen lines of code.

Tags: css, javascript, web-development, frontend, html

Simon Willison's Weblog
31 May 14:36

Quoting Steve Krouse

There's a new kind of coding I call "hype coding" where you fully give into the hype, and what's coming right around the corner, that you lose sight of whats' possible today. Everything is changing so fast that nobody has time to learn any tool, but we should aim to use as m...

Simon Willison's Weblog
31 May 03:57

Using voice mode on Claude Mobile Apps

Using voice mode on Claude Mobile Apps Anthropic are rolling out voice mode for the Claude apps at the moment. Sadly I don't have access yet - I'm looking forward to this a lot, I frequently use ChatGPT's voice mode when walking the dog and it's a great way to satisfy my cur...

Simon Willison's Weblog
30 May 14:27

Talking AI and jobs with Natasha Zouves for News Nation

I was interviewed by News Nation's Natasha Zouves about the very complicated topic of how we should think about AI in terms of threatening our jobs and careers. I previously talked with Natasha two years ago about Microsoft Bing. I'll be honest: I was nervous about this one...

Simon Willison's Weblog
29 May 21:30

Saying Bye to Glitch

Saying Bye to Glitch Pirijan, co-creator of Glitch - who stopped working on it six years ago, so has the benefit of distance: Here lies Glitch, a place on the web you could go to write up a website or a node.js server that would be hosted and updated as you type. 🥀 RIP 2015...

Simon Willison's Weblog
29 May 05:00

llm-github-models 0.15

llm-github-models 0.15 Anthony Shaw's llm-github-models plugin just got an upgrade: it now supports LLM 0.26 tool use for a subset of the models hosted on the GitHub Models API, contributed by Caleb Brose. The neat thing about this GitHub Models plugin is that it picks up an...

Simon Willison's Weblog
29 May 05:00

First monthly sponsor newsletter tomorrow

I'll be sending out my first curated monthly highlights newsletter tomorrow, only to $10/month and up sponsors. Sign up now if you want to pay me to send you less!

My weekly-ish newsletter remains free, in fact I just sent out the latest edition.

Tags: blogging

Simon Willison's Weblog
29 May 04:21

llm-mistral 0.14

llm-mistral 0.14 I added tool-support to my plugin for accessing the Mistral API from LLM today, plus support for Mistral's new Codestral Embed embedding model. An interesting challenge here is that I'm not using an official client library for llm-mistral - I rolled my own c...

Simon Willison's Weblog
29 May 04:21

llm-tools-exa

llm-tools-exa When I shipped LLM 0.26 yesterday one of the things I was most excited about was seeing what new tool plugins people would build for it. Dan Turkel's llm-tools-exa is one of the first. It adds web search to LLM using Exa (previously), a relatively new search en...

Simon Willison's Weblog
28 May 21:51

AI-assisted development needs automated tests

I wonder if one of the reasons I'm finding LLMs so much more useful for coding than a lot of people that I see in online discussions is that effectively all of the code I work on has automated tests. I've been trying to stay true to the idea of a Perfect Commit - one that bu...

Simon Willison's Weblog
28 May 17:45

Codestral Embed

Codestral Embed Brand new embedding model from Mistral, specifically trained for code. Mistral claim that: Codestral Embed significantly outperforms leading code embedders in the market today: Voyage Code 3, Cohere Embed v4.0 and OpenAI’s large embedding model. The model i...