FeedCity: Simon Willison's Weblog

Codex CLI 0.128.0 adds /goal

The latest version of OpenAI's Codex CLI coding agent adds their own version of the Ralph loop: you can now set a /goal and Codex will keep on looping until it evaluates that the goal has been completed... or the configured token budget has been exhausted.

It looks like the feature is mainly implemented though the goals/continuation.md and goals/budget_limit.md prompts, which are automatically injected at the end of a turn.

Via @fcoury

Tags: ai, openai, prompt-engineering, generative-ai, llms, coding-agents, system-prompts, codex-cli, agentic-engineering

Simon Willison's Weblog
30 Apr 23:27

Our evaluation of OpenAI's GPT-5.5 cyber capabilities

Our evaluation of OpenAI's GPT-5.5 cyber capabilities

The UK's AI Security Institute previously evaluated Claude Mythos: now they've evaluated GPT-5.5 for finding security vulnerability and found it to be comparable to Mythos, but unlike Mythos it's generally available right now.

Tags: ai, openai, generative-ai, llms, anthropic, claude, ai-security-research, gpt

Simon Willison's Weblog
21:42

Quoting Andrew Kelley

It's a common misconception that we can't tell who is using LLM and who is not. I'm sure we didn't catch 100% of LLM-assisted PRs over the past few months, but the kind of mistakes humans make are fundamentally different than LLM hallucinations, making them easy to spot. Furthermore, people who come from the world of agentic coding have a certain digital smell that is not obvious to them but is obvious to those who abstain. It's like when a smoker walks into the room, everybody who doesn't smoke instantly knows it.

I'm not telling you not to smoke, but I am telling you not to smoke in my house.

— Andrew Kelley, Creator of Zig

Tags: zig, llms, ai, generative-ai

Simon Willison's Weblog
19:36

We need RSS for sharing abundant vibe-coded apps

We need RSS for sharing abundant vibe-coded apps

Matt Webb:

I would love an RSS web feed for all those various tools and apps pages, each item with an “Install” button. (But install to where?)

The lesson here is that when vibe-coding accelerates app development, apps become more personal, more situated, and more frequent. Shipping a tool or a micro-app is less like launching a website and more like posting on a blog.

This inspired me to have Claude add an Atom feed (and icon) to my /elsewhere/tools/ page, which itself is populated by content from my tools.simonwillison.net site.

Tags: atom, matt-webb, rss, ai, vibe-coding

Simon Willison's Weblog
02:06

The Zig project's rationale for their firm anti-AI contribution policy

Zig has one of the most stringent anti-LLM policies of any major open source project: No LLMs for issues. No LLMs for pull requests. No LLMs for comments on the bug tracker, including translation. English is encouraged, but not required. You are welcome to post in your nati...

Simon Willison's Weblog
01:06

llm 0.32a1

Release: llm 0.32a1

Fixed a bug in 0.32a0 where tool-calling conversations were not correctly reinflated from SQLite. #1426

Tags: llm

Simon Willison's Weblog
01:06

llm 0.32a0

Release: llm 0.32a0

See the annotated release notes.

Tags: llm

Simon Willison's Weblog
29 Apr 19:18

LLM 0.32a0 is a major backwards-compatible refactor

I just released LLM 0.32a0, an alpha release of my LLM Python library and CLI tool for accessing LLMs, with some consequential changes that I've been working towards for quite a while. Previous versions of LLM modeled the world in terms of prompts and responses. Send the mod...

Simon Willison's Weblog
22:33

Quoting OpenAI Codex base_instructions

Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query.

— OpenAI Codex base_instructions, for GPT-5.5

Tags: openai, ai, llms, system-prompts, prompt-engineering, codex-cli, generative-ai, gpt

Simon Willison's Weblog
13:30

Quoting Matthew Yglesias

Five months in, I think I've decided that I don't want to vibecode — I want professionally managed software companies to use AI coding assistance to make more/better/cheaper software products that they sell to me for money.

— Matthew Yglesias

Tags: agentic-engineering, vibe-coding, ai-assisted-programming, ai

Simon Willison's Weblog
05:51

What's new in pip 26.1 - lockfiles and dependency cooldowns!

What's new in pip 26.1 - lockfiles and dependency cooldowns! Richard Si describes an excellent set of upgrades to Python's default pip tool for installing dependencies. This version drops support for Python 3.9 - fair enough, since it's been EOL since October. macOS still sh...

Simon Willison's Weblog
03:51

Introducing talkie: a 13B vintage language model from 1930

Introducing talkie: a 13B vintage language model from 1930 New project from Nick Levine, David Duvenaud, and Alec Radford (of GPT, GPT-2, Whisper fame). talkie-1930-13b-base (53.1 GB) is a "13B language model trained on 260B tokens of historical pre-1931 English text". talk...

Simon Willison's Weblog
00:42

microsoft/VibeVoice

microsoft/VibeVoice VibeVoice is Microsoft's Whisper-style audio model for speech-to-text, MIT licensed and with speaker diarization built into the model. Microsoft released it on January 21st, 2026 but I hadn't tried it until today. Here's a one-liner to run it on a Mac wit...

Simon Willison's Weblog
19:15

Tracking the history of the now-deceased OpenAI Microsoft AGI clause

For many years, Microsoft and OpenAI's relationship has included a weird clause saying that, should AGI be achieved, Microsoft's commercial IP rights to OpenAI's technology would be null and void. That clause appeared to end today. I decided to try and track its expression o...

Simon Willison's Weblog
18:15

Speech translation in Google Meet is now rolling out to mobile devices

Speech translation in Google Meet is now rolling out to mobile devices

I just encountered this feature via a "try this out now" prompt in a Google Meet meeting. It kind-of worked!

This is Google's implementation of the ultimate sci-fi translation app, where two people can talk to each other in two separate languages and Meet translates from one to the other and - with a short delay - repeats the text in your preferred language, with a rough imitation of the original speaker's voice.

It can only handle English, Spanish, French, German, Portuguese, and Italian at the moment. It's also still very alpha - I ran it successfully between two laptops running web browsers, but then when I tried between an iPhone and an iPad it didn't seem to work.

Tags: google, translation

Simon Willison's Weblog
17:12

WHY ARE YOU LIKE THIS

@scottjla on Twitter in reply to my pelican riding a bicycle benchmark:

I feel like we need to stack these tests now

I checked to confirm that the model (ChatGPT Images 2.0) added the "WHY ARE YOU LIKE THIS" sign of its own accord and it did - the prompt Scott used was:

Create an image of a horse riding an astronaut, where the astronaut is riding a pelican that is riding a bicycle. It looks very chaotic but they all just manage to balance on top of each other

Tags: text-to-image, pelican-riding-a-bicycle, ai, generative-ai, slop, chatgpt

Simon Willison's Weblog
12:21

Quoting Romain Huet

Since GPT-5.4, we’ve unified Codex and the main model into a single system, so there’s no separate coding line anymore.

GPT-5.5 takes this further, with strong gains in agentic coding, computer use, and any task on a computer.

— Romain Huet, confirming OpenAI won't release a GPT-5.5-Codex model

Tags: generative-ai, gpt, openai, ai, llms

Simon Willison's Weblog
04:39

GPT-5.5 prompting guide

GPT-5.5 prompting guide Now that GPT-5.5 is available in the API, OpenAI have released a wealth of useful tips on how best to prompt the new model. Here's a neat trick they recommend for applications that might spend considerable time thinking before returning a user-visible...

Simon Willison's Weblog
00:12

llm 0.31

Release: llm 0.31

New GPT-5.5 OpenAI model: llm -m gpt-5.5. #1418

New option to set the text verbosity level for GPT-5+ OpenAI models: -o verbosity low. Values are low, medium, high.

New option for setting the image detail level used for image attachments to OpenAI models: -o image_detail low - values are low, high and auto, and GPT-5.4 and 5.5 also accept original.

Models listed in extra-openai-models.yaml are now also registered as asynchronous. #1395

Tags: gpt, openai, llm

Simon Willison's Weblog
24 Apr 23:12

The people do not yearn for automation

The people do not yearn for automation This written and video essay by Nilay Patel explores why AI is unpopular with the general public even as usage numbers for ChatGPT continue to skyrocket. It’s a superb piece of commentary, and something I expect I’ll be thinking about f...

Simon Willison's Weblog
24 Apr 23:12

Millisecond Converter

Tool: Millisecond Converter

LLM reports prompt durations in milliseconds and I got fed up of having to think about how to convert those to seconds and minutes.

Tags: tools

Simon Willison's Weblog
24 Apr 23:12

llm-openai-via-codex 0.1a0

Release: llm-openai-via-codex 0.1a0

Hijacks your Codex CLI credentials to make API calls with LLM, as described in my post about GPT-5.5.

Tags: openai, llm, codex-cli

Simon Willison's Weblog
06:12

DeepSeek V4 - almost on the frontier, a fraction of the price

Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) last December. They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, DeepSeek-V4-Pro and DeepSeek-V4-Flash. Both models are 1 million token context Mixtur...

Simon Willison's Weblog
04:54

It's a big one

This week's edition of my email newsletter (aka content from this blog delivered to your inbox) features 4 pelicans riding bicycles, 1 possum on an e-scooter, up to 5 raccoons with ham radios hiding in crowds, 5 blog posts, 8 links, 3 quotes and a new chapter of my Agentic Engineering Patterns guide.

Tags: newsletter

Simon Willison's Weblog
01:57

russellromney/honker

russellromney/honker "Postgres NOTIFY/LISTEN semantics" for SQLite, implemented as a Rust SQLite extension and various language bindings to help make use of it. The design of this looks very solid. It lets you write Python code for queues that looks like this: import honker ...

Simon Willison's Weblog
01:57

An update on recent Claude Code quality reports

An update on recent Claude Code quality reports It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems. The models themselves were not to blame, but three separate issues in the C...

Simon Willison's Weblog
01:57

Serving the For You feed

Serving the For You feed One of Bluesky's most interesting features is that anyone can run their own custom "feed" implementation and make it available to other users - effectively enabling custom algorithms that can use any mechanism they like to recommend posts. spacecowbo...

Simon Willison's Weblog
23 Apr 22:03

Extract PDF text in your browser with LiteParse for the web

LlamaIndex have a most excellent open source project called LiteParse, which provides a Node.js CLI tool for extracting text from PDFs. I got a version of LiteParse working entirely in the browser, using most of the same libraries that LiteParse uses to run in Node.js. Spati...

Simon Willison's Weblog
23 Apr 20:42

A pelican for GPT-5.5 via the semi-official Codex backdoor API

GPT-5.5 is out. It's available in OpenAI Codex and is rolling out to paid ChatGPT subscribers. I've had some preview access and found it to be a fast, effective and highly capable model. As is usually the case these days, it's hard to put into words what's good about it - I ...

Simon Willison's Weblog
23 Apr 14:00

Quoting Maggie Appleton

[...] if you ever needed another reason to learn in public by digital gardening or podcasting or streaming or whathaveyou, add on that people will assume you’re more competent than you are. This will get you invites to very cool exclusive events filled with high-achieving, interesting people, even though you have no right to be there. A+ side benefit.

— Maggie Appleton, Gathering Structures (via)

Tags: blogging, maggie-appleton