Simon Willison's Weblog
- Author
- Simon Willison
- Public lists
-
Featured
- Fetched
Quoting Phil Gyford
Since getting a modem at the start of the month, and hooking up to the Internet, I’ve spent about an hour every evening actually online (which I guess is costing me about £1 a night), and much of the days and early evenings fiddling about with things. It’s so complicated. All the hype never mentioned that. I guess journalists just have it all set up for them so they don’t have to worry too much about that side of things. It’s been a nightmare, but an enjoyable one, and in the end, satisfying.
— Phil Gyford, Diary entry, Friday February 17th 1995 1.50 am
Tags: phil-gyford, computer-history
Claude Code for web - a new asynchronous coding agent from Anthropic
Getting DeepSeek-OCR working on an NVIDIA Spark via brute force using Claude Code
TIL: Exploring OpenAI's deep research API model o4-mini-deep-research
TIL: Exploring OpenAI's deep research API model o4-mini-deep-research
I landed a PR by Manuel Solorzano adding pricing information to llm-prices.com for OpenAI's o4-mini-deep-research and o3-deep-research models, which they released in June and document here.I realized I'd never tried these before, so I put o4-mini-deep-research through its paces researching locations of surviving orchestrions for me (I really like orchestrions).
The API cost me $1.10 and triggered a small flurry of extra vibe-coded tools, including this new tool for visualizing Responses API traces from deep research models and this mocked up page listing the 19 orchestrions it found (only one of which I have fact-checked myself).
Tags: ai, openai, generative-ai, llms, deep-research, vibe-coding
The AI water issue is fake
Andrej Karpathy — AGI is still a decade away
Quoting Alexander Fridriksson and Jay Miller
Using UUIDv7 is generally discouraged for security when the primary key is exposed to end users in external-facing applications or APIs. The main issue is that UUIDv7 incorporates a 48-bit Unix timestamp as its most significant part, meaning the identifier itself leaks the record's creation time.
This leakage is primarily a privacy concern. Attackers can use the timing data as metadata for de-anonymization or account correlation, potentially revealing activity patterns or growth rates within an organization.
— Alexander Fridriksson and Jay Miller, Exploring PostgreSQL 18's new UUIDv7 support
Tags: uuid, postgresql, privacy, security
Should form labels be wrapped or separate?
Should form labels be wrapped or separate?
James Edwards notes that wrapping a form input in a label event like this has a significant downside:<label>Name <input type="text"></label>
It turns out both Dragon Naturally Speaking for Windows and Voice Control for macOS and iOS fail to understand this relationship!
You need to use the explicit <label for="element_id"> syntax to ensure those screen readers correctly understand the relationship between label and form field. You can still nest the input inside the label if you like:
<label for="idField">Name
<input id="idField" type="text">
</label>
Via Chris Ferdinandi
Tags: accessibility, html, screen-readers
Quoting Barry Zhang
Skills actually came out of a prototype I built demonstrating that Claude Code is a general-purpose agent :-)
It was a natural conclusion once we realized that bash + filesystem were all we needed
— Barry Zhang, Anthropic
Tags: skills, claude-code, ai-agents, generative-ai, ai, llms
Claude Skills are awesome, maybe a bigger deal than MCP
NVIDIA DGX Spark + Apple Mac Studio = 4x Faster LLM Inference with EXO 1.0
Quoting Riana Pfefferkorn
Pro se litigants [people representing themselves in court without a lawyer] account for the majority of the cases in the United States where a party submitted a court filing containing AI hallucinations. In a country where legal representation is unaffordable for most people, it is no wonder that pro se litigants are depending on free or low-cost AI tools. But it is a scandal that so many have been betrayed by them, to the detriment of the cases they are litigating all on their own.
— Riana Pfefferkorn, analyzing the AI Hallucination Cases database for CIS at Stanford Law
Tags: ai-ethics, generative-ai, law, hallucinations, ai, llms
Coding without typing the code
Last year the most useful exercise for getting a feel for how good LLMs were at writing code was vibe coding (before that name had even been coined) - seeing if you could create a useful small application through prompting alone.
Today I think there's a new, more ambitious and significantly more intimidating exercise: spend a day working on real production code through prompting alone, making no manual edits yourself.
This doesn't mean you can't control exactly what goes into each file - you can even tell the model "update line 15 to use this instead" if you have to - but it's a great way to get more of a feel for how well the latest coding agents can wield their edit tools.
Tags: coding-agents, ai-assisted-programming, generative-ai, ai, llms
Quoting Catherine Wu
While Sonnet 4.5 remains the default [in Claude Code], Haiku 4.5 now powers the Explore subagent which can rapidly gather context on your codebase to build apps even faster.
You can select Haiku 4.5 to be your default model in /model. When selected, you’ll automatically use Sonnet 4.5 in Plan mode and Haiku 4.5 for execution for smarter plans and faster results.
— Catherine Wu, Claude Code PM, Anthropic
Tags: coding-agents, anthropic, claude-code, generative-ai, ai, llms, sub-agents
Quoting Claude Haiku 4.5 System Card
Previous system cards have reported results on an expanded version of our earlier agentic misalignment evaluation suite: three families of exotic scenarios meant to elicit the model to commit blackmail, attempt a murder, and frame someone for financial crimes. We choose not to report full results here because, similarly to Claude Sonnet 4.5, Claude Haiku 4.5 showed many clear examples of verbalized evaluation awareness on all three of the scenarios tested in this suite. Since the suite only consisted of many similar variants of three core scenarios, we expect that the model maintained high unverbalized awareness across the board, and we do not trust it to be representative of behavior in the real extreme situations the suite is meant to emulate.
Introducing Claude Haiku 4.5
A modern approach to preventing CSRF in Go
NVIDIA DGX Spark: great hardware, early days for the ecosystem
Just Talk To It - the no-bs Way of Agentic Engineering
nanochat
Quoting Slashdot
Slashdot: What's the reason OneDrive tells users this setting can only be turned off 3 times a year? (And are those any three times — or does that mean three specific days, like Christmas, New Year's Day, etc.)
[Microsoft's publicist chose not to answer this question.]
— Slashdot, asking the obvious question
Claude Code sub-agents
Vibing a Non-Trivial Ghostty Feature
Note on 11th October 2025
I'm beginning to suspect that a key skill in working effectively with coding agents is developing an intuition for when you don't need to closely review every line of code they produce. This feels deeply uncomfortable!
Tags: vibe-coding, coding-agents, ai-assisted-programming, generative-ai, ai, llms
An MVCC-like columnar table on S3 with constant-time deletes
simonw/claude-skills
Superpowers: How I'm using coding agents in October 2025
A Retrospective Survey of 2024/2025 Open Source Supply Chain Compromises
Video of GPT-OSS 20B running on a phone
Video of GPT-OSS 20B running on a phone
GPT-OSS 20B is a very good model. At launch OpenAI claimed:The gpt-oss-20b model delivers similar results to OpenAI o3‑mini on common benchmarks and can run on edge devices with just 16 GB of memory
Nexa AI just posted a video on Twitter demonstrating exactly that: the full GPT-OSS 20B running on a Snapdragon Gen 5 phone in their Nexa Studio Android app. It requires at least 16GB of RAM, and benefits from Snapdragon using a similar trick to Apple Silicon where the system RAM is available to both the CPU and the GPU.
The latest iPhone 17 Pro Max is still stuck at 12GB of RAM, presumably not enough to run this same model.
Tags: android, ai, openai, generative-ai, local-llms, llms, gpt-oss
