Flux
Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

OpenAI released ChatGPT Images 2.0 today, their latest image generation model. On the livestream Sam Altman said that the leap from gpt-image-1 to gpt-image-2 was equivalent to jumping from GPT-3 to GPT-5. Here's how I put it to the test. My prompt: Do a where's Waldo style image but it's where is the raccoon holding a ham radio gpt-image-1 First as a baseline here's what I got from the older gpt-image-1 using ChatGPT directly: I wasn't able to spot the raccoon - I quickly realized that testing…

Simon Willison's Weblog
Quoting Andreas Påhlsson-Notini

Quoting Andreas Påhlsson-Notini

AI agents are already too human. Not in the romantic sense, not because they love or fear or dream, but in the more banal and frustrating one. The current implementations keep showing their human origin again and again: lack of stringency, lack of patience, lack of focus. Faced with an awkward task, they drift towards the familiar. Faced with hard constraints, they start negotiating with reality. — Andreas Påhlsson-Notini, Less human AI agents, please. Tags: ai-agents, coding-agents, ai

Simon Willison's Weblog
Dark Factories: Rise of the Trycycle

Dark Factories: Rise of the Trycycle

The following article originally appeared on “Dan Shapiro’s blog” and is being reposted here with the author’s permission. Companies are now producing dark factories—engines that turn specs into shipping software. The implementations can be complex and sometimes involve Mad Max metaphors. But they don’t have to be like that. If you want a five-minute factory, […]

O'Reilly Radar — AI/ML
llm-openrouter 0.6

llm-openrouter 0.6

Release: llm-openrouter 0.6 llm openrouter refresh command for refreshing the list of available models without waiting for the cache to expire. I added this feature so I could try Kimi 2.6 on OpenRouter as soon as it became available there. Here's its pelican - this time as an HTML page because Kimi chose to include an HTML and JavaScript UI to control the animation. Transcript here. Tags: openrouter, llm, llm-release, pelican-riding-a-bicycle, kimi, ai-in-china, llms, ai, generative-ai

Simon Willison's Weblog
Gradient-based Planning for World Models at Longer Horizons

Gradient-based Planning for World Models at Longer Horizons

.grasp-results-table table { font-size: 0.875rem; line-height: 1.35; width: 100%; } .grasp-results-table th, .grasp-results-table td { padding: 0.35rem 0.5rem; } /* Consistent whitespace between major sections (this post is long and hr-heavy) */ article.post-content h2 { margin-top: 2.75rem; margin-bottom: 0.75rem; } article.post-content h2:first-of-type { margin-top: 2.25rem; } article.post-content h3 { margin-top: 1.65rem; margin-bottom: 0.5rem; } article.post-content hr { margin-top: 2.5rem;…

BAIR Blog
SQL functions in Google Sheets to fetch data from Datasette

SQL functions in Google Sheets to fetch data from Datasette

TIL: SQL functions in Google Sheets to fetch data from Datasette I put together some notes on patterns for fetching data from a Datasette instance directly into Google Sheets - using the importdata() function, a "named function" that wraps it or a Google Apps Script if you need to send an API token in an HTTP header (not supported by importdata().) Here's an example sheet demonstrating all three methods. Tags: spreadsheets, datasette, google

Simon Willison's Weblog
Claude Token Counter, now with model comparisons

Claude Token Counter, now with model comparisons

Claude Token Counter, now with model comparisons I upgraded my Claude Token Counter tool to add the ability to run the same count against different models in order to compare them. As far as I can tell Claude Opus 4.7 is the first model to change the tokenizer, so it's only worth running comparisons between 4.7 and 4.6. The Claude token counting API accepts any Claude model ID though so I've included options for all four of the notable current models (Opus 4.7 and 4.6, Sonnet 4.6, and Haiku…

Simon Willison's Weblog