Flux
Quantization from the ground up

Quantization from the ground up

Quantization from the ground up Sam Rose continues his streak of publishing spectacularly informative interactive essays, this time explaining how quantization of Large Language Models works (which he says might be "the best post I've ever made".) Also included is the best visual explanation I've ever seen of how floating point numbers are represented using binary digits. I hadn't heard about outlier values in quantization - rare float values that exist outside of the normal tiny-value…

Simon Willison's Weblog
Thoughts on slowing the fuck down

Thoughts on slowing the fuck down

Thoughts on slowing the fuck down Mario Zechner created the Pi agent framework used by OpenClaw, giving considerable credibility to his opinions on current trends in agentic engineering. He's not impressed: We have basically given up all discipline and agency for a sort of addiction, where your highest goal is to produce the largest amount of code in the shortest amount of time. Consequences be damned. Agents and humans both make mistakes, but agent mistakes accumulate much faster: A human is a…

Simon Willison's Weblog
datasette-llm 0.1a1

datasette-llm 0.1a1

Release: datasette-llm 0.1a1 New release of the base plugin that makes models from LLM available for use by other Datasette plugins such as datasette-enrichments-llm. New register_llm_purposes() plugin hook and get_purposes() function for retrieving registered purpose strings. #1 One of the responsibilities of this plugin is to configure which models are used for which purposes, so you can say in one place "data enrichment uses GPT-5.4-nano but SQL query assistance happens using Sonnet 4.6",…

Simon Willison's Weblog
LiteLLM Hack: Were You One of the 47,000?

LiteLLM Hack: Were You One of the 47,000?

LiteLLM Hack: Were You One of the 47,000? Daniel Hnyk used the BigQuery PyPI dataset to determine how many downloads there were of the exploited LiteLLM packages during the 46 minute period they were live on PyPI. The answer was 46,996 across the two compromised release versions (1.82.7 and 1.82.8). They also identified 2,337 packages that depended on LiteLLM - 88% of which did not pin versions in a way that would have avoided the exploited version. Via @hnykda Tags: packaging, pypi, python,…

Simon Willison's Weblog
Spotting and Avoiding ROT in Your Agentic AI

Spotting and Avoiding ROT in Your Agentic AI

The following article originally appeared on Q McCallum’s blog and is being republished here with the author’s permission. Generative AI agents and rogue traders pose similar insider threats to their employers. Specifically, we can expect companies to deploy agentic AI with broad reach and insufficient oversight. That creates the conditions for a particular flavor of […]

O'Reilly Radar — AI/ML
Auto mode for Claude Code

Auto mode for Claude Code

Auto mode for Claude Code Really interesting new development in Claude Code today as an alternative to --dangerously-skip-permissions: Today, we're introducing auto mode, a new permissions mode in Claude Code where Claude makes permission decisions on your behalf, with safeguards monitoring actions before they run. Those safeguards appear to be implemented using Claude Sonnet 4.6, as described in the documentation: Before each action runs, a separate classifier model reviews the conversation…

Simon Willison's Weblog
Package Managers Need to Cool Down

Package Managers Need to Cool Down

Package Managers Need to Cool Down Today's LiteLLM supply chain attack inspired me to revisit the idea of dependency cooldowns, the practice of only installing updated dependencies once they've been out in the wild for a few days to give the community a chance to spot if they've been subverted in some way. This recent piece (March 4th) piece by Andrew Nesbitt reviews the current state of dependency cooldown mechanisms across different packaging tools. It's surprisingly well supported! There's…

Simon Willison's Weblog