Articles IA — Page 46

DiffusionGemma

DiffusionGemma

DiffusionGemma Last May Google briefly released an experimental Gemini Diffusion model. I tried the preview at the time and recorded it running at 857 tokens/second. It was an exciting model, but Google made no further announcements about it. That research has returned in the best possible way: as a new open weight (Apache 2 licensed) Gemma model, google/diffusiongemma-26B-A4B-it. NVIDIA are currently hosting the model for free on their NIM cloud API. I used that API to generate this pelican,…

10 juin 2026

Quoting Jeremy Howard

Quoting Jeremy Howard

Easy solution to slow down recursive AI self improvement: The lab with the top-ranked model must agree THEY must not use it for working on frontier AI But everyone else should have access to it. By definition, this means the frontier doesn't advance. It also has the critical benefit of avoiding a dangerous power imbalance. Anthropic has chosen the opposite of the safe path: they are allowing themselves, the current top lab, to use their top model for frontier AI research. They've said they'll…

10 juin 2026

The PM’s Playbook for Shipping AI Features That Actually Work in Production

IA

The PM’s Playbook for Shipping AI Features That Actually Work in Production

The demo to production Death Valley If you’ve worked on an AI feature, you know the feeling. You start building something that you are excited about, set launch timelines. The model spits out a perfect response, the prototype works magically, and everybody in the room is mentally calculating how big this product will be when […]

10 juin 2026

O'Reilly Radar — AI/ML

If Claude Fable stops helping you, you'll never know

If Claude Fable stops helping you, you'll never know

If Claude Fable stops helping you, you'll never know Jonathon Ready highlights one of the more eyebrow-raising details from the 319 page system card for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine: In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training…

10 juin 2026

Initial impressions of Claude Fable 5

Initial impressions of Claude Fable 5

I didn't have early access to today's Claude Fable 5 release, but I've spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast. It's slow, expensive and has been quite happily churning through everything I've thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can't do. First, let's review the key characteristics. Anthropic claim that Claude Fable 5 offers the same…

9 juin 2026

Daily Dose of Data Science

IA

Loop Engineering: Design the System That Prompts Agents

...explained visually.

9 juin 2026

llm 0.32a3

llm 0.32a3

Release: llm 0.32a3 Almost entirely written by the new Claude Fable 5, see my write-up for more details. Tags: projects, ai, generative-ai, llms, llm, claude-mythos

9 juin 2026

Setting a custom price for a model in AgentsView

Setting a custom price for a model in AgentsView

TIL: Setting a custom price for a model in AgentsView I've been really enjoying AgentsView by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop. Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices. Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different…

9 juin 2026

Quoting Andrej Karpathy

Quoting Andrej Karpathy

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). — Andrej…

9 juin 2026

The Subsidy Ended: What Tool-Using Agents Actually Cost

IA

The Subsidy Ended: What Tool-Using Agents Actually Cost

On June 1, GitHub Copilot’s usage-based billing became active for all Copilot plans, and developers reacted quickly and loudly. A Pro plan still costs $10, but it now comes with a monthly pool of AI credits. Those credits are priced at a penny each, and they’re consumed according to the model used and the tokens […]

9 juin 2026

O'Reilly Radar — AI/ML

Siri AI at WWDC 2026

Siri AI at WWDC 2026

Given how badly burned anyone who took Apple's 2024 WWDC Apple Intelligence announcements at face value was, I'm holding to a strict "I'll believe it when I see it" policy for everything they announced today. The new Siri AI features do at least look feasible with today's technology, especially since Apple are licensing a custom Gemini-derived model that they can run on their own Private Cloud Compute. It sounds like they'll be taking advantage of vision LLMs to extract information from the…

8 juin 2026

Daily Dose of Data Science

IA

Your Agent Harness Should Repair Itself

...covered with an open-source solution.

8 juin 2026

IA

Announcing major new donations, and recapping the 2025 fundraiser

This past December, we ran our first fundraiser in six years, setting an ambitious goal of $6M. We ended up receiving a total of $1.8M from small donors and $1.6M in matching from the Survival and Flourishing Fund (SFF) for a total of $3.4M. We’re incredibly grateful for all this support! In the rest of […] The post Announcing major new donations, and recapping the 2025 fundraiser appeared first on Machine Intelligence Research Institute.

8 juin 2026

MIRI Blog

IA

Long-Running Agents

The following article originally appeared on Addy Osmani’s blog and is being reposted here with the author’s permission. A long-running AI agent can keep making progress over hours, days, or weeks. It can do this across many context windows and sandboxes, recover from failure, leave structured artifacts behind, and resume where it left off. For […]

8 juin 2026

O'Reilly Radar — AI/ML