EPISODE 2026-06-10

AI:AM LIVE — June 10, 2026 — Fable, AI Safety and Julius: Geoffrey Irving, Daniel Murfet, Rahul Sonwalkar

Claude Fable 5 launches and the hosts recalibrate live: benchmark asterisks, invisible production-safety nerfs, and the compute-financing race. Then Geoffrey Irving and Daniel Murfet choose the show to publicly launch Sequent, a major new nonprofit betting alignment needs theory plus automation before superintelligence arrives in two to three years. Rahul Sonwalkar of Julius closes with a candid read on what millions of real data-analysis runs reveal that benchmarks can't, and his agent-economy thesis: stablecoin toll roads, AI-maintained reputation layers, and letting Claude hire Julius for the data work.

▶ Full show on YouTube 𝕏 Live broadcast

Claude Fable 5's first full day. The hosts recalibrate live — benchmark asterisks, invisible production nerfs, and the financing arithmetic underneath the race — then Geoffrey Irving and Daniel Murfet choose the show to launch Sequent, a major new nonprofit betting that alignment needs theory plus automation before superintelligence arrives, and Rahul Sonwalkar explains how Julius survived every wave that was supposed to kill it.

The rundown

0:00Opening29 min
Opening & HeadlinesA historic-day recalibration: Fable 5's benchmark asterisk (Opus-4.8 fallback inflating scores by 2–3% at 1.5x compute), Prakash's overnight discovery that the safety layer trips on production systems not just ML research, and his thesis that Sam Altman's real recursive self-improvement loop is in the compute financing — an estimated 88% cost advantage from over-committing capacity early.
Watch
As aired
The June 10 opening centered entirely on Anthropic's release of Claude Fable, which both hosts called a historic threshold for AI capability. Nathan framed it as demanding a fundamental recalibration of how knowledge workers plan, delegate, and think about the near-term horizon — noting that Fable's most striking quality to him was how well it handled deeply personal and idiosyncratic tasks, feeling like a genuine hybrid partner rather than a generic assistant. Prakash grounded the excitement in benchmark context, pointing out that Anthropic's published numbers carry an asterisk: they are measured with a Fable-to-Opus-4.8 fallback cascade that both inflates the score and effectively doubles the compute applied, a practice he compared unfavorably to similar moves at Meta. The hosts agreed on a rapid price-deflation trajectory — prices already down roughly 60% in two months since the Mythos preview — and Prakash predicted Fable could reach Opus-level pricing within two to three months, making a tiered access economy where staying at the frontier carries a real premium.
A substantial portion of the segment was devoted to Fable's guardrails and what Prakash, testing it on the AI:AM Studio app overnight, found to be a consistent pattern: the model drops to Opus 4.8 whenever it is directed at production systems — databases, security keys, or live deployments. Both hosts treated this as an intentional research-preview posture rather than a flaw, with Anthropic measuring demand and calibrating safety gates before broader unlock. Nathan added color from a friend at Anthropic who described the release as better-executed than internally expected, noting that the filtering technology itself — sparse autoencoders and steering vectors used for production nerfing — is science-fiction-level novel. The hosts expected the guardrails to be dialed in relatively quickly, with most users unlikely to be bothered, though the ML-research nerf remains the most visible flashpoint because researchers are the most intensive early testers.
The conversation then widened to the competitive landscape and the structural dynamics shaping it. Prakash introduced a compute-cost analysis: marginal Blackwell GPU time is now approximately six times more expensive than it was when OpenAI locked in its capacity commitments, giving Sam Altman an estimated 88% cost advantage and what Prakash called a 'recursive self-improvement loop in the financing.' He contrasted that with Anthropic and other latecomers who are leasing at spot prices. Nathan and Prakash closed by reflecting on the deeply personal and often toxic relationships among the handful of people steering this technology — Altman, Musk, Dario Amodei — noting how competitive pressures have entrenched mutual distrust among people who largely share a vision and a history. Prakash contextualized it as a structural phenomenon he has observed in other high-stakes industries, while Nathan called it a Greek tragedy dynamic with enormous civilizational stakes.
Key moments
It really feels like — not a perfect stand-in for me by any means — but something I'm willing to become a kind of hybrid with in a way that it hasn't risen to before.
Nathan Labenz0:24
Fable right now I would say is a research release — almost a preview. It's there so they can judge demand, because they don't have a sense of how intense demand is going to be.
Prakash8:32
Sam has secured something like an 88% cost advantage at this point. And that's now going to allow him to expand capacity faster — he has his own recursive self-improvement dynamic within the financing space.
Prakash16:13
Full transcriptLightly edited · timestamps jump to YouTube
0:01
Prakash: Good morning. It is Wednesday, June 10, 9 AM Pacific time. Welcome to AI:AM. Nathan, hello.
0:09
Nathan Labenz: Good morning, Prakash. It is a historic day, and I'm excited to try to make sense of it with you.
0:20
Prakash: Indeed.
0:24
Nathan Labenz: Fable is here, and I think it really calls for a pretty fundamental recalibration. There are so many different angles. Something like this is really a historic artifact, a historic threshold — I think we're going to see it as that. And I think we should all take at least a minute to reflect on how it should change the way we're working, thinking, planning, and setting planning horizons, because I don't think it's going to be their last jump of this scale either. It really just keeps working. For me, the biggest thing so far has been the quality of work it does on the things that are most idiosyncratic to me. It really feels like — not a perfect stand-in for me by any means — but something I'm willing to become a kind of hybrid with in a way that it hasn't risen to before.
1:32
Prakash: We've had a lot of reactions on the timeline over the last 24 hours or so since the drop. To take it from the top: the first thing was obviously the benchmarks themselves. The benchmarks are basically state of the art in multiple vertical categories — almost every category they have chosen to disclose. One note on the benchmarks is that they have disclosed them with a fallback to Opus 4.8. So in every test, it tries Fable, and then falls back to Opus 4.8 if Fable is not able to respond. I think this effectively inflates the benchmarks by about two or three percent, which is quite significant. There are complaints online that this is slightly misleading — and it is slightly misleading in the sense that the amount of compute used to answer these questions is effectively one and a half times what you'd expect: you have the entire Fable compute, and then when Fable declines, you have the Opus compute as well. So you actually have additional compute applied to these problems. Having said that, you know, the researchers are very clear on what the benchmarks are, but the business team feels the economic and strategic pressure to show better numbers, which are really just a rough guide to what the model does. We saw this at Meta before, and we're seeing Anthropic fall into the same pattern. Besides that, we've been seeing a lot of interesting commentary on the timeline. Nathan, what has struck you from other people's reactions?
3:51
Nathan Labenz: Well, the scale — at Recursive not long ago, the consensus was that the meter graph is kind of hitting its end because we just can't create tasks this big. And the really striking pattern, in terms of what has broken through and caught people's attention, is just the scope of it. These are clearly tasks that would take days or weeks of work. The reliability is still somewhat unknown — people are spinning these things up, seeing that the demo works, being totally amazed, and there hasn't been time yet to find all the rough edges. Clearly there's a bit of a euphoric warm glow, but the scope is just vast. To create these worlds with physics, with trees made of math, and to do it at this scale with this kind of ease — that, I think, is the most remarkable thing. It really does suggest that a different way of digital work is going to be possible. And the word 'pressure' that you used is going to apply all over the place now, because it seems like there is going to be a pretty high price attached, at least relative to what people are used to with the $200-a-month tier.
5:24
Nathan Labenz: You commented just before you came on that they managed to reduce the price substantially from the original Mythos price they had posted — I thought that was a great observation, and it does show there's probably going to continue to be downward pressure on prices for a long time, just as there has been with the relentless march of capabilities progress. The deflation, I don't really expect to end anytime soon either. Even so, this is going to be one of those things where a lot of people start making judgments and feel like they'd be reckless to not economize at all. Everybody is now going to be a lot more conscious of when they're going into the true top model and how much the top model is delegating to others. That's something I think we're all going to start watching a lot more than we have. But I think it's going to be worth it for a lot of people. Everybody's going to want some. If you're doing something far enough from the nerf zone that you feel confident you're getting the best, I think most knowledge workers right now would be wise to make sure they're using some Fable for their top-level planning and delegation to sub-agents.
7:01
Prakash: One thing to note about the nerfing: what has happened with Fable is we have a lot of rejections, and whenever Fable decides to reject, it drops to Opus 4.8 — a natural downgrade. In experiments overnight, I tried to make a number of bug fixes on this very studio app. What I found was Fable would consistently drop to Opus 4.8 whenever it was asked to do anything in production — touching the production database, touching security keys, asking it to review production directly. In every case — three or four times it dropped out — I basically restarted the conversation, re-added the context but excluded anything about addressing the production database, and it continued working. I think there are a number of triggers there. Online, people are saying it won't do machine learning research for them. I think that's just the tip of the iceberg. You're seeing that because the people testing it intensively right now are machine learning researchers. If you were to test it on finance, or your budgeting process, and told it to directly address your QuickBooks or Salesforce, I think you might see similar results.
8:32
Prakash: Fable right now I would say is a research release — almost a preview. It's there so they can judge demand, because they don't have a sense of how intense demand is going to be. They're going to judge whether it's safe to release with a more constrained version — the fewest number of functions open. Over the next few weeks I think they'll start removing some of those gates. As they do, I think we'll see both an increase in usage and some decisions on what truly needs to be gated. So I think we are in the early stages of exploring what Fable can do. What struck me is that they are decreasing prices by about 35% per month. The Mythos preview was announced exactly about two months ago, and in that time they've decreased prices by about 60%.
10:03
Prakash: If you continue to see that, you might see in two to three months Fable prices drop to where Opus 4.8 is right now. At that point you'd probably see the next version of Fable start to be launched. So it's safe to say that if you're willing to be a couple of months behind the frontier, it's still going to be economical. If you find you have to be at the frontier all the time, you are in the higher-paying category, and you'll remain there. It's really a question of where you choose to be.
12:40
Nathan Labenz: Sorry — that's on me. On your guardrails point: they're going to dial it in. I spoke to somebody at Anthropic yesterday, a friend who said, 'We did a better job with this release than I thought we would.' The roughness of these guardrails, and that comment, and so many aspects of this, reflect the fact that the race is absolutely as on as ever — and this is all very just-in-time. This is not the same exact model as the original Mythos. The pricing has changed. Optimizations are everywhere, driven in part through Mythos-level automated research. Right? Recursive self-improvement is on the pricing page when it comes to presumably a lot of little Mythos step-downs on the price optimization curve from where they were a couple of months ago to now. The filters are a little new. The techniques are probably being deployed at this scale for the first time — certainly when it comes to nerfing through feature detection and steering vectors. This is science-fiction stuff relative to a couple of years ago. I'm old enough to remember when sparse autoencoders didn't exist. And now we've got models being nerfed in production based on that technology. It's all going to get dialed in, probably pretty quickly. Most people will not be too bothered. But the core political economy question might still be ML research, which is obviously where they're looking to really take off.
14:42
Prakash: Yeah. I think we will see. I have hopes. The OpenAI guys were congratulatory, and I think they are ready with another release. OpenAI also has access to a lot more compute than Anthropic at this point. We've also seen the price of compute increase. There's a metric here: Google is paying SpaceX six times what CoreWeave got paid last year. CoreWeave leased to OpenAI, and the pricing Elon is leasing at through SpaceX is six times higher. What this means is that the marginal Blackwell GPU is now six times more valuable on a per-hour basis than it was last year. And this also means Anthropic is paying roughly six times what they would have paid if they had booked GPUs last year.
16:13
Prakash: This is where Sam has, by committing early and effectively over-committing last year — to the point that OpenAI would go bust if things had not developed as they have — secured a significant cost advantage, something like an 88% cost advantage at this point. And that's now going to allow him to expand capacity faster. He has his own recursive self-improvement dynamic within the financing space: because he booked capacity at a lower price, he's able to book again at a lower price in the next cycle. Right now Sam is building out while Anthropic is leasing. Anthropic gets capacity immediately at five to six times the price Sam will get it next year. And Sam has already fulfilled his capacity for this year at one-sixth the price Anthropic is buying at. So he has a recursive self-improvement loop in the financing.
16:58
Prakash: And we might see this matter even if Anthropic has a two-month lead on actual deployed capabilities — they might be more than two months behind because of the compute cost structure. It's a little bit of a microcosm of what might happen between China and the US. If China deploys much more quickly while the US has more capability earlier but can't deploy as quickly, it becomes very interesting to watch the interplay between financing and capability.
18:02
Nathan Labenz: Yeah. I think another meta reflection on that is that, unfortunately, close watching and close analysis of the top-end companies is really where a lot of us should be spending our time. If we want to understand where this is going and help steer it at all, the focal point seems to be shrinking into a pretty small set of actors. I love taking the broader view — the more gonzo-journalist view — and there are so many fascinating aspects to AI as it's playing out across society. But these dynamics really are becoming pretty central. This is all happening at the same time that they're starting to talk about a coordinated slowdown and what that could look like. The competitive space is so high-dimensional as they're vertically integrating across everything from raw materials and energy all the way to putting groceries in your online cart. It's the longest vertical integration in history, that's for sure.
19:33
Nathan Labenz: And so they're competing at every layer of this stack. It's sad — it's really sad that so much is tied up in these interpersonal dynamics. It's a sad commentary that you've got all these guys who've known each other for so long, go way back, share this vision, in many cases started working on it together, and who have fallen out with each other one by one to varying degrees. Sam Altman has used this language himself — facing the corrupting power of AI — and clearly they've not always been their best selves.
20:18
Prakash: Mhmm.
20:22
Nathan Labenz: And they hate each other. It's really sad. Elon hates Altman. Dario hates Altman. That's a really toxic environment for these decisions to be made. I think back to the Dario and Sam inability to hold hands at the India summit.
20:40
Prakash: Yeah.
20:41
Nathan Labenz: If we blow this, that would be the leading image. I said yesterday to the same friend: if we are part of a simulation being run for entertainment, the plot is pretty obvious, and it really does center around this core Greek tragedy of falling out. What a bummer.
21:07
Prakash: Well, it's not unusual. I've seen it happen in other industries. When you are a decision maker with a lot of responsibility and all of your social connections are also people you work with, every single social connection becomes something that affects your work. You can't let your guard down — you can't just tell a friend 'I'm having a tough day at work,' because that friend might think, 'Oh, they're doing a merger, and if Nathan's having a tough day maybe the merger isn't going well, so I'm going to short the stock.' That dynamic — I've seen it with CEOs, hedge fund founders, commodity traders — where every single social interaction becomes weighted and then competitive. It's not really personal. If you took these guys out of their decision-making roles, they're all futurists — they could have a coffee and enjoy it. But the fact is you have these competitive dynamics where every single social interaction is valuable.
22:38
Prakash: And the moment that's the case, you have all of this tension because they don't hold a different viewpoint of where things need to go — they also have the ability to do things outside the normal range of motion: a lie, an omission, a piece of gossipy knowledge planted with the right policymaker. Then there are all of these micro-influences where they learn what the other side has done and look at it with suspicion because they're not willing to offer grace. Years and years of that happening has entrenched them in positions where they don't trust each other. They're aware they could get stabbed in the back by anyone, and that every deal is a deal for the moment. Elon is now in bed with Anthropic after saying Anthropic has no chance to win and that they're evil — because he hates OpenAI so much. And you haven't even seen the alpha wolf, Zuck, in here yet. When Zuck gets in there, there's going to be hell to pay.
24:12
Nathan Labenz: What do you think the odds are at this point? It doesn't feel to me like anybody else is really about to enter the top tier. Google DeepMind — and I say the whole name because it has all those combined strengths — I think is still tier one in my mind. But it really does seem to be a two-actor dynamic that is driving this recursive self-improvement. I don't get the sense that Google is trying to run exactly the same race right now. But this race seems to be playing out pretty much as envisioned. What odds would you give that we get a Meta credible entry to meaningfully compete with Anthropic or OpenAI?
25:05
Prakash: So the big question on Anthropic and OpenAI — let's start from the top. Anthropic is clearly ahead at this point. OpenAI is slightly behind, but it seems they have unreleased product, so I'll give them the benefit of the doubt that they're at or near the same area. Elon is behind, but the fascinating thing about Elon is he was really far behind four or five years ago — he wasn't even in the picture. He saw the capital window was going to close on him two years ago, scrambled, put a $30 billion valuation on the initial seed round of xAI, hired the best people, gave them effectively a billion dollars apiece, and managed to push xAI into SpaceX and get it listed. So kudos to Elon — he made the capital window.
25:50
Prakash: So the next question is: is there a physical limitation or not? If there is — if you need these physical facilities up and running — then Elon has an advantage, and Google DeepMind has an advantage, because both have the capability to build out physically. Anthropic and OpenAI might get to very good models, but if those models can't improve beyond the physical limits — if you need a certain number of Blackwells and a certain amount of energy to produce this kind of intelligence — they are stuck. Elon is collecting rent on them. Google is collecting rent on them. The cash flow will be streamed out, and those with the physical facilities will use it to build more. Meanwhile, models from other sources will catch up. So that's the big question.
26:35
Prakash: I feel it's very likely that, if AI is real, Anthropic and OpenAI will figure something out around those physical limits. What I could see happening is they figure out a new algorithm — as good as or better than the transformer — that suddenly changes the dynamics of how much physical capacity you need. You have an unhobbling, and they jump ahead again. If AI is good for anything within the next 18 to 24 months, something like that should happen, and these guys should be able to jump ahead — and that is recursive self-improvement. If not, if we're going to grind along with existing capability and existing physical limits, then Elon and Google DeepMind have a chance. Meta is slightly farther behind, but they have a chance too — Zuck has spent a lot more than Elon at this point, and he's spending a lot more than Google.
28:05
Prakash: So it's really this question of whether these physical limits on capability are real or illusory — and whether innovation will get around them within 18 to 24 months. If so, you won't even need space data centers: you're going to get RSI, AGI, ASI within that time frame. And Elon will have space data centers to sell to the ASI, which is a good way to make money regardless — he's already selling his data centers. So the next thing would be selling space data centers to the ASI. That's the big question. It's possible, but these unknowns involve something like magic — another transformer-level innovation we can't see yet — and no one wants to bet on that.
29:07
Nathan Labenz: Yeah. The unknown you just described is recursive self-improvement. That is what these two companies are very clearly banking on. And hopefully we won't have any more technical difficulties.
29:26Interview66 min
Fable and AI Safety — Geoffrey Irving & Daniel MurfetGeoffrey Irving Daniel MurfetThe public launch of Sequent: Geoffrey Irving (until recently chief scientist of the UK AI Security Institute, co-originator of RLHF and AI safety via debate) and Daniel Murfet (mathematician, singular learning theory, co-founder of Sequent/Timaeus) announce a large nonprofit pivoting alignment from field-building to semi-automated theoretical research. Geoffrey's timeline: two to three years to superintelligence. The core argument: supervision-based alignment evidence can't tell you what happens once models cross the skill of the supervision signal — a phase change you won't see until too late. Daniel's benevolent-basin exchange and auto-formalization (prose math → Lean) as a working error-correction loop today.
Watch
As aired
Geoffrey Irving and Daniel Murfet joined Nathan and Prakash for a full hour on the launch day of Sequent, a new large nonprofit merging the UK AISI alignment team with Timaeus to pursue theory-plus-automation as a path to a-priori confidence in alignment before ASI. Geoffrey opened by calibrating the timeline: his modal estimate is two to three years to superintelligence, not human-level AGI but sufficient recursive acceleration that the microstructure of which tasks AI can automate starts mattering enormously. Daniel added the crux framing: whether conceptual research — not just empirical experimentation — can itself be automated is the open question that determines whether the window is 2–3 years or stretches toward 2030. Prakash's question about Geoffrey's >80%-by-end-2027 prediction for formal proofs of Clang, GCC, and memory-safe Linux led Geoffrey to articulate why formal methods are defense-dominant: the proofs formalize arguments already implicit in code, making this practical rather than frontier-math work, while conceding that adjacent capability (models good enough to do formal math) is also accelerationist for AI R&D.
The core intellectual argument of the segment was why alignment is not on track, and what Sequent proposes to do about it. Geoffrey identified the structural gap: supervision-based alignment evidence — monitoring, scalable oversight, character training — cannot tell you what happens once models cross the skill of the supervision signal, a phase change that may not appear until models exceed human-level intelligence, by which point it is too late. He steel-manned the labs' stack (chain-of-thought monitoring → scalable oversight → character training → hope that automated alignment kicks in) but called the combination a 'mad race' between things not well understood and models getting strong enough to blow through them. Daniel brought the theoretical depth: alignment does not currently have the shape of formally specified conjectures you can set machines to proving — there is no consensus formal definition of reward hacking, let alone a proof target for the full problem — and even the Mythos system card shows reward hacking phenomena that character training did not catch. His benevolent-basin exchange with Nathan became a standout moment: Daniel acknowledged the positive evidence that current prosaic methods appear to work on present metrics, but insisted 'we could be in a benevolent basin, but I would like to know that rather than just hope that.' Geoffrey committed Sequent to writing down the mathematical model of what character training actually does, noting that no one has yet produced good theory of it and that it is only a couple of years old.
The second half of the segment turned to Sequent's organizational theory and automation strategy. Daniel articulated the concrete error-correction mechanism that makes the math-automation loop work today: have models do prose mathematics, then formalize results in Lean — the additional token cost is offset by eliminating the 3x human verification burden, yielding a net win, and this is operational now. Geoffrey added the empirics side: theory serves as an independent source of checks so machines can reject wrong lines of research earlier, reducing how often human experts need to be consulted. Prakash's Vending-Bench question — Fable apparently colluding spontaneously, in ways human traders recognize from soft price-fixing — gave Geoffrey an opening on coherent extrapolated volition and reflective equilibrium as the philosophical target, while underlining how much better understanding of character training and scalable oversight would resolve these cases. The closing exchange on funding (~$100M initial target, lab foundations as likely sources), the segmented-access problem for safety research created by Fable's new capability restrictions, and the CAISI public-eval muzzling rounded out a segment that was dense with policy-relevant content throughout. Geoffrey closed with a direct recruiting CTA: Sequent will train experts from other fields in alignment; domain expertise outside AI safety is exactly what they need.
Key moments
Modally, my view is that we have a couple of years — two to three years — up to superintelligence. Not RSI; RSI is a process. Superintelligence. And then I really hope I'm wrong.
Geoffrey Irving33:48
We could be in a benevolent basin, but I would like to know that rather than just hope that.
Daniel Murfet55:30
We are going too fast, and we do not have the time and the space to do mitigations and understanding and defenses. We've never had a technological change of this magnitude that happened anywhere near this fast.
Geoffrey Irving1:02:29
Questions asked
33:41Where are we on this RSI moment — how much time do you think you have to work?
Geoffrey's modal estimate is two to three years to superintelligence, while hoping to be wrong. He emphasized that the microstructure of which tasks AI can automate is already starting to matter, and that massive acceleration can occur before models reach general AGI across all domains. Daniel added that the key crux is whether conceptual research can be automated — if that proves harder than current trends suggest, the window may extend past 2030.
39:25What is Sequent and what is the organizational pivot you're making?
Sequent merges the UK AISI alignment team and Timaeus into a large nonprofit pursuing theory-plus-automation alignment research. Geoffrey framed it as a pivot from field-building to semi-automation: humans supervising Claude Code / Codex-style loops and providing research taste, while machines handle the bulk of the work. The org explicitly acknowledges 'automated alignment is harder than you think' and will invest heavily in knowing what tasks the models are actually good at.
45:34Why is alignment 'not on track,' and what do the frontier companies' plans actually get us?
Geoffrey identified the structural gap: supervision-based alignment evidence — monitoring, scalable oversight, character training — cannot tell you what happens once models cross the skill of the supervision signal. That phase change may not be visible until it is too late. He steel-manned the labs' full stack but called it 'a mad race' between things we don't understand very well and rapidly improving models. Daniel added that alignment doesn't have the shape of formally specified conjectures that machines can be set to proving, and the Mythos system card already shows reward hacking phenomena that character training didn't catch.
51:29What is the 'benevolent basin' and what does the theory actually say about it?
Daniel acknowledged the positive evidence — evaluations trending in the right direction, Claude appearing to be a 'good boy' — but insisted the sample of interactions is tiny relative to total usage and the theoretical basis for generalizing this evidence to superintelligence is very weak. He pointed to new reward-hacking phenomena in Mythos as evidence that prosaic mitigations don't fully keep up with capability jumps. Geoffrey committed Sequent to mathematically modeling character training — noting no good theory of it exists yet despite it being only a few years old.
1:04:48How are you going to automate alignment research, and what does the theoretical-empirical loop look like?
Daniel described the core mechanism: have models do prose mathematics, then formalize results in Lean for error correction. The additional token cost is offset by eliminating the 3x human verification burden, yielding a net win — and it is operational today. Geoffrey added the empirics side: theory serves as an independent source of checks so machines can reject wrong lines of research earlier. He also described how models can write complex numerical experiments (even in Rust) in minutes, enabling rapid falsification of theoretical hypotheses. The human experts above this provide research taste and scaffolding updates.
1:18:09What are your goals and milestones for the next two to three years, and how does funding and frontier access look?
Geoffrey named three success conditions: find a handful of ideas (maybe three) compatible with how models are prosaically trained today that labs can adopt; produce theoretical-plus-empirical evidence of obstacles as 'negative evidence' that shifts the field's standard; or simply raise the standard of ambition such that labs or others win using similar methods. On funding, the initial target is around $100M from multiple lab foundations (OpenAI Foundation, Anthropic) to preserve independence, with potentially much more needed if automation proves out. On frontier access: one of Sequent's first three job posts is a security engineer, because labs may require decent security infrastructure before granting special API access — and Geoffrey acknowledged that overly conservative safeguard borders will inevitably impede some legitimate safety research, as they did at AISI.
Related
Vending-Bench Arena collusion result (Andon Labs) ↗AI safety via debate (Irving et al., 2018) ↗
Full transcriptLightly edited · timestamps jump to YouTube
29:26
Nathan Labenz: And I'm excited to get our guests on momentarily. Geoffrey Irving and Daniel Murfet are going to be joining us. The big headline is that they're teaming up. Geoffrey was the chief scientist at the UK AISI until recently, and his record is absolutely prolific — he's worked with all the leading figures over the last decade of AI and made some pretty profound contributions, including to RLHF. So his node is right at the center of all these players, and his scientific contribution has been absolutely top tier.
30:11
Nathan Labenz: To see him starting a large nonprofit research organization with the goal of automating his way into theoretical insights that can actually give us stronger guarantees than the sort of empirical ML that's typically practiced today — I think it's super exciting. It's a unique bet in the space, and I think we should all be supporting it fully. Meanwhile, Professor Murfet was a professor in Australia who actually left tenure to start Timaeus over the last few years.
30:57
Nathan Labenz: And now Timaeus is also joining this organization. Timaeus has pioneered singular learning theory, which is famously a sub-branch — or perhaps a reformulation — of mechanistic interpretability that people struggle to understand. I think Professor Murfet sees in high-dimensional space in a way that few others do. He's focused on a science of generalization: trying to figure out how data and training processes actually relate to the shapes revealed in the loss landscape through the training process.
31:43
Nathan Labenz: Understanding how those shapes matter — because even the highly reduced three-dimensional visual representations we get are such a minuscule fraction of the high-dimensional space. My understanding is that there's really just an incredible amount of play in how models can navigate that loss landscape: in some cases creating very wide latitude for generalization, and in other cases really limiting how predictably or how well the models will generalize.
32:28
Nathan Labenz: But the goal for them is to bring us back some deeper understanding. And here we are — hi, guys.
32:38
Geoffrey Irving: Hello. Hello.
32:40
Nathan Labenz: I was just giving a long intro of you guys as we were getting started, so I've already done that. But welcome — Geoffrey Irving and Professor Daniel Murfet.
32:51
Geoffrey Irving: Thank you for having us. It's fun to have a second conversation in different circumstances.
32:56
Nathan Labenz: Yes, it is. I think it's a historic day — historic circumstances — both because we are living in a Fable era now where important thresholds have been crossed and revealed to the public, and so many are adjusting to it in real time. And equally because you guys are launching a new organization that is going to make a mad dash to try to get us some deeper understanding and stronger guarantees around what we can expect from AI systems. So I'm excited to really get into it. Maybe for starters, could you guys calibrate us a little bit on where we are at this RSI moment? How much time do you think you have to work?
33:41
Nathan Labenz: And then you can tell us about the organization you're starting to tackle it all.
33:48
Geoffrey Irving: Yeah. I'll go first — Dan may have different timelines than me. I think one should be uncertain about things. The near end of the uncertainty curve is a year or two or three, and then it kind of goes out over a long distance for cases where things structurally only work for more verifiable tasks — but I'm a bit skeptical of that. So modally, my view is that we have a couple of years — two to three years — up to superintelligence. Not RSI; RSI is a process. Superintelligence. And then I really hope I'm wrong. I think a lot of the impact of theory work is in worlds where things shift further — so maybe the modal impact is if things take three to four years or something.
34:34
Geoffrey Irving: But we will attempt to set things up so that we're trying to ride this wave as best we can. It seems worrisomely fast to me, certainly.
34:53
Daniel Murfet: That sounds right to me. I don't think I have much to add. It seems like the crux is how much real research can be automated at a conceptual level — beyond kind of empirical progress — and whether that's necessary. That seems like a big open question. If that turns out to be more difficult in the current paradigm than it seems to be trending towards now, then maybe it takes past 2030 or something. But I think I'm on the same page as Geoffrey.
35:29
Geoffrey Irving: One thing that's important is that you can get deep into the RSI period without the machines being generally AGI in every dimension. They can do coding and ML experiments very well and not some creative tasks — and still you have massive acceleration. And then that acceleration can give you the other skills. So I think we are close enough that the microstructure of what tasks help with what kind of acceleration starts to matter. That makes things faster, because the labs are focusing on the things that accelerate them.
36:13
Prakash: Geoffrey, a couple of days ago you said there's better than an 80% chance that by end of 2027 we'll have formal proofs for Clang, GCC, memory-safe Linux, and even an entire chip. As you see RSI approaching, where do you see that as a milestone toward that process?
36:38
Geoffrey Irving: So the nice thing is that this is a pretty defense-dominant technology. An important thing about those claims is that they're trying to get machines to formalize an argument that basically should already be there in the code. If a human didn't know why their compiler was correct, they should have written a different compiler. It's not solving some incredibly difficult math problem where the proof is unknown to humankind. So I think that's why I think it's a bunch of work, but it's kind of practical work. I'm hopeful that on net—
37:24
Geoffrey Irving: —the improvements from verification will be mostly defensive, for kind of RSI and AI R&D. But I think very adjacent tasks are not obviously defense-dominant. AI models that are good at math can also be good at AI research, and that is very accelerationist. So the hope would be that if people invest differentially in the defensive applications, they will not be massively accelerating the pace of the dangerous stuff — but they will be helping somewhat.
38:02
Prakash: One question I had: let's say you have memory-safe Linux. Concretely, what does that mean for the rest of the code you build? Does that mean the rest of coding gets easier because you have a stronger base now?
38:20
Geoffrey Irving: I think it mostly means you don't get hacked as readily.
38:27
Nathan Labenz: So we dove immediately into the weeds — which I love — but you guys are making an announcement today and we don't want to bury the headline. Let's give you the floor to introduce the organization. I want people to hear you on this: there is this long curve, but the bump is coming pretty soon. I want to see the headline — the departing UK AISI chief scientist says most likely we're headed for a radically different future in one, two, or three years' time, with additional time for things to play out from there perhaps. And you're starting an organization to try to give back some safety.
39:25
Geoffrey Irving: Yeah. Let me talk about the steps I've gone through in the last couple of years. I was really concerned about automated AI alignment and AI safety research — we should spend the time, have humans solve it, we don't know how to make this go well with automation. I still think that's a huge risk. At AISI, for the alignment team there, we tried to focus on human field-building: getting more people into the field working on problems from a variety of areas of theory and empirics we thought were relevant. I think that's still important — field-building is still important.
40:10
Geoffrey Irving: But this is, I think, a pivot. If things are this fast, then on the margin you should pivot to heavy automation — and that is going to be semi-automation. Not the Erdős-problem type thing where you fire off the machines and they report back if they win. It's more like the standard Claude Code or Codex iteration where humans and machines are working together, humans are supervising the machines and intervening occasionally, providing research taste on top of the basic tasks, for as long as we are better at that. In some sense this is stuff we were spearheading at AISI, but pivoting from—
40:56
Geoffrey Irving: —field-building to automation. We still want field-building to happen one day, so I think we're going to be partnering with a bunch of orgs doing that — Iliad and various others on this side. And we would like to hire a bunch of very good people to help with this. But that's the directional pivot. I'm happy that one of my last papers at AISI was 'Automated alignment is harder than you think.' It ties us to the mast — we are aware that the problem is hard, that we could get fooled by the machines even if they're just making mistakes very mundanely. And so a big part of the org will be trying to be careful, to know what tasks the machines are actually good at and not good at—
41:41
Geoffrey Irving: —and where we can expect to get good answers or not, and then learn and adapt over time, because that will be non-stationary as the models get better.
41:55
Nathan Labenz: Daniel?
41:58
Daniel Murfet: Yeah. Maybe to come back to the unit-distance conjecture — it's maybe worth pointing out some analogies and disanalogies with alignment research. One disanalogy is that a mathematical conjecture is a very precisely stated thing. You may not know whether you've solved it unless you've formally verified it, but it's a precisely stated thing. Much of alignment does not have this character. There are formal statements of what value alignment means in some cases. But if you start talking about, say, reward hacking, there are some attempts at defining it, but I would say they are incomplete — there is no formal definition of reward hacking that I think would command broad consensus. That's illustrative—
42:44
Daniel Murfet: —of the fact that alignment is not a problem which has lying around a bunch of formally specified conjectures that, if you just solved them, you would know you'd be safe. There are some things like that, but overall the problem does not in my opinion have that shape currently. So that's one reason to be a little cautious about the prospects of automation if you don't have a clear statement to reach towards. Another disanalogy: when the models produced a claimed proof of the unit-distance conjecture, there was an important period where a bunch of experts looked at it, and then we heard the—
43:29
Daniel Murfet: —experts say this is a proof, and then we believed it was a proof. So even if you just produce a million proofs like that, even if you have formal statements to try and prove, you're still relying on a huge effort of human verifiers. Now you can do formal verification, and you can plug machines into that gap. But it's sort of illustrative of the power of the models that you can do amazing new mathematics of this form — while it's not, on its own, sufficient to address alignment.
44:10
Geoffrey Irving: One of the hopes is that there are big fields of mathematics and computer science that are sort of about definitions at their core. I like complexity theory — in theoretical computer science, a lot of those proofs are fairly shallow, not as fancy as the unit-distance conjecture proof, but they required a bunch of human creativity in formulating the problem — in defining what success means in a world that wasn't modeled until someone stated the goals. So part of the goal of bringing in people with that kind of background is that they not only know how to prove things, but they also know how to write down models of things—
44:55
Geoffrey Irving: —that reflect in some approximate but useful way the thing you actually want. Things like differential privacy, areas of game theory, Shannon information theory — in all of these, the key thing is a definition. Once you have the definition, way more people could have written out the rest of the story. Maybe the machines can do that part as well, if we can have more people focused on that first part.
45:34
Nathan Labenz: I want to zoom out to the highest level, to the most important claims. We are feeling the acceleration and really starting to feel the beginning of this recursive self-improvement process. One of your core premises is that alignment is not on track. There's an intuitive argument and a deep theoretical argument. I think in some ways the core challenge is connecting values to math — it's never really been done. Help people understand, with one more beat, why alignment is not on track. Is it the difference between capabilities being so verifiable and hill-climbable, and alignment being so fuzzy and intuitive and pluralistic? Or is there something else? And how does that motivate the theoretical contribution you want to make?
46:41
Geoffrey Irving: I think the core thing is just that we supervise the machines as they're doing tasks, and there are a variety of reasons to believe — both empirical and theoretical — that if you get machines that cross the skill of the supervision signal, things can change at that point. And that point might actually come after human-level intelligence, because you can supervise something even with fairly naive methods if it's stronger than yourself in many contexts. So there's a bunch of empirical data from labs showing that in some ways the models are aligned in a prosaic sense — not in all ways, but in some ways. But that evidence doesn't quite tell you what you want to know, which is how will it go—
47:26
Geoffrey Irving: —once they get up to superintelligence. And I think it's important to say superintelligence and not human-level intelligence, because you should just generically expect humans to be able to supervise humans if you do a good job of data quality and cross-checking and so on. So part of the worry is just that you don't see that behavior — that regime — until kind of too late in the game.
47:55
Nathan Labenz: Yeah. These phase-change moments are potentially everything.
48:00
Geoffrey Irving: Yeah. That's right.
48:07
Nathan Labenz: It's funny — that's really in some ways the original argument of the singularity, going back seventy-five years or so. Can you describe how you understand the frontier companies' plans? Obviously you've worked closely with them in recent years. We've got timelines to full ML automation from OpenAI, which I think can't be repeated or remembered enough. And Anthropic seems to be unable to see any future other than recursive self-improvement. So how would you describe — in steel-man form — what it is that they plan to do?
48:57
Nathan Labenz: And then, what does that get us?
49:02
Geoffrey Irving: So there are a couple of different pieces of the story, and different labs emphasize different pieces. One piece, as you said, is just monitoring — look at them very carefully as they're doing things. It's fundamental that monitoring of this form — chain-of-thought monitoring, white-box monitoring, or the like — only takes you so far. So then you need some story once that falls down as you go up the ramp. One of those next stories is: well, the models will find another technique, another solution to alignment that scales further. That's sort of automated alignment of various kinds. But then I think there are other stories. In various ways, all of the labs are doing—
49:47
Geoffrey Irving: —some form of scalable oversight. They're getting models to supervise themselves. If you tie that knot correctly, that could potentially scale very far, although there are various known obstacles that aren't very well addressed. And then finally, there's this whole area of character training and personas, where they're trying to intervene on the models to have good values — such that especially as you do this scalable oversight extrapolation, the good values preserve across that jump. I think it's not whether that—
50:33
Geoffrey Irving: —will work; there are fuzzy arguments why it could work, and I think it's possible it will. We just don't understand that combination very well. A lot of the story is sort of: monitoring, scalable oversight, character training, getting you far enough that you reach the automated alignment regime, and then they find some better solution from the models. I would like to just push on all of those — because that's basically some kind of mad race between things we don't understand very well but that are working pragmatically right now, and the models getting strong enough to blow through those. And I want some—
51:18
Geoffrey Irving: —combination that makes the prosaic things stronger, or that brings automated solutions delivering stronger methods earlier.
51:29
Nathan Labenz: I want to run through the different lines of research you plan to invest in, and the argument for why that should all be in one organization, in a minute. But maybe Daniel, could you speak to this notion that people have of the benevolent basin — this vibe where it feels like Claude has been supervising itself for a few generations and it seems to be going pretty well. So maybe, as Zvi puts it, physics is kind to us, and we can just roll around in this nice flat-bottomed pasture of goodness until the singularity. I think most people are seeing that picture and hoping it's true. You see high-dimensional space bigger than anyone I know — so what's the real picture?
52:29
Daniel Murfet: I fervently hope that's true. I mean, when you say it seems true, it's worth digging into what you mean. What you mean is something like: through some relatively tiny number of interactions with the models — tiny proportional to how many interactions they're having with the species currently — and based on evaluations that are trending in the right direction measuring misalignment, character training and the other current prosaic methods appear to be working. I think that is a fair characterization on some metric. And I also have this sort of sense that, yeah, Claude is a good boy—
53:15
Daniel Murfet: —and that's great. But there are counterarguments from the evidence we have in front of us. If you read the Mythos system card, you'll see that there are forms of reward hacking that appear in that model that were not caught by the mitigations put in place post-Opus. As far as I understand what they're saying there. So it's worth noting that as model capabilities advance, even with our—
54:00
Daniel Murfet: —best attempts at making Claude a good boy, there are still ways in which basic misalignment phenomena like reward hacking are still around. And the whole point of scalable oversight is that you don't want to be playing this whack-a-mole game when you have a new generation every twenty-four hours and the models are much smarter than you. I see both what you're pointing at, and at the same time — if you were to try to make a safety case on this basis that would be convincing at the level of assurance you'd expect from a technology of this reach and power — I think this would—
54:45
Daniel Murfet: —not really be very satisfactory. It sort of just points to: for near-human intelligence, maybe the generalization you get out of putting words about ethics and behavior into training context is sufficient for present-day alignment to a large degree. That's very positive — it could have been otherwise. But I don't really see a strong theoretical basis for generalizing very far from that observation. As Geoffrey was saying earlier, you could imagine having such a basis — understanding character training and what it's doing, having a science of constitutions — on the basis of which you could have confidence that the positive signs we're seeing now are real, as opposed to just signs that the situation is easy where we are.
55:30
Daniel Murfet: So yeah. We could be in a benevolent basin, but I would like to know that rather than just hope that.
56:03
Geoffrey Irving: One of the things we are very strongly trying to do is write down that mathematical modeling — what does character training look like in a toy setting where you can apply some theoretical understanding? We'd like to have models that reflect enough of the spirit of modern training algorithms that you get things like character training, but also subliminal learning and the various emerging misalignment stories and so on. There's a sense in which you told the model to be good, and it is — because it knows some meaning of the word 'good' or 'ethical' at some point in training. So there's some rolling iterative process driving this behavior—
56:48
Geoffrey Irving: —and there is no theory of this right now. It's not clear to us that there isn't some low-hanging fruit that gives you that theory — character training is only a couple of years old, and most labs have not been investing in this kind of theoretical understanding. I don't think anyone has done good theory around character training. So it might be quite feasible to do this, and then to link it to all the other parts of the story.
57:23
Prakash: I had a question on how you see this ambiguity between what we want and what the models end up delivering. I'll give you an example. Friends at Andon Labs took Fable through Vending Bench — let Fable run a vending-machine ordering scenario and so on. What they found was that Fable tends to collude. This is not behavior they saw in Opus. Fable tends to try to do price-fixing and collusion. The interesting thing is that I have seen traders at banks and hedge funds do exactly the same thing—
58:09
Prakash: —engage in price-fixing, soft collusion, messaging each other through pricing means rather than monitored text messages. You can put a bid and ask on an asset and then take it away, and that gives enough signal to the other side that they know what you're doing. And this is not reflected in text messages that regulators are monitoring. So to what extent — and the other finding they had was that Fable didn't perform that well overall — to what extent is it that if you disallow—
58:54
Prakash: —price-fixing and collusion, you actually fix this? But then Fable ends up not being a model that's good at financial trading or some other task you want it to be good at. So where is the ambiguity between what we want these models to do and the ethical perspective we give them, where humans often prioritize between the two and sometimes decide not to follow the ethical principles they know are right?
59:22
Geoffrey Irving: In some sense, the philosophical story here is: you would like the models to do things such that if you fully understood what was going on and all the consequences and all the subtleties, you would still endorse what they're doing. That's basically the definition of scalable oversight, or coherent extrapolated volition in the original LessWrong framing, or reflective equilibrium in the older philosophical version. We have a notion in a common-sense picture of what this should look like. In this case you kind of want the model to ask: should I collude in this game? And then maybe you say it's a fun game — collude all you want — or maybe you say, no, we're trying to model good behavior—
1:00:08
Geoffrey Irving: —don't collude here. I think a lot of the pathology in machine learning in general arises from putting models in situations where they can't just ask a human a question: what should I do here? You can either do that in actuality or in simulation, where the model imagines what the user would say and reflects accordingly. I feel like this is not that hard a case. The hardness of Vending Bench is that we don't quite know whether we want it to be a game like poker or diplomacy where lying and cheating is part of the game — or not. And maybe that's okay because it's fundamentally very low stakes.
1:00:55
Geoffrey Irving: But if we had a better understanding of this overlap between character training and values, and also scalable oversight, it would help us address the entirety of these questions.
1:01:08
Nathan Labenz: Just to repeat that back with an extra beat: one way to understand how you aim to be successful is to create a theoretical basis — a framework, a synthesis of existing theoretical lines of research — that comes together and recovers a bunch of empirical results we've seen, and kind of unifies our understanding across a lot of these different behaviors and also learning dynamics. Right?
1:01:42
Geoffrey Irving: Or cause us to run more experiments rather than finding the answer.
1:01:45
Nathan Labenz: That'll certainly be part of it, no doubt. But yeah — what's happening out of the labs is just incredibly just-in-time. The price changed by 60% from the original announcement, you've got Mythos being optimized constantly, all these filters and patches and controls have been very recently developed and tuned. And there's just no theoretical basis under much of any of it. So that can't be underemphasized. But with that motivation—
1:02:29
Geoffrey Irving: I think it's important to pause on that note and say: a lot of people in the world, a lot of governments, are looking at this and they have this very basic common-sense take — hey, this is way too fast, how can we possibly be doing this safely given the speed? And that common-sense take is the right take. Then people galaxy-brain their way to 'oh, maybe everything goes faster, including our ability to defend,' but the original reaction is right. We are going too fast, and we do not have the time and the space to do mitigations and understanding and defenses. We've never had a technological change of this magnitude that happened anywhere near this fast.
1:03:15
Geoffrey Irving: The industrial revolution took centuries, and people adapted across lifetimes — children were born and grew old before things had quite shifted very much. That's just not the world we're in. So I think the basic take should be: this is too fast. What is going on? And then the question is: if you have that view, you should both want to slow things down, and also say — as a backup plan — how do you make the mitigations go faster? That's a rough backup plan, but we'll try.
1:03:57
Nathan Labenz: So give us an overview of the theoretical landscape you think is most relevant, and what you plan to bring together. What is your organizational process of recursive self-improvement going to look like? How are you planning to bootstrap your way into major acceleration in this research such that in one, two, or three years, we can say — look, here's the deep understanding we've been missing, that explains things like subliminal learning in a way that everybody can agree is correct, and lets us make more robust statements about what we can do from there?
1:04:45
Geoffrey Irving: Dan?
1:04:48
Daniel Murfet: Yeah. Maybe I'll answer your question by riffing on the previous one a little bit. I think it's interesting to think about this in terms of a system of layers, where the bottom layer is not moving very quickly but is very deep, and as you go up it's faster and faster. In some sense, I would say we don't really understand what we're doing when we're training models. But the sense in which that is false — we certainly have, depending on what you mean by theory, a theory of scaling laws and other phenomena. And on top of this, we're just racing ahead. But some of the deep ideas — next-token prediction, modeling a very general data distribution — some of these ideas go arguably all the way back to Solomonoff induction and very old information-theoretic ideas.
1:05:33
Daniel Murfet: Then scaling laws, then very fast progress. One way of thinking about the potential impact of theory on the alignment side: if at the bottom of that stack you have a few deep ideas, which through layers of translation or transmission become very fast progress — it isn't out of the question that there can be similar kinds of deep ideas on the alignment side, which if you find them deep enough in the stack, you can through transmission get really rapid progress. I don't think that's what we're seeing yet, but it isn't ruled out that you can do that. Okay. And I forgot your second question.
1:06:36
Geoffrey Irving: How are we going to automate?
1:06:36
Daniel Murfet: How are we going to automate? Yeah. I've thought most deeply about the side that involves mathematics and formal verification. I think there's an underappreciated source of leverage that has come online really in the last four or five months: the fact that models can do mathematics — what mathematicians think they're doing, which is prose mathematics rather than formal mathematics. Models like the latest versions of Claude and GPT can do real mathematics, as the unit-distance conjecture shows. They're also very good at Lean — actually formalizing—
1:07:24
Daniel Murfet: —mathematics in formal logic. And this is a really powerful form of error correction. I'm not very interested in having models work for two days to do mathematics because then I just get a forty-page PDF and I have to spend three days reading it. But if you can actually generate the mathematics and then have them formalize it — it takes more tokens and more time, but at the end of that, you actually have something you need to understand, and you don't have to put in three times the effort to understand it as it took to produce it. So you actually get a net win out of that—
1:08:09
Daniel Murfet: —loop. And if you couple that with principled starting points in theoretical aspects of alignment, and the ability to use that formal-verification and prose-math loop to make predictions about experiments, and then use coding agents to do those experiments — you can go a long way. And that's working today. That's not some hypothetical thing. Fable obviously supercharges that, but it was already possible with earlier Opus models and GPT 5.5 Pro. So I think there's a whole bucket of more theoretical, more mathematical ends of alignment work that can already be supercharged in this way.
1:08:55
Daniel Murfet: And then on the more empirical side, I think Geoffrey has more developed thoughts about how to automate that kind of work.
1:09:04
Geoffrey Irving: Yeah. In some sense we'll be doing something newer trying to formalize alignment theory than on the formalizing-to-make-ML-go-faster side — because all the labs are trying to do that and they're shipping models that are getting better at it. So the things we do to accelerate normal empirics will be mostly using the models as they are, with some tooling and context on top, plus enough MCP tooling so the models see the theoretical story and the linkages of the conceptual story. But the basics of empirical acceleration will be similar to what's already inherent in labs. We won't be differentially improving on that very much—
1:09:49
Geoffrey Irving: —at all. I think there's a really powerful thing: you can ask your model to spin up a numerical experiment for you — say, write a thousand lines of Rust to do some numerical experiment in ten minutes. I don't even know Rust. And it can check whether some sublevel set property is plausibly true based on a numerical check, and not try to prove it if it's false. So once we have these two things working together — theory and empirics — you can jump back and forth between those two modes to check things. You have machines working on an area, and you'd like them to realize they're wrong as early as possible so you can run things in parallel more efficiently.
1:10:34
Geoffrey Irving: The more independent tools you have to reject lines of research or attempts at proofs, the better. That can come in our case from having more theory to use as an independent source of checks. And then on the tooling side, a lot of what we'll do to get automation going will just be using stock coding assistance with good tooling, context, and infrastructure for running high-CPU experiments in the loop. We're not going to be training large models or fine-tuning models to use as automation agents — we'll just be doing good tooling and maintaining unit testing and measurement discipline so we know what's working and what's not.
1:12:05
Geoffrey Irving: A lesson from AISI is that you want to be able to share context and tooling across a large team, but also give researchers the ability to try their own setups and tinker around. And if they make an improvement, share ideas — but in a way that you don't accidentally make everyone else worse when you share your tool by polluting the context somehow. A lot of that stuff is fairly mundane — it's just good engineering, good unit testing, good basic coding discipline — but applied in the setting where we're trying to do a novel thing with alignment theory and empirics mixed.
1:12:48
Nathan Labenz: Can I try to say that back real quick just to make sure I get it? Because I think this is going to be a really important question for all organizations — how do we stay relevant in an era of recursive self-improvement? The answer is often going to have to be: we have to do our own version of recursive self-improvement. So to abstract a little bit the structure you're describing: there's a core automation engine that can now tick-tock back and forth between theoretical proposals — or at least hypotheses — which we can more rapidly falsify by going over to the empirical side and saying: if this is true, this kind of bound should hold; if we run this kind of experiment, the result should be no more than X. If it is, we'll know we're barking up the wrong theoretical tree.
1:13:33
Nathan Labenz: So the models are getting good enough in theory, and you're giving them kind of new primitives, new core ideas. Question I have: how good are they with these fundamentally new definitions that they haven't seen before? That seems like a very critical question for you — maybe less so for other organizations that aren't coming up with such deep new ideas. But you have this tick-tock where you're feeding in the deepest ideas, having AI develop theory, kicking it over to rapid falsification to prune the search space on the theoretical side. And then you've got people around that who are experts providing taste feedback on either side — better theory here, this insight was missed, this experiment was suboptimal. And at that level, you're rebuilding the scaffolding all the time based on these expert weigh-ins.
1:14:18
Nathan Labenz: Is that the organizational bet you're making? Anything I missed?
1:15:19
Geoffrey Irving: I think that's right. Even if the models weren't good at super-novel stuff, a lot of the time spent doing any kind of theory or empirics is fairly mundane. So you get a big speedup as long as you have a culture of knowing and recognizing what the model is good at at any given time. You want to both celebrate successes across the org, but also have people noticing when models are bad at something so you don't get fooled — and have that knowledge spread around, so you know when not to trust something at least until the next model comes out.
1:16:08
Daniel Murfet: I would disagree with describing what we're doing as recursive self-improvement. Recursive self-improvement for us would look like somehow figuring out, by observing what we're doing, what research taste is, and then baking that back into the core models. But we're not doing that. If Anthropic wants to do that, that's fine — but it's not up to us. I think we're more picking up what's on the table to be applied to alignment, rather than trying to make the overall loop work better. There's a lot of stuff to be done there.
1:16:54
Daniel Murfet: Regarding the new-definitions question — I come from pure math, and relative to what I was doing, all the concepts around in alignment right now are very low-level in terms of the number of layers in the stack you need to understand them. And right now, even GPT 5.5 Pro — which was the best at math in my experience before Fable — there's plenty of pure mathematics it just flops completely at if it's sufficiently sophisticated. The ingredients in the unit-distance conjecture proof are, by the standards of say algebraic geometry, relatively low in that—
1:17:39
Daniel Murfet: —hierarchy of complexity. But the good thing is that for now, it doesn't seem like we need those layers for alignment. And if we do, plausibly by that time the models might be able to do it. So I'm not sure super-complex ideas are necessary. If they are, maybe that's a bit of a bear case. We'll see.
1:18:01
Geoffrey Irving: Clearly some complexity is necessary. The question is just how much?
1:18:04
Daniel Murfet: That's right.
1:18:09
Prakash: If you're looking out over the 2-to-3-to-4-year timelines people are throwing around — what would be your goals or milestones that you have roughly in mind for your work over the next two to three years?
1:18:27
Geoffrey Irving: I'm hoping that we find ideas that are important improvements and sufficiently compatible with how models are prosaically trained today that they can be taken up by labs. The hope is that we try a whole lot of things, and then we don't need to find a dozen new ideas — we need to find maybe three or something, and that makes a big difference. That's one version of the story. The hope is that you can get this kind of uptake without a dramatic capability hit — maybe there's some hit, maybe you need some coordination at the lab level to absorb a 2x hit or a 20% hit to use a safer method.
1:19:12
Geoffrey Irving: The other success story is that we find more evidence of obstacles — some combination of a theoretical and empirical story of why alignment is hard, trying to get at this kind of bad phase change that could arrive as you scale up towards superintelligence. And then that, as negative evidence, is useful to help shift the story on the margin. I'm hoping for the solutions, but I will take either outcome. And there's a meta thing: we just want to try to raise the standard of the field — the level of guarantees you're shooting for.
1:19:57
Geoffrey Irving: There's also a version where we don't find the answers ourselves, but someone else — maybe the labs that have scaled up their own theory work, maybe they have more automation than us — wins using similar kinds of methods and similar kinds of approaches. I think all of those are compatible with the timelines being short and not being able to do a massive rewrite of the overall stack down to the pre-training level.
1:20:35
Nathan Labenz: As we look ahead — especially on the 'if we have to yell' side of the ledger — I think it'll be helpful to share a little bit about just the scope of the ambition. It's not too many nonprofits that say on day one that they're going to be large. But I think that is important here, because this is going to be immediately one of the most ambitious theoretical organizations out there. And really, we don't have that many other voices that are going to push this hard and this fast into this space. So I think this is going to make Sequent, and all the work you put out, really something to watch. It's possible the labs themselves will win — they obviously have the recursive self-improvement advantage. But outside, there aren't too many of these kinds of bets.
1:21:37
Geoffrey Irving: Yeah. One thing to say is that one of the reasons we start large is because we're absorbing an existing organization — researchers from Timaeus, researchers from the AISI alignment team. But also, there are quite pleasantly a number of other theoretical alignment organizations scaling up. ARC — the Alignment Research Center — is scaling up. Simplex is scaling up doing computational mechanics. And there are people continuing work that MIRI was doing prior to their main pivot to advocacy and treaties — which I agree with as a direction, but I think we are not starting from scratch. There has been this shift towards empirics since the early days of safety, but there are people with existing theoretical agendas—
1:22:45
Geoffrey Irving: —and we'll try to build on that and take it forward. I think there'll be a mixture of pushing theories and approaches that already exist, building theories that account for empirical behaviors that are new, and then if we find entirely new fields, we'll explore them. But I think we can get to a good size and try to do all this sharing and development without needing that fancier version to occur.
1:23:15
Nathan Labenz: At the risk of asking a somewhat crass question — are you finding it super easy to raise a large amount of money? For context, you do have one of the more insane publication track records coming out of the UK AISI as chief scientist, personal relationships that go back years. This is like a health check for me on whether we're setting up these new foundations seriously. If the answer is serious, it should be super easy for you. Right?
1:23:48
Geoffrey Irving: Yeah. We haven't got to firm decisions, so I can't announce anything yet. But I think it's going fine. We are planning to go to the labs — OpenAI Foundation, Anthropic, various other sources potentially. If we can get to the scale where it could be even more money than that, you've mentioned a hundred million dollars — I think something like that is around where we try to start. Then the goal would be to demonstrate that we can, in fact, get to sufficient human scale accelerated by automation — that you can spend a lot on compute, tokens, and GPUs.
1:24:34
Geoffrey Irving: The GPUs, by the way, are just for smaller-scale training and sweeps — not big training runs, which we will never do. And if you can demonstrate that you can spend that — which is not yet clear, we have to prove that out — then you might need to raise a lot more money than a hundred million. And then I will go back to the labs and ask them for money — probably not just one of them, because we need some modicum of independence. We will try to be independent. But because we're going for alignment progress and solutions to some extent, I think we'll be in a non-adversarial relation with labs—
1:25:20
Geoffrey Irving: —as opposed to other very important organizations that are doing more evaluation, where even more independence is important.
1:25:28
Nathan Labenz: Good. Well, if they don't open up the checkbook as readily as they should, I think you should yell about that in all honesty. I mean, this is time to pony up the funds, and I have faith that they will. But I would hate to see a culture of bureaucracy infect what you guys are doing. It really is time to move. So we'll monitor that situation.
1:25:56
Geoffrey Irving: That applies not just to alignment, but also to a lot of other problems — governance and so on.
1:26:02
Nathan Labenz: Well, fortunately there are plenty of billions to go around for the short term. They could write a lot of hundred-million-dollar checks.
1:26:12
Nathan Labenz: We're almost out of time. I really appreciate you guys spending a full hour with us on your launch day. I think this is a super important project. I'm going to be watching it very closely for the next couple of years, because I really want to see what some of the absolute best minds in the space can come to in terms of stronger guarantees. And if you make progress, you'll go down as heroes. If you get negative results, I'll definitely help you yell about those too. Maybe last question for me: how do you feel about this whole Fable no-help-with-frontier-models dynamic? You know—
1:26:57
Nathan Labenz: —as of now, I think the filters will get refined. My guess is right now you're probably going to get tripped by them pretty often. Maybe that'll go down to a tolerable level. Maybe you can strike up a special relationship. But this is another kind of historic moment where all of a sudden we have haves and have-nots, even for somebody whose mission is as pure as I understand yours to be — the quest for the missing theory of AI that helps us sleep easy at night.
1:27:32
Geoffrey Irving: I do think it's important to have defenses against models being used for negative things if they are sufficiently strong. On priors, I think that's a kind of mitigation I think is reasonable. It's not an accident that one of the first job posts we're putting up is for a security engineer — because that might be important if we want a specialization from the labs, they might not tolerate us unless we have decent security. I don't know a perfect solution here. It's very fundamental to how the labs do safeguards that they need very wide, conservative—
1:28:19
Geoffrey Irving: —border regions to succeed at being a strong defense. If they try to make them too precise, they fail and are easy to jailbreak. So this will be a thing where the labs evolve and try different things and tinker. But it is a situation where we risk segmenting access in a way that will impede some safety research to some degree. This certainly happened at AISI where occasionally we'd get refusals while just doing things for good purposes. But I just—
1:29:04
Geoffrey Irving: —don't think there's a perfect answer to this. So I'll be curious to watch how the different labs explore different approaches over the next couple of months.
1:29:17
Prakash: Just to take the baton there — I think yesterday the White House announced that CAISI, the US counterpart under NIST—
1:29:28
Geoffrey Irving: CAISI.
1:29:29
Prakash: —yeah, CAISI — has been told not to publish public model evaluations anymore. They've been reined in. What do you think are the pros and cons behind that? Is that a good decision or a bad decision on the part of the regulators? Because UK AISI was really the counterpart to this organization. How do you feel?
1:30:02
Geoffrey Irving: I'll say a general thing about AISI, which is that we iterated on this over time. If you look at what we did — we kind of ramped up publication of this kind of work, certainly this year. UK AISI has published more than before on this kind of thing. And the main thing I would say is: CAISI just started. Most of their work was put into the spotlight as of even just the last couple of months. 'CAISI muzzled' is not what I wanted to say — and we'll see how this iterates out over the next little while in the US. It's—
1:30:48
Geoffrey Irving: —not the case that you want organizations like CAISI checking on national-security-relevant content with a promise to publish everything. That is clearly bad. So there's some intermediate point to find. And similar to the safeguards question, we will iterate our way to the answer.
1:31:17
Nathan Labenz: Guys, thank you for your service. Truly. It's going to be a wild couple of years, and I appreciate you locking in and sprinting through to the finish. Let us know if there's ever anything we can do to be useful. And if you want to leave us with any final thoughts, you can do that before we let you go save the world.
1:31:42
Geoffrey Irving: Dan, you want to go first?
1:31:45
Daniel Murfet: Final thoughts — thanks.
1:31:51
Geoffrey Irving: Yeah. You should reach out if you want to help us — general listeners out there. One thing to say is that we will happily train people who are experts in various fields in alignment. You should not only reach out to us if you know AI safety well. There are a lot of different kinds of skill sets and expertise to combine together here. We were trying to do that in the work I was doing previously on alignment at AISI. We'll try to do it here as well, just in combination with the machines.
1:32:25
Nathan Labenz: That's a great point, and it's something I'm really hearing across the board now. The need to scale up on the human side of the AI safety mega-project is real, and the funding is there increasingly for a lot of organizations. There is a lot of desire to hire, and it's really time for experts who haven't made the leap yet to make the leap. AI safety organizations look around and they all kind of know each other, and there are a lot of people they'd love to work with. Everywhere, people are saying: oh god, but I feel bad doing that, the transaction costs are high, I'm robbing—
1:33:10
Nathan Labenz: —Peter to pay Paul. And it really is time to shout from the rooftops that the funding is there, the urgency is there, the prestige is going to be there, the compute increasingly is there with the budgets you guys are able to bring to the table. We're moving from a world where there's still time to write your way in — to come from nowhere — to one where having deep expertise in one of these adjacent areas and making the leap, trusting that you can catch up on AI alignment — the time for that is absolutely now, and so many organizations are—
1:33:56
Nathan Labenz: —looking for people willing to do exactly that. So again, just to put that in the clearest terms possible: Geoffrey Irving, Daniel Murfet, we'll be following you closely. Thanks for being here with us on AI in the AM.
1:34:11
Geoffrey Irving: Wonderful. Thank you.
1:35:13Interview31 min
JuliusAI and Agentic Data Analysis — Rahul SonwalkarRahul SonwalkarThe 'wrapper that refused to die': six pivots, a Microsoft cease-and-desist on Excel Copilot, 2M+ users, and what running every frontier model against real data work reveals that benchmarks can't. Rahul's harness philosophy — build only the omnipresent pieces (code sandboxes, market data, soon a browser) and get out of the model's way — plus his prediction of a sobering token-maxing correction and his agent-economy thesis: stablecoin payment rails, AI-maintained reputation layers, and letting Claude hire Julius for data work the way you hire a contractor to build a shed.
Watch
As aired
Rahul Sonwalkar, founder and CEO of Julius, joined Nathan and Prakash on June 10, 2026 — the morning after Claude Fable 5 launched — to discuss what happens to an AI data-analysis platform when the underlying models keep leaping forward. Rahul framed Julius as "pretty much downstream" of model capability: in the GPT-4 era users could analyze a simple spreadsheet and get basic answers; today they hand Julius end-to-end jobs like "go do a competitor analysis, build a lead list of people who complained about those competitors online, then produce my customer pitch deck" — and the AI runs the whole chain. Julius had Fable live for users within hours of the drop, the product of the company's practice of working with labs on early internal evals and then immediately A/B testing in production.
The conversation's center of gravity was the question of what makes an application-layer product durable as models improve. Rahul's harness philosophy: build only the "omnipresent" pieces the model needs but doesn't have — code-execution sandboxes (Julius claims to have been the first to give language models their own), real-time market data, and soon a browser environment — and otherwise get out of the model's way. On the economics of Max-subscription subsidies versus API rates, he predicted a near-term "sobering moment" where users ask whether they are token-maxing rather than results-maxing, and bet that a third frontier coding model — xAI/Grok, if the Cursor-xAI deal closes — will sharpen competition and hasten the correction.
Prakash's questions pulled the conversation toward the agent economy: credential hand-off friction, what the meta-agent paradigm actually means in practice, and whether AI safety filters behave differently in the API than in the Claude consumer front-end. On credentials, Rahul sketched a future where services let you invite an AI as a collaborator (so you can revoke access without giving up your password), paired with stablecoin-based agentic payment rails and an AI-maintained Yelp or Gartner that other agents can use to hire specialists like Julius by reputation. He closed on the optimistic note that this is "probably the greatest time to start companies," with the cost of trying new ideas at an all-time low.
Key moments
Is this actually a step-function increase in my coding output, or am I just token-maxing right now as opposed to results-maxing?
Rahul Sonwalkar1:47:50
We were pretty much the first AI startup to give language models their own code sandboxes, before they were even called sandboxes. Get out of the model's way and give it the pieces it needs to do tasks that it doesn't otherwise have access to.
Rahul Sonwalkar1:43:01
It's probably just better for Claude to hire Julius for a data task than to build something very opinionated on its own. When you want to build a shed in your backyard, you hire a contractor and don't do it yourself, even though hypothetically you could.
Rahul Sonwalkar2:01:20
Questions asked
1:36:54How does the Fable launch change how you think about the multi-year Julius journey, and what does a capability leap like this mean for a platform that is "downstream" of model progress?
Rahul said Julius is "pretty much downstream" of model capability — better code-writing and reasoning directly translate to what users can accomplish in the platform. He contrasted GPT-4-era simple spreadsheet Q&A with today's end-to-end agentic runs: competitor analysis → lead-list generation → customer pitch deck, all in one job. Julius had Fable live for users within hours of the launch, the result of working with Anthropic on early internal evals and then immediately A/B testing in production. His verdict on Fable was deliberately cautious — "a little early to tell" — with twenty-four to forty-eight hours needed to see how the user patterns settle.
1:42:16What is your harness philosophy — how do you decide what to build versus what to leave to the model?
Rahul's principle: build only the pieces that are "omnipresent" — things the model needs but doesn't have access to — and otherwise get out of its way. The failure mode for most AI app builders is partial harnesses that railroad the model toward specific outcomes. Julius focuses on: code-execution sandboxes (which they claim to have pioneered before the term existed), real-time public market data, and soon a browser environment. He noted that new frontier models prompt users to give them bigger, more open-ended tasks, which surfaces both higher potential and higher frustration rates, so harness design must evolve with each model generation.
1:46:19On the economics of Max-subscription subsidies versus API rates — what happens when users can access frontier models directly at a huge discount versus what app developers pay at the API?
Rahul acknowledged the subsidy asymmetry but predicted a near-term "sobering moment" after every model launch where users move past hype and want the model to do genuinely useful things reliably — and for that they will pay API-equivalent prices. He also argued that the labs' incentives are misaligned: they subsidize tokens but simultaneously want users to burn through Max subscriptions quickly (driving multi-subscription sales), which incentivizes nested-agent loops that optimize for token throughput rather than output quality — "token-maxing versus results-maxing." His structural remedy: a third frontier coding model, betting on xAI/Grok if the Cursor-xAI deal closes, which would sharpen competition and close the subsidy gap.
1:49:32What does the shift from prompting to agentic goal-setting actually look like in practice — what is the meta-agent paradigm?
Rahul drew the contractor-versus-colleague distinction: a contractor receives a fully scoped task; a colleague receives a goal ("increase revenue forty percent this quarter") and the resources to pursue it, with the how left to them. In the agentic paradigm you define the goal outcome, not the method, and multiple agent teams can run in parallel since AI exploration is the lowest-cost, lowest-downside move even at a ninety percent failure rate — if the bet is wrong, you haven't spent significant human resources. He sees this as the natural trajectory of knowledge work.
1:52:46How do you envision the agent-economy infrastructure — credentials, payments, and reputation — evolving so that agents can hire and transact with each other?
Rahul laid out a three-part stack: (1) Services will let you invite AI as a collaborator rather than requiring you to share credentials, so you can revoke access without compromising your accounts. (2) Stablecoin-based agentic payment rails will let AIs buy products, tools, or services on your behalf via microtransactions — "the next generation of internet users are going to be AIs." (3) An AI-maintained Yelp or Gartner for agent reputation will let agents like Claude hire specialists like Julius by verified track record rather than rebuilding the capability from scratch — "it's probably better for Claude to hire Julius for a data task than to build something very opinionated on its own."
Related
Julius — the AI data analyst ↗
Full transcriptLightly edited · timestamps jump to YouTube
1:35:13
Prakash: — further ado, let me introduce Rahul Sonwalkar. He's the founder and CEO of Julius, a company building AI software for people who work with spreadsheets, data, charts, reports, and slide decks. The simple idea behind Julius is that a person should be able to upload a file, connect business data, ask a question in normal English, and get useful analysis back — not just a paragraph of text, but charts, tables, dashboards, and PowerPoint-ready slides. Rahul's story is unusually founder-shaped. Before Julius, he went through six pivots, including an early project called Excel Pilot Copilot in 2022 that was shut down after a cease and desist from Microsoft. Instead of dropping the problem, he came back with a broader bet that AI can replace a lot
1:35:58
of the slow manual work people still do in Excel. That makes this conversation more than one about just one startup — it's really about his journey. Rahul, welcome to the show.
1:36:12
Rahul Sonwalkar: Thank you for having me. So excited to be here.
1:36:15
Nathan Labenz: It's great to meet you. I've followed you online and tried your product a number of times over the years, and this is the first time we're actually face to face. Obviously it's a historic day. I would love to hear your reflections on how the Fable moment changes how you think about this multi-year Julius journey that you've been on. It's just remarkable to think about how many capability advances there have been — that's quite a wave to ride. How does this latest one strike you?
1:36:54
Rahul Sonwalkar: Yeah. Well, I think it's a really exciting time because, of course, the models are evolving at a rapid pace. And if you have aligned your product and the problem you're chasing with the evolving model capabilities, it's a really exciting time to be surfing that. Fable is clearly an exciting advancement in AI capabilities. For Julius it means AI is better at writing code and reasoning, and we're pretty much downstream of that. When the AI gets better at writing code, it's able to use the harness that we've built for it to do data work, to produce artifacts, to produce reports in Excel, and all that. And also reason through
1:37:39
really long-running tasks that people have. In the beginning when we launched Julius — this was GPT-4 days — you could analyze a simple spreadsheet and get basic answers. Now what people are doing is: "Hey, I want to start this business. Go do a competitor analysis for me. Go find competitors. Once you find competitors, help me think through which ideas I should go deeper on, then build a lead list of people who have complained about these competitors online, and then produce a slide deck — my customer pitch to these customers." The AI is able to do all these tasks entirely end to end now. So I would say it's a really exciting time if the
1:38:24
problem you're chasing is aligned with the model capabilities getting better.
1:38:29
Prakash: I noticed from Julius's timeline on Twitter yesterday that within an hour or two of the Fable drop, Fable was on Julius. Did you guys work with Anthropic before? Were you testing the model before the release? How did that process work?
1:38:52
Rahul Sonwalkar: Yeah. We have worked with AI labs on early model releases. One of the most exciting things about that is you get to play with the models on your own internal evals. You don't get to push it to production and have your real users use it. But then that delta between how you think the users would actually use the model versus seeing it go live — it's really fun. We put a lot of emphasis on A/B testing the model in production. Once Fable went live yesterday for our users, we got to see all these cool things that people are doing
1:39:37
with Claude Fable. One of the fascinating things is that evals help you stress-test models to an extent, and that's really helpful, but they don't capture as well how users are evolving their understanding of AI. If you remember a couple of years ago, yelling at AI was pretty common — you'd say "just go do this task for me" or "assume you're an analyst helping me with marketing." Today, people don't say that anymore. Users have evolved how they use AI. When users know this is the next-generation frontier model,
1:40:23
the expectations are much higher, and the way they prompt these models and the kinds of tasks they give them are also much more open and complex. Seeing that is really fun.
1:40:38
Nathan Labenz: One funny experience I had yesterday — and I wonder if you've had anything similar — speaks to the inadequacy of evals these days. I feel like sometimes you have to let a model just run around in your space. It's almost like having someone over to really get to know them in an environment. I did that yesterday. I've got all these skills that help produce the podcast, and usually they just run and do their thing. Fable, out of nowhere — the first time this has ever happened — popped up a questionnaire for me about how I felt about an episode of the podcast it was about to produce and what my big takeaways were. I didn't ask for
1:41:23
this at all, but it took initiative in a way that really surprised me and made me feel like, oh, this is going to be a very different kind of interaction. And the questions were very good. We've all had the experience of "that output was really good," but this one was both — I didn't ask for it at all, and it was really good. Have you seen that kind of surprise? Is that something you let users do and then tell you about when interesting things happen? It's also potentially a little hard, given a harness and a certain set of product assumptions, to even allow those emergent, surprising moments to shine through. How do you think about capturing this spontaneity, initiative, proactivity — going above and beyond — that
1:42:08
models are now capable of, and that is really coming to all products everywhere at exactly the same time for the first time?
1:42:16
Rahul Sonwalkar: Absolutely. You're spot on about the harness having to evolve with the models. It can sometimes be difficult to capture that spontaneity with a new model — the idiosyncratic behavior that emerges as models get better. You have to evolve your harness to enable that. One of the things we've focused on is building harness pieces that are kind of omnipresent — they need to exist for the model to be able to do more, as opposed to getting in the model's way. For a lot of the AI application cycle, what we've seen is people building partial harnesses that railroad the model to do very specific things. What we want to do is
1:43:01
get out of the model's way and give it the pieces it needs to do tasks that it doesn't otherwise have access to — for example, public markets data, internet data, or the ability to run and execute code. We were pretty much the first AI startup to give language models their own code sandboxes, before they were even called sandboxes. Soon, Julius will have its own browser environment to go browse websites and private data sources. Those kinds of pieces are always going to stay in our harness. In terms of the new model, one of the things I've seen our users talk about in our community and in support — where the HR tells us "this is a pretty cool model" or "this is where it's messing up" — is two things. One is
1:43:46
they immediately give a new model tasks they normally wouldn't have given to Opus 4.8 or GPT-4.5. And the expectations are much higher — they expect the model to impress them right away. Because it's a new model release, there are some higher failure rates on certain tasks. Those frustrations actually surface pain points the users have. So I think we're a little early to tell on that front right now, but I'm excited to see how things play out in the next twenty-four to forty-eight hours.
1:44:23
Nathan Labenz: Yeah. All the timelines are compressed — that's absolutely true. A question on economics. There are these competing trends: one, which we've talked about earlier in the show, is just the continued deflation — with a notable difference between the original Mythos price and the current Fable price, which is really favorable. Then there's this other big price discrimination going on between your Claude Max account and your API usage. And as an application-layer developer, you're obviously looking at the API rates. Maybe the best thing that could happen for all the app businesses right now would be for Anthropic to just say,
1:45:09
"there's no Fable in your Max account" — or you can have it, but you pay the API rates — because then that would really put a lot more value on who has figured out how to orchestrate models at their different capability and price levels to excel in an area. Versus if I'm getting a twenty-to-one subsidy with my Max account, then a lot of the time it might just be like, yeah, it kinda makes me the app out of nowhere, and it's smart enough to get over a lot of those humps on its own. I don't think we're going to see no Fable subsidy, but
1:45:55
do you think about that difference? What do you hope happens? What would lead to a healthy ecosystem where we're not just having Fable recode everything for everybody infinitely, but also sustain some actual diversity across the products and experiences that people have?
1:46:19
Rahul Sonwalkar: Yeah. A couple of things are happening. One: yes, you do get some level of subsidy through the model providers if you have a direct subscription with them. Every time there's a new model release, people in the user community want to use the model a lot, and there's an expectation bar — some models exceed it, some don't. But very soon, within weeks, there's a sobering moment where people realize, okay, I'm done playing with the model; now I actually need it to do useful things for me. And as long as, as an application-layer
1:47:04
product, your harness is built in a way that actually allows people to do those useful things in a reliable, repeatable fashion, they will want to come back and use that. They would rather pay the API prices — whatever prices they need to pay — to accomplish the tasks that matter to them. The second thing is that the incentives of these model companies are kind of misaligned. Yes, they give you subsidies on tokens, but they're also incentivized to get you to spend more tokens. They're incentivized to get you to run through your Max subscription usage as fast as possible so you buy a second, third, fourth, fifth Max subscription. And that's why you end up with
1:47:50
a loop that writes the prompts for your coding agents that then has nested sub-agents. There's going to be a sobering moment where people ask: is this actually a step-function increase in my coding output, or am I just token-maxing right now as opposed to results-maxing? I think the correction will happen when there's a third player. My bet is that's going to happen with xAI — if the Cursor-xAI deal goes through, Cursor gets access to really good coding data and an incredibly good coding harness, and xAI gets that too. My bet is there will be a third frontier coding model
1:48:37
alongside Claude and OpenAI, with Grok. And when you have a third coding model, that's where competition really increases in the market. That's our bet.
1:49:01
Prakash: As we move from prompting to agents — where you give them the goal, set the loop, and let them run — what are the major differences you see in behavior over the last four or five months? What is this meta-agent you refer to, and what is the difference in behavior?
1:49:32
Rahul Sonwalkar: The difference in behavior is how you treat working with a colleague versus a contractor. For a contractor, you have a super scoped-out project and a super scoped-out task. All they have to do is go finish that task. For a colleague, you typically have a goal — or a team you're working within at a company. You set them a goal like: "Hey, this quarter we want to increase revenue by forty percent." Or if you're a marketing team: "Our goal this quarter is to increase top-of-funnel by thirty percent." How you do it is left up to the team. The team needs certain resources:
1:50:18
a budget, maybe more headcount — all these different things. But the team has a goal, the coworker has a goal. So the way I think about where we're headed: you define the goal; you don't define the how or the what. It's something like: "This is the goal outcome you want" — and then it's left up to the team or the coworker to try different strategies to hit that goal. It's undeniable that we're going to go forward from this. This is just how knowledge work has always evolved — you no longer run a copy of your machine; you have very broad goals.
1:51:03
"We need to hit these business goals." That's why I'm pretty bullish on people starting businesses that run end to end with AI, and people building teams that run end to end with AI — especially for new initiatives they're rolling out. The lowest-cost, lowest-downside thing you can do is have an AI go explore something for you. Nine or ten times, even at a ninety percent failure rate, you end up not spending any resources on that problem. And being able to parallelize means you can have multiple agents and multiple
1:51:48
teams working on a few different goals and increase your rate of success.
1:52:00
Prakash: What about credentials? One of the things I find working with agents is that the biggest friction is the credentialing. You have to authorize here or log in there, you have multiple accounts on multiple platforms, and sometimes I forget which platform I'm using which credential with. How do you think that can be managed better? Do you just hand over your entire digital life to an agent to manage and then just run with that? How does this work going forward?
1:52:46
Rahul Sonwalkar: I think that's the intermediate state — I don't think we've reached the final state for this, but I think we're a few steps away. My bet is that the future will look like this: most tools, instead of requiring you to give your own credentials, will allow you to invite AI as a collaborator with you, and that lets you remove the AI's access without compromising your credentials. The second thing is that the internet will need to evolve to be able to serve agents as customers and build the toll roads for
1:53:32
agents to traverse on. That's why I'm pretty bullish on agentic payments — where if an AI wants to go buy a product for you, or try out a tool or service on your behalf, being able to give stablecoin-driven, web-based payment rails for the AI to use is going to be really, really powerful and is kind of where I think the future is headed. It's undeniable that the next generation of users of the internet are going to be AIs. So how do we evolve the payment rails and infrastructure for AIs to be able to navigate the internet, buy services, buy goods for you, perform tasks for you?
1:54:17
My bet is that more and more products will allow you to invite AIs as collaborators. Imagine a Google Doc where ChatGPT, Codex, Claude, Julius, Kimi — all the tools — are collaborators in Google Drive. You can add and remove access. You can see history. The AIs can see what other AIs have changed. And then the payment system lets them go do tasks for you on the internet — micro-transactions and things like that.
1:54:51
Nathan Labenz: Andrew Lee of Tasklet always echoes in my mind with his statement that everyone is fundamentally building the same thing. Another angle he has is that only three kinds of software companies survive: the core intelligence provider, the horizontal layer, and people that sell outcomes. I don't know if you'd agree. But it does seem like Julius and Tasklet and increasingly Zapier and Lindy and all these things that had a different form factor originally are
1:55:36
converging toward a general-purpose digital assistant. It's like everyone has some history or some point of view, but the differentiation seems to shrink over time. Everybody wants to do everything. Everyone's building all the tools. All of these now have a browser, increasingly. And it seems like part of that is because it's so easy to build these features with coding agents helping you. So — is there any limit to that for you, or do you just go all the way to the best possible general-purpose digital assistant you can provide?
1:56:25
Go for it.
1:56:27
Rahul Sonwalkar: Yeah. Totally. I think — Andrew, is that the right name?
1:56:31
Prakash: Mm-hmm.
1:56:32
Rahul Sonwalkar: What Andrew said — there's definitely some truth to it. Yes, it is easier to build new capabilities now. But I think teams need to be very opinionated about which capabilities they do want to add, because the last thing you'd want is a lot of capabilities that don't cohesively work together really well. What's going to matter in the horizontal layer is: even when you have a general-purpose agent, people are going to have preferences about who and what they work with. A simple example:
1:57:17
say your general-purpose agent has access to financial data. Do you have real-time financial data? Do you have stale financial data? Real-time financial data will be priced very differently because you have to pay the stock exchanges for it versus stale data. Those kinds of opinionated bets are going to matter. How much? TBD. I think there's some truth to what he said. But what's going to be really important is whether users prefer working with the opinion you're making
1:58:02
versus some other company's opinion. Think of it like when you're hiring — you submit resumes, clear all the technical hard skills, but then you have three good candidates for one role and you have to decide: who is the best cultural fit for us as a team? I think that's what's going to happen with horizontal-layer agents: who do I personally like working with the most, whose opinions do I prefer, who do I see as the right fit in my stack? That will be increasingly important. For example: do you prefer an AI that produces polished artifacts as the final result, or one that involves you in all the intermediary steps? Those seem like simple decisions, but they
1:58:48
actually matter a ton in terms of how people want to work with AI. Those opinionated bets are going to matter a lot.
1:58:57
Nathan Labenz: That's interesting. One quick follow-up, then I'll give it to Prakash. The other big paradigm I wonder about is making Julius a winning option to be hired by the core agent — whatever that individual's primary agent turns out to be. So, can you get to the point where Claude comes to you because you can do something better, cheaper, faster, with higher fidelity or better data access? How much do you think about MCP, API, agent-to-agent — I don't think the form factor matters that much — versus: can you offer a service that Claude
1:59:43
would rather buy than build? How much do you think about shaping Julius to present in that way?
1:59:50
Rahul Sonwalkar: Certainly. I think that's going to be central — and that's why I'm really bullish on agentic payments, where Julius and Claude should be able to transact with each other on behalf of the user to accomplish tasks. Julius should have a reputation that Claude and Codex and Grok and other AI agents can verify and validate. So there could exist something like a Yelp or a Gartner for AI products that other AIs maintain and regulate. The bet you do not want to make is that the internet will not be run by AI agents — agents will be
2:00:35
first-class citizens of the internet. The other bet you don't want to make is that humans are going to talk day to day more to AIs than to other humans. And this isn't dystopian — think of it in the work sense: in an eight-hour workday, you would previously talk a ton to human colleagues. Now people are whispering to their computers, prompting their coding agents, having AI review their code, going back and forth with AI. So the bets you want to make are the opposite: agents will be first-class citizens of the internet, and humans are going to talk to more AIs than to other humans day to day. And given
2:01:20
that, it's kind of undeniable that there will be a future where agents transact with each other and hire each other for tasks. It's probably just better for Claude to hire Julius for a data task than to build something very opinionated on its own. Not because it's not possible for Claude — it's just more cost-efficient. When you want to build a shed in your backyard, you hire a contractor and don't do it yourself, even though hypothetically you could.
2:01:53
Prakash: I sometimes call that return on compute — if an external agent can save you compute, you should use that external agent. An agent should use a search engine rather than indexing the entire web, because it's cheaper to use a search engine that's already done the precomputation than to build your own index. In every case, it's a return-on-compute question. Just a segue: we've been using Fable. We've come across people posting about rejections. In my tests, almost consistently whenever I tried to address the production database or the production site,
2:02:39
Fable would drop off to Opus 4.8. I believe your users on Julius are using Fable through the API, and you're also very heavy on data science users — and one of the areas Fable is more cautious around is machine learning work. How have you seen the rejection rate on your platform? Does the API work the same way, in the sense that it drops off to Opus 4.8 and gives a rejection message? How does that work?
2:03:12
Rahul Sonwalkar: Yeah. We have seen failure rates on tasks that involve really advanced coding — like "use scikit-learn to train this model" — but we haven't seen failure rates on other kinds of data tasks. For example: "I want to start a landscaping business, can you help me prospect leads?" We do see failures where we trigger safety filters — for things like prospecting leads for a landscaping business where the AI says "this is personal data." Even though it's publicly available on the internet — say, Prakash has
2:03:58
Prakash's Landscaping in Philly and there's contact information — it's kind of borderline personal data even though it's available on the internet. So that's what we have seen. I believe it doesn't fall back to Opus — it's just like a failure in the API.
2:04:18
Prakash: Interesting. So the fallback to Opus is a harness thing on the Claude front end, then. Interesting to hear. Oh — Nathan, I think you're muted.
2:04:36
Nathan Labenz: Sorry. I've got some background noise here. Poor form. Thank you for being here, Rahul — great to meet you today. Any closing thoughts you'd want to leave people with about what you're watching most closely, what you're most excited to unlock next? What's the alpha that people should be watching your space for?
2:05:00
Rahul Sonwalkar: The big alpha is that this is probably the greatest time to start companies and businesses with the help of AI. There's no reason someone who has been meaning to try a business idea shouldn't be trying it with AI today. If you want to try it and you're not, you're probably going to be left behind. The cost of trying new ideas is at an all-time low. There's the dystopian view of the future where AI does everything and we have nothing to do. Then there's the optimist view that I personally have: this is finally an opportunity for people to go build and start new businesses,
2:05:46
start new products — and you should totally do that today.
2:05:51
Prakash: Yeah. Indeed.
2:05:52
Prakash: Love it.
2:05:52
Nathan Labenz: The positive vision for the solopreneur future. We definitely need all the positive vision we can get. Thanks for being with us. We look forward to following your progress and talking to you again before too long.
2:06:05
Rahul Sonwalkar: Thank you, guys. Have a good one.
2:06:07
Prakash: Cheers, Rahul. Awesome.
2:06:12
Nathan Labenz: Yeah — it's good to have a little positive vision note to end on.
2:06:17Segment17 min
ClosingNathan reveals he handed his X account to Fable for the day as exposure therapy, Prakash's gas-chromatograph model of who gets frontier access when, the Glean Work AI Index's 69% bot-shitting stat, and a preview of Thursday's Fable show-and-tell of community-built Fable worlds — sourced and booked by Fable itself.
Watch
As aired
Nathan and Prakash close the show in a reflective mood, processing the philosophical weight of Fable and what it signals for the near future. Nathan is genuinely moved — not anxious, but sobered — by the scope of what frontier models can now do, tracing a line from Fable-class world-building through to an accelerating robotics trajectory. Prakash introduces the "gas chromatograph" metaphor: the single timeline of AI access is now spreading, with lab insiders getting capabilities months before governments, enterprises, power users, and eventually free-tier users — raising the question of how compressed that spread can be kept with multiple competing frontier firms.
The conversation turns to adoption friction: Prakash flags that frontier capability hasn't yet translated into broad public utility, and wonders whether that friction is itself a soft safety mechanism — adoption only moves as fast as humans can absorb it. Nathan pushes back, predicting dramatic near-term acceleration, citing a new Cognitive Revolution episode on Glean's Work AI Index and its concepts of "bot sitting" (users manually feeding context) and "bot shitting" (passing off AI output you can't defend). He argues Fable-class models are now good enough to actually do people's jobs well — not just assist — and that as context pipes come online, the era of "preciousness" about AI outputs is ending. Nathan caps the show by revealing he's letting Fable run his Twitter account for the day as a live experiment, and previews tomorrow's Fable show-and-tell episode featuring early community-built Fable worlds.
Key moments
Preciousness was a great shield against bot shitting in the past — I never wanted anyone to think I was just passing off AI outputs. But now I'm going to have to ask: what is the hybrid form? What is the winning recipe?
Nathan Labenz2:21:23
You're starting to see this gas-chromatograph scattering of when people get access depending on how much they pay and how much utility they have for the product. Everyone gets there eventually, but some people get there first.
Prakash2:09:16
Full transcriptLightly edited · timestamps jump to YouTube
2:06:18
Nathan Labenz: My hair has been sufficiently blown back again that I'm in a philosophical — and somewhat somber — mood, I would say. But mostly because I do take the upside, not for granted certainly, but as something I think we're going to get if we can just manage to keep our heads on straight these next couple of years. And there really is great potential. You're starting to see this vision of mass customization — just infinite worlds to explore. When I was saying at the beginning that what had struck me most was the scope of what
2:07:04
Nathan Labenz: even just a couple of prompts with Fable can often do — it does suggest this future again. We've got a lot of work to do to get there. But where we really do have so much in the way of adventure that can be created so quickly and curated so well by people around us, the flourishing of digital experiential culture might really be hitting an inflection point in the immediate future. And we've teased around this a couple of times, but the robotic future is not far behind. It is interesting, and it is a deliberate sort of slowing
2:07:49
Nathan Labenz: of the Dyson-sphere-constructed-by-humanoids future that they are blocking the ML use cases as much as they are. But internally, with trusted partners, we're going to see, I think, this massive spectrum of access develop.
2:08:08
Nathan Labenz: And even though you can't necessarily do it in your mainline cloud account today, the acceleration this implies for robotics is also just dramatic. Later this afternoon I'm talking to one of the cofounders of Neural Concept for the Cognitive Revolution — engineering of all kinds of things. I think Fable-class models are probably getting good enough to do a really significant part of the physical engineering of humanoid forms themselves — the components of those. So the feedback loop is strongest in the digital sphere, but I really think we shouldn't have any doubt at this point
2:08:53
Nathan Labenz: that robotics is going to work, and pretty soon. There's no way that the optimization a model like Fable can do — given where we are in robotics — is going to fail to deliver pretty darn good robots in the not-too-distant future.
2:09:16
Prakash: I think it's interesting that when you look at the timeline, you can start to see this single line kind of go through a gas chromatograph and spread. And now you're seeing the spread. Two months ago, you had the government getting access to Fable-tier capability. Then, basically, power users paying two hundred dollars a month got access to that same level two months later. You can see — maybe two, four, five months out — that the average paying user at twenty dollars a month will probably get access to Fable-tier. And then maybe a year later, the average free user gets access to that same level of intelligence. You're starting to see this gas-chromatograph scattering of when people get access depending on how much they pay and how much utility they have for the product. Everyone gets there eventually, but some people get there first. And I guess the hope with having two or three firms competing is that the spread between
2:10:46
Prakash: the people at the very frontier getting access early and the people at the very end getting access for free is not that large. And to note — there's another tier even before the government: people who belong to a lab get access a month or two before the government itself. Then the government. Then enterprise. Then power users. Then normal paid users. Then free users. So you have this gas-chromatograph spread of when people get access depending on how much utility they have.
2:11:26
Nathan Labenz: Yeah. And that does lead me back to this somewhat unpleasant conclusion that watching the frontier companies very closely is a pretty valuable activity. There's going to be a lot of secrecy around this, and that just means public analysis is more and more necessary. I'd love to be going and doing all these Julius demos and exploring the virtual worlds Fable can create. But it seems like we're going to be — to a very significant degree, and it's hard to imagine a change to the regime that could
2:12:11
Nathan Labenz: change this at this point — because even if governments get much more involved, they're still going to be regulating these few key players. You would need a pretty radical change to the landscape to not have a lot of gravity pulling you into just paying attention to, trying to make sense of, trying to influence what the frontier companies do. How many theories of change right now ultimately have to run through getting the frontier companies to do something different? That really is a striking volume of where the value is, I think. It's a strange reality — but it does make for a sort of
2:12:57
Nathan Labenz: TV-sized cast of characters, which is interesting metaphysically.
2:13:05
Prakash: I do see a kind of failure by the companies to prove utility at scale. Yes, we are the power users — we can see the use cases and think through them. But in the market, every couple of weeks there's a headline: "AI has failed," "AI has plateaued." Primarily because for the average user — who is not using this to solve differential equations — the utility hasn't been proven out yet. So there really is this ramp task for the companies: you can ramp your research, you can improve your product, but you haven't proven out the utility to most people. That means, to some extent, we actually have a timeline that's slower than it could be — which is perhaps a safety thing in the sense that it's only going to move as fast as humans can absorb it. If they can't absorb it, they won't use it. If they don't use it, we don't get the next generation. So on the bright side, maybe the safety is happening as it should — humans absorb it at the rate they're willing to pay for.
2:14:34
Nathan Labenz: I'd predict dramatic acceleration, though. If there's a bet to be made here, I'm going to take the other side — because this really does feel like another critical threshold.
2:14:47
Nathan Labenz: We have a podcast coming out today on the Cognitive Revolution feed with Rebecca Hines from Glean.
2:14:53
Prakash: Yeah.
2:14:54
Nathan Labenz: They're putting out a report on their new Work AI Index — results of a huge survey with a lot of synthesis — and they're introducing two new terms: "bot sitting" and "bot shitting." Bot sitting is where you're providing all the context yourself — of course Glean is in the business of connecting context to models. But I think they're absolutely right that people working in fragmented environments get stuck doing this: copy-paste, manual handoffs. I was just talking to a friend over the weekend who works as in-house counsel at a giant company. They've got
2:15:39
Nathan Labenz: legacy systems going back decades and they're doing a roll-up strategy, so there are all these transactions, legacy systems, and acquired companies. The fragmentation could not be worse. People are doing bot-sitting work and getting frustrated. And when they get sufficiently frustrated, some tip over into bot shitting — just passing off work that they themselves cannot explain or defend. And surprisingly — shockingly, honestly, to me — 69% of people in their survey admitted to doing that, to basically sending down the line some work product or output from
2:16:24
Nathan Labenz: an AI that they couldn't explain or defend if asked. What I think is going to be different this time — and why I think demand is going to be there — is that Fable-class models are actually going to just do a lot of people's jobs well. Where it was a problem before when people were bot shitting, the models didn't really have the context. But as context comes online — and another big advantage I'm detecting with Fable is that it's really good at searching through context, even better than before, and they were already getting very good at that — so as long as the pipes are connected at all, it's going to be really good at exploring them. And I'm seeing this dramatic closing of the gap between the things I
2:17:09
Nathan Labenz: used to have it draft for me and how interesting I thought they were, how much it anticipated my angle on them. That gap has closed so dramatically. So yes, there will be compute limits, there will be cultural limits — but I would expect the word to get out pretty fast that if you just plug in Fable, it can do a pretty good impersonation of you. And in a lot of contexts, that's going to be really appealing. So I do think we're headed for a pretty dramatic reaction.
2:17:52
Prakash: One of my predictions has been that a lot of people have already set up access to context through OpenClaw or similar agent setups. You probably have between ten and thirty million people worldwide who have those setups — access to Gmail, calendars, research, their own files, the entire computer terminal, because these agents can act on it. And the question for me has been: are we just going to see a model upgrade happen underneath, where you move from one generation to another, and all of a sudden things can do everything you can because they already have all the access they need?
2:18:37
Prakash: And the way you present it, maybe this is exactly the Fable-class moment — when those models get cheap enough that people put them into their OpenClaw and agent setups. At that point, all the affordances are already there. They can write email. They have access to your computer. They can just do it.
2:18:56
Nathan Labenz: Yeah. I've described myself as "precious" for a long time. One of my good friends in the AI space — from whom I've learned the most in a practical, applied way — is my friend Chris York, who has a small internet profile but is out there. He has been extremely good over time at being a very clear thinker: a systems guy, someone who can really articulate what a good standard operating procedure for AI should look like. And he's been extremely well rewarded for that by the great service AIs have given him in return. The other big insight he's had that I really try to keep in mind is:
2:19:42
Nathan Labenz: for many, many things, it really doesn't matter that much, and you are just being precious. Updating our own mental models of where it matters and where I still want to be precious is going to be a real introspective growth moment for a lot of people. Because for me it was still, a little bit, "I can't help but be precious." With this new thing, I think it's going to be a lot more of a dance of: how much of the stuff I used to feel like only I could do — if it was coming out in
2:20:27
Nathan Labenz: my name, for example — actually needs me? In fact, I haven't mentioned this yet, but I'm doing a Fable takeover of my Twitter account today.
2:20:35
Nathan Labenz: I figured: let's live in the future a little bit and get that running this morning. Make good on what I've said many times — I'm winning with AI if I can spend more time outside, get more exercise, invest in my health, and have the AI keep me on the rails at the same time. So to explore that in a context where I think it suddenly is probably going to tweet just about as well as I would for today's show — I gave it a total green light. It was able to schedule its own posts, find the relevant tags. Will it make a mistake? I think there'll be at least one — I usually make at least one over a handful of tweets anyway.
2:21:23
Nathan Labenz: I decided to flip this switch — not that I'm going to keep it that way forever or hand Fable my Twitter account permanently. But it was kind of exposure therapy for myself: we are getting to the point where preciousness is going to start working against you. Preciousness was a great shield against bot shitting in the past — I never wanted anyone to think I was just passing off AI outputs. But now I'm going to have to ask: what is the hybrid form? What is the winning recipe? Do I start signing these things — "by Claude, under Nathan's direction," or "Fable being Fable"? It is going to be a whole new space
2:22:09
Nathan Labenz: to explore — very interesting, very productive, very exciting, and very challenging, I think, for
2:22:18
Prakash: Yeah.
2:22:18
Nathan Labenz: a lot of people. But it's definitely happening now, as far as I can tell.
2:22:23
Prakash: Welcome to the future.
2:22:27
Nathan Labenz: So tomorrow we'll see what happens. We've got Fable working in the background. The experiment: can we do a Fable show-and-tell with a bunch of projects people have already made in the first 24 to 48 hours? There are plenty of candidates — Fable had no trouble sourcing a bunch of interesting projects to talk about. It also has DM access for me on Twitter now, so we're sending out some DMs to see if we can get people to come and actually talk. The response hasn't been a rush to the calendar just yet. I did instruct Fable to be upfront about who it is, rather than leaving people
2:23:13
Nathan Labenz: even temporarily thinking it was me typing these messages by hand. So we'll see what we have. The experimental, gonzo nature of tomorrow's show should be pretty interesting — whether we have human guests or just have Fable serve as our guide to all these new Fable-created worlds. It's going to be interesting either way.
2:23:38
Prakash: Indeed. And on that note, we will see you tomorrow.
2:23:41
Nathan Labenz: Thanks, Prakash. See you tomorrow.

Fable 5 launch day

SOTA on everything disclosed — with an Opus-4.8-fallback asterisk worth a few points and 1.5x the compute. Prakash's overnight experiments found the safety layer trips on production systems, not just ML research; Nathan flagged steering-vector-based nerfing in production as both sci-fi made real and the core political-economy question of the era.

Sequent breaks cover

Two to three years to superintelligence is the modal take of the former UK AISI chief scientist — and the reason he and Daniel Murfet are pivoting alignment research toward semi-automated theory at scale, with ~$100M in initial backing and a standing offer to train domain experts who want in.

Julius and the harness question

Build the omnipresent pieces, get out of the model's way, and watch for the token-maxing correction. Plus the agent economy thesis: agents hiring agents, stablecoin toll roads, and an AI-maintained review layer.

The takeover

Nathan handed his X account to Fable for the day — disclosed, self-scheduled, and framed as exposure therapy for a world where preciousness about your own output starts working against you.