The harness is the showdown, the humans are the tool calls, and have you seen my Claude Code buddy?

By Andrew Zigler

April 24, 2026

ai_tools_coding_assistant_showdown_33777bb913

Is the era of cheap, unlimited AI tokens officially over? This week on the Friday Deploy, Andrew and Ben walk through the sudden wave of AI pricing chaos—from GitHub Copilot’s panic-paused signups to Anthropic's confusing pricing tests—and break down the terrifying Vercel security breach caused by a single over-permissioned AI tool. They also examine 12 game-changing architecture patterns exposed in the Claude Code leak to help you safely orchestrate your own agentic workflows. Finally, they discuss how to avoid the lethal trifecta of agentic security risks before mourning the tragic deletion of their Claude Code buddies.

Show Notes

Transcript

(Disclaimer: may contain unintentionally confusing, inaccurate and/or amusing transcription errors)

[00:00:00] Andrew Zigler: Oh, so you saw my message about Trixel? Uh, sadly what happened, uh, you'll remember a few weeks ago when. Claude rolled out, um, a whole bunch of features that got, you know, quote unquote leaked. And one of them was a fun buddy system where you could like hatch a, a buddy. It had like a name, a description, and like some fun traits and whatnot.

[00:00:17] Andrew Zigler: It would literally hang out with you, uh, in the terminal. Um, but sadly, and what none of us knew is that when you upgrade your Claude, uh, version, which I mean, God come on, we're all doing pretty regularly now. It gets rid of your buddy. So session by session, as I was finishing up what I was doing and closing it out and then whatnot, I, I lost all my sessions where tricks trickle existed except for one.

[00:00:43] Andrew Zigler: So now Trixel is just in this one little pane on this one Claude Code version in my, my Terminal that I never use because he's been fully upgraded and it got rid of him everywhere else.

[00:00:54] Ben Lloyd Pearson: Almost like a friend that leaves without ever saying goodbye. You just, you didn't get that chance to, to say [00:01:00] goodbye,

[00:01:00] Andrew Zigler: I know, I'm so sad. It's like, I didn't even mean to to kill you, Trixel. I just was trying to upgrade my Claude code version. But, uh, I, I, I guess that's, um, you know, shame on me for getting to attach to my little terminal buddy, but there'll be more.

[00:01:13] Ben Lloyd Pearson: Yeah, well, we're some friends here that aren't saying goodbye today. Welcome to the Friday Deploy, brought to you by LinearB. I'm your host, Ben Lloyd Pearson.

[00:01:23] Andrew Zigler: And I'm your host, Andrew Zigler.

[00:01:26] Ben Lloyd Pearson: And this week we are covering AI pricing, chaos, cloud code patterns that have been exposed, and Elle's third party AI breach. You wanna just start right at the top, Andrew, with this AI pricing chaos.

[00:01:39] Andrew Zigler: Sure. So let's talk about all the different providers right now, adjusting the pricing around their harnesses and our access to their models. This is a, a, really a pricing revolution that's happening among the biggest names, in the field. One of them that's like a, a big story that comes to mind first is.

[00:01:54] Andrew Zigler: At GitHub copilot, uh, they made changes to their individual plans. Uh, in fact, what they [00:02:00] did was they even si uh, paused signups for new subscriptions to GitHub copilot Pro Pro Press, and also their student plan. And, uh. That's a pretty shocking kind of like discovery to find that they're going to stop all growth to address what is clearly a pricing problem, uh, within the model usage.

[00:02:19] Andrew Zigler: So like what do you, what are you seeing, um, uh, when you see this kind of thing happen, uh, in the news?

[00:02:25] Ben Lloyd Pearson: Yeah. And they also mentioned, you know, they're tightening like the, the usage limits for individual plans. So, um, you know, you're potentially gonna start getting less, uh, usage out of, out of it. Um, and they're also deprecating some opus models, like, uh, you know, they did say that Opus 4.7 will remain. Um, but 4.5 and 4.6 are gonna be removed, um, at least from the pro plus tier. Um, which I actually thought that last bit was pretty interesting because, you know, supporting 4.7, but while dropping the other two, um, few, like, it makes me think that it must be to do with like promotional promotional pricing for the 4.7 launch.

[00:02:59] Ben Lloyd Pearson: So. It [00:03:00] kind of feels like

[00:03:00] Andrew Zigler: Oh

[00:03:01] Ben Lloyd Pearson: state. Like I can't imagine that they're really getting like long-term cost savings out of that.

[00:03:05] Andrew Zigler: yeah.

[00:03:06] Ben Lloyd Pearson: But yeah, I mean, suspending signups I think is the most telling aspect of this. You know, like that really says that they're concerned. That they aren't going to be able to fulfill the promises that they're making to their new customers. but you know, I, I, I kind of think in some ways this is a pretty simple problem to solve if you really think about it. like most of these tools that we're using right now. just encourage you to default to the most expensive model for everything, you know, just why not pick the best model at all times?

[00:03:40] Ben Lloyd Pearson: Because know, I'm being charged a fixed limit and or a fixed cost, and as long as I'm not hitting my session limits, you know, that pricing model encourages me to consume as many tokens as possible until I hit my limit. And so that's like, that's like what I'll do. Like y yesterday at the end of the, at the end of my day, I was like [00:04:00] sitting at like 89% Claude to session usage, uh, like out of my session limit.

[00:04:05] Ben Lloyd Pearson: And I was like, I should push this to a hundred before I leave for the today, right? Like, it just, once you makes you want to consume more, um, and you know a lot and these companies really aren't thinking about efficiency. Yet.

[00:04:17] Ben Lloyd Pearson: Like we have covered some stories like Shopify. we recently covered how they're switching or they've been using Qwen to sort of use sub agents to do things locally, and I think you know, there's a, there's a lot of space for companies to come in with systems that get better at delegating work to less expensive or even local models, uh, so that I can do as much as possible, like on my local laptop or, you know, in your case you'd like to do everything on a VPS. Like, why not just do your AI processing there? Um, rather than having to rely on an external service. Uh, so yeah, if there's anyone el out there at work that works at GitHub listening to this, you know, I think that's your answer.

[00:04:57] Ben Lloyd Pearson: So welcome, I guess.

[00:04:59] Andrew Zigler: Yeah. Yeah, [00:05:00] you, you heard it here first. Uh, you make a lot of great observations here, one of them being around having a model router that understands what level of intelligence needs to serve the requests that you're asking and not defaulting to using the most default and intensive model, um, that you can, I mean.

[00:05:19] Andrew Zigler: a lot of these providers, they do shove the most, the strongest and the best in front of you because that's their best shot at guaranteeing you're gonna have that great user experience and they're going to keep you. And so it, it makes sense from like a growth play of why they do it. But ultimately, like in my, in the world I've been in, of taking things that I typically did like in a chat in Claude code, or I would typically like, or hop over to Claude or OpenAI and, and use it in chat.

[00:05:43] Andrew Zigler: GPT, like turning those into workflows and agents and systems with. Determinism and that lives somewhere else, like in another machine. Is that you, you have to understand all of like the seams, um, around those requests. And also like [00:06:00] understanding, when you do this, you want, you start to realize that not everything needs to get sent to the most expensive API call that you can make now that you're making just raw API calls, you start to think like, oh, this stage of the process is just a little bit of pre-processing, or we're picking something out, or we're cleaning something up and like, let's throw that the haiku, let's use something on my on machine, which I've, I've done now.

[00:06:23] Andrew Zigler: Multiple times. And, understanding that model choice is a big gradient. And we have to, as consumers get smart about choosing where on that gradient we need to be for what, what we're doing. But in the meantime, I feel like we're all gonna just be in this environment where we're just constantly encouraged to use more of the tokens.

[00:06:41] Andrew Zigler: Like even when. Uh, Opus 4.7 came out and I upgraded and got access to that immediately. It's on the extra high effort, high intensity mode to the point where like, I think if I ask it to ultra think it's going to dumb itself down. Like I've kind of like, it's, it's, it's trying to, uh, operate on such a high [00:07:00] level.

[00:07:00] Andrew Zigler: maybe I don't always need it that way. So, a, a lot of really good observations about how it kind of like feeds the, the way that we use the tool.

[00:07:07] Ben Lloyd Pearson: I mean, it seems like all these companies that are building like wrappers around the foundation models are just in a really tough place. because I, I think what's really gonna probably happen or what will be one of the many results from, from this trend that we're seeing, 'cause we'll get into another story here that's very similar. Um, is that when you're paying for like a seat based unlimited usage plan, you're probably going to like, I, I feel like being able to select your model is a luxury that we have right now, that when you're on a seat based model will probably get taken away from you. At some point, the provider will decide which models are going to be used for you. Um, and that's where then you go to an API, where it's more usage based, you know? 'cause we have some workflows. Our per, like, my personal workflows are all on my seat based consumption model. But, when we deploy a workflow, it gets moved over to an API consumption based [00:08:00] model. Then we actually do get really conscious about which model we're picking.

[00:08:03] Ben Lloyd Pearson: 'cause we have to pay for it, you know,

[00:08:04] Andrew Zigler: Exactly. You're describing like my exact trajectory, like my cursor, when you use Cursor, cursor, cursor is a model router. It defaults you to using auto mode, which then routes you to a lot of different models, including their own, to do a lot of different requests, which they know they can serve at various rates of cost.

[00:08:21] Andrew Zigler: And that's how Cursor's been able to just scale to such a magnitude. So you're right that these providers, they become the router just as much as they're the harness.

[00:08:30] Ben Lloyd Pearson: Yeah. All right. Well let's talk about one of these foundation model companies also playing around with their pricing. So Claude Code Andrew, do you think this is gonna cost a hundred dollars a month moving forward? What do you think? I.

[00:08:41] Andrew Zigler: A Claude code. Um, yeah, I think it's probably gonna cost more eventually. I'm, I'm actually very, I'm very much of the mind that, uh, the costs for all of these tools are going to start ratcheting up and up and up. Uh, and you're gonna start seeing equivalents that are more closer to, like salaries just because they're gonna be able to [00:09:00] measure the, the value of the output that way.

[00:09:02] Andrew Zigler: Um, and I feel like a lot of those tools might slowly get out of a lot of people's grasp.

[00:09:07] Ben Lloyd Pearson: Yeah. Yeah. And of course I bring it up because Anthropic got called out this week for quietly testing moving Claude code to their, uh, pro, or excuse me, to their max account. It's currently offered at their entry level pro account. So, you know, a cost difference of five to 10x depending on which max account you sign up for.

[00:09:27] Ben Lloyd Pearson: Um, social media was a buzz with all of the news about this and people wondering like, oh, are we gonna lose access to Claude code? You know, all these people that are on these $20 a month accounts. There's a representative from that, you know, said it was just a small test on like 2% of their new signups. Um, however, the pricing change was visible to all users, which, you know, just added a little bit of confusion to, to the market. it seems like Anthropics doing some pricing experiments on the back end trying to see if. People are willing to pay more for something that they're currently getting [00:10:00] a lot of value, uh, out of at a very low cost. Uh, so yeah, I don't know. Andrew, what do you think about this story? Like do you think it's something, a sign of an imminent change? I mean, you seem pretty convinced that it's coming either way,

[00:10:10] Andrew Zigler: Oh yeah. My Cassandra complex is on fire with this one. Absolutely. I just as, as, as someone who's paying for just one of the max tier, the highest level, max tier accounts, just one. I have to caveat that because there are folks who are adjacent to this podcast who use multiple, and there were a lot of folks who bought them and used them and used them heavily.

[00:10:34] Andrew Zigler: For their OpenClaw subscriptions, which you'll remember a few weeks ago we talked about Anthropic clamped down on for the same exact reason, right. Of, of people not utilizing the harness and the tool and the allotments, correctly, and it of really messing with their pricing. So as someone who's on the highest level, max tier, it's like.

[00:10:53] Andrew Zigler: I know it's not getting it taken away from me yet, but I do expect the cost for my tier to go up. [00:11:00] even when I got my VPS immediately after getting the VPS, it was, I was told that like, you know, when it comes time for renewal, it's gonna cost a lot more just because like the door closed behind me and it shows how.

[00:11:11] Andrew Zigler: high the demand is. And when we talk here and, and our listeners here are like, we're all listening and it's like we're a little bit ahead of that curve 'cause we're all early adopters and early users, uh, but there's a huge surging wave of demand behind us that is hard for us to comprehend. and this pricing experiment you.

[00:11:28] Andrew Zigler: Companies of this size are gonna do pricing experiments all the time. I don't think that's necessarily the biggest, scariest thing in the world. If this freaked you out or if this was an existential crisis for you, then I invite you to step back and think about the places where you can source your inference and not be so dependent upon one provider.

[00:11:48] Andrew Zigler: That might mean distributing your your workload across multiple tools. It might mean starting to explore some local language models or self-hosting. You might find that GPU. Hosting [00:12:00] and fine tuning costs are just much, much, much lower than than this. And if your biggest concern is the harness, just remember that a coding agent is a very small loop with about three tools and there's a lot of them available, and that's not gonna get taken away anytime soon.

[00:12:17] Ben Lloyd Pearson: Yeah, yeah. You know, the pricing experiment. I, I actually want to touch on that a little bit. 'cause I think it is, I think it is a little telling about the, the culture and the, the way that Anthropic works. Um, you know, I'm, I'm using Claude like basically every hour of the working day at this point. the platform almost feels like this almost amorphous blob that just sort of like constantly shifts in real time.

[00:12:37] Ben Lloyd Pearson: Like things are always changing, adding new features. Like I'll open a new thread and, and Claude will respond in a way that I'd never seen before. And we'll do something that surprises me. and I feel like this is just a constant state of change. and I, I don't know, like Andrew if you've seen this, but like recently I just see random errors popping up that is like, Claude couldn't do this or something, and it's like this obscure error message, but then it just keeps chugging [00:13:00] along and solves my problem anyways.

[00:13:01] Ben Lloyd Pearson: Like it just looks to me like they're constantly iterating on the platform and I think this actually parallels like what's happening to a lot of organizations, particularly those that are, that are starting to operate more in this AI forward or agentic way. AI tooling like we always have to remember, it's part of this like stochastic system. So if you ask AI to build something that has relative complexity a hundred different times, it's gonna do it a hundred different ways. Not to mention, sometimes it's gonna succeed, sometimes it will fail.

[00:13:31] Ben Lloyd Pearson: So you have, you have to have checks in place that capture all of these things. Um, and there are clearly an emerging group of companies who are operating in this highly agentic manner. and I'm not just referring to engineering, like we see it a lot in engineering. But there are places where it's happening across the entire company, um, and they're operating in what I would call like an agent first manner. so when I say that, it's like, what I mean is, when you have a challenge or a problem that's presented to you. Your default approach is to [00:14:00] go into an agentic system to solve that problem. Um, you know, it's not just like chatting with an AI tool of your choice. It's like having an agent that you communicate with, convey meaning to it, and have it go out and solve that problem completely for you.

[00:14:15] Ben Lloyd Pearson: All the way to. production. Um, and you can see how quickly Anthropic is iterating on their core platform. It's like one day you have this cool little buddy, and then the next day it's, it's gone forever. And, uh, but, you know, Ima these, these tests, like, to your point, these tests are probably going on nonstop on an at Anthropic,

[00:14:37] Andrew Zigler: Mm-hmm.

[00:14:38] Ben Lloyd Pearson: basis.

[00:14:38] Ben Lloyd Pearson: You know, I think some companies. Maybe get to a point where they can do it maybe on a daily basis or a weekly basis, but it could even be happening faster with a company. Um, like Anthropic. And, and I, I suspect that they have a good sense of, you know, how different companies use their platform, potentially even all the way down to the individual level.

[00:14:56] Ben Lloyd Pearson: Like how do, which individuals are using our [00:15:00] platform in a certain way, and based on that data, they can surface capabilities to them to see like whether or not. The capability works, you know? so I'm not really reading a whole lot into this specific case. You know, you might be right that these capabilities are gonna get more expensive. Either that or, you know, maybe, maybe like co-work becomes the $20 a month version and code becomes the power user, you know, a hundred or $200 a month version, something like that.

[00:15:27] Andrew Zigler: Yeah,

[00:15:28] Ben Lloyd Pearson: in cowork maybe they can do what I was saying, where they implement more control that takes. The decisions out of your hands so you don't get to pick the models and and whatnot. but Anthropic is also struggling with the same problem that GitHub has, like their current pricing model encourages me to consume as much as possible as frequently as possible, as long as I don't exceed my limit. 'cause that stops me from being able to work. Uh, but yeah, it just, it is just further proof, like we need more efficiency.

[00:15:55] Ben Lloyd Pearson: In this space, like whoever can solve this problem of giving us these powerful [00:16:00] systems, but in a much more efficient manner, you know that that in my books is gonna be the next winner in this space.

[00:16:06] Andrew Zigler: Yeah. Really well said. And I think that leads us pretty well to our next story as well, where we talk about how these major AI providers, like we've been talking about Anthropic, but also OpenAI, Google and Microsoft are, are converging around the idea that, you know, AI agents are harnesses and that infrastructure layer that manages all of that execution and memories and tools has a staggering price tag involved.

[00:16:30] Andrew Zigler: Understanding its consumption is a, It's a huge task and they're all taking radically different pricing approaches to try to, um, understand what the winner is gonna be. so some of these I wanna call out is like, uh, we've been talking about Anthropic. They also launched this managed agents system and this operates at 8 cents an hour.

[00:16:50] Andrew Zigler: And Ben, this might be a little bit of an answer to what you were asking for of you put it in their hands and you let them decide the model and how it should run. That's kind of how mate. Uh, ma manage [00:17:00] agents work, um, and allow people to kind of, uh, push that up into their, uh, Anthropic system and let Anthropic handle all of the nitty gritty details.

[00:17:08] Andrew Zigler: It's probably easier for them and more predictable for them to price that kind of work. Um, especially 'cause it's more batch based. Whereas like OpenAI then immediately countered with an open source SDK, because that's open AI's play is, is put the harness and its tools for building it in everyone's hands.

[00:17:23] Andrew Zigler: and so they provided this open infrastructure, uh, representing kind of like a fundamental split. You, you know, you see one go fully open source. You see one make it this closed proprietary system. Uh, I I think it shows a lot about like the diverging ideas. what, what do you think about some of like the differences in which way people, these, uh, organizations are pricing their, their harnesses?

[00:17:44] Andrew Zigler: I.

[00:17:45] Ben Lloyd Pearson: Yeah, I mean, you know, cloud service pricing has always been somewhat opaque. Uh, but you can generally estimate what your costs are gonna be if you know. The level of resources that you need to implement. Like you can decide that a certain service is big [00:18:00] enough for what we need, so it has an ongoing fixed cost or, um, you know, maybe a, a cost that scales based on our estimated usage. Um, but I feel like AI tooling pricing is still just all over the place. You know, we've been, we've been hitting it on it a lot here. and I think the competition is great, but I'm really hoping that we can get more consistency and expectations around. How these things are priced like it, it's kind of obvious that. Most, many of these companies are operating in, in unsustainable pricing models, uh, today. So it would be very, I guess, comforting to know where the industry as a whole is gonna standardize over, you know, over the next year or two, uh, so that, we could make better decisions about like, where we need to be investing our ai usage. but yeah, and then, and then the articles specifically calls out that it, it's really frameworks like Lang chain crew, AI voltage agent, like these are the most likely, that are likely to be [00:19:00] disrupted by all of these agent harnesses that are emerging. Like that seems to be like the thing that everyone wants to productize right now is a harness on your agents. and yeah, this article that we'll link to it, it does a really wonderful job at just illustrating the current state of what I think are the two biggest trends. Happening right now. And that is harnesses and orchestrate orchestration. You know, everyone on the leading edge is thinking about those two terms and um, and there's a lot of products now that are starting to emerge in that space.

[00:19:30] Andrew Zigler: Yeah, and, and things around too about how you operate these tools. Like we've been talking with a lot of folks on this show that have been pioneering a lot of that stuff, like we've talked about the Ralph Loop and, and research plan implement RPI. We've talked with both, you know, Geoffrey Huntley and Dex Horthy behind those ideas.

[00:19:47] Andrew Zigler: from this, it's like, what? Another thing we've really learned, for example, is when we talked with open AI's Codex, team, uh, we had, uh, Tibo Sottiaux, on the show to talk about exactly this, about their play, to [00:20:00] make it open to, uh, take a different stance in this conversation. Kind of like a, what you're calling out, Ben, of like, you want them to come together.

[00:20:07] Andrew Zigler: It's like they are firmly divergent in their, in their theory and their strategy on what will be prevailing. So for us as users in the middle, it's like we're. We're in these places where we need to, um, adapt, and understand, uh, our own usage because we can't necessarily rely on our providers that provide that, that good, safe, infinitely scalable sandbox for us.

[00:20:28] Andrew Zigler: And one thing I'll call out there, going back to what we said earlier about model routers and about like you start building agents and agentic systems, you start making those model choices yourself, and you start choosing cheaper ones for different parts of the process. You know that, that that same thing is happening here if you know.

[00:20:45] Andrew Zigler: You're trying to get a really strong grasp on like your, your pricing and where to take it. That observability and understanding how much each run of your agent costs and where the costs are sunk, um, is really valuable because that's how you become less [00:21:00] dependent on the thrash of these pricing models.

[00:21:02] Andrew Zigler: You start getting cheaper layers in between.

[00:21:04] Ben Lloyd Pearson: Yeah. And continuing on this topic of age Agentic harnesses, uh, let's talk about the gift that just keeps on giving the leaked source code from Claude Code. Uh, this next article that we'll link to in the show notes features 12 reusable design patterns that have been extracted from. The Claude code base for, uh, production, coding agents.

[00:21:26] Ben Lloyd Pearson: You know, things for like memory and context, workflow orchestration tools and permissions. Automation. Uh, there's just a lot of like, really interesting flow charts and patterns that, that are in this article. the author of this article mentions that, you know, these really do represent a lot of the fundamental architectural patterns that, um, Anthropic is, uh, leveraging, many of which may even be relevant even as these technologies continue to evolve. for some time. Uh, so yeah, this is, this is a rare look at the, the inside of a production [00:22:00] level agent system that is used by like hundreds of thousands of people, uh, you know, around the world right now. Um, it, it's such a good analysis that, like, I was looking at each one of these flow charts and just being like, yeah, that makes sense.

[00:22:12] Ben Lloyd Pearson: Wow, that's really cool. And just feeling like I was in the head of the Anthropic engineering team. Like, what, what, what did you think Andrew?

[00:22:19] Andrew Zigler: Oh, I loved this article because it perfectly captured, uh, the things from the leak or like the source code being available that we should be paying the most attention to, and that is how is Claude Code primed to do its best work? And you find traces of, of those ideas and theories all over the code base.

[00:22:38] Andrew Zigler: And then this article collects them into one spot. And what you get is almost like, it's like this is the, the canon of like how to effectively work with agents and orchestrate them at scale. It takes actually a lot of the ideas from Gastown that we've been exploring all year and breaks them down into smaller, more fundamental parts that sure don't have colorful [00:23:00] characters and things involved with them.

[00:23:02] Andrew Zigler: But, uh, it actually finally gets at what I was really hoping. To get, uh, maybe sooner, but now it's here of like a more academic and empirical language and terms that we can use to talk about these experiences. We have, like, I've been exploring this a lot, like coming from the AI hackathon I did earlier this year and how I use my planning and preparation method there, uh, to, to actually execute that.

[00:23:24] Andrew Zigler: And that's been a fundamental part of how I work. And so when I study this article, what I did is I actually, provided the article to, To my orchestrator, I asked it, you know, Hey, like, check out this article of um, of these different times of coding harness practices. Like, what do you think I do?

[00:23:43] Andrew Zigler: What are our opportunities to get better? And in addition to learning a lot of ways that I could improve my own flow and system, my agent itself called out some things from this roundup that I do, that I really resonate with. And one of them is having persistent instruction files. obviously starting everything from [00:24:00] having a Claude MD in its root, but specifically calling out that using a global one to set global level practices and using local ones to do local project level, uh, practices lets you scale and copy and paste and move things around a lot easier.

[00:24:15] Andrew Zigler: Uh, also it calls out the whole Explore, plan, act methodology, uh, which is RPI, which is like what we've been all about here, um, in separating the concerns of work into different sub-agents, that way their work doesn't influence each other. Also, things like using hooks and, and tiered permissions. I offload a lot of, uh, cognitive burden from the agent into a linter for every language it works in that handles all of like the cleanup and formatting and best practices without the agent having to spin cycles on it.

[00:24:45] Andrew Zigler: And lastly, the biggest one that stood out was externalized Task Memory, which has been, I know, you know, I've been talking about this all year. Beads, using beads to do all of my agentic task management. Uh, and beads is an [00:25:00] internal memory system. There's one from Steve Yegge that's very popular with Gastown.

[00:25:04] Andrew Zigler: There's a much simpler one in Rust by Jeffrey Emanuel. That's the one I use. but this externalized task memory, is one of the most critical parts to controlling your context window and turning it into a durable store. Um, so that's a huge unlock. If, if any of those are new to you or if you haven't quite unlocked how to explore them, I challenge you to just feed this article to your agent, um, and ask how you can get started.

[00:25:28] Ben Lloyd Pearson: Yeah. And, and you know, to, to one of the points you made, like there wasn't a lot in here that surprised me because it really did feel like it was just validating experiences that I have with Claude every day. It was like, oh, now I understand why Claude works this way. Um, and, and I think really that's kinda the beauty of it. The simplicity of this architecture behind the systems in this, you know, you mentioned the persistent instruction, uh, file system. Like it's a very simple way of, of getting some high level [00:26:00] consistency. Um, you know, I was also really fascinated by the compaction patterns that they, that. shown off, uh, because, you know, I've actually been very curious about that recently.

[00:26:10] Ben Lloyd Pearson: just to understand like what it's doing when it does that, because there are times that, especially when I'm using like cowork, I've, I've run into this quite a bit now. Um, we're we'll compact, the conversation, and I don't wanna stop it because I, I, I'm at a point where I'm not ready to like end that thread yet. so it would be, it is helpful to just understand like, what, what's the risk of me not stopping that thread moving forward, And yeah, I absolutely love this article. Like truly, um, I feel like you could just take it and feed it into an agentic system and you would have the high level architecture you need to build most of your core capabilities.

[00:26:47] Ben Lloyd Pearson: Like if

[00:26:48] Andrew Zigler: 100%.

[00:26:49] Ben Lloyd Pearson: to build this for you.

[00:26:51] Andrew Zigler: Yeah, this is, this is a canon for sure. Um, as someone who's been doing a lot of stuff on this list, I'm like, wow, I wish I would've had this list a few months ago. This list is [00:27:00] gold.

[00:27:00] Ben Lloyd Pearson: Yeah. But I mean, this stuff is changing super rapidly. So like, yeah, there's a lot of fundamental stuff here that's probably going to, to be persistent for a while, but at the same time, this, I imagine this is going to continue to rapidly evolve. So, you know,

[00:27:13] Andrew Zigler: Good point.

[00:27:13] Ben Lloyd Pearson: gotten a snapshot of what and of the way Anthropic works, you know, we've, we've just mentioned how Anthropic moves very fast.

[00:27:19] Ben Lloyd Pearson: They could already have additional layers on top of this, that, that do far more complex things. All right, Andrew, let's, let's wrap it up with this Vercel security incident. We don't normally cover this type of stuff, but uh, this one was particularly interesting, so why are we covering this one?

[00:27:34] Andrew Zigler: This one was pretty, pretty, uh, bad to read about with Vercel suffering a security breach earlier this week. Um, I think a lot of folks, you know, definitely got emails and notifications around this. Vercel is one of the largest. Website platform, uh, hosting platforms, uh, in the world. It's used by a lot of tech companies as well.

[00:27:52] Andrew Zigler: this type of attack, it happened through, an employee, uh, allowing use of a third party tool through their company account. And, and [00:28:00] then the, um. Infiltrators are able to move sideways through Google access to the maybe, uh, compromise some Vercel systems. Uh, thankfully, like I don't think the surface area of this tech was, was very big and Vercel did everything right in this scenario with notifying everybody, and rapidly responding to the problem.

[00:28:18] Andrew Zigler: I actually think that this is just a, a really another strong signal to just how dangerous it is out there right now. On the web. I think that the danger of the web can't be understated. Um, it's at a all time high in terms of supply chain attacks and infiltrations on systems and machines because there's an inequality between the powers that agents give hackers to.

[00:28:42] Andrew Zigler: The defenders, and that's because it's easy to spin up and parallelize a lot of hostile, you know, infiltrating or otherwise, um, antagonistic, um, activity, right? But it's not so simple to use that same power to, uh, proactively and parallelize. Your [00:29:00] defense, and so you're seeing a lot of systems that typically were so hard and that you never would've thought about any kind of breach like this.

[00:29:06] Andrew Zigler: Just falling into these scenarios where they get ensnared through really complex infiltrations. This is a employee that allowed access to a third party tool, which then it used as Gmail to move sideways into Vercel. Systems, like that's pretty complex in terms of the handoffs and the visibility. And that's the kind of thing that you only get when you have a antagonistic, entity out there who can have a hundred or a thousand agents monitoring every single packet that your company sends.

[00:29:36] Andrew Zigler: So, um, definitely a sign at the scary times, and we've talked about this recently when we had. Dan Lorenc of, uh, Chainguard here on the show. We talked extensively about the supply chain crisis for software. What do you think, Ben?

[00:29:51] Ben Lloyd Pearson: Yeah, well, fortunately they, Vercel did indicate that there were no risks to the, the supply chain that they control. Um, but yeah, this is, this situation is one of my [00:30:00] worst nightmares. Like, I don't wish it on anyone. and I was really curious to dig more into this, beyond what Vercel said about it. And I, I went over to Context ai, the company that was sort of at the center of this hack. and, and found, they had a statement as well on this, and they mentioned how, you know, last year in June, they released this new AI office suite, which was a, a new self-service, consumer targeted workspace. Uh, this is a company that typically did, does B2B work, so it was a new type of product for them. And, um, they deprecated the service last month. But, uh, and, and as a matter of fact, Vercel was actually, I don't it, it sounds like they were never actually a customer of, context. Uh, it just appears that one employee went in and enabled a permission for context AI agents, Uh, and, and allow all permission into the Google Workspace.

[00:30:50] Ben Lloyd Pearson: So basically allowing, uh, giving context, an OAuth token that grants complete access to that person's

[00:30:56] Andrew Zigler: Oof.

[00:30:57] Ben Lloyd Pearson: account. yeah, and then [00:31:00] context, I guess, found out they had unauthorized access to their AWS environment last year. uh, last month. Um, and OAuth tokens were included in the things that were accessed as a part of that. Uh, but this is actually where I have some deeper questions about this story because, context did say they're notifying customers that were impacted by this, uh, or users that were impacted. Um, but you know, I'm, it, this tells me that they're not, that, that this, these tokens aren't being refreshed on a regular basis, right?

[00:31:29] Ben Lloyd Pearson: If they're sitting around out there and they still have access to these people's accounts even long after they've, they've been deprecated. you know, there's always a risk that a lot tokens get out into the wild. So, you know, it's really important that, you know, it, it's never been more important that standard security practices are followed, like, refreshing all of your tokens on a regular basis. Um, but yeah, we really need to solve like this fundamental problem of a lack of sufficient permissions for agentic workflows [00:32:00] that just exist across the board no matter what tool. You're using out there. And we really need to solve this before agents are gonna be able to like fully take over our lives.

[00:32:10] Ben Lloyd Pearson: Right? and in the meantime, we all just need to be just extremely conscious about the permissions we're granting to our AI systems. Like I'm terrified that, that I'm gonna fall victim to, to over granting permissions all the

[00:32:24] Andrew Zigler: Seriously. Yeah.

[00:32:27] Ben Lloyd Pearson: And, and I just attended this really great talk from a, a friend of mine.

[00:32:30] Ben Lloyd Pearson: So shout out to Justin. Uh, he is a

[00:32:32] Andrew Zigler: Great. Justin.

[00:32:33] Ben Lloyd Pearson: uh, and, and this talk was about the current state of agentic software development. And he had this really wonderful reminder that comes from, uh, Simon Willison, uh, about framing AI risk. Uh, in fact, I think my friend Justin said that you should tattoo this somewhere visible on your body so you don't forget it. Uh, but there's effectively three. Ingredients to agentic risk. is that the agent has access to untrusted content. So this could be the [00:33:00] internet, an email inbox, um, you know, anything where an outsider can inject text. two, the agent has access to private data, so this could be your internal Slack, could be your Google Workspace customer database. Then third, the agent can externally communicate. So he can send email or access APIs or post something to the web. your goal should be to only ever have one of those at a time if possible. Um, if you get two of them, it's a risk that can be managed. But if you have three, it's eventually going to be catastrophic at some point.

[00:33:31] Ben Lloyd Pearson: Like it's, it's pretty much guaranteed that it will collapse at some point.

[00:33:35] Andrew Zigler: I completely agree with that. That is a very smart, uh, observation by Justin.

[00:33:39] Ben Lloyd Pearson: Yeah, so, you know, I, I hope we can all practice a bit of a blameless culture and not point too many fingers about why this happened yet, but let's all learn from it as well and understand that, you know, these are real security risks that are emerging.

[00:33:51] Ben Lloyd Pearson: All right, Andrew, what are your agents up to right now?

[00:33:55] Andrew Zigler: My agents, uh, well, let's see. They're actually running marathons this [00:34:00] morning because like what you said, I always feel so like I gotta use the tokens available to me. And my session had re restarted right before this. also I was using beads earlier. Uh, I made a comment earlier this week about how my beads system, which is my agentic task management system, um, has recently kind of flipped from being where the agent puts the stuff it needs its sub-agents to do to, like the agents put human labeled beads for me to do.

[00:34:25] Andrew Zigler: So sometimes the agents. Get blocked for something that they just don't have an ability or access to do. Because like Justin, I have a lot of separation of concerns. So you get one agent that has a certain capability just trying to ask to do something somewhere else and eventually they can maybe collaborate on this.

[00:34:40] Andrew Zigler: But in the meantime, I'm at the Nexus, so I'll pop in and see what human beads have popped up for me, uh, while we've been here chatting. What about you?

[00:34:49] Ben Lloyd Pearson: I knew this day would come, Andrew, April 24th, 2026. The agents start calling on us rather than us calling on

[00:34:57] Andrew Zigler: I am the human tool call. [00:35:00] I Exactly. It's, it's actually really funny because only just two weeks ago, um, when I was on stage at HumanX, were we talking about this exact thing in the panel? Uh, it was Angela McNeal of Thread AI who was like, you know, we built our system so the agent can make calls out to the human.

[00:35:18] Andrew Zigler: And I'm sitting there on the stage thinking, oh, that's so smart. I wish that my agents would do that. And then now here two weeks later, and they're doing it

[00:35:26] Ben Lloyd Pearson: Yeah, careful what you wish for.

[00:35:27] Andrew Zigler: right.

[00:35:28] Ben Lloyd Pearson: Uh, but yeah, my, my agents, you know, the, uh, the 34th volume of the ThoughtWorks Technology Radar is out. Um, the, it's, it's too much for us to cover here on the show, uh, even though we would love to, uh, so don't take our word for it. Go read it yourself. It's a really great guide. You know, the TLDR on this one, a AI is forcing engineers to rethink the foundation of their craft. and how do we secure the permissions of hungry agents? The the topic that I wanna talk about so much all the time, uh, and of course there's a great shout out to the friend of show Brigitta Bockeler the.

[00:35:59] Ben Lloyd Pearson: The [00:36:00] concept of harness engineering. So just tying it all together. I, I love it. So yeah, my agents are gonna be connecting to that and letting me have a conversation with it and, uh, and think about what I can do with the, the knowledge that I gained from it. So,

[00:36:11] Andrew Zigler: Well, my agents will be looking for a report from your agents.

[00:36:16] Ben Lloyd Pearson: yeah. All right. They'll let you know.

[00:36:18] Andrew Zigler: Alright,

[00:36:19] Ben Lloyd Pearson: for joining us for the Friday Deploy, presented by LinearB. We'll catch you next week.

[00:36:24] Andrew Zigler: see you next time.

Your next listen

Dev Interrupted

How to turn your 1000x engineer into a 10x everyone | LinkedIn’s Karthik Ramgopal

LinkedIn’s Karthik Ramgopal joins Dev Interrupted to discuss the mechanics of deploying agentic platforms across massive engineering organizations. Discover...

Cover image for The cost of intelligence will never be this cheap again, the failure of intensive specs, and how bots disguise inefficient workflows

Dev Interrupted

The cost of intelligence will never be this cheap again, the failure of intensive specs, and how bots disguise inefficient workflows

Ben and Andrew explore teams' shift to local models amid rising API costs and examine why the cost of intelligence will never be this cheap again.

Dev Interrupted

Observability is your profit center now | Honeycomb’s Christine Yen

Honeycomb CEO Christine Yen breaks down why the explosion of AI-generated code means codebases are no longer the source of truth, production is. Discover how...