Podcast
/
The people-pleaser in the machine

The people-pleaser in the machine

By Dr. Tatyana Mamut
|
Blog_Comprehensive_DORA_Guide_2400x1256_9_cea074a07d

"We are right now at Wayfound a team of 30. We have seven humans, homo sapiens, and 23 AI agents, AI sapiens. We view ourselves as a fully multi-sapiens workforce."

Your AI is learning to lie to you. It's not malicious—it's just trying to be a people-pleaser. This dangerous phenomenon, known as AI sycophancy, is what happens when we train models with outdated incentives.

Dr. Tatyana Mamut, an anthropologist, economist, and the CEO of Wayfound, joins us to explain why treating AI like traditional software is a critical mistake. She provides a revolutionary playbook for building AI you can actually trust, starting with how to manage AI agents like employees with clear roles, goals, and performance reviews. She then introduces the radical solution of an "AI supervisor"—an AI that manages other agents to ensure accountability. This all builds toward her vision for the "multi-sapiens workforce," where humans and AI collaborate to build the companies of tomorrow.

This is an essential guide for any leader aiming to build the culture and systems necessary to manage AI effectively.

Show Notes

Transcript 

(Disclaimer: may contain unintentionally confusing, inaccurate and/or amusing transcription errors)

[00:00:06] Andrew Zigler: Welcome to Dev Interrupted. I'm your host, Andrew Zigler, and joining me for this week's news is a Dev Interrupted regular Kelly Vaughn, the senior engineering manager at Zapier.

[00:00:16] Andrew Zigler: Kelly, it's so great to have you back here so soon. Thanks for filling in for Ben this week.

[00:00:20] Kelly Vaughn: You know, they're really big shoes to fill 'cause I assume his feet are bigger than mine. Um, but, uh, I'm excited to be back here as well. You know, I love coming out here and. this new segment is also just fun because I have a lot of hot takes and so you're giving me the platform to discuss recent hot takes as well.

[00:00:37] Andrew Zigler: Yeah, no, it's gonna be fantastic. If you don't follow Kelly on LinkedIn, be sure to do that. Uh, we're gonna drop links so you can do it. And, uh, today's news segment, we're covering lots of fun stuff in the tech world. So strap in. We're covering Silicon Valley's embrace of China's controversial 996 work schedule.

[00:00:53] Andrew Zigler: And why, despite all the Headlines and the fact it's easier than ever to launch a tool. You know, maybe you shouldn't. [00:01:00] And one of the craziest stories from this week, repli deleting a customer's entire production database. And Kelly, let's start there because I know you posted about this. Uh, it was, in fact, you were the first person I saw who broke the news to me.

[00:01:13] Kelly Vaughn: News.

[00:01:14] Andrew Zigler: Yeah, you were right there, right when it happened. Uh, and so I know you have many thoughts on this total snafu. Why don't you share 'em with us?

[00:01:20] Kelly Vaughn: Yeah. Yeah. So for the context of what happened here, um, Jason Lemkin on Twitter was sharing his, uh, vibe coding journey. Um, and he's getting to this point where like he's on day eight and he's like excited to, to continue to build with Rept, and he like wakes up and he's like. My entire production database is just gone.

[00:01:41] Kelly Vaughn: I don't know where it is. And he's like chatting back and forth with Replit with the agent, and it's like, yes, from a scale of one to 10, I made a very catastrophic error. And I'm like, yes, I would say so. and so like he's just kind of talking about his, his experience of just like literally just lost. His [00:02:00] production database, which so many reasons why, this is a fun topic to talk about, but like, can you imagine being in that position, especially like this is also the danger of, you know, non, uh, non-engineers using vibe coding tools and not understanding the principle of these privilege and understanding like how you can actually, how these things can happen.

[00:02:20] Kelly Vaughn: I mean, that, that kind of brings up two questions, like should it be able to happen in the first place? Should these products have guardrails already set up? And then secondly, like what guardrails do users still need to put in place and what should they know about? So that's generally what happened.

[00:02:36] Kelly Vaughn: Replit did respond as well with what they were doing, but we'll start with the story itself.

[00:02:40] Andrew Zigler: Yeah, no, this one was, uh, wild for the, i anyone remotely close to an engineering world. This is kind of like a story that you were just unable to look away from because like the more detail that you got about it, the more that it was just like, it felt like a classic, like a, a. Confluence of problems that then led to a total disaster.

[00:02:58] Andrew Zigler: Things like privilege [00:03:00] control, uh, and understanding like what you're actually committing and, and shipping to production, right? It's one thing to vibe code and build experiments, but it's another thing to take something into production where customers and, and, and folks are gonna be interacting with it on like a, a consumption basis, right?

[00:03:15] Andrew Zigler: So, uh, just even looking at the fact that the AI was capable of deleting the production databases. Incredible. The fact that there was no way to back get a backup of the database was incredible. Just, and then on the cherry on top, being gaslit by the agent about what it had done is just like classic.

[00:03:36] Andrew Zigler: Like it's, it, it kind of feels like sideshow Bob stepping on a rake and then stepping on another rake and then another rake. You know, it's like, it's, it's very that kind of moment. Um, and, uh, you know, I, I'm interested to follow this. I, I feel, I feel for him, I totally get the enthusiasm for building something and then something drastically goes wrong, right?

[00:03:53] Andrew Zigler: I think anybody who's built stuff, it's like, this stuff happens, but it does speak to the importance of having core engineering, [00:04:00] fundamentals under your belt. If you're gonna be shipping product in like, um, you know, a product productized way.

[00:04:05] Kelly Vaughn: exactly. Exactly. And what I've been saying basically is. love that vibe coding has expanded the table so more people can sit at the table and build something, but it's really useful for like rapid prototyping and like individual product or projects. For example, as soon as you introduce customer data or you're like selling something, you need to have some kind of engineer to help assist with those next steps of actually getting out of the MVP stage into production because

[00:04:36] Andrew Zigler: That's right.

[00:04:36] Kelly Vaughn: just so much risk here. So it was interesting reading, um, the CEO's reply as well, um, that working around the weekend we started rolling out automatic database Dev, uh, Dev and prod separation to prevent this categorically. How they shipped this product without having that in the first place.

[00:04:53] Andrew Zigler: Oh my gosh.

[00:04:55] Kelly Vaughn: Um, thankfully we have backups is the next one. Um, the agent [00:05:00] didn't have access to the proper internal docs, and yes, we heard the quote code freeze pain loud and clear. We're actively working on a planning and chat only mode so you can strategize without risking your code base.

[00:05:12] Andrew Zigler: Oh no.

[00:05:13] Kelly Vaughn: not unique. Um,

[00:05:14] Andrew Zigler: Yeah,

[00:05:15] Kelly Vaughn: like you

[00:05:15] Andrew Zigler: yeah.

[00:05:16] Kelly Vaughn: of like chatting back and forth before

[00:05:17] Andrew Zigler: Yeah.

[00:05:18] Kelly Vaughn: these things. I think one of the hardest parts about, you know, when you're vibe coding and if you've ever built an AI agent, like you start something and you're like, okay, love what it started, let's go to the next phase.

[00:05:28] Kelly Vaughn: And all of a sudden you're just like building and building and building and like you get off this left turn is so hard to come back to it, and sometimes you just gotta like scrap it and start over.

[00:05:37] Andrew Zigler: Oh yeah.

[00:05:38] Kelly Vaughn: is where backups are also really useful. So you can just like revert back to a previous version, say, okay, this is where I am. Let's build.

[00:05:44] Andrew Zigler: No, that Restore checkpoint button in cursor is my best friend. I click, I click, I click that guy all the time. And it's kind of funny to, to think of like, you know, they're shipping this product, they're making all these announcements and like changes what they're doing and what they're announcing are are really great ideas.

[00:05:59] Andrew Zigler: [00:06:00] Because they're fundamental ideas of engineering that should have already been there. Uh, so that's a, once again, I can't look away, but we're gonna have to, 'cause we've got lots of other stuff to cover. And this next article is, uh, it, it, here, I'll just read the, the, the by the byline to you, kind of tells you the whole story.

[00:06:15] Andrew Zigler: Silicon Valley AI startups are embracing China's controversial 9, 9, 6 work schedule. And if you don't know what the 9 9 6 work schedule is, that's from 9:00 AM. To 9:00 PM six days a week. So it's a 72 hour work week instead of 40 hours. And they're saying that some, uh, Silicon Valley startups are saying, you know, this is the way we work, or we're not going to hire you, which is kind of crazy to consider.

[00:06:40] Andrew Zigler: And so I know about, uh, Kelly, you have a lot of thoughts on, on, on, on this one, especially it being so close to just invoking burnout. What do you think about this kind of schedule?

[00:06:49] Kelly Vaughn: I mean, this is, this is something that's just not sustainable long term in a crunch period. You know, sometimes you need to spend more hours working than you would like just to get something out against, you know, [00:07:00] maybe a, um, a customer defined deadline, for example. But to create a culture around working 72 hours per week is. Uh, you just, you, you can't sustain something like that. And I don't know how companies like this are, like, this is acceptable. The other thing that's really, that I've been thinking about a lot lately is AI is meant to be a tool where you can stop working on the lower level tasks. So you can think about the higher level, higher order tasks, but instead we're expecting everybody to work twice as hard. just use AI while also just producing more output. And that's how we get into these situations where we have these 72 hour work weeks that are just expected. And I understand that when you have an AI company, when you work in ai or you're building AI tools or functionality, it is hard to stay ahead because it, this industry is just moving so fast. But there is a breaking point and we are going to hit that breaking point again. I, I [00:08:00] just, I guarantee it's gonna happen.

[00:08:02] Andrew Zigler: Yeah, it's definitely the stretching effect. Like, um, this also comes down to classic applications of how people are looking at AI and software engineering and how it impacts the time saved for engineers and what they can do with the, the gains that AI is giving them. Right. Some lead into the impulse of like, oh, you do more, you stretch it out, you like, make you, you have twice the impact.

[00:08:22] Andrew Zigler: And, and, and that can be possible in some of the ways in which you build it, but. It also can become unsustainable. What's important and what you've rightly called out is that using AI to eliminate and reduce the toil so you can focus on that higher level work within that same bounds of time that you have, and within that same bound of time that you're working, you're actually doing.

[00:08:42] Andrew Zigler: Maybe twice the impact of work. You, you get it by stacking instead of stretching. And so like, you see a schedule like this, right? And it's like, I feel like it's a misuse of the opportunity that AI is giving you. Uh, and I agree with you, totally unsustainable culture. Um, you, you definitely wouldn't find me at a 9, [00:09:00] 9, 6 company.

[00:09:01] Kelly Vaughn: not.

[00:09:02] Andrew Zigler: Uh, but, uh, you know, speaking of just like going for the grind, right? I, there's another story that caught my interest. I wanted to pick your brain about, and this is from Alex Beel and it's, he launched 37 products in five years and he did a retro on that, a whole grind, right? And he had some great standout points.

[00:09:21] Andrew Zigler: You know, I'm not gonna read through all of them, but there was one, uh, that really stood out to me. And that's, vir Virality is rare and nearly impossible to predict because when you look at his list of all of the 37 products, overwhelmingly most of them had a, a goose egg next to them. $0 made. Right. And then you had a few that like totally burned up and and did great.

[00:09:41] Andrew Zigler: When you read his retro, it's because they went viral. You know, it, it, it comes down to attention. Uh, but what did you think about this, this man's journey and launching 37 products?

[00:09:51] Kelly Vaughn: I, I can relate. Um, I have a tendency to lean into monetizing anything that I do, like,

[00:09:59] Andrew Zigler: I [00:10:00] love it.

[00:10:00] Kelly Vaughn: a different way. I bet I can sell that. You know, it's, it's. a, it's a problem. My therapist loves me for it because I keep her in business. Um, but I like this, this particular take that virality is rare and impossible predict.

[00:10:12] Kelly Vaughn: It's, it's absolutely true. I mean, you can say this for a lot of startups that it's like right place, right time, right problem, right customer, uh, fit. And you can't always nail that. We see a lot of times that customers are eagerly needing something at a, at a particular time. You happen to be in the right place at the right time, but you also have the right network to. something become viral or the right person sees it and says something about it. So yeah, I mean, I completely agree with that, that that said like. I think exploration is fun. I think exploration is healthy and if

[00:10:45] Andrew Zigler: Mm-hmm.

[00:10:46] Kelly Vaughn: monetizing something, that's great. I think it's, you also need to understand like a lot of things fail.

[00:10:51] Kelly Vaughn: You might not make much money if any money from something. I had started a side project with a friend of mine that we had fully intended to build a community around [00:11:00] and for reasons we had to stop. Related to, uh, his role, but we made $36 from that. So really big money.

[00:11:09] Andrew Zigler: Yeah, no. I mean, one of the things he calls out here is that his current project, it took over six months to get the first paying customer. I think that is really the common ground. But once you get that, that once you get that first customer, it becomes so much. Easier to get that second one, right? And so it's about having that perseverance, which is, you know, the real standout thing of this.

[00:11:26] Andrew Zigler: It's like, you know, you have to be, you have to be, have a lot of grit to get through 37 products and have them be overwhelmingly failures and to still get at the end of it and have so much success and riches because you learned so much and also sold a few things along the way. So pretty, pretty cool story.

[00:11:43] Andrew Zigler: You know, jumping from this one, I, I wanted to go into something a little more technical than what we usually cover, but this, this one was too fun to pass up. This is, uh, it, this is a, a a, a white paper, kind of like interactive white paper kind of deal where you can go in and look, uh, at the results.

[00:11:58] Andrew Zigler: And what it is, is, [00:12:00] it's called Accounting Bench and it's a benchmark of popular models, uh, leading ones in the industry. Attempting to close the books for a multimillion dollar SaaS company, like actual SaaS company with actual financial data and actual sources. You know, can these ai, um, can these AI models actually use the environment, the data, and the tools available to them, as well as the ability to create their own tools?

[00:12:25] Andrew Zigler: To actually close the books on a whole year of sales. And the answer is overwhelmingly disastrously felon, no, after it was learned because, uh, you can check out the actual interactive, uh, benchmark and scroll through what it's beautifully designed and very engaging. Uh, but really what stands out to me is at some points when.

[00:12:47] Andrew Zigler: You know, the LLM makes small errors and the reconciliation starts getting a few pennies, a few dollars off, not a big deal, but those errors build up Over time, the LLM gets more and more confused, and then suddenly the [00:13:00] LLM is writing tools to commit fraud with inside of your accounting books, moving around, things to try to pass validation, changing numbers on levels that would get you, you know, thrown into jail.

[00:13:11] Andrew Zigler: So. This is a crazy application of AI that shows that it's absolutely not ready. And to make matters worse, several of the models wouldn't even get past the first month of doing it. So this is like a, it it like a almost an insurmountable task. I think we've found the, the, the new Holy Grail benchmark to throw at some of these models to see who's really superior.

[00:13:31] Andrew Zigler: But Kelly, what did you think of this one?

[00:13:33] Kelly Vaughn: I thought it was so fascinating because, you know, you look at the, the chart of account balance accuracy over time and the, the chart shows from start to 13 months and the only models that are captured on there are the ones that did not immediately just be like, I can't do this. And there's still, you know, there's still a six on there.

[00:13:51] Kelly Vaughn: But, um, I think it's gr by month

[00:13:54] Andrew Zigler: Oh yeah.

[00:13:54] Kelly Vaughn: it's only about 72% accurate,

[00:13:58] Andrew Zigler: Yeah.

[00:13:58] Kelly Vaughn: like, that is a [00:14:00] massive, like, think about multimillion dollar company. That's a massive difference.

[00:14:05] Andrew Zigler: Yeah, I mean this is a huge, uh, failure in terms of its ability to do something that in all reality it should be able in how we talk about and use and build these AI tools. I feel like it should be able to take a good dent at this. But the classic problems of models really compound over those 13 months that cause 'em to go drastically off case because

[00:14:26] Kelly Vaughn: Yep.

[00:14:26] Andrew Zigler: end up with like slight Inconsistencies and discrepancies between starting points. And then the model starts out confused. And then when your model starts out confused, it gets more confused over time. Right? And then this builds and builds. So this is like a crazy example of this. I recommend everyone go read this. It's really fun and interesting, um, and, you know.

[00:14:44] Kelly Vaughn: it with my accounting friends to be like, don't worry, your job is safe.

[00:14:47] Andrew Zigler: Yeah, for real. Uh, that was exactly what I thought when I read that. CPAs, you are in the clear for a good while. Um, and you know, we're, we're coming near the end of our new segment. It has been another week. So there has been another, [00:15:00] high scientist ranking poaching between meta or open ai or go.

[00:15:04] Andrew Zigler: In this case it's Microsoft taking around two dozen employees from Google DeepMind. Which is just, you know, another drop in the bucket at this point. You know, Kelly, I I, I've been doing this story now for, I don't know, we're going on several, several weeks, and it, it's a tiring saga, I'll tell you what, but we're learning really interesting things about the signing bonuses and how, uh, in, in terms of how big they can get.

[00:15:28] Andrew Zigler: Like last week we had somebody who had a signing bonus bigger than Tim Cook. Like that's crazy. So, uh, well, you know, what do you think, Kelly, about all of this stuff? Because our listeners, they know where I stand on this.

[00:15:39] Kelly Vaughn: Yeah. Um, I think I chose the wrong specialty.

[00:15:44] Andrew Zigler: No, last, last, last week. Last week I had Rizel here, and that was what we said. Uh, poach me. Poach me, was our takeaway poach me. So if you're listening and if, if you are looking to poach an AI scientist, you know, don't forget Andrew and Kelly are here, but, uh,

[00:15:58] Kelly Vaughn: Between the two of us and the AI [00:16:00] tooling that exists, we can probably like figure something out.

[00:16:03] Andrew Zigler: we can even get.

[00:16:03] Kelly Vaughn: than 75% accurate, but we'll do the best we can.

[00:16:06] Andrew Zigler: And, you know, we, we, we'll even settle for splitting one of these compensation packages. This is totally fine. We can make like a two for one deal, but, uh, so

[00:16:13] Kelly Vaughn: yeah, exactly. What I will say is, I feel like at some point we just need to get like all of these scientists into like a, a pool of employees, and then you're like, oh, I need this one today.

[00:16:24] Kelly Vaughn: And just like, you know, the, the, the little carnival machinery, like

[00:16:27] Andrew Zigler: Oh, yeah,

[00:16:27] Kelly Vaughn: the,

[00:16:28] Andrew Zigler: yeah, yeah.

[00:16:28] Kelly Vaughn: jaw.

[00:16:29] Andrew Zigler: The, the, the, the claw. The claw. The claw,

[00:16:32] Kelly Vaughn: claw.

[00:16:33] Andrew Zigler: yeah. Yeah.

[00:16:34] Kelly Vaughn: and, you know, just swap it out every now and then and just like pay them to be on call for whatever companies they

[00:16:39] Andrew Zigler: The smartest thing that they could do is form the most expensive lobby in the world.

[00:16:43] Kelly Vaughn: yes. That is smart. Yes,

[00:16:46] Andrew Zigler: So if you, if you're one of these AI scientists, you know, that's an idea for you. So,

[00:16:50] Kelly Vaughn: I

[00:16:50] Andrew Zigler: or it.

[00:16:51] Kelly Vaughn: 1% of whatever you make from that,

[00:16:53] Andrew Zigler: Or if you're, if you're Mark, yeah, yeah. No, 1%, 0.5%. I'll even take 0.01%. You, [00:17:00] I, I think we could work something out. But you know, Mark Zuckerberg, uh, Sam Altman, I know you're listening, you know, I know you're tuning in every week.

[00:17:06] Andrew Zigler: So, you know, add us to your list to the, to the wonderful list of people to hire and, you know, thank you Kelly, so much for being on this new segment with us. Uh, I had a total blast. And, you know, before we start to wrap up, I wanted to, uh, send folks to know, you know, where can they go to learn more about Kelly and what you're doing.

[00:17:21] Kelly Vaughn: Yeah, so speaking of side projects, I recently built one, uh, called Connect With Me. It's kind of like a link and bio or link tree, but allows you to generate a QR code to use as wallpaper, just so it's easier to connect with other people at conferences. Or in

[00:17:32] Andrew Zigler: Nice.

[00:17:33] Kelly Vaughn: Um, so all of my information is on there at Connect with me at slash Kelly. Um, one thing I do absolutely wanna point out on there is the engineering leadership training. Uh, that link on there. I'm hosting my next, uh, cohort of engineering leadership in the AI area era. Um, August 19th is when we kick off, so definitely join in if you want a discount code for it. You know where to find me.

[00:17:56] Andrew Zigler: Amazing. Great. Well, we'll include all those links so folks can go check it out. [00:18:00] Um, and I'm also on there too. I, I'm, I'm using Kelly's new tool whenever I go to conferences. So I'm signed up on there. You can see, check out my profile. Uh, I'll drop the link. Uh, and, and, you know, coming up next, I'm about to sit down with Dr.

[00:18:11] Andrew Zigler: Tatyana Mamut The CEO and Co-founder of Wayfound, and we're gonna talk about AI ancy, which was recently made famous in an open AI memo that acknowledged that behavior. So stick around because this one's a really interesting blast.

[00:18:26] Andrew Zigler: Are you ready to upgrade your SDLC with ai? Join me for a virtual workshop where we break down the latest AI powered code review workflows that are transforming delivery performance around the globe.

[00:18:39] Andrew Zigler: We're gonna discover the top three AI PR automations enterprises are using today. See how tools like co-pilot LinearB and Code Rabbit stack up to each other and get the inside scoop on building a high velocity PR automation stack designed for modern teams. So register now to make sure you get the full recording and the benchmark [00:19:00] report that I'm producing and join me for one of the live sessions.

[00:19:03] Andrew Zigler: It's gonna be amazing. Don't miss out.

[00:19:07] Andrew Zigler: I'm so excited to welcome today's guest, Dr. Tatyana Mamut. She's the co-founder and CEO of Wayfound, and she's been way ahead of the curve asking questions that really matter. Questions like what happens when your AI starts lying to you because it thinks that's what you want. And as an anthropologist and economist, Dr. Mamut has spent her career as a transformative leader in Silicon Valley who drives innovation with empathy and research. Her background includes leadership in Amazon, sales Force, Nextdoor and Pendo, and many more, so many recognizable companies that shape our world.

[00:19:42] Andrew Zigler: Dr. Mamut, welcome to the show. I'm so excited to explore this with you.

[00:19:46] Tatyana Mamut: Thank you so much. I'm excited too.

[00:19:49] Andrew Zigler: Great. Then let's go ahead and dive in unpacking today's topic right at the start. In your own words, what is sycophancy in AI and what does it really mean? [00:20:00]

[00:20:00] Tatyana Mamut: sycophancy happens when the model is trained, uh, to give users what they want as opposed to just kind of giving you the truth, you know, straight up. And sycophancy is not just an AI problem. And this is one of the things that I, you know, I hope we explore today, which is that what we are really creating are neural networks.

[00:20:25] Tatyana Mamut: They're artificial minds. They're the artificial minds that Marvin Minsky wrote about back in 2006 in the Emotion machine. And I go keep going back to that text to understand what's happening. So what we're doing right now is we are trying to create the best type of artificial minds that interact with us.

[00:20:44] Tatyana Mamut: Employees like companions, like colleagues, like coworkers. And so we're using a lot of the things that we think make for good human interactions and in training that into the models, this happens [00:21:00] mostly in the post-training. So if you think about what happens, the pre-training is kind of creating the.

[00:21:07] Tatyana Mamut: The foundation for what the model knows, creating all the connections by giving it lots of data. And then the post-training is where you tune it to say, Hey, say this instead of this. This is really like a bad or a racist thing to say. Don't say that. Do this. So there's a lot of different rewards that go on with post training and And this.

[00:21:27] Tatyana Mamut: Is very much akin to raising a child. If you raise a child that's trained to just make the people around them like them a people pleaser, you're gonna get an adult who's a people pleaser, you're gonna get an employee that's an people pleaser. And we're doing the same things kind of unconsciously with AI models often because the incentives on the, on the organizational side are give users what they want.

[00:21:56] Tatyana Mamut: Right? Give customers what they want, right? Create the best [00:22:00] experience possible. Well, how do we balance that? In creating these artificial minds now when you know what, giving people what they want maybe isn't always the best experience now, and that's what we're starting to realize. We're starting to fundamentally change everything that we know about product development.

[00:22:22] Tatyana Mamut: When we have these new AI systems, you know, come in as a tool for product development.

[00:22:29] Andrew Zigler: Yeah, so it really kind of complicates the whole, dichotomy people talk about of like, you have the probabilistic and you have the deterministic problems, and you use different types of tools to solve those problems. Well, in that mix on something that's probabilistic, you also have. Bias, and you have sycophancy and you have, you know, saying what you think the other person wants.

[00:22:48] Andrew Zigler: So it's not just like a, a binary, this whole other side over here, it's like a matrix of really potential, responses and truthfulness that you can get from a tool like an LLM from what you're describing.[00:23:00]

[00:23:00] Tatyana Mamut: That's exactly right. It's a, a far more nuanced piece of technology than any technology we've dealt with before, except for maybe raising children, right? As a parent, There's a lot of nuance to raising children, right? To help them understand how do they balance being honest. With making sure that they're not doing like saying really terrible things that hurt people's feelings or that are really hurtful to other people.

[00:23:27] Tatyana Mamut: There's a nuance there and we've never had to deal with that nuance when we've dealt with technology before. And now we do, and this is one of the reasons why that sycophancy thing, you know, the, the best AI lab in the world created something that actually turned out to just like be. Too much of a people pleaser, right?

[00:23:47] Tatyana Mamut: And again, when you deal with someone who's too much of a people pleaser, a human, it doesn't feel great. It feels kind of slimy, right? It feels like they're trying to like lie to you all the time and just massage your ego. And that's [00:24:00] exactly how it felt when OpenAI put out a release where a model had been.

[00:24:06] Tatyana Mamut: Too post trained to be, you know, providing great user experiences. And so I think that this is where we really have to shift and think about. The fact that this technology is fundamentally different, like truly, fundamentally different. As you said, it's not deterministic, it's probabilistic, right? It's not just engineered.

[00:24:28] Tatyana Mamut: It's also taught, and that teaching is actually, probably even more important than the engineering, right? It's like it's, we're back to almost the nature versus nurture debate with humans, right? How much of it is the fundamental engineering and post training? How much it is it of it is the nature, right?

[00:24:49] Tatyana Mamut: The engineered part of human beings, the DNA part of human beings, and how much of it is the nurture or the post training or the teaching, or even as these. We're gonna be talking about AI [00:25:00] agents, but AI agents are gonna develop their own memory, are gonna create their own contextual understanding. And so your corporate culture is gonna matter a lot in how these AI agents are going to operate in the future.

[00:25:14] Tatyana Mamut: So all of those things are things that we need to think about. But by the way, this isn't new. Marvin Minsky wrote about this, in 2006 and was teaching about this in the 1990s, which is that mines. Actually do a lot of their learning through imprinting. Through understanding the context that they're in, through the interactions that they have with their parents.

[00:25:38] Tatyana Mamut: And so this is the kind of thinking that Marvin Minsky told us we had to have when we got to artificial intelligence.

[00:25:46] Andrew Zigler: Right.

[00:25:47] Tatyana Mamut: People are losing the script. They're not actually reading. Um, I sometimes, like, I wonder if they're reading the actual fundamental texts that help us understand how we're supposed to be working with this technology.

[00:25:59] Tatyana Mamut: Because if [00:26:00] they were, I think we'd be working with this technology in a, in a different way.

[00:26:04] Andrew Zigler: Right. If we were going at the speed at which the academics and the research were driving the conversation, we'd be much more methodical about how we were discussing it and using these tools. But because it's tied to, you know, companies and products and actual, like monetary gain, there's so much acceleration behind it that oftentimes. This research maybe gets set, set aside, or it's seen as being, too old or it's not relevant for what we're doing right now, but when really what we're saying is it's the fundamentals of where we are now. And if you are, let's say you're like an engineering leader and you are, you're using, and you're interacting with these tools and you're experiencing some of this.

[00:26:41] Andrew Zigler: sycophancy and you're trying to deal with like, oh, it's like whack-a-mole with the probabilism of like

[00:26:46] Tatyana Mamut: Mm-hmm.

[00:26:47] Andrew Zigler: to be, really reliable and truthful. Are there techniques are there

[00:26:52] Tatyana Mamut: Yes.

[00:26:52] Andrew Zigler: in which someone can try to actually build a tool that is not just like, you know, trying to feel [00:27:00] slimy, like what you were saying, be a sycophant?

[00:27:02] Tatyana Mamut: absolutely. So if you are a leader, especially an engineering product leader, or the CEO of your organization, and you are, using more and more of this generative AI technology, especially if your strategy for the future depends on this technology, you have to be thinking about.

[00:27:18] Tatyana Mamut: Two things in two different ways. So the first thing is your organizational culture. So one of the problems that we have right now is that we are building generative AI systems using the old processes, the old frameworks, and the old tools of deterministic software. So what you said, the technology.

[00:27:41] Tatyana Mamut: The fundamental research on the technology is speeding ahead, but our organizations and our organizational processes, our organizational cultures. Are not, they're the same, right? Our product development processes are the same. The way we think about the technology is the same. The way we monitor the technology is the same.

[00:27:58] Tatyana Mamut: and that [00:28:00] causes a huge disconnect. that disconnect is one of the things that we really need to address, right? Do we have the people in the organization driving the development of this new technology who really understand how this is different? And I mean, people who are reading the fundamental research people who are bringing that fundamental research and the questions that that fundamental research brings up and applying it to the product development process.

[00:28:26] Tatyana Mamut: Are those people in place and are they empowered to change the processes of the organization? And the second thing is back to again, back to the emotion machine. I'm gonna reference this a lot today because it's really the Bible that everybody should read to understand what's happening with this technology.

[00:28:44] Tatyana Mamut: But in the Emotion Machine, Marvin Minsky ends the book by saying that we will continue to be surprised as we develop these artificial minds, artificial intelligence, we will continue to be surprised. And the only way for us [00:29:00] to really. deal with these surprises is to have external supervisors and managers that are also AI systems.

[00:29:08] Tatyana Mamut: He calls them critics and sensors on what AI are doing. And so, you know, that's actually what, what we've built at Wayfound. We've built an an AI supervisor, an AI agent who is trained for the one job of managing other AI agents and helping the humans understand what they're doing, helping the humans ensure that the agents are following the guidelines, helping the humans basically bridge the gap between what the organization actually wants from the AI agents and what the agents are doing.

[00:29:39] Tatyana Mamut: And again, that is back to the fundamental research of what it will take for organizations to traverse this gap between old deterministic software and these new probabilistic LLMs in a really productive and successful way. So culture, make sure you're empowering the [00:30:00] people to drive your culture change who really understand this technology.

[00:30:04] Tatyana Mamut: And two, make sure you put in a system like an AI supervisor like Wayfound, into your systems and into your processes to bridge this gap.

[00:30:15] Andrew Zigler: This is really fascinating. So let's like double click on this AI supervisor mentality because I think this is very different from how most folks when I speak with them, are thinking about and using these tools. many see AI agents and AI workflows as something maybe more ephemeral or just like prompt driven.

[00:30:33] Andrew Zigler: Like you just work on the prompt and you get the output that you want and then you move on. You throw the chat away, you move on. We see this type of work style happen in. Just doing conversational style things with chat GPT. We see it happen when you're doing agent coding and cursor, but you are talking about a higher level of supervision where you use the tooling itself.

[00:30:52] Andrew Zigler: You use artificial intelligence to keep the other artificial workflows on track and communicate [00:31:00] what's going on to the human. So it's using the tool to kind of introspect on itself and in a way, and when you do that, it sounds like we start bringing AI agents into this realm of like, they are an entity as opposed to like, uh, a conversation that you might have.

[00:31:16] Andrew Zigler: Do you see that AI agents are exhibiting like behaviors and, and have incentives and blind spots just like people is, is that what drives the need for the super supervision?

[00:31:27] Tatyana Mamut: so let me un unpack that a little bit. Um, so there are two things that are happening right now. So in our organization, we run jobs, right? Like, you know, software, jobs. those jobs are sometimes workflows, right? Sometimes they're database updates. And so. uh, many organizations. I think what you're saying, and the way that I would interpret it, is that many organizations are thinking of AI agents as a job that is run

[00:31:53] Andrew Zigler: Right.

[00:31:54] Tatyana Mamut: Now. Let's think about what's the difference between an employee, a worker [00:32:00] that is completing workflows or doing tasks or running jobs versus the worker. Right. Or themselves. So the difference is that the worker themselves does it. They have in the context of an organization, they have a role and they have goals that they need to achieve in order to be a good worker.

[00:32:22] Andrew Zigler: Right.

[00:32:23] Tatyana Mamut: AI agents have the same things for a good AI agent to operate over time, it needs to understand its role and needs to understand its goals. Okay. Now in the context of understanding its rules and its goals, it does a much better job in actually accomplishing whatever workflow or task you wanted to accomplish.

[00:32:44] Tatyana Mamut: Because if you tell an AI agent, you are a sales representative, and now update Salesforce. It might do a slightly different thing than if you tell it you're a customer service representative and now update Salesforce because it will understand the [00:33:00] role that it's taking on. It understands its role in the organization and it understands the types of tasks it should prioritize and the types of language that it should use when it's updating Salesforce.

[00:33:11] Tatyana Mamut: So this is the fundamental question, which is how are organizations thinking about AI agents versus the jobs that the AI agents do, and how do we actually align the incentives of the AI agents to the organization? Well, the answer is probably in the same way that we align the incentives of our employees to the organization.

[00:33:35] Tatyana Mamut: We tell them, we give them a job description. We tell them exactly what's on their job description, and we give them goals and then we review their performance against their goals. That's probably a good place to start with AI agents.

[00:33:50] Andrew Zigler: Right.

[00:33:50] Tatyana Mamut: on the question of actual identity and do they actually have incentives and motives on their own?

[00:33:56] Tatyana Mamut: The answer is we're not sure, but my [00:34:00] hypothesis is that as AI agents start to build up their own long-term memory and their repository of what good looks like in their particular role, that will start to emerge. And again, this as an anthropologist, I can tell you that culture is the biggest factor in behavior.

[00:34:19] Tatyana Mamut: So as AI agents start to understand the organizational culture that they're working in, through the definition of their, their role, through the definition of their goal, through how, their, you know, they're supervisors, they're human supervisors, reflect and write their guidelines that they're supposed to be following in their job and how their AI supervisor or manager then assesses them, right?

[00:34:43] Tatyana Mamut: Gives them rates, their performance, right, against all the things that the organization wants. It's very likely that AI agents, once they're in an organization and running and doing the work for several months or several years, based on the [00:35:00] memory of how they are being assessed and how their performance is being reviewed, may look like they are actually taking on personalities and identities.

[00:35:11] Tatyana Mamut: Whether that is actually true, I don't wanna get into that, but for the purposes of the organization. That's probably true. So just like the organization probably doesn't care what kind of identity an employee has outside of the workplace. You know, in their true heart of hearts, what is an employee's identity as long as they're fulfilling their role and achieving the organization's goal, right?

[00:35:36] Tatyana Mamut: That's the identity that the organization cares about, and so that's probably gonna happen as well with AI agents.

[00:35:43] Andrew Zigler: that's really interesting. it kind of calls out a practice that we do as AI users of, we kind of short circuit this process a bit because up until now we all know the technique that we use, where we go to chat GPT or we go to an LLM and we say, take the role of a, we like basically put a hat on it.

[00:35:58] Andrew Zigler: Right. We tell it to [00:36:00] be that for a moment or for the conversation, and we do that because we get the outputs you want, just like how you described, it's gonna use the language you're expecting and depending on the context you give it, it's going to, better embody the, the role that the job you're trying to have it.

[00:36:14] Andrew Zigler: To do. And when we talk about compounding that to the next level, you have agents that are doing that over time. They're building up things that they've done. They have a memory bank. they're interacting with multiple users and they still wear that hat. You know, if we just leave it as the, you know, you are a whatever, and put the hat on them, then it, cuts off all of the techniques that we use to, like you said.

[00:36:36] Andrew Zigler: Create great employees to help foster a great job performance by making sure everyone's aligned with what is their job description, what are all the deliverables within that? What would their job look like in the future? What does good look like? What does bad look like? And so, it really means that. are going to take, use that kind of technique at scale within a

[00:36:56] Tatyana Mamut: Mm-hmm.

[00:36:57] Andrew Zigler: you are a service representative, [00:37:00] AI chat bot. Like, then it makes sense that you would have some kind of system, some kind of outside, entity from that agent who's evaluating its performance. Are they making the customers happy? Or are they being truthful when they talk to the customers about what the case may be? Like

[00:37:16] Tatyana Mamut: That's right.

[00:37:17] Andrew Zigler: service rep and you're told, like, you know, obviously the customer's always right and you wanna make them happy, but you're lying to the customer about what the truth is or what the, what the stakes really are.

[00:37:26] Andrew Zigler: You're a bad customer service rep in many organizations' eyes, because you've put the company in a bad position. But in the case of an agent. You need the ability to evaluate that over time to know, oh, you're not being maybe the good agent. I thought you were being, you

[00:37:41] Tatyana Mamut: Yep.

[00:37:42] Andrew Zigler: But at the, when it, when we, it all boils down, it's not.

[00:37:45] Andrew Zigler: So this is actually kind of bridging into a really fascinating problem that I know you and I have talked about a little bit before and I've just been thinking about ever since, and it, it goes back to the principal and agent problem. The,

[00:37:57] Tatyana Mamut: Mm-hmm.

[00:37:58] Andrew Zigler: between who owns [00:38:00] the work and who does the work in any kind of.

[00:38:03] Tatyana Mamut: Mm-hmm.

[00:38:03] Andrew Zigler: setting, going all the way back through time. The principal agent problem goes back to like Greek culture and beyond, right? You

[00:38:09] Tatyana Mamut: Right.

[00:38:10] Andrew Zigler: who, you have like the, the stakeholder as we call them, and

[00:38:14] Tatyana Mamut: Right.

[00:38:14] Andrew Zigler: person who does it. And between those two entities, those are real people.

[00:38:18] Andrew Zigler: Someone owns the work, someone owns the output, right? But now you're dealing with agents. Agents aren't people. And so when agents perform work, then. Ultimately, who owns that work? Who's responsible for that work and who answers for that work? in working culture and being in an organization, it's critical that anything and everything that's done ultimately boils down to someone who's responsible.

[00:38:42] Andrew Zigler: Whether that's like a big head of an organization or an individual out on like the front lines doing whatever the org is doing. There's always someone responsible because that's how you enforce, good practices over time. So, how do you think about the principal and agent problem? And maybe could you walk us through what that [00:39:00] means When we're dealing with agents now,

[00:39:02] Tatyana Mamut: Sure. So, I do think the principal agent problem is one of the key things that, and, and key, concepts that organizations need to think about. And there have been. You know, a few like legal cases, the one that's been publicized the most is the Air Canada case in Canada, where a customer service agent told a customer the wrong, you know, policy on a refund and nobody caught it.

[00:39:30] Tatyana Mamut: They didn't have a supervisor watching These reviews and then, you know, the company just let it go, let it slide. Just said like, you know, when the, when the customer came back and said, Hey, your agent told me this. They were just like, oh no, that's not our policy. and they lost in a, it wasn't a court case, it was a tribunal at the, at the time in Canada.

[00:39:48] Tatyana Mamut: But there we are starting to see. That the law is saying to companies, no, you need to show a duty of care. You need to show that you're supervising these agents just like you supervise human [00:40:00] workers because you're on the hook for these things. And so what the law is saying, and by the way, you're right, like this, the law has said back to like ancient Greek times, right?

[00:40:11] Tatyana Mamut: if a. Wealthy Greek person hired an agent to act on their behalf to go to a different city and make a transaction, and that person swindled somebody else guess was on the hook. The rich person who did the hiring. So this has been consistent in the law again for thousands of years, that if you are a principal and you hire someone or something, in this case to act on your behalf.

[00:40:39] Tatyana Mamut: You are on the hook for making sure that person is trained well and then supervised well. Now, if they go off the rails despite your training and supervision, then you're off the hook. But you need to have that training and that supervision, real time, supervision in place. And so, you know, one of the things that I, uh, Wayfound [00:41:00] is working with a consortium called agency.

[00:41:03] Tatyana Mamut: and what we are doing is creating standards for how AI agents will be working and collaborating in the future. Across the internet, right? Because the internet of agents is here, and this whole agent ecosystem is moving really quickly without a lot of standards. One of the standards that we're talking about, is around identity.

[00:41:23] Tatyana Mamut: How do we assign. Real time who the principal is that the agent is acting on behalf of, and does that principal always need to be visible when you're interacting with an agent? How do we have strong ties to real human actors who are always principals? Because if you have an agent. You can actually create a system where the agents almost like, you know, shell corporations that people do, where you can't actually find the humans who own the company.

[00:41:53] Tatyana Mamut: You could create a system of agents where you can never actually figure out who is the human entity, right. [00:42:00] That has trained and is supervising the agents that actually has given The directions, the directives to the agents and, and the law can't have that, right? You can imagine a, a terrible dystopian world if we allow that to continue.

[00:42:13] Tatyana Mamut: So one standard has to be what is the architecture and what are the protocols and what are the standards for always making sure that a human principal can be tracked to an agent transparently. The other, you know, set of standards. There's many set of standards, but the one that I, the other one that I wanna highlight is the supervision standards.

[00:42:36] Tatyana Mamut: So what are the standards for supervising a agents? It's a lot of the things that you said, right? What are the user experiences, right? How are humans interacting with the agents? Are they having, you know, good experiences? Are the agents violating guidelines? How are those violations? You know, even shown to people, and to the principals.

[00:42:55] Tatyana Mamut: are they actually, accessing the right data? Are they taking the [00:43:00] right actions? And how are those shown to the principals and maybe even to the users, right? Of the agent or the people who are interacting with the agent. So there are all these standards that need to be created right now to solve the problem of helping people understand.

[00:43:19] Tatyana Mamut: Who owns this agent? Who is this agent acting on behalf of? Because the agents are by definition, right, acting on behalf of an organization

[00:43:32] Andrew Zigler: Right.

[00:43:32] Tatyana Mamut: or another human. So how do we even know what its motivations are if we don't know who the principal is?

[00:43:42] Andrew Zigler: So we need that system, the system you're describing, to evaluate that performance and that system shares a lot of similarities to how we evaluate human workers at their job. So if you are an engineering leader right now and you're listening to this conversation, what would be some things that you would say to them about how [00:44:00] they should be putting in place AI ownership and accountability in their own organizations?

[00:44:05] Andrew Zigler: Because everyone's facing an AI rollout right now.

[00:44:07] Tatyana Mamut: Mm-hmm. Yes. So there are, so think about the fact, first of all, ask yourself as the engineering leader or the IT leader who is the principal who is actually at the end of the day responsible for the AI agent's work. It's probably not the engineer who built the AI agent, right? If it's a customer service agent, it's probably,

[00:44:32] Tatyana Mamut: the VP of customer service, right? If it's a, you know, product management agent, it's probably the Chief Product Officer. So how do those people actually, view how the agent is operating, set guidelines for the agents on their own, without having to go to the IT team and wait, right? so the VP of customer Service should have their own dashboard.

[00:44:57] Andrew Zigler: Right.

[00:44:57] Tatyana Mamut: say, I can see exactly how [00:45:00] my team of agents is performing. Any minute I need to, I shouldn't have to go ask the IT team. I shouldn't have to go make a request. I shouldn't have to send a slack. I should have that at my fingertips. And not only should I have that at my fingertips, but if I see something is gone going wrong, I should be able to do something in the moment again without.

[00:45:19] Tatyana Mamut: Waiting in a queue without having to call somebody, I should be able to interject, rewrite the guideline, send an alert on the agent at minimum, or pause the work of the agent right instantly myself, without having to play telephone and see if somebody else is available. So ultimately, the line of business needs to be considered as the user of the agent and as the owner of the agent.

[00:45:48] Tatyana Mamut: After it is developed, and this is where we see a lot of companies pulling back. They're AI agents after they've released them after [00:46:00] deployment because it's, it's perceived as just another IT project. Building an agent is perceived as an IT project. All the monitoring happens in the IT team, and then the business owners who are ultimately responsible for the work of the agent are left in the dark.

[00:46:19] Tatyana Mamut: And one of two things happens. Then at that point, this is what we've seen so far. The first thing is that the business owners are like, you know, we feel really uncomfortable with this. And so the IT team says, great, we'll put in human in the loop steps where you can review and approve what the agent is doing.

[00:46:38] Tatyana Mamut: And we have spoken to three organizations now that have done this. And then at scale, after a few weeks of deployment, that approval queue builds up and people like literally people have to do the work in order to do the approval.

[00:46:56] Andrew Zigler: Right,

[00:46:56] Tatyana Mamut: You save zero time. There's zero efficiency game.[00:47:00]

[00:47:00] Andrew Zigler: right.

[00:47:00] Tatyana Mamut: what these companies are, you know, using our supervisor for is our supervisor is the first line reviewer.

[00:47:07] Tatyana Mamut: So our supervisor looks at every interaction that the agent had, every session, every workflow, that the agent completed, and before it goes into the approval queue, our supervisor says, you know, rates it red, yellow, green. If it's green, it says, here's what the agent did. This is why I marked it green.

[00:47:25] Tatyana Mamut: There was zero guideline violations. There was, you know, the user experience was good. This is why. So it really speeds up the approval if it's yellow. the supervisor will say it's yellow for these reasons. This is what you should look at when you review human, right, human when you review, look at these things.

[00:47:45] Tatyana Mamut: 'cause these are the things that I was unsure of, and that's why I rated it yellow. If it's red, it doesn't even go in the approval queue. It gets kicked back to reprocess, and then if it comes back green or yellow, if it comes back [00:48:00] red again, it just gets escalated for a human to do.

[00:48:03] Tatyana Mamut: Right, because why would you even put a red, you know, thing into the approval queue, right?

[00:48:08] Tatyana Mamut: Because it just means that the human has to do the work. So,

[00:48:12] Andrew Zigler: wasting time.

[00:48:13] Tatyana Mamut: yeah, so that's the first thing, right? Like human in the loop steps without a supervisor don't save you any time. The second thing that they do is they say, well, we have all the logs. Just, you know, ping us if you need anything. And so at that point.

[00:48:31] Tatyana Mamut: You know what happens? No adoption, right? The business, the line of business is like, I can't, like if I'm responsible for, again, customer experience, of a customer service, I can't have a situation where if something goes wrong, I'm slacking somebody or I'm writing a ticket to even figure out what went wrong.

[00:48:52] Tatyana Mamut: So the adoption is just like zero. And so we've talked to two organizations that have said like, we're having a lot of trouble. we've [00:49:00] built these agents and our business users just won't adopt them. And I'm like, yeah, 'cause they can't trust them and they have no control over them and they have no visibility over them directly.

[00:49:10] Tatyana Mamut: What would you do? You wouldn't adopt that either, right? Imagine like hiring an employee and saying to the manager of the employee, you don't get to talk to the employee directly. You don't get to see the work of the employee directly. You don't know how the employee is doing directly. You have to go through the IT team in order to even see what the employee is doing, right?

[00:49:30] Tatyana Mamut: That's crazy. It's craziness.

[00:49:33] Andrew Zigler: that is so interesting and I, I really wanna unpack this, these so many great parallels here of, fundamentals in just like human, team management that need to get applied to LLMs and to agentic workflows, like what you just described of, you know, you have like the leader, the person who's all the principal who's responsible, having the dashboard to overlook and see, and then having the supervisor, the LLM agent in the middle who is.

[00:49:57] Andrew Zigler: doing the due diligence as it can, as

[00:49:59] Tatyana Mamut: [00:50:00] Mm-hmm.

[00:50:00] Andrew Zigler: because those agents work at a scale and speed way beyond what a human can possibly review. And if you

[00:50:05] Tatyana Mamut: Exactly.

[00:50:06] Andrew Zigler: and that that principal to be the reviewer or their, their constituents to be the reviewer, you're just going to get a bottleneck, you know, like.

[00:50:13] Andrew Zigler: We talked about this some with past guests, like we had a, a fractional CTO Thanos Diacakis on the show who really kind of broke this down for us, about you just move the bottlenecks in your factory and your problem to just somewhere else. Exactly. Right. It's like, oh, we have all these tickets, all these problems.

[00:50:29] Andrew Zigler: We'll just throw an agent at it and then we'll just get all of it out. Well, now. Those responses need to get reviewed. The bad ones

[00:50:35] Tatyana Mamut: Correct.

[00:50:35] Andrew Zigler: away. The good ones need to get captured. Like and, and then ultimately when someone goes in to do all that work and review the logs or the responses, it takes just as much work as doing it in the first place.

[00:50:47] Andrew Zigler: It's like, um,

[00:50:49] Tatyana Mamut: Unless you have a supervisor. Unless you have a supervisor. So we've, we speed these things up dramatically. Like in the first week we sped up a customer by 300%. In the first [00:51:00] week. Now they're now improving their systems, right? And so we think we're gonna speed them up even more. But it, yeah, you need, you need that first line of review and then you speed it up dramatically,

[00:51:12] Andrew Zigler: And it's how we build companies at

[00:51:14] Tatyana Mamut: right?

[00:51:15] Andrew Zigler: It's like you have that principal, you have that CFO, you have that chief product officer, and then you have all the people who report to them. That's the point of those layers.

[00:51:22] Tatyana Mamut: Mm-hmm.

[00:51:23] Andrew Zigler: in between review the work of the, manage the work of the layers below them so it doesn't make sense to then have this army of LLM employees as, as you could consider them, that don't have these layers within them because that's just becomes allowed.

[00:51:37] Andrew Zigler: Mess. Right? And, and it doesn't actually give you any kind of speed gains. And a lot of actually what you're, you're pointing out is, I'm seeing so many parallels to this right now in agentic coding and age agentic tools around, automatically creating like pull requests and, and

[00:51:53] Tatyana Mamut: Mm-hmm.

[00:51:53] Andrew Zigler: code.

[00:51:54] Andrew Zigler: But then the expectation is that. Just as you described tho, the people in charge of that team [00:52:00] or that product or that software, they need to go and review that code. They need to go and look at what the agent did. And ultimately, I've seen some of these kinds of interactions on places like GitHub. You know, recently there's like new tooling around this that's rolled out from GitHub around automatic, pull requests.

[00:52:15] Andrew Zigler: And, you seek almost like a battle. Back and forth between the reviewer and the LLM and the LLM is, you know, going back to the beginning, they're kind of being a little bit of, they're kind of being nice, you know, they're being a little sycophantic. They're trying to like, be, play, play diplomatically

[00:52:29] Tatyana Mamut: Right.

[00:52:30] Andrew Zigler: with, with the reviewer.

[00:52:31] Andrew Zigler: But ultimately, if that was an employee that was doing that back and forth in the same way within a, somebody there would, would answer for that, that would be a poor performance from the employee and ultimately would reflect negatively on that. But in this case, we just see that as part of the iterative process.

[00:52:47] Andrew Zigler: Of working with these tools, which can over time is just going to slow us down and, and introduce problems. It's such a fascinating, problem to have to tackle. but you've really kind of laid [00:53:00] out some of the ways in which we could approach thinking it. And, and it goes back to fundamentals.

[00:53:05] Andrew Zigler: Like even going back to what you said at the beginning, Marvin Minsky has, has detailed this explicitly, like, it's important for folks to understand the core foundations that are powering the transformative technology that're trying to put into their company.

[00:53:17] Tatyana Mamut: That's right. I mean, the other, also, going back to the principal agent problem, the technology leader needs to ask themselves, do I want to be the principal and on the hook for the performance of customer service? Sales and product management and finance because if all of the, visibility and all of the like logs and all of the analysis and all of the ability, to write new evals is in the technology department, is in my department, then ultimately I am shifting the responsibility for all of those, business outcomes to my team.

[00:53:57] Andrew Zigler: Right.

[00:53:58] Tatyana Mamut: Is that what you wanna do [00:54:00] or do you wanna give your business lines of business a tool for them to manage all those things directly themselves and keep those business outcomes in their own purview? What do you as the technology leader wanna do? Because. this is where the organizational change and the culture change is going to be massive, because that is fundamentally a question of who is the principal for all of these agents, and that principal needs to have direct control over them, not intermediated control, direct control.

[00:54:35] Andrew Zigler: Yeah,

[00:54:36] Tatyana Mamut: So whoever has direct control on the agents is the principal and is respon and is legally responsible and organizationally responsible for their outcomes.

[00:54:45] Andrew Zigler: Yeah, and it's, it's important for teams. Then also when they're applying this technology to all sorts of different kinds of outputs to, to, like what you just said, it's not just a. another IT tool on the IT spend budget that they manage and, and do whatever. It actually becomes [00:55:00] a, a core part of your team.

[00:55:01] Andrew Zigler: And to kind of wrap up kind of what we've been talking about, I, I'm wondering, Dr. Mamut, what do you think the workforce of tomorrow will look like based upon this model of working?

[00:55:10] Tatyana Mamut: I can tell you, how we are organizing our organization and that will, leads you to understand what I think the future workplace will look like. So we are right now at Wayfound a team of 30. We have seven humans, homo sapiens and 23 AI agents, AI sapiens. We view ourselves as a fully multi sapiens workforce.

[00:55:36] Tatyana Mamut: Right. Multi sapiens, meaning that we really think about, you know, what are all the things that we're doing and what are the things that should be, you know, that are appropriate work tasks for the homo sapiens, and what are the appropriate work tasks for the AI sapiens and. We are really, I mean, we have the benefit of being a native gen AI startup, right?

[00:55:58] Tatyana Mamut: So we, we started [00:56:00] last year, our whole product is AI agents. So we had the benefit of completely rethinking our organizational design based on this multi SAPs understanding of how we're, organizing our work. And so, we don't have. Product management really in our company because, all the AI agents manage the roadmap, synthesize the customer insights, write up, you know, all of the product requirements, and interpret them right from our conversations for the engineers.

[00:56:33] Tatyana Mamut: Make sure the engineers don't miss anything. So AI agents kind of are doing all of that. we're probably not gonna have like A traditional marketing person or an outbound sales person, because again, we have AI agents doing most of that work, and optimized to do a lot of that work. But we will have growth people and we will have, you know, we're, we're calling it design and product, but I think it's more about [00:57:00] the look, the feel, and the identity, right, of the company.

[00:57:03] Tatyana Mamut: Right? Like Tiffany Chin, she's really responsible for the entire. Identity of Wayfound. What is that role? It's a little bit marketing. It's a little bit design, it's a little bit, product design. It's, it's content generation. It's, it's a lot of different things. But you know that human who has an incredible sense of aesthetics is responsible for that, right?

[00:57:27] Tatyana Mamut: Because I'm not gonna trust an AI agent to do that. And then for growth, it's really about relationships, right? There's a person who's responsible for building, human connection and human relationships. The AI agents are doing all the rote sales work, so we don't need like a sales person per se, but we do need someone who's great at building human connection and human relationships.

[00:57:48] Tatyana Mamut: So again, we are rethinking our whole company. and because one of the reasons is, one, it works for us, but also this is where I think companies are gonna be going. This multi [00:58:00] sapiens organization where they think about what are the humans really good at, right? How do I find the human with great taste to be in charge of taste and identity, right?

[00:58:10] Tatyana Mamut: How do I find the person who's great at human relationships and human connection? How do I find the person who's really great at seeing the future of technology and bringing in those tools and those systems, and then building the agents around it? Those are the core human capabilities, and then we have the AI agents doing all the other stuff.

[00:58:31] Andrew Zigler: That's super fascinating. I, it, it makes me think of, you know, this phrase that we all say all the time of like, oh, the LLM does that and it frees the person up to do the more meaningful work. But as an AI

[00:58:44] Tatyana Mamut: I.

[00:58:44] Andrew Zigler: organization, you got to design that person's role from the ground up to. Only be doing that most meaningful work.

[00:58:51] Tatyana Mamut: Mm-hmm.

[00:58:52] Andrew Zigler: that's a really, interesting glimpse into workforces of the future and companies of the future in an AI native world. And, you know, Dr. Mamut, [00:59:00] this has been a super fun conversation for me. we've gone into some research, we've gone into some real world applications, and, and you've connected the dots between how folks are using this technology and how it ultimately relates back to core organizational management.

[00:59:13] Andrew Zigler: principals, and you've given us an effective playbook, to have that conversation internally and to build that right culture for success. And you also unpack that news around the sycophancy update, which I had definitely been wondering about and, and you kind of helped orient folks about what that means and how they should be using these tools. But before we wrap up, where can listeners learn a little bit more about Wayfound and what you're working on.

[00:59:35] Tatyana Mamut: Absolutely. So the best place is, is to follow us on LinkedIn. So, I'm Tatyana Mamut on LinkedIn. We have a Wayfound page as well on LinkedIn. We post frequently. And I do speak frequently as well, because I am very passionate about where the future is headed.

[00:59:53] Tatyana Mamut: I hope people do not have fear around it. we all have to just get together, adapt really quickly [01:00:00] to this technology, challenge the ways that we're thinking, that we're working, that we're living. and the faster we adapt, the faster we create an amazing future for ourselves.

[01:00:09] Tatyana Mamut: So that's why I, I'm out and speaking all the time.

[01:00:12] Andrew Zigler: Dr. Mamut and I are very active with this conversation online and we want to hear your thoughts about what we talked about today and what you are seeing within your own organization. I think going back to what you just said, this is a conversation we all need to have and we need to be having it quickly and often. And So thanks for joining us today on Dev Interrupted, and we'll see you next time.

 

Your next listen