Foresight Over Firefighting: Being Proactive in a Reactive World

"So we went through and we deleted every single dashboard, every single product measurement tool we had, and we said the only way we are going to learn from customers is we are going to get on the phone with them... And that culturally has set the stage for us to be incredibly customer centric..."

Still stuck in a reactive loop with incident response, only fixing problems after they happen?

JJ Tang, Co-founder and CEO of Rootly, joins host Andrew Zigler to reveal how to shift beyond reactive, leveraging powerful AI and an often-underestimated skill in engineering: genuine customer empathy. Discover how these elements are crucial for navigating the complexities of modern infrastructure and shaping the future of incident management.

JJ explores the forefront of incident response automation, discussing how to integrate shiny new tech like AI safely and why deep customer understanding is key to building trust and reliability. Learn about the common pitfalls leaders face, the cultural shifts needed for proactive reliability, and how teams can make our digital world safer.

Show Notes

Check out:

Beyond Copilot: What’s Next for AI in Software Development
Survey: Discover Your AI Collaboration Style

Follow the hosts:

Follow Ben
Follow Andrew

Follow today's guest(s):

LinkedIn: JJ Tang
Website: rootly.com

Referenced in today's show:

Support the show:

Subscribe to our Substack
Leave us a review
Subscribe on YouTube
Follow us on Twitter or LinkedIn

Transcript

Andrew Zigler: 0:06

Welcome to Dev Interrupted. I'm your host, Andrew Zigler.

Ben Lloyd Pearson: 0:09

And I'm your host, Ben Lloyd Pearson.

Andrew Zigler: 0:12

This week we're talking about open AI's acquisition of Windsurf and a new competitor to Star Link's satellite internet. We're also discussing the growing divide between contributors and managers during AI adoption and how folks like you and me can avoid skill atrophy. But Ben, we gotta start by talking about Windsurf, right?

Ben Lloyd Pearson: 0:31

Yeah, I mean, as much as I identify with that skill atrophy in the age of ai, we can hold off on that one for a moment and talk about this big news with Windsurf.

Andrew Zigler: 0:39

Yeah. So, um, it's been all over the news about OpenAI, buying Windsurf for a $3 billion valuation. windsurf is a, a coding assistant tool with an IDE, that allows a user to do agentic workflows, to build software. this is, commonly used in a lot of different environments right now by folks experimenting with ai. you may recognize it as a, leading competitor to Cursor. the deal which hasn't closed yet would be open AI's largest acquisition to date. And as you know, OpenAI is not shy about its acquisitions. in the past it bought search and, database analytics startup rock set. it's also tried to buy the leading, Agentic coding, IDE cursor. Before in the past. so open AI making this move now acquiring Windsurf, getting a stake in the game, for agentic AI and coding, kind of an exciting, development in the space. Ben, what do you think about it?

Ben Lloyd Pearson: 1:34

you know, I'm happy to see competition, in any tooling in general. I think it, it's one of the things that really sets American tech companies apart is just how competitive it is. this I really think is a simple one. OpenAI, they're just trying to build a moat and they have a lot of money. And in fact, I think this is their attempt to buy a moat rather than build it. As a matter of fact. specifically in the software development space. So you mentioned Cursor, it's obvious they couldn't get cursor to sell to them, so they went to the next one. cursor being like the fastest growing company in history, based on revenue, it's gonna be some stiff competition, I think, in both directions. You know, neither one of these companies really has a strong moat yet, even though. I think we are getting to the stage now where the leaders are starting to sort of establish the norms of this space. You know, when, when I personally, when I think about agentic coding, I think about cursor rules and a lot of the features that they've

Andrew Zigler: 2:32

Mm-hmm.

Ben Lloyd Pearson: 2:33

platform and. you know, everyone's gonna have their take on how that stuff should work, but right now, cursor's the one that's out there, sort of leading the definitions around all of this. But with that said, I've, I've heard a lot of positive things about Windsurf. I'm gonna have to check it out at some point. I know some people that have tried it and have really liked it, even people who know cursor. So, you know, it's just something interesting to check out and I like that we have a lot of competition in this space.

Andrew Zigler: 2:58

I totally agree that the competition has only created a better ecosystem of tools. Um, I've actually tried out windsurf. I had a lot of luck with it. I think it's a really nice tool. so I think that there's space for, you know, multiple players in this game to create a really great new future of development workspaces. So I'm excited to see what happens.

Ben Lloyd Pearson: 3:15

Yeah. And from surfing the wind here on earth to surfing the solar winds up in space, let's talk about what Amazon has done week

Andrew Zigler: 3:24

wow, what a transition. So let me launch this story into orbit for you. Amazon has launched its first Kuiper internet satellites taking on starlink in the satellite internet space. Kuiper, you know, you might read it as K-U-I-P-E-R. it's a Dutch word related to copper, and it's the name that Amazon has picked to describe its new satellite internet network. in this space, you know, we all think of starlink as the incumbent and has the predominant mind share when people think about satellite internet. But Amazon is now taking on that space and that industry head on with their own offering. And Ben, have you, uh, had any experiences with satellite internet before?

Ben Lloyd Pearson: 4:05

Yeah. Well first of all, I, I'm very familiar with the word Kuiper because of my 4-year-old who's obsessed with the solar systems. I do actually like the name, you know, for anyone who doesn't know Kuiper, we're all the cool dwarf planets hang out. But you know, personally, on one hand I'm, just hopeful that we're not headed towards, the world in Wally, the animated movie where our atmosphere is just littered with space junk all over the place. I like found myself wondering today, when did we decide to just like, let corporations fill up low earth orbit with all of these space boats, I don't remember that decision ever being made. Uh, but with that said, more data connectivity is a good thing over the long term, and rural areas really stand to benefit the most from this, areas that have been left behind by a lot. Of the benefits of the internet age, whether that's in the developed world or even in the underdeveloped world. you know, I've already mentioned my opinions on competition. Like I think competition is good for services like this, within some reason, like there should probably be some regulations over the use of low Earth orbit. but I, I've also found myself wondering, with all the recent antitrust things that have been coming down the pipeline, specifically in the US federal courts. how long until Amazon becomes under the, the microscope, for the seemingly vertical monopoly that they're beginning to create.

Andrew Zigler: 5:25

It's definitely an interesting competition space. The idea of the world being enmeshed in all of these internet services. internet reaching the far flings of the earth and people in rural areas. Having access to really fast speeds like that can be really transformative for a lot of people in a lot of situations. It could even go as far as to save lives or educate people for the future. So, on the overall, I think it's going to be a great development for people, but I agree with you that as we launch more things in the orbit, we should think about, how we're going to actually maintain this long term, and how things are gonna go into orbit. So if, if you are listening to this and you have any insights on things entering and coming outta low orbit and uh, you want to connect us with somebody who's maybe even launching things into that space, we'd love to learn more about. Those decisions and how things get sent up into the sky, because that's a pretty cool application of engineering and there's a lot of technology behind it.

Ben Lloyd Pearson: 6:16

Yeah, there, we've long held the belief on this show, many of us here, that we would love to have somebody who has either been in space or sent things to space on our show which, we've talked to people who have been adjacent to launching things into space. but if you are out there as somebody who is doing this, we would love to talk to you. let's get onto the story about atrophy in the age of ai. As somebody who often feels like I am maybe starting to become dumbed down because of my GPT usage.

Andrew Zigler: 6:43

This article, was full of advice and actions you can take to avoid skill atrophy in the age of ai. it dives into the reality of critical thinking decay and frequent users of AI tools and folks who are using this on a day-to-day basis, they're probably familiar with this, uh, kind of mentality. So I really encourage you to check out this article on Substack by Addy Osmani. We're going to include it in our news, roundup. and some things that really stood out to me, was about making sure you throw yourself into the pit of cognitive thinking on a regular basis, and you fight through that struggle. Um, it gave you a lot of great ways to reinject yourself into these situations that you might be guarding yourself from with ai. Prevent you from developing skills. And Ben, what kind of stood out to you as you read through this article?

Ben Lloyd Pearson: 7:27

Yeah, so, just a personal anecdote. Last week I did a little, like the most like artisanal content crafting that I've done in a while. You know, as somebody who has definitely adopted AI for a lot of my workflows. and that was to prepare a conference talk. I'm, I'm going to just real quick, I'm gonna plug an event that I'm at this week. Um, I'm in Miami at the Code Remix Summit, so if, if you're there, come hit me up and let's meet and chat about AI or anything else under the sun. I've tried to be more intentional about this, like, artisanal work, so to speak. at least like on some sort of regular basis, like, you know, once a week if I can do something myself without making ai the, driving force behind it, Another example is, is just reading articles all the way through. Like I feel like this has been a trend really since the internet era. Like our attention spans have gotten so short that it's hard to stay focused when you have like a long, in depth article and actually read it to the end. So I read this article all the way to the end and have been more intentional about making sure that I'm doing that. and there were two signs of Overreliance in this article that really stood out to me because I, I felt them, the first is like debugging despair. Like if you're the kind of person that is just skipping reading error logs now and just automatically pasting them straight into ai, it could be a sign that you're overlying on these tools. And trust me, I'm super guilty of this like. I've done it plenty of times, but I've also recognized how limited it is at its ability to do a root cause analysis. So I never let it just be the only analysis. I sometimes let it be the first analysis or if I get stuck, help me get unstuck. But, you know, the sad part is there's actually a tool I just have been trying out in the last week, a brand new AI powered tool. And I really like it. But the thing that kind of stinks about it is it, is like entirely built around this concept of like one shot copilot. Like you,

Andrew Zigler: 9:20

Mm.

Ben Lloyd Pearson: 9:20

what to do and if it doesn't do it, you just click the button that says fix the error until it goes away.

Andrew Zigler: 9:27

Yeah, no. I don't wanna be in a situation where I'm unable to get my hands on the terminal and do the debugging with it, or all I can do is just roll the dice on it, hopefully generating the right command. That sounds tortuous.

Ben Lloyd Pearson: 9:39

Yeah. I mean the irony is it you click that button enough and some, it often gets there, you know, it often

Andrew Zigler: 9:45

It can, but oftentimes, and I'm sure folks who've maybe been experimenting with this, sometimes it creates really strange work arounds I've had situations where my application was giving errors and in trying to troubleshoot it, it would just change arbitrary parameters or things about the environment to make the error go away or seem less critical than it was. so it's always a good call out to understand the, patterns that your AI is using to solve problems and that you understand what's going on.

Ben Lloyd Pearson: 10:14

and I mean, even cursor can do can have the problem that you describe like quite frequently. The, the other thing that really stuck out to me as a sign of over Alliance is losing your ability to do architecture and big picture thinking. So that's like if you're like accepting code without really understanding the long term sustainability and quality risks of that code. and you know, personally, I've been kind of operating onto this theory that, you know, a lot of the frameworks that have been built for software development, we're designed to reduce human complexity, like it was designed to help humans navigate the difficulties of creating software, but we're actually entering an era where AI can just abstract away all of that complexity. So how much do we actually need? Like all of these, JavaScript frameworks, for example. So I've gone way back to the basics. I'm like working back in like HTML, CSS, vanilla JavaScript, and just seeing. What I can do to orchestrate and control these agentic AI services so what I'm kind of getting at is if you can feed it strong architecture, like help it make the right decisions, it can be a lot more effective at whatever code you need without the use of all these dependencies and extra frameworks. But, we'll see how this experiment goes. I actually don't know if it will be successful in the long term.

Andrew Zigler: 11:28

Well, one of the best ways to get to, a good, solid architecture or something that is a high quality document that you and a tool can work off of is, really understanding the fundamentals, one of the standout pieces for me from this article was not using AI to achieve those fundamentals, because struggle is good. and this was a really good reminder to me and I think folks who will read it, that frustration is a key part of learning and frustration is what enables you to remember, make connections that are important. But also think of solutions and problems that maybe you're not anticipating yet. And humans are still way better at this than agents. So you have to take that role when you partner with a tool to, do your work in this way. But speaking of frustration, another story recovering is the growing divide between developers and their bosses over how generative AI is adopted within their organizations. We're really loving this piece from Jen Riggins. It was on LeadDev, uh, in the recent week about frustrating divides between developers and leaders during AI adoption at organizations. And it arrives at this by taking a close look at a survey of over 2000 IT managers and developers last year. and in the survey leaders listed AI is the most important technical factor in improving developer productivity and satisfaction, and that's a, a huge surge in mind Share for that, that leadership audience. Meanwhile, only a third of developers reported experience seeing any significant AI productivity gains. And so what this survey really highlights is the divide still between leaders and doers on how AI is actually impacting their work every day. And this is a cumulative piece that has, quotes from folks all across the developer experience and productivity space that speak about this commonly. It also has a quote from yours, truly, in the article. I really recommend you check it out because there's so much cool advice in there from smart people that are talking about this. And so, Ben, what stood out for you?

Ben Lloyd Pearson: 13:28

first of all, I, I think there are a few quotes from you and, and it's really great to see that you're getting the opportunity to take what we're learning from our community and share it with audiences like over at LeadDev. because, you know, I'm, I'm a huge fan of them, that they do great work over there, so. we love being able to share knowledge over there. But I think Ori Keren said it best when he came on, Dev Interrupted when he said that developer productivity is going to decline this year. And I think upper management is just going to have to accept that. Like there's a lot of disruption that's happening just across all of knowledge work, not just software development. And there's a lot of adaptation that is taking place right now to respond to these disruptions and adaptation, you know, rethinking how you accomplish your work. that all detracts from what would conventionally be viewed as like productive work for, for lack of a better phrase. You know, you're not able to focus on shipping new features and, and moving faster when you're just legitimately trying to, to redesign how you complete your work. But I think there's a couple of traits that I'm seeing from organizations who are leading the charge with AI adoption. the first is that they provide really clear guardrails around AI usage. that's training on how to properly use it. That's guidance and resources that help provide context to models that. that make them do the right thing rather than hallucinating security vulnerabilities. so that's the first thing. The second thing is they're giving space for early adopters to experiment and share learnings with the rest of the company. So at, at just about any company out there, you've probably got a small percent of your developers who like to use new technology. They like to experiment. They like to find new ways to do things. They need to have the tools to let them do that. And then the, the space to experiment and the channels to share those learnings back with the rest of the organization. Developers wanna reap the benefits of ai, but the technology just isn't there for all use cases yet. So you wanna let the, the people who can benefit from it benefit today while teaching the people who are gonna benefit tomorrow. You know what to expect. So, you know, expect productivity losses from this disruption is the point that I'm trying to make. And just be ready to reap the productivity gains that are around the corner. Like they're starting to show up today, um, but they really just haven't fully manifested yet. So get your organization ready for that future. Yeah. So Andrew, who do we have on the show today?

Andrew Zigler: 16:01

Yes, in just a moment, we're bringing JJ Tang of Rootly on the pod to talk about creating an engineering culture built on customer empathy. Stick around.

Ben Lloyd Pearson: 16:13

Beyond copilot. What's next for AI in software development is happening on June 4th and fifth. Join us for a live 35 minute panel featuring past podcasts. Guests from Adnan Ijaz from Amazon Q and Birgitta Bockeler from ThoughtWorks, alongside experts from LinearB. We'll explore how leading teams are going beyond copilot to experiment with a agentic ai. Measure real impact and drive meaningful DevEx gains registrants. Get the full recording plus early access to the DevEx Guide to AI driven software development, packed with tools, prompts, and insights from the 2025 AI developer survey. Reserve your spot today and stay ahead of the AI curve.

Andrew Zigler: 16:55

Hey everyone, for those of us just joining us, today we're joined by JJ Tang on the podcast. He's the co-founder and CEO of Rootly, and today we're tackling some really cool stuff.'Cause JJ is at the forefront of incident response automation, helping teams turn chaos into reliability, more importantly, the work that he does at Rootly. It helps make our world a safer place by automating incident response for things like mission critical infrastructure. And today we're gonna dive into how leaders can build with shiny new things like AI safely. why customer empathy is kind of an underrated engineering skill and how teams can go beyond just reactive reliability. So JJ, welcome to the show.

JJ Tang: 17:39

Thank you so much for having me. After see

Andrew Zigler: 17:41

all

JJ Tang: 17:42

the amazing guests you've had, I feel very special to be here.

Andrew Zigler: 17:45

Oh no, totally the honors on our end. It's really great to have your perspective here. like I said, I'm real excited about the topic, so let's just go ahead and jump right on in. Right now a lot of companies. They find themselves rushing to integrate new things and maybe they're not stopping to fully. Consider things like reliability, trust or usability at scale. And JJ, I know you're always thinking about building safely, and not just fast. That goes for all things, not just shiny new things with AI in them. what are some of the common pitfalls engineering leaders face or fall into when integrating new stuff? And how do you think they can avoid them?

JJ Tang: 18:23

That's a great question. We serve some of the most mission critical businesses. We work with companies, such as Dropbox, all the way to Nvidia, but we also have customers, for example, in the Nordics, like 911 call centers. we work with nuclear energy plants and gas stations, and, we cover the entire gamut. So for us to be reliable and secure as a platform becomes incredibly important. And one of the principles that we have from engineering. Standpoint is we understand we are in the business of doing the boring things, right? When we started building our on-call product, the first thing that we said, this has to be immensely reliable. so we developed this multi-cloud architecture that no one else in the industry had and what felt like overkill. You know, we never really had to use it in, in the purpose that it was built for, but it inspired a lot of trust. So I think there's this aspect of functional reliability, and also perceived reliability. That also becomes, quite important. When we thought about building with ai, we said, privacy and safety becomes the most important thing I think in this world in particular with ai as companies are adopting it. If you are a vendor that is building AI into your tooling, you have to understand a lot of the customers that you're selling to and you're serving and that you're caring for are also venturing down this path, at the same time. And if both of you view AI as this mysterious black box, neither of you are gonna trust, each other. And so I think the onus is on the vendor to be incredibly smart with it. The recommendation and the thing that we found. To be helpful for us is, integrating and building AI into your product is the easy part. I think truly understand, understanding the power and also the dangers and pitfalls of AI is the other piece we spend a lot of time on, and so I. Us as a business, we experiment constantly. We've developed our own rootly AI lab specifically to tinker right on the bleeding edge of what these technologies can do and what they can't do. from there we've been able to formulate our own guardrails and truly understand, what the capabilities are. Then conveying that in a truthful and trusted way back to our customer, I think inspires a lot of trust and also for us. Then suddenly safety and, privacy becomes a core differentiator that we, that we love. And you know, your mileage may vary and I think that's how you go from being this bolt-on AI product. Maybe it's a wrapper for some organizations. There's nothing wrong with that either to being this AI native company that truly understands, what it's capable of.

Andrew Zigler: 21:14

Totally. It's more about like having the transparency of understanding its impact and what it's supposed to do, what it can do, what goes into it, what comes out of it. And you know, during that process it's like there's a lot of alignment that has to happen. How do you think that engineering teams can ensure that the solutions and, you know, shiny new stuff that they're using, it actually aligns with their business objectives and it has a measurable impact. is there a, a pattern that you see successful teams kind of follow to achieve this?

JJ Tang: 21:46

I think for individual teams it has to be sustained effort. I think when you view AI as just a singular feature of your product, then you go back to doing what you were doing before

Andrew Zigler: 21:58

Yeah, The

JJ Tang: 21:58

peaks and valleys become either very high or very deep and can be quite jarring sometimes what we've done, to ensure that we're constantly thinking this AI first. Mindset and how we build sustainably over time and, in intelligent ways through our product is every leader of every function inside of our business we have a biweekly meeting where everyone presents new AI tools that they've been using. I would say 80% of them fall short. we tried some, automated PR review tools, some QA tools. We tried some on the rev ops side that unfortunately didn't work out, but had a lot of

Andrew Zigler: 22:37

You just have your proof of concept, just the prototype, flexing out the idea.

JJ Tang: 22:42

Exactly. and what that did for us culturally was really important. It forced leaders to think in this mindset. It also forced us to explore what the latest and greatest was out there, and that naturally propagated itself through the rest of the organization. So when people talk about AI now, it doesn't feel like a novel concept as if we did two years ago it just feels like a natural extension of our day-to-day workflow. I don't think we've gone as far as, you know, maybe Shopify. I know they came out with a, a memo recently saying, Hey, you're not allowed to request for head count unless you can prove that AI agents can't do your job.

Andrew Zigler: 23:19

Mm-hmm.

JJ Tang: 23:19

are there yet as a business, but maybe one day.

Andrew Zigler: 23:23

Yeah, we talked about that memo. on a recent episode actually and the picture you're painting here is really helpful because you're, you're showing how creating that culture of learning, creating that space for innovation in a company helps normalize using it as a tool. And building understanding of it even goes back to your first point about, having the transparency and the reliability in what you're doing. Part of that is, making sure everyone's equipped with knowledge and you're giving them the space to experiment. In that world where they have the space to experiment and they're trying new things, they're growing as a company, it kind of then starts the butt up against maybe traditional ways that people ship products. And I wanted to ask about how traditional incident response has, you know, typically always been a reactive. Process for teams, you know, they wait for something bad to happen and then they get out the fire extinguisher and they go find the fire and put it out. But Rootly changes that model. I want us to understand a little more about how, from your perspective, you know, what are the key differences between a proactive and a reactive reliability practice, and what does it mean?

JJ Tang: 24:27

Yeah, I think that's a perfect way of characterizing it. Either it's proactive or reactive. And I think it's actually with the unlock of LLMs, can we truly be more proactive now you know, we built a very successful on-call business incident response business that is primarily geared towards helping humans become faster and more consistent at resolving incidents. And, you know, the maturity curve where most customers are, that is what they need, but what the future holds with LLMs becomes really interesting because the work that if you think about. Your smartest SRE your hardest working, your most, you know, the SRE that's making, 700k a year that has, 20 years of domain knowledge. Their ability to recognize patterns, their ability to, sift through large amounts of data that are relevant for your business is incredibly powerful. And that tribal knowledge is really unlocked by sometimes no one else. And this hero mentality starts developing and they start getting burned out. And with LLMs, what we can do now is harness everything that they can do, then replicate it across hundreds of agents that can simultaneously then do the work for you. So the way that we're solving for this right now is imagine when an alert fires the first things that you will do is you will go check, you know, maybe your logs in telemetry and traces and synthetics and check Datadog. You check sentry check, you know, your GitHub you maybe try to mentally correlate what were some of the past incidents that were occurring and what we're able to do with our agents is, well, we can sift through large amounts of data instantaneously before that you, you even have time to flip open your laptop and tell you what the probable root causes. And where this starts becoming powerful is the complexity of incidents is only going to increase over time. I am sure you guys see it all the time. The adoption of coding assistance has been through to the roof. You know Cursor,

Andrew Zigler: 26:32

Yep.

JJ Tang: 26:32

You know, is one of the fastest companies to reach a hundred million. Everyone. Uses copilot. I was recently in London with a private equity firm, and they have strict board level mandates to adopt coding assistance.'cause they view it as

Andrew Zigler: 26:45

Yep.

JJ Tang: 26:45

if we can get 5% more developer productivity, then why wouldn't we do it across, you know, we have a million developers. That's a huge money save them. But with these coding assistance and, you know, if people are vibe coding into production right now, are new system dependencies. That introduce problems and errors and incidents that we did not necessarily anticipate. And even Your smartest SRE, that L7 SRE that we were talking about before does not know those changes are occurring. So on the other hand, you need a new wave of machines to help debug what other machines have done. So you're

Andrew Zigler: 27:23

fighting

JJ Tang: 27:25

the trend on, both sides effectively. And that's where you can get, you know, not only necessitate the need, but where proactivity then becomes important. What systems like ours will be able to do is actually determine whether or not a human needs to be, involved, and the future of that becomes really exciting. On one hand, you're identifying what the issue is, on the other hand, you're able to identify what the right fix is, then deploying the right fix, then all the way, shifting left into the point of pushing your code into production, you can emulate what the impact could be before you even make that change, I might be able to say to you, Andrew, hey, I maybe I wouldn't do that. That seems really dangerous because of this and this reason.

Andrew Zigler: 28:09

Yeah, it's like predictive impact in a way. And what you're saying is really interesting and it actually threads through something I've been hearing from recent guests on the podcast. It's actually quite interesting how this narrative compounds itself. So to connect them, you know, we had a recent guest. Tanya Janka, she's a cybersecurity expert and she really expressed how, cyber attacks and incidents are only gonna get more and more worse. They get exponentially worse every year. With the introduction of AI and agents and this kind of thing at scale, that risk is. Even higher, you know, these fundamentals become even more important. And we spoke with another recent guest. Sagar Batchu talked about API consumption in the future and about how it's gonna be 10 x as much as it was the day before and how like overnight, LLM, consumers of services and tools are going to break infrastructure and are going to have profound impacts on how we've built these things. And now here we are at the culmination of that in that you have to fight that big fire, like those thousand fires that are happening with something that can predict where the fire thousand fires are going to happen and how you can minimize the impact of that. So if you're taking advantage of all of the benefits of an LLM and ai, you also have to take advantage of all of the protections that you need in that environment. does that kind of, thread through on how y'all are thinking about it too?

JJ Tang: 29:39

Yeah, totally. I, I think it opens up a, you know, selfishly for us, a whole new market of, productivity that can occur.'cause a whole new class of incidents will start existing and I think, you know, the TAM that LLMs will open up for other software companies will also be quite enormous. But, with the good comes the bad to a certain extent, but I think the

Andrew Zigler: 30:01

Yep.

JJ Tang: 30:01

positive is still there for most companies and for the space and category as a whole.

Andrew Zigler: 30:07

Yeah. And let's talk about the steps of getting to this, proactive model. And I think a lot of it comes down to cross team collaboration. I'm wondering, you know, what's your perspective on how you see successful teams interact between, engineering, product, leadership, how do they work best?

JJ Tang: 30:26

I can probably share this best or characterize this best. There are a few examples that we found particularly helpful. I think over time the definition of roles and functions will change. There's a few things that we've done. One introduction was the concept of an AI engineer, a generalist AI engineer that we hired. And this person's job is to find effectively areas of opportunity where agents can do the work of team and also be the cultural influence for how other teams can think about adopting and using AI as well. So one of the things that we did was we had a, 15 person team of BDRs that we had hired in Denver. And one of the things that we were able to do with an AI engineer was entirely automate how we prospect by using tools like Clay and Claude. And merging them together with a few other things that we're doing. This person did not have a traditional, you know, computer science background and now we're going function by function and finding the opportunities of where this automation can now exist. We're finding a lot in our own reconciliation processes on the rev ops, side of the house. We are finding it a lot on the support side of the house as well. And so I think this concept of where traditionally when we started as a company, engineers felt they were, maybe they had a role definition of just, Hey, my job is to execute on these tickets, and then I go home and do that. lines are becoming more blurred and we're very much

Andrew Zigler: 32:10

Mm.

JJ Tang: 32:11

We're letting the role of everyone here to be more blurred. Um, Adam, who is our head of marketing, is running a lot of our self-serve business. to ensure that's in a good spot because he has a ton of experience there. And that might not be, the role of any other header marketing, but we're totally, fine with that. And I think adopting that mentality becomes incredibly important. And it has changed how we interview people as well. We don't want to see them as you're only a specialist in a particular area. We wanna see that you can be smart be a generalist in many other areas if you had to be flexed towards it. Then also culturally, we've done things to allow everyone in the organization, not just customer success and not just sales and not just support to be much closer to the customer. The tangible examples I can give is every single customer of ours has a shared Slack channel or teams channel with us and all of our engineers can access it. Sometimes they're working on a feature request for a particular customer. They're in there directly with the CTO of that customer going back and forth on, Hey, does this look good? What would you change? here's how we're thinking about the next iteration of it and getting feedback. And that takes a lot of time to deliberately codify, I would say. One of the things in the early days of our business that we did was there's this tendency to just start creating dashboards for the sake of creating dashboards, and, and I never really understood that. I remember we went through this process of creating these product metric dashboards that measured. If you became a Rootly customer in the early days, you would get a health score depending on your activity in the platform and a whole bunch of other factors. You got summarized into this number and this number felt weird to me for the longest time because it ultimately distilled the uniqueness of you as a customer, the challenges and pains that you felt as a human into this number. Okay, you're a seven outta 10. What does that really mean? So we went through and we deleted. Every single dashboard, every single product measurement tool we had, and we said the only way we are going to learn from customers is we are going to get on the phone with them. And we don't care if you're a $2,000 customer or you're $2 million customer for us, we're gonna get on the phone with you and we're gonna uniquely understand what is working and what is not working. And that culturally has set. Now we have a bunch of dashboards for what it's worth, that culturally has set the stage for us to be incredibly customer centric, to be close to customers, engineers interacting with them has changed how we hire, how we evaluate. All of that. so I think to answer your question, a lot of it has to come tops down. A lot of it you have to think through systematically and codify, and you have to find these pockets of where people can really carry the culture and the behaviors that you want as a business that amplify that the best that you can. And oftentimes we get it wrong too, that we've done many, many things where we said, this is the right move. Then two weeks later we said that was the wrong move.

Andrew Zigler: 35:32

It's an iterative process and I like this culture that you're building that focuses on a customer obsession that really gets close to understanding problems. And I wanna talk about how that impacts engineering, you know, for Rootly and how customer empathy plays a role there.'cause you know, I think it's very typical, for engineers, engineering leaders to focus specifically on the technical problems and forget about that connection to an end customer. So what is customer empathy in that kind of engineering environment look like to you?

JJ Tang: 36:04

Yeah. The thing to say there are downsides of being two customer centric. There are really great engineers that have passed on us because of it. They don't want to be on the phone with the customer. They don't want to be interacting with the customer every day. They prefer to do hardcore engineering and process and system design stuff. And I think we've made that trade off. because know, when you're, when you're not a Salesforce sized company or an Instacart sized company, you don't have the luxury of a really good decision maker in, in every turn. I think we rely a lot on, we never AB test, we always rely on our intuition and instincts to make the right call in a product. We'd much rather delete what we wrote and be wrong than to run a long, drawn out experiment. I think that for us, culturally has remained true and because of all of those things that we do. Being incredibly customer centric is important because we do not trust someone's opinion unless you can advocate and be somewhat the voice for the customer, because then you don't have someone in that room making that decision that does have the customer empathy. We'll override your decision most of the time because of it, and that becomes incredibly important to us as well because we want to make, we want all engineers to not just engineers. We want them to make product decisions.

Andrew Zigler: 37:38

Right. You want them to be product engineers, right? And it's serving an end goal and end customer.

JJ Tang: 37:44

Exactly, and so that's something that we found works, at least for us. We'll see if it works in the next five years. But for now, we're continuing down the track. if you ever sign up to be a Rootly customer, a lot of what you interface with is actually with engineers on our, on our team.

Andrew Zigler: 38:03

That's cool. And I'm wondering for engineering leaders who maybe listening to this and thinking maybe I wanna try this practice. What are some habits that, or some key things that they could put in place for their own team to build that customer obsession? That customer empathy.

JJ Tang: 38:18

I think that's an engineering leader. The first thing that you want to do, and where we really started was consistently sharing the context into what they're building. Oftentimes, engineers will get locked into a particular work stream or a particular feature without understanding the larger picture, because oftentimes I think we rely on, oh, well product that's product's job to, you know, have the overall holistic picture. Does it work for this startup? Does it also work for this enterprise? Is it important? Is it revenue driving? Is it a nice to have? Is it, you know, closing a competitive gap? All of that nuance often gets lost. It doesn't translate to what engineers do. I would ensure that context exists. It's not as noisy as you think it is. In fact, it's incredibly motivating in good ways, and also sometimes not in good ways. You know, you might work on something in particular for a customer and you don't end up winning that customer and then, you know, it doesn't feel too, too great. But you know, when you do and it, you can see your work map directly to these outcomes. It triggers something in your mind where, Hey, I really like that feeling. I wanna do more of it. All of our engineering leaders, we have a weekly sales meeting. Our engineering leaders are part of that. So they can distill that to their team and we post it publicly, we share it in our all hands. There's not a single engineer that works at Rootly does that, doesn't understand where we are revenue and customer wise, more than an AE here as well.

Andrew Zigler: 39:53

in everything you've guided through today, There's so many interesting new things to consider in the world of incident response. And when we think about incident response, it's a, it's table stakes for everywhere, right? Especially large infrastructure, critical infrastructure things that when they go down, it's not just like, oh, you can't order pizza. there are services that can life and limb can be on the line and provide key services and key scenarios. So, something I've learned in having these conversations is that founders, they sit on these very unique precipices and they have a very far vantage into how things are moving and evolving. I wanted to ask you in our chat, what would you think is the, the future? Of incident management. From where you are seeing it now, in charge of Rootly, where do you see it all going? Are there, interesting or even scary or kind of like things that keep you up at night, kind of curious to know.

JJ Tang: 40:49

Yeah, the lucky position we get to be

Andrew Zigler: 40:51

JJ Tang: 40:51

as as a business is no matter how good a company gets at their incident management, they always still have incidents. So we have this fortunate problem where we never run outta supply necessarily, and it's a challenge that affects. All verticals and businesses, and I often joke with their customers, they get on the phone with me and then tell me how bad their incidents are and how many they've been having. And I tell 'em, I say, well, that's great because that means I can, find a way to help you make that a little bit better at the very least. And it, I think it ties into a lot of what we, um, what you called out previously and the world is just going to get more complex, more sophisticated tools will naturally evolve. I think there'll be a new class of tools, a new way of with agents. There'll be agents that go rogue and do the wrong thing. And those need to be detected and created as incidents. And do humans resolve those? Do agents resolve those? I think it's all to be define still. I think the complexity in this landscape will change, and I think for us, the work smart humans will always exist. We will always be the copilot to them in many ways, and their work will shift to become this new category of impactful things. They're not gonna be on this category of like updating Jira tickets and writing status page updates anymore. Their brain is best harnessed to use elsewhere, and maybe that's using other agents to help agents so the world will get more, more proactively reliable as well because of it.

Andrew Zigler: 42:30

Fascinating. Well, you've given me a lot to really think about here, about where it's all going. But before we wrap up, I wanted to ask, you know, where can folks go to learn more about you and what you're doing at Rootly?

JJ Tang: 42:41

Yeah, I post a lot on LinkedIn. Uh, mom likes every single one of those posts. I think they're somewhat interesting. I talk a lot about AI and how we build the company and a lot of the things that we're doing on the incident response side. I try to share quite a few photos of my dog as much

Andrew Zigler: 42:58

Oh, I love that. I love dog picks on LinkedIn. I'll definitely go check that out. you know, link Dev Interrupted and we post a lot on LinkedIn too, so we'll have to continue the chat there. And if you are listening to this and have some thoughts about what we talked about today, please, you know, and jump in, let us know. You're gonna be seeing this all over, LinkedIn if you follow us. and, you know, based on your predictions of what's gonna happen in the future, you know, JJ, maybe we can have you back as like an update on the, the status quo of, of incident management, of incident response. So you can see what's evolving.'cause I think it's quickly moving.

JJ Tang: 43:29

I would love that. Thank you so much for having me.

Andrew Zigler: 43:31

Of course. if you made it this far, you know, our loyal listener. First off, thank you. Second off. You clearly really liked it, so be sure to subscribe to the podcast. If you're only listening to this, be sure to check out our substack as well and what we post on LinkedIn. And that's it for this week's Dev Interrupted. We will see you next time.

Your next listen

Cover image for The art of letting go as a manager

Dev Interrupted

The Engineering Productivity Platform

Resources

Use Cases

Features

Productivity Research Center

6.1M PRs

< 26 Hrs

13.3%

Resources

Foresight Over Firefighting: Being Proactive in a Reactive World

Show Notes

Transcript

Your next listen

The art of letting go as a manager

AI agents are knocking. Is your API ready to answer?

The people-pleaser in the machine