← 返回日报
精读 预计 33 分钟

AI Won’t Fix a Company That Can’t Ship

摘要

资深数据顾问 Nikhil Suresh 指出,许多企业将 AI 视为万灵药,但核心问题在于其缺乏将代码转化为可用产品的基本能力。大模型(LLM)虽能快速生成演示原型(PoC),但这常让缺乏交付经验的管理者产生产品已近完工的错觉。作者认为,AI 泡沫中充斥着为了晋升或融资而进行的“魔法化”包装,而真正的交付依赖于垂直切分任务和有效的反馈循环,而非 AI 本身。

荐读理由

识别 LLM 易于产出 PoC 但难以转化为产品的「多巴胺陷阱」,避免将演示效果误判为工程进度。文中通过 98% 的 AI 项目本质是工程开发、以及 LLM 在 AWS 架构建议上的事实性错误,为你提供了评估 AI 投入产出比的清醒视角:AI 无法修复交付能力缺失的系统性问题。

原文

AI Won’t Fix a Company That Can’t Ship with Nikhil Suresh

Listen now or read below

NN - Intro VO: This is No-Nonsense Agile Leadership, where we explore better ways to develop digital products and services.

Joinworld-class experts for honest insights and practical advice to help you lead digital teams clearly and confidently. Subscribe now to learn the best ideas in the field.

Murray: Welcome to No Nonsense Agile Leadership. I’m Murray Robinson

Donna: I am Donna Spencer.

Nikhil: And I’m Nikhil Suresh

Murray: Hi Nickel. Thanks for coming on today.

Nikhil: Thanks for having me on.

Murray: So we wanna talk about your blog post. “I will fucking pile drive you if you mention AI again, but before we start, can you tell us a bit about your background and experience?

Nikhil: Yeah, absolutely. So I think somewhat uniquely in the IT space, I studied psychology for four years at Monash University. Worked for a little bit in Southeast Asia, moved back to Melbourne, did my Master’s in Data Science. So that was finished up around 2019 when , aI roles are very popular. But we didn’t have GPT yet.

Even in 2019, a lot of the AI movement was fundamentally bullshit. Every job I got hired for, they said they wanted machine learning , what they actually had was spreadsheets and not much executive interest in moving past the spreadsheets.

Everyone I know in data science has now moved to data engineering because we saw the infrastructure was so weak, we needed to move on to more process oriented, less hyped up stuff. The data science market was starting to die off until chat GPT came out. It is now picked up and I don’t think the amount of bullshit has changed,

since then. I’ve started a data consultancy here in Melbourne and we have gone all in on not doing AI stuff, even though I would say 50% of our team have postgrad degrees in something, data science adjacent. I just think as a huge bubble, it’s not a smart place to invest our time if we’re looking at a longer time horizon because we can’t sell the business to someone for billions of dollars. The value is the people in it. So it doesn’t make sense. I’m not trying to fundraise or anything, so we don’t do AI stuff.

Murray: So what sort of projects or work have you done in the past?

Nikhil: So I’ve spoken about a couple in the blog. I was hired to do machine learning work for most of the corporate entities I was at before striking out on my own. Most of those projects never ended up having a huge machine learning component. And frequently after enough time it would turn out the problem was somewhere else, usually in the kind of DevOps stakeholder management, web development space. So I did a lot of that even though I didn’t have formal training in it.

I worked at a hospital for a while that was analyzing all the ambulance records in Australia. So for all of Australia, paramedics would fill out these reports and they would get sent to this place. I think they had 20 to 40 staff just literally reading paramedic records and labeling them, cocaine involved, heroin involved, alcohol involved. And they had this idea which was maybe we can build an AI platform to automate that and cut headcount. And that project ended up being 98% web development. Because even a case that, when you train a neural network or a random forest or whatever model, it just degenerates into a keyword search for either the word alcohol or not alcohol. That was the pattern that occurred over and over in corporate context, we would either need five lines of Python that someone could just look up on Google and get the thing going pretty easily, or you would need seven PhDs working on it for five years, in which case the business doesn’t take the project on.

So I ended up doing a lot of those two things, either just an intensely difficult problem that no one wanted to invest in, or the AI component was finished in the first week. And what was interesting was it was the only part we sold internally. We would do a bajillion lines of incredible code and we’d only fixate on the five AI ones because even then, 2019, it was the only way for executives to get promoted.

Murray: What led you to write this article?

Nikhil: I thought what I was saying was so obvious and tedious that everyone already feels this way. So I’m just gonna get it out ‘cause I’m annoyed. At the time I was working at a typical corporate environment and every day I’d be told something insane about LLMs. I quit that company and I got a call from a junior engineer yesterday complaining that every problem is still being addressed with, can we put an LLM in this? It’s very frustrating and I can’t believe it hasn’t stopped yet.

Murray: I was talking to a Digital Agency, CEO recently, who said he’d been into big corporates and they’ve got these big consultancy in and they’re telling me that AI’s gonna automate their entire workflow so that they didn’t need as many people.

Nikhil: My favorite sales technique, lying. Automation is possible, but no one’s been stopping people from automating this stuff for the last 30 years. Automation’s always been available. I think non-technical companies under utilize it to a ridiculous degree. And I think the promise with Ai, it’s almost synonymous with magic in these contexts.

Donna: I just want to substitute the word magic most of the time when people say it because the sentence means exactly the same.

Nikhil: Right. If you went to a company. And you said I’ve identified this thing you could do more efficiently and we can automate it with Selenium. To me, that’s a much more concrete offer, but also the interest level would be near zero, even though that’s targeting something done in a browser that my team spends tons of time on. you go Ai the possibilities are infinite because Ai theoretically can solve any problem in any system. It obviously can’t, but that’s the promise and it’s very hard to compete against that.

Murray: But won’t we realize untold efficiencies with machine learning?

Nikhil: Where are they? where are the untold efficiencies. Every couple of weeks we just have some company coming out with oh we’re going AI first, we’re going AI first. That doesn’t mean anything. I get emails every day from people being laid off. And even in those discussions, they’re going i’m basically sure they’re gonna ask me to rejoin in six months. At no point are they oh, my job’s actually gone. I’m cooked. It’s over. They’ve all just been like, my boss has tricked. I have to wait until he sees the light of reason again.

Murray: I have seen a couple of cases of people saying that my executives have got very excited about AI and they’ve fired all our contract developers. So I had 10 people in my team, now I’ve got five, and we have to do the work of the 10 because now we have Ai.

Nikhil: Yeah, I think if you fire a bunch of people, team is smaller, so maybe some things are actually faster, but not for the reasons people think. For the most part, it’s a terrible idea. I don’t object to AI outta some puritanical stance. My consultancy’s interested in making money. And if I thought I could plug an LLM in and halve my prices or double my delivery speed, I would but I don’t because it doesn’t .

We even had a case about a month ago, trying to assist a government contractor. They’re putting their first data analytics environment together. Should they have one AWS account with dev test and prod stuff in there? Or should they have three separate accounts and there’s pros and cons, but someone punched it into an LLM and they were like, can you think of any considerations we didn’t think of? And the first thing it says is, oh, then you can just have all your S3 buckets with the same name. You can have a data bucket in test and a data bucket in Dev and prod and you don’t have to rename everything or call it test bucket or whatever. That’s not even true, ‘cause S3 bucket names have to be globally unique. So It’s in this weird case where every time I use it, it says something like that. unless you’re already an expert, that looks entirely reasonable because every other AWS service does work that way. What am I meant to use it for? I don’t understand. It’s wrong on so many things.

A lot of these problems are just artifacts of the way that most companies are dysfunctional. It’s not they weren’t wasting the money before AI stuff. It just wasn’t all put under one label. They would buy data governance software that doesn’t work, and then they would buy a firewall that isn’t quite working.

And the spend was distributed across various bad ideas and now it’s all just one bad idea. And that makes it a single target. But I think it is the same social dynamic. It’s a lack of understanding of what people are buying. And the only salient difference is when I talk to non-technical executives, it’s very hard to convince them that they’re not going to get the magic systems improvement they’re hoping for. For reasonable social reasons, right? If you have hundreds of people around you saying, this thing is magical and it’s going to work, anyone who it isn’t going to work is untrustworthy.

It’s this fundamental issue of, if you’re not an expert, how do you decide who to trust? But I now think a lot of people are willfully deluding themselves and not really being skeptical enough.

When I was in Fiji, I spoke to a couple of people who run veterinary hospitals and veterinary clinics and a couple of politicians there. And AI kept coming up. And the vets didn’t understand why it was coming up, but people at veterinary hospitals were saying, yes, I’ve been sold AI for this and this and this. And the politicians were saying, yes, I’ve been told that AI is gonna revolutionize finance in Fiji. Blah, blah, blah, blah, blah.

Murray: Let’s talk about these so-called AI thought leaders a bit more. You said before that lying is a sales strategy. Do you think that they know that they’re lying? Do you think that they’ve deluded themselves or that they’re just ignorant? What’s going on there?

Nikhil: I have met people who have admitted their lying. I have a couple of emails from people who started AI companies and then realized the product doesn’t work. They’ve already got the funding, so they’re doing their best. Some of them felt pretty bad about it. They were very switched on, competent engineers. They just got swept away with the hype.

One of my co-founders went a date with someone here in Melbourne, and when he asks what they do for work she reveals that she has received $6 million in funding for an AI startup that doesn’t work. And she’s planning to fly back to India at some point. You could just rack up the money and nothing works and you leave. So there’s a lot of that going on. And those are the people who know what they’re doing doesn’t work.

Even prior to chat GPT kicking off people were still obsessed with machine learning, data science, AI for everything. A lot of the people then were really technically unskilled. I worked with quite a few directors who have never really written code in their life, can’t really do math. Didnt really seem to have much in the way of business intelligence credentials. And they really thought they were good. They weren’t lying or anything. They just thought it was normal to not ship. That’s a classic thing in It.

There’s a lot of career managers who have never actually run a successful IT project and they just think that’s normal because all their friends also never ship projects.

I think we’re seeing a lot of that. And part of why so much of this AI stuff is so popular, there’s this mechanism which is previously if you did a machine learning project, you would need to process a lot of data. You would need to do some fairly complicated mathematics and test training splits and running through algorithms and hyper parameter tuning and all sorts of complexity and updating the model if the data set changes and the distribution shift underneath it.

So those executives and managers had gotten used to never shipping. They would literally take a job, fail at a project for two to three years and then go on to a new job. Not maliciously, they just didn’t know how to do a better job. With LLMs they could just put in a plain English prompt and get something that looks like a proof of concept. Instead of going, we need to train Xg Boost to do sentiment analysis on Twitter, you just put the tweet in and then say, is this positive or negative?

That’s not actually a product, but it is a proof of concept. And for people who’ve never shipped their whole careers, I think the exhilaration of being able to churn out presentations week after week hooked the dopamine machine straight up to their brains. And the fact that there’s no pathway to the actual product, is very much beside the point. They just got really excited.

I think for many of them a straight proof of concept must mean the product is almost done. Which is something you can only believe if you’ve , never shipped before.

Murray: So what does this say about the competence of technical managers?

Nikhil: It says it’s pretty low, but that’s not a surprise. You had Brian Finster on a few episodes ago. I think that’s a pretty conventional piece of knowledge amongst a lot of leaders in his area. My consultancy’s an extreme programming shop and in the extreme programming community, we just assume most places are not terribly good at shipping. If you just randomly pick a company, I will just assume they don’t actually know how to get code out the door.

And that’s not a terrible slam on the people. I think it’s a systems problem. I used to really attribute that failure to people, and in some sense it’s people, but it’s the system they work in. And it’s also the meta systems. So it’s not just the company they’re in, it’s every company they’ve been in and their families and their previous schooling. We don’t systematically produce people who can ship. So it shouldn’t be surprising that mostly can’t ship. they love standups and PowerPoints, but not actually writing code that works.

Murray: What are those systemic issues that are preventing people from shipping?

Nikhil: I think the number one issue that stops people from shipping generally is people don’t seem to be plugged into the feedback loop that becomes concerned that they haven’t shipped. At our consultancy, we have a clause in our contract, which is people can terminate the engagement on one week of notice and a full refund on any remaining time.

And there are no terms attached because I know if we ever have a week where the client’s looking at what we’re shipping, and the answer is they ship nothing, somethings probably gone wrong there. The idea I hear very often, the phrase I’ve heard is the thinnest vertical slice. So you know what is the fastest way to a thing a customer can actually use.

Most stuff doesn’t seem to be built that way. It’s just a series of complete horizontal slices. how do we spec out the whole database and then the whole application layer and then the whole DevOps pipeline, and there aren’t clear feedback mechanisms on when that stuff is done. There’s no point where you go, we spent too long in the database. This project is super late. You can just noodle through everything for years at a time thinking you’re just one more week from the next stage. And being aware of that doesn’t fix your problem. It doesn’t tell you how to get to shipping, but that is your point.

If you’re two weeks in and nothing’s on the table for a person to use, you at least have the ability to go, we’ve screwed everything up. It is much worse if you have a two year feedback cycle and then the app doesn’t ship, and then you have one data point saying you’ve missed your target.

We’re going into negotiations next week for probably what will be our biggest contract since we started off and the consultancy we’re displacing, has spent two years on a project and they didn’t ship. And my original counter offer was gonna be, I will ship this in four weeks for half the price because it was so easy. I don’t know how they didn’t get this thing shipped. And I spoke to someone and they said, I actually needed to add a couple of weeks to that because the person I’m talking to isn’t going to believe it’s that simple. And the thing they’re asking for is a data warehouse with some data model.

We’re talking about three weeks of writing SQL and then we swipe a credit card and buy Snowflake. I don’t understand how the other team didn’t do it. But that is broadly the state of affairs. People are spending hundreds of thousands of dollars and they have no idea. This consultant, they didn’t even use Git. The whole thing’s not under version control. But they’re big. They’ve got 300 people here in Melbourne on staff. They’re not just five guys in a basement screwing around.

Donna: I’ve been working in software for 25 years and have worked on an awful lot of stuff that didn’t ship. But as a designer, most of the problems that I would see was because people could not figure out and agree on what the problem was. And actually, what you said before about thin slices is a really good way of figuring that out. And the ones that did ship it was a problem you could comprehend. The team agreed on it well enough.

Nikhil: Yeah. Now that you mention that it, it makes a thin slice thing make sense as well because everything feeds into everything else. Also if you do a thin slice, you don’t need to get people to agree on the whole problem. You could just go would this button help anyone anywhere? Can we just agree the button existing is better than the button not existing? And at least you can do something. And if you hate it, only spend a week. So it’s not a huge problem.

Donna: You could make a design prototype or a skinny working prototype and get people to disagree on it.

Nikhil: It’s at least a start. But what’s been really hard in sales conversations is, a lot of people I talk to are so used to things never shipping that they become very sensitive about it. So they want guarantees in the contract and they want everything specified to a super high level because what they’re thinking is, that will protect me. I will have legal recourse if this thing doesn’t turn out. And it’s very hard to explain that the act of specifying is one of the things that makes the product never get delivered.

We turned a contract away recently even though we probably could have used the money because they said, we have another vendor and we want dollar values and a list of everything you’re gonna deliver in a fixed timeline and then we’re gonna pick the cheaper vendor. And I just went well, you’re just incentivizing me to write a bunch of stuff down that’s gonna be really shoddy and then put a price on that none of us is gonna be happy with. I won’t be happy ‘cause the price will be low and you won’t be happy ‘cause the work’s gonna be bad. So I’m just not gonna participate.

And the client looked really hurt and offended but i’m just gonna give this work up for the other group and if they don’t ship in one year, you can call me back and I’m expecting to be called back in one year.

And it is understandable, because if someone’s been scammed or had a bad deal, it is really hard to hear that we need trust in this relationship to make the thing work. It’s like the last thing people wanna hear.

Murray: Yeah. I was consulting with a big pharmaceutical distributor a while ago. And I remember talking to the general manager of operations about doing a project by building up their internal capability. And he said to me, software development and technology is not our core capability. It’s not something that we want to invest in. We want to give it to somebody else. But I said to him, you’ve got a $6 billion per year revenue business. All of that money goes through your e-commerce system and your CRM and your ordering system that your clients use to order their product every day, is completely custom, and it’s your key competitive advantage. So actually it’s the core of your business and you’re saying no, that, that’s not the core of our business. The core of our business is warehouses and negotiating with suppliers.

So they had underinvested for a long time and their solution was to outsource everything to big consulting companies. And those consulting companies were getting all of the work done offshore in India by other companies.

I see this a lot where a lot of big companies have just said no. Software is in our competency, so we’ll outsource it to whoever’s cheapest. And the results have never been good from what I’ve seen.

Nikhil: I’ve also heard that idea of IT is not a core competency, so we’re not gonna invest in it. And we actually see this in the Phoenix project, the Gene Kim book. In there the main threat to the business function is the CEO saying you had better get this under control, or I’m gonna decide, this is not a core competency and we’re offshoring all of this.

And if you’re a technician, you’re looking at that and going that doesn’t even make sense, it’s still gonna cost a bunch of money. You’re basically committing to being really bad at this. Your competence around everything in the business is actually a spectrum. Law is not the core competency of my consultancy. That doesn’t mean I go to the cheapest lawyer at every point. And it also means I don’t keep a bunch of lawyers on staff full time. I try to buy quality in the places where it has high leverage. So we get all our legal advice from one guy, Ian McLaren here in Melbourne. Very expensive because he’s special counsel. So I only call him when I really need help and I only need legal help when we’re doing something really important, getting our master services agreement done. I don’t want that done by a cheap grad. That’s not a smart thing to do.

And that seems the equivalent of just offshoring to India. Just because it’s not the most important thing doesn’t mean you throw it out the window.

Murray: Have you had any experience working with offshore development teams like that?

Nikhil: one of my co-founders, spent some time in India running one of those teams and all I know is they usually don’t recommend anyone goes with an offshore team.

Murray: I did wanna ask you about tabletop role playing games and I say that because you, me and Donna, all GM

tabletop role playing games. this is Dungeons and Dragons

Rune Quest and Cthulhu the other things

like that.

We’ve all been playing them for a long time. You said 16 years, so you must have been pretty young.

Nikhil: Yeah, that’s right. I started when I was 10 or 12

Donna: I was well over 45 before I started? ‘Cause I, couldn’t find a group until then.

Murray: There’s quite a lot of people like us who have played a lot of tabletop role playing games, and I’m just wondering what is it that you get out of that helps you with your work

Nikhil: It really helps your reading of social dynamics in rooms. When you’re sitting down with your friends and you’re running a session it might run anywhere between two and four hours. So you’re basically trying to keep a room full of people entertained for four straight hours and that’s a really tough challenge, especially if like phones are on the table. So being good at it really makes you sensitive to how people are feeling. Are players snapping at each other? Is everyone still clued in?

If you have a meeting and you lose someone for 30 seconds, they’re missing context, which means they will not process the remaining five minutes and in those five minutes are gonna be lost for the next hour. I think just really being aware of the moment to moment dynamics, is everyone tuned to this conversation is very helpful.

Everyone I know that is a great DM is really good at running meetings and they’re also very sensitive to the broader concept of fun. Something I’ve really enjoyed since starting my own consultancy and the reason I did it was I was just very bored in corporate contexts. And I think there was this sort of implicit assumption that if people were having fun, then they weren’t doing hard work. And for the most part, we steered the other way, which is if we’re not having fun with something, that’s our time to check and make sure this thing should be done at all.

And when you look at how things like Scrum are typically conducted, there is a lot of high level detail on meeting structures and ceremonies, there’s not really anything in there about what to do if someone looks they’re just blankly staring at a wall.

In extreme programming, you go, we’re gonna take a break because we’re pairing together and you’re not saying anything, and you know clearly something’s gone wrong. The two places I know to learn that. It’s tabletop RPGs, extreme programming. And then we just don’t teach people that anywhere else except maybe the theater.

Murray: The other thing I find is that people who’ve played tabletop role playing games are very familiar with autonomous, self-organizing, self-managing teams. Because you are a player, you’ve got a problem, you’ve gotta solve it. It could be a bank heist or something, but you’ve gotta plan and act and use all your resources and cooperate at a high level. I think it teaches you entrepreneurialism and self-organization in a way that is not expected and actively fought against in a corporation. if you are the sort of person who thinks for themselves and who says, oh, I see a problem we should do this to solve it. You could easily get smashed down real hard in a corporate for saying that.

Nikhil: Yeah, I think that is a huge component to it. I think it’s really helpful to develop that skill. And I do want to comment a little bit more length about what the median corporation does to people. It’s not good for them psychologically in the long term.

I thought because I was very popular with engineers, ‘cause I write an engineer oriented blog, I would get people to reach out to me and they would describe failing projects. And we’d go in and that would be our way into the company. And we would go in and clean stuff up let’s work to some degree. But it was much less successful than I thought because a lot of people would reach out and they would say, this project’s not working for various reasons. A consultancies come in, done a really bad job. My team’s flagging, my manager’s kind of clueless. And I would say, all right this sounds tractable. We can sit down and talk to someone. Then the request would come every single time. Can you do this in a way that doesn’t suggest I see any problems with the company or that I approached you. Can you raise a problem with my manager without saying, I think there’s a problem and there’s a strange circularity to it, which is the reason there is a problem is the organization is full of people that won’t admit there’s a problem and there’s reason in that. There’s survival value to it.

Everyone that I respect has either quit a job into unemployment or been fired. Almost everyone I know at some point has had some principle they wouldn’t violate. And they were like, we’re fixing this project, or you’re getting rid of me, or I’m quitting. I’m in that category. A lot of my friends are in that category. Most employees don’t do that.

It’s not even good for your career in the long term. it just traps you in some local maximum, where you’re just here and then you wait four years and you go to senior engineer and then you wait five years and you go to staff engineer or whatever. That’s where a lot of people are. And the appetite for risk taking is way too low, I think.

Murray: I wanna come back to your blog post because it was quite inflammatory in, so a few of your other posts. How on earth can you say such inflammatory staff and get people to, hire you for consulting gigs?

Nikhil: It’s the opposite. if I didn’t say, if I didn’t have interesting opinions, why would anyone talk to me? Imagine I wrote a different blog post and it was just the most generic thing possible. I think we’re in the middle of a hype cycle, but there’s clearly something there. Some people are gonna use AI and some people are gonna throw away the parts that doesn’t work, and we’re early in the thing . Every technology has gone through this cycle. That’s not an unreasonable thing to write. But no one reading that has gained any sort of information. And I’m not talking about writing inflammatory things to get passed along. I just mean a lot of stuff in the corporate world is very opinion devoid.

I meet a lot of engineers who’ve done a lot of corporate work who end up in this category where they’re broadly technically knowledgeable, but they don’t even have things they object to. I meet a lot of engineers who just won’t bad mouth any piece of technology in any context, and they’re all extremely ineffective because they bring nothing to the table. Their whole personality profile has been constructed as if you’re trying to avoid upsetting any vendors ever and they’re not getting money from vendors. So I don’t know why they behave that way.

Donna: Maybe people have been taught to be professional in air quotes. I’ve had my whole career being told that I was too opinionated and blunt but I have thoughts.

Murray: We’re not paying you for your thoughts. We’re paying you to do work.

Donna: Yeah, I got told that once literally by a client and okay, here’s the work I’ve finished. I’ll hand my pass in at the desk.

Nikhil: It gets back to that offshoring thing. So I’m a Malaysian citizen. And a huge issue we have in Southeast Asia is lot of our graduates and staff are really strongly encouraged to accept their source of values from other parties. So we have a lot of engineers in Southeast Asia who work very hard, but they won’t express their own opinions.

That’s very dangerous for a business to get into because if something is not your core competency, you actually really need someone to give you strong opinions. I don’t want my lawyer to just do things I tell him to do. I don’t know what the hell I’m doing. I need to sit down, describe problems to him, and then he tells me what I need to do. If he just turned up every day and said all these contracts are approximately equal, and I don’t wanna bad mouth anyone, and I don’t wanna say there’s a wrong course of action, he would be incredibly useless to me. Instead, it’s the opposite. He just tells me what to do and I trust him. .

Murray: I have worked for quite a few vendors. The vendors will train you to always agree with the client, to ask them questions, to find out what it is that they want to buy, and then tell them that’s what you can offer them. Whether you can or not. That’s the delivery people’s problem. Your job is to tell the client whatever they want to hear to get the deal across the line. And disagreeing with a customer can lead you to lose customers. I know because i’ve disagreed with customers when I thought it was appropriate and I’ve lost some deals. But some deals are just bad deals that are gonna be fucked up if they keep going the way they’re going.

Nikhil: Yeah, you’re absolutely right. And that’s the tricky thing, for somewhere between 90 and 98% of the market, the most effective way is to just agree with whatever preconceptions they have. If I wanted to max out on my sales, I would go to everyone and I would go, everything you’ve done is basically genius. And if you hire us, we could go to super genius.

The clients who want to hear that usually approach us and go, could you do AI for us? what happens almost every time is I do my ethical duty and I go, Hey, this isn’t gonna work, dude. It’s not gonna work. And then they politely thank me and then we never speak again.

Donna: Do they know what they’re asking aI to To solve? Or is this just actually the hype cycle of we’ve heard about this thing. We’ll get somebody else to tell us how the thing can help us.

Nikhil: The only product company that we have had as a client that knew what they were doing with AI, didn’t even use the word AI on their website. There was no way of knowing it was actually an AI product. They had computer vision experts on the back end doing really clever stuff. And they wouldn’t de base themselves by going ai, Ai, ai, Everyone else at has asked us. For AI has had no idea what the fuck they were talking about.

It’s so difficult to talk them down for it ‘cause it’s a major hit to your ego. If someone comes in and says, Hey, you’re misled on all this AI stuff. At that point, they’ve probably said a lot of things to a lot of people about how they’re about to get invested in Ai. So you need a huge amount of humility to dial it back and admit, Hey, I was wrong on this one. And I think that humility is very anti-correlated with what gets you into an executive position in the first place.

One of the reasons our company is so small is I think if I even had, not even a big company, we have six people. I think if I had 50 people, I might be forced to start selling AI stuff. I don’t know I could afford to be picky with our clients anymore. You just need a lot of places with a lot of very reasonable leaders who want to hear same things, and I actually don’t think there’s a market for it right now. You have to stay small work with two or three sensible people. It just takes so long to find good clients.

Murray: Let’s talk about the future. What is the realistic trajectory for, AI development in the next five years?

Nikhil: We’ll start with the kind of disaster, which would be they don’t make substantial progress on the hallucination problem. As you probably know, LLMs currently just say silly stuff sometimes. They say it enough that it’s a major barrier to adoption. If there was some way to consistently align an LLM so that it would say, I don’t know, something that would immediately make them hyper valuable. One of the reasons aI wasn’t doing so well in 2019 is you couldn’t get a hundred percent accuracy on almost any system. it’s why a lot of this stuff isn’t used in legal practice right now because you can’t afford to have your contracts be wrong even 1% of the time.

That would be disastrous. If they don’t solve the hallucination problem or make immense progress, this was a gigantic bubble and most of this money’s wasted. That is probably the most likely outcome.

The next one is you have incremental improvements and we get to a point where those problems are mostly resolved but not completely solved. That doesn’t seem like it’s going to happen, the idea that this architecture can result in truth telling doesn’t seem plausible to me. It would be a genuine innovation for that problem to be solved. half the planet’s working on it right now, so maybe we’ll get there.

Murray: Maybe it’s as good as truth telling as most people are.

Nikhil: Yes, that is quite possible. One of the reasons it’s an issue though, is that if your lawyer makes a mistake, your lawyer is accountable and these systems can’t be accountable. It’s okay for a doctor to misdiagnose. You just take the doctor to court and they have malpractice insurance and so on. I don’t really know what it means for an AI system to misdiagnose someone or who is responsible or who’d be willing to accept the consequences of that. Or as a consumer, if the company won’t accept the consequences, are you okay with having no legal recourse if it misses your cancer?

Murray: But we have developed human solutions assuming people are sometimes wrong and, they involve, other people with different points of view looking at things.

Nikhil: That’s true. And I think this is part of the fixation on agents. Which is just LLMs piping output into other LLMs. It’s a crude way of going, can we get a human check on this? And answer is no. It’s still a thing that gives you answers based on whatever it thinks the most likely next token is.

I had an interesting case where I was trying to find a way to use one of these things, and I wanted to explain to a non-technical client why they shouldn’t be using a really old piece of software. And we were maybe we can get chat GPT or Claude to write an explanation for a non technician and we can pick some ideas out of it. And I said, I wrote some prompt that was here’s what this other consultancy’s doing. I have extreme doubts. Could you write an explanation for a non-technical audience?

One of my co-founders didn’t use the word extreme doubt and we got completely different answers. Mine came out with fire these people immediately, they don’t know what they’re doing. And my co-founder got, I think you’re overreacting. This is a great piece of software that’s battle tested. So that’s definitely an issue with just using LLMs to verify LLMs.

Earlier on you asked what makes teams deliver more slowly and there are discrete factors, but there are also cycles between those factors, like interactions between principles. One of the problems with the written format is it’s linear. It’s not a graph. And I wonder how much LLM struggle with systems thinking purely because do everything in the written format, which means they’re producing linear content. And that must make them exceptionally bad at actually representing a graph of connections.

I’m expecting improvements. The problem is I’m expecting big improvements in things that aren’t the main barrier to value.

It needs to stop lying or we’re not gonna consistently get value out of it. There’s, other stuff around context windows and the relative intelligence and whatnot. the key thing is that they don’t stop hallucinating then all the investments in a fair bit of trouble.

Murray: Is data engineering the answer?

Nikhil: The moment you’ve put away AI stuff, the problem space is no longer how do we solve some subclass of problems susceptible to AI? We’re now talking about how do you solve problems because computers can solve all sorts of problems.

Data engineering is essentially the set of practices and patterns we use to integrate systems for analytics purposes.

A traditional example would be you have something like a HR system and a warehouse system. And lets say you have OHNS incidents recorded in the warehouse system. So in data engineering, you’d integrate those systems by pulling all the shift details and all the literal warehouse stuff into a data warehouse, which is a big database.

And then you could do things like, see, having this staff member on at certain times causes these incidents to go up. So a lot of businesses have trouble integrating systems and they do weird stuff with their data warehouse where they pull the data in and they push it up and it’s usually a bad idea.

Data engineering is basically the act of cleaning up your data to a high level, figuring out how to integrate it in a way that generally supports either introspection about the business or enhance the day-to-day operations.

It is necessary to do AI work if you’re serious because any algorithm you run needs to have an accurate idea of what is happening in the business what the state of things is at a given moment. Even the state of your documentation, I would consider to be in the scope of data engineering. Especially if you’re talking about LLMs, you do need a source of truth for the thing to reason about your business. And that’s the sort of stuff we specialize in. We go to businesses that frequently have a good idea of how to get their systems to work together, and we just do really boring cleanup. And once you’re not talking about Ai, what I just described to you probably just sounds like good software engineering.

Murray: One thing I wanted to ask you about was was the media response been like to your article?

Nikhil: I did a lot of interviews with journalists at the Financial Times and stuff, and almost all of them have just said they a hundred percent agree with me. They haven’t seen much sign that what I’ve said is not accurate, but they don’t feel comfortable putting that opinion out there because they’re not technical experts themselves. There’s a lot of bits where they turn the microphone off and then agree a bunch and then turn the microphone back on.

I think the most concerning thing has been the number of non-technical experts who have reached out and basically stated they feel threatened by the tech industry. Because what a lot of these grifty people are doing is going to artists and terrorizing them with stuff that is

probably not going to eventuate. some stuff has, I imagine a lot of translation and drawing work has dried up, but, a lot of the people who reached out were thanking me for calming them down.

If it is a bubble, then they’re all accountable for a huge amount of human suffering, and I will not let them forget it.

Murray: Well, Also the enormous amount of waste that’s going on with this bubble.

Nikhil: I know executives at hospitals and universities who stated that they are unable to fund a huge number of ventures. Because everything has gone into ai. So it is killing people in a second order effect. And a lot of actually lifesaving initiatives are only being undertaken now if you AI wash it first, had at least one nonprofit reach out and say, we need to deliver a bunch of urgent medical care to this place. Will you help us figure out what is the smallest amount of AI we can do so that we can actually get this money to do the thing? .

Murray: Donna, do you wanna give us a summary.

Donna: Some of the things that we chatted through were the hype cycle. Lying as a sales strategy. People believing that AI is magic. We talked a bit about why companies dont tend to ship product. Of course, it’s always organizational, not individual, Of course, it’s usually organizational structures. We had a tangent into how playing role playing games helps us all in jobs that we do. A lot of the value we get from it is being able to really read social dynamics in a room, being sensitive to the idea of fun. You made the point that if we don’t progress the hallucination problem this is just a gigantic bubble of money wasted. Another option for the future might be incremental improvement, losing jobs and discovering that we actually need them still.

Murray: Yeah. That’s a good summary.

All right. Let’s talk about your company.

Nikhil: Hermit tech. So hermit-tech.com. We work with places in the US and Australia for the most part. You always work directly with me or one of my co-founders. We only have directors of the business. We don’t outsource anything. And they can go to Ludic blog it is L-U-I-D-I-C.

Murray: Okay, well good to talk to you.

Donna: Thank you.

Murray: thanks for coming on

Nikhil: Thanks

NN - Outro VO: That was No Nonsense Agile Leadership from Murray Robinson and Donna Spencer. If you’d like to explore better ways to develop software products and services, contact Donna on LinkedIn and Murray at evolv.co. That’s evolve with a zero. Thanks for listening

Lobsters · 1 赞 · 0 评 讨论 → 阅读原文 →

这条对你有帮助吗?