How Data Hygiene Impacts AI Success

Insightly_Ep23
===

[00:00:00]

Alyssa McGinn: Okay, I want to start by telling you a story. And everyone, welcome to the podcast. You know we talk about data. And I don't want to bore you with all the recap because

Jordan Walker: Listen to a previous episode

Alyssa McGinn: Listen to one of our episodes, you'll know what we talk about. Data stuff. Okay, I want to tell you a story

Jordan Walker: Okay, tell me

Alyssa McGinn: So we talked a lot about the middle market. So this is about a mid sized company, specifically an auto parts manufacturer. They do about 50 million in revenue. So sizable company. They decided, you know, [00:01:00] AI would really enhance their productivity specifically on predictive maintenance.

Jordan Walker: Oh,

Alyssa McGinn: So when do cars typically need servicing?

What types of servicing? Like starting to slice and dice by model, by year, all kinds of cool stuff. They invested 200, 000 into this AI system that allows them to predict. Maintenance needs. I think it's for cars, but then also for their own internal equipment. And they thought, you know, this is going to be amazing, you know, like we're going to know exactly, like, if we stay on top of the maintenance, we'll probably not have to hire.

Maybe they have someone that I think runs a maintenance schedule and does all that stuff. So, and, you know, to reduce the major downtime that their equipment has.

Jordan Walker: Yeah.

Alyssa McGinn: So then what happened was six months after implementing this system, which You know, 200, 000 is a decent chunk of change. They started realizing that the AI was making worse predictions than what they [00:02:00] had been seeing previously with, it was just human, humans doing it. This is what they found out when they kind of dug into it. So, each time there was a shift change, they realized that each shift was logging the maintenance issues differently. The critical sensor data was being recorded in three different units of measurement. Forty percent of maintenance logs were missing key information. And equipment IDs weren't standardized.

Jordan Walker: Okay, can we stop for a hot second because I'm over here trying not to giggle.

Alyssa McGinn: Let it out.

Jordan Walker: This is exactly like the same thing that we hear when it's like I want to use my CRM system, but the labels aren't being used the same way. Half the profiles aren't filled out correctly. So what you're telling me is this exists everywhere

Alyssa McGinn: Yeah, 100%. So this is just not a CRM, this is just a different kind of a system. So, when I [00:03:00] read that, I just say, okay, the data input

Jordan Walker: Data cleanliness,

Alyssa McGinn: yeah, the input is bad, then the units of measurement, like there's no standard reporting on that.

Jordan Walker: which that one actually kind of boggles my mind just a little bit.

Alyssa McGinn: Which how are they doing that

Jordan Walker: Right. Yeah. What. Okay. There's obviously more questions from an operational standpoint to ask

Alyssa McGinn: And that's not our realm, but let's just say there's issues there. So basically, incomplete data, messy data and just inconsistencies and lack of standardization. So, they ended up pausing the AI system or, the further implementation of it. And they had to spend an entire year fixing some of the foundational pieces. But what they found was actually the super interesting part is that they cleaned the data and they standardized across, different shifts and systems.

And that, reduced their downtime by 15%.

Jordan Walker: Because they [00:04:00] standardized it?

Alyssa McGinn: huh. Because they standardized it and cleaned it up.

Jordan Walker: Yeah.

Alyssa McGinn: And so now, you know, they could see patterns and start to pull out trends of the maintenance. But they weren't using AI when they experienced 15 percent of that downtime.

Or when they saw the reduction of the downtime. So

Jordan Walker: So the practice of going through and really, like, getting closer to the foundational data alone helped them reduce 15 percent

Alyssa McGinn: but what and they thought that a, you know, implementing the AI tools would, you know, far surpass that. So what do we learn from this, which is going to be the whole topic of this episode is that AI is nothing without data. And AI is definitely not good without clean, good, useful data. And it's, in the world where this, every new shiny object is something AI, this AI.

[00:05:00] I mean, I know we both use it. We're not, I'm not against it. I'm for it. Because it's,

Jordan Walker: you should be a healthy skeptic of it right now.

Alyssa McGinn: Just the insane money that's being thrown towards it. It's not going to work if your data is not good. And what do you call it now if you tweeted? What do you, what

Jordan Walker: You we're still saying Twitter, like,

Alyssa McGinn: I exed it. I tweeted. We're having the wrong conversation. Because middle market companies are still trying to pull 20 different spreadsheets together for one monthly report that's being done manually. Like, we need to be having a different conversation.

Jordan Walker: Well, it's the same thing that happens when any new bright tool hits the market and becomes more readily available to people or and companies, right? Like, we all want to hop on it quickly. If the talking points have been, Oh, this can actually make things a lot more efficient for you.

Biz bam, boom, just connect these things and you're going to be good to go. That is never the actual reality of any tool that ever hits the market like [00:06:00] this, right? Something that, like, within this example, we know that, the data foundations were completely skipped.

So obviously, if you're going through this, doing a good data hygiene audit is a good first step for any company at any size. Looking to explore data, AI as a tool within their company, data hygiene is important,

but the thing that stuck out to me when you were telling me the story is it kind of goes back to why the human element, regardless of AI, is still important, because without the context around the data.

So. A tool isn't gonna know to look for specific differences in data. Like even if you wanted AI to help you skim a bunch of data to help clean it up, it still needs to know what to look for and what conditions to review before it changes it, right? You would do these [00:07:00] like as a human too. You would look if you had to go through and comb through all of these spreadsheets or systems or software, whatever.

You would be looking for the inconsistencies and the conditions surrounding it before you made a recommendation on It should be this data label versus that one, right?

The context and that connection to the data is so important in these processes That even just and I mean we probably aren't gonna like this as much but this is why like you also can't just say hey External team.

Here's my package go for it. Because even an external team is not going to know all of those conditions. You have to be close to it. You have to understand, like, okay, well, when does that maintenance trigger typically happen? And what does that user experience or that journey to make that decision take place?

And then how do you update it from there?

Alyssa McGinn: Yeah. I think of the world in the future where we talk about training [00:08:00] models or like training the ai. There's a world where that's possible. Like a human from the company can train AI on like a local

Jordan Walker: model

That's what you would want that person to do too, right? When AI and like chat GPT like first came out, there was the whole conversation around like, Oh, it's going to take our jobs, blah, blah, blah. But in a very simple example, I would much rather have someone who was skilled in web development, train AI to develop code than me.

You know what I mean? Like,

Alyssa McGinn: Yeah, the specific knowledge around a certain topic. Where you're trying to make AI better is still, you know, held in the humans right now

Jordan Walker: Mm hmm.

Alyssa McGinn: And I heard a quote that's like AI is not gonna take your job but people that know how to use AI are gonna take the jobs of people that don't or are not willing to adopt it because that is, this isn't a hype cycle.

This isn't going away. This isn't, it's a shiny object right now where people are in the sense of like, let's just get on the [00:09:00] bandwagon and do it, do it, do it before they're ready. But It's not going anywhere. It's not something that's gonna be a fad.

Jordan Walker: Yeah, no, I agree. Like, we, we are here with it now, but we're definitely still in that Wild West territory.

Alyssa McGinn: It's the Wild freaking West out there

Jordan Walker: Yeah, where you have the early adopters, right, that are really hopping on it for good or for worse, whatever. I'm glad that people are testing it out. I'm glad that people are asking questions about, like, how should we really be using this?

Like, what are the ethics? Like, I'm glad for all of those conversations that are happening. We need them. But it's not going to go away, and also Newsflash, AI has actually been here for a while. Like, it's just now accessible.

Alyssa McGinn: It's democratized now cause of an individual person. Yes. And I was even thinking, like, from a service standpoint, so, anyone listening, tell me if this is a bad idea. Like, there needs to be almost, like, a service or consultants that do

An AI readiness because [00:10:00] there's a lot of components like we had Tariq on the show and it was like talking about cyber, like that's a component, data is a component,

I. T. System systems and you know, that's a component. I mean, software text that, you know, there's a lot of different components. So it's almost like it would be more worth your 200 grand to go and have someone thoroughly audit everything and say if you really want to implement AI at this level here are the things that you need to do and fix Spend the next year and you might feel like I think people are just scared to fall behind and are scared of competition But I can guarantee that you know, I mean the Google and the Microsoft's of the world is different but more, Middle market I can guarantee that if you spend that time and that money to do it, right?

You're gonna be ahead of the curve

Jordan Walker: I would agree. And I think it's good to have a healthy, risk mindset when you're going into any kind of innovation like this within your company, you have to look at, okay, what [00:11:00] are the outcomes that we're hoping to achieve? But instead of just launching into, well, let's just, develop this as we go.

This is a scenario where, Numbers are numbers until they're put into some sort of context, and if you're not giving the right direction or the strategy in it, you could be getting what looks like great insights back from a tool that have very little basis to what your real data is actually telling you, just because of the way that it was, like, manufactured to spit back to you.

And so I would agree that, like, you know, spending that 200 K. Give yourself a good, like, six months to fully, like, audit all of your data to understand, like, what cleanliness needs to occur. Align that with the goals that you have on how you're wanting to use this, like, not only within the next, like, year or so, but what's, future state?

Like, what is your hope and dream? If you could wave a magic wand, because, like, now's a really good time to be, like, if [00:12:00] you're auditing, that's a really good time to be exploring what that looks like. So then you can phase approach it. You can actually develop a road map that allows you to see, Okay, we've audited.

Now we need to spend the next three months cleaning all of this stuff up. Cool, we've already, reduced downtime by 15%. This initial goal that we had now needs to be adjusted. So now you can be iterating as you go along instead of, like, This is never going to be a stand it up and forget it and walk away.

Do it right to stand it up that way that that way you have the opportunity to innovate further instead of working on band aid fixes after it's launched

Alyssa McGinn: Cause, to your point, it's evolving so quickly Like, what's happening right now. Some of the things weren't even possible just in Q4 of 2024. I guess changing so fast. So to even say that you're implementing AI, using AI you know, within your company, like that's going to look [00:13:00] different in three months, six months, you know, a year. And so, mean, you all, everyone knows where we stand, but it's just so worth, especially in this day and age, like it's so worth investing in the foundation.

Because then that foundation allows you. To build whatever house you want. This is an example that just came into my mind. So like, let's say on a house you build like a really solid foundation, and you build a beautiful house. And then, you know, two years later, the trim that you chose is like, radically out of style.

And you want to like, change the trim. Or the, you know, the fixings on the house. You can easily just do that because your house can withstand, the foundation can withstand that, and you can put the pretty new exterior as opposed to having to literally knock the house down just to fix, you know, the shiny pretty parts of it That's kind of how I [00:14:00] imagine it to be, is that there's gonna be, yes, there's gonna be so many pretty things that will come. It's gonna be hard to

pedal backwards to, like you said, all the Band Aids, take all the Band Aids off to fix it. And at that point, it could be a very severe setback if it's in five years you have to crack, knock the house down.

Jordan Walker: Well, and something that comes to mind is I think we get hooked on looking for the software and the tools first. And that is how we end up, like in this case. If it was, okay, we're going to invest 200, 000 in an AI system, was the system, the thing that we were making a decision against, meaning are we having demos on all these different systems?

Are we like interviewing teams on like what their process is? Blah, blah, blah, blah, blah. So then our focus is around, we are going to hire this group or whatever and their system to then come in and do all of this. But we've focused so [00:15:00] much on this, on the system that we didn't actually like focus on the whole reason why we were set to do this to begin with.

Like, I don't know what the answer is for this particular story, but just historically, like thinking about any new tool that comes to market, a lot of times it's okay, well, we've heard about it in some networking association, we've participated in webinars to learn more about it. We're hearing more of our peers become interested in it, like within our competitive spaces.

So now we're becoming a lot more serious about it. We've got some visionaries in our leadership team that are like, yep, that's the direction we're going in. And they probably have a lot of really good ideas about it, but now it's been delegated down to now you go figure out how we're going to make this happen.

And historically speaking, whenever this occurs, we start looking at just the. Tool and the system that we're going to onboard. So it's like back in the day when everybody was talking about CRMs, you know, it's like, I'm going to look at HubSpot. I'm going to look at Salesforce. And [00:16:00] those are probably the big two at the time.

Right. So I'm going to, or incision or Oracle. Okay. So I've got these three, but am I asking why I'm going to use a CRM? Am I going to talk about how I need to use it for my sales, you know, development teams and whatnot? Once those kind of things got implemented, it took companies years to kind of like reverse course a little bit after they fully stood up these CRM systems to realize that incomplete data is still going into it.

So they can't do all the things that they want to do anyway. That's why I was giggling when you were telling this story, because it's like, you know, we're, we are focused on the system and the tool, and we're so excited about it. Maybe it's passion that is driving our instant gratification. But every single time this happens, I feel like we always learn that we should have taken a breath in the beginning.

Alyssa McGinn: That's exactly our conversation with Ted, remember? Like, he's all about especially tech stacks for non-profits. But I think his [00:17:00] overall opinion still vibes with that, is like you get so entrenched in the future set.

Jordan Walker: Yeah.

Alyssa McGinn: And you get so like, amazed at what it could do for you and all the different bells and whistles but forget that like the people on your team actually still have to use the tool for it to do the things And like HubSpot for example, like super powerful

Jordan Walker: Super powerful if you buy the whole thing

Alyssa McGinn: and actually use

Jordan Walker: you use

Alyssa McGinn: set it up and have all the pieces, but it only works as good as the humans using it And at this point within the AI revolution, if you will Your point, like, we still need humans to use it, you know, we still have data input and processes where data is being captured or not being captured, and that's not.

Going away until there's some sort of, you know, telepathy to read your, which I'm sure is coming. Yeah. To read your salesperson's brain and they just hook on a little thing [00:18:00] and then it

Jordan Walker: That'd actually be magical. Can someone please,

Alyssa McGinn: come up with that. That would make your job

Jordan Walker: that?

Alyssa McGinn: much easier. 'cause you'd have all the knowledge that they have.

But that's not here. And we still need humans to use the tools.

Jordan Walker: Soo I feel like I don't want to discourage anyone from going this path. I want you to keep exploring. I want you to keep going there. But I think what I would probably recommend, if I were in the shoes of someone who was looking to stand up an AI system in their company, While you're exploring systems, like, I think it goes back to you need to either somebody internally or hire a expert to audit your data hygiene, like, just a look at what are the data sets that we're hoping to use for this and what does that look like cleanliness wise. I'd also look at the processes. So if we have an issue of incomplete data, which as we've already noted, this is not an uncommon challenge just within this example, [00:19:00] I see this in literally every company that I've ever worked with.

So incomplete data happens. I think outside of the data hygiene, it's also really outlining What is our process and what is our expectation of the individuals who are inputting the data?

Because if there are gaps in that, now would be the time to start firming that up. And you can even put it in the case of, this is our vision. This is what we're moving toward, but we can't do it without you. And so these are like, this is what we're now being held accountable toward. Because by the time you at least like find a system.

That, like, you want to have all that stuff already taken care of. And if you're exploring systems and a part of their onboarding process isn't to help you with those items automatically, I would probably throw a red flag on that immediately. So I think the foundations are important. Looking at your team processes are the second thing before you ever even, choose a platform.

Because I [00:20:00] would think that if I were that platform, I would want to know that information so that I could figure out the best way to help you.

Alyssa McGinn: hmm. I also, this wasn't very clear in the story, but I'm assuming. Before AI gets productized as much, I'm assuming that a lot of this internal spend on AI solutions is custom.

And so I feel like to your point, like maybe you're not necessarily at this point, evaluating platforms, maybe you are, but you're also maybe evaluating groups or consultants or developers to build something for you internally. You know, that's specific and proprietary to you or to your company. That feels like to me.

an even bigger investment of time to like home grow your own system or your own tool set or you know, whatever it is. But yeah, this is definitely not a discouragement. Like this is a we're here for it. We think it, I [00:21:00] think it's super powerful. It can be used in so many unprecedented ways, like how it can maximize efficiency and optimize insights and results and just all the potential.

Like, I think it's there and it's not going away. So I, yeah, definitely don't feel deterred. It's just take a, take small steps and be thoughtful and careful as you move through it. Because it is something that is ever changing. But you want to be like foundationally prepared for, and that's going to take investment too.

So, I don't know why that got really serious, but I, I mean, it's like,

Jordan Walker: well, I think we're gonna see a lot of these things happen a lot more where companies are going to take a leap and I mean, again, I think of this like any other tool or system or You know, even something as simple as like, we've got to completely like overhaul our server room or something like that.

I mean, you would [00:22:00] never embark on a challenge, like today, you would never embark on a challenge like that without really taking a look at, okay, who are the stakeholders that are involved in using this every single day? What are the, what's the data and the information that, you know, is being housed here?

What are the requirements of the information in order to be housed here? What are the security requirements that we have? Like, you would do all of these checks and balances for a project like that, so why not do that same level of checks and balances for an AI system? Like, to me, it's kind of, it, it's a big misstep by not covering those just to leap into it.

Now, I'm not opposed to, you know, quick startup solutions if you're wanting to have more of that mindset within your company just to, like, do small trials of it. But something as big as this, I think needed a little bit more time to, like, think through because obviously, like, what they found on the back end was something that probably could have been [00:23:00] remedied

pretty quickly

Alyssa McGinn: Yeah, 100%. I just wonder about the server rooms. Like again, that's physical so I wonder if it's like you see the money in a room, but with AI it's a little , you know, everything is in the cloud for the most part now.

And, so its almost like, hard to like, Physically or visually attach that value, which is weird. So you're just like, you can't really, our brains still is like probably hard to quantify or like, I don't know. You don't just go and look in a room and see all the servers. You're like, this is hundreds of thousands of dollars. We better be careful as we step through this.

Jordan Walker: I'd love to talk to a business owner who has either like started to implement an AI system like this at whatever scale. It doesn't have to be as major as, you know, a company with 50 million in revenue, but something that I wonder though is typically things that you cannot see tangibly [00:24:00] require a lot more decision making time to make an investment like this.

Whereas like

Servers, you know, like, okay, those are tangible, but I understand the cost of them. I understand the value that they bring to our company. Like, literally, we can't go down without, like, we cannot go down, right? But if you, like, I mean, in the world of brand and marketing and advertising, it's always like, but what can engagement get to give to us?

You know, like, what will this generate lead wise? Because you can't see it. But how is this such an easy conversation to have? How is this such an easy, you know, yes to make? I would love to know that. Like I would love to know, like from a business owner's perspective, like what is going through your mind when you're like assessing the risk of a 200, 000 investment like this and that versus the outcomes that you have?

Like are you comfortable with knowing that you're probably going to be spending more money through the phased approach if [00:25:00] you go about it this way? Are you already like projecting that you're going to be making your money back? Like what makes this tangible for your company versus maybe some other more elusive things that could really make or break a business

And I'm not saying like compared against marketing, that's just like something that like is top of mind. There's always that argument and I'm not going to spend that money on, you know, an ad on a what if. But this is literally what too.

Alyssa McGinn: that an acronym, what's my CAC?

Jordan Walker: Yeah.

Alyssa McGinn: Did I get that one right?

Jordan Walker: Oh man.

Alyssa McGinn: the story.

thats a wrap on this episode

Jordan Walker: I liked the story time.

Story time.

Alyssa McGinn: I like understanding things with stories and I feel like hopefully that's more relatable than us just talking about data and AI and actually bringing that down to like real world scenarios. So I'm sure there will be way more conversation to come about this topic. I don't want to be an AI podcast, but [00:26:00] we started as a data podcast and data is AI. you'll hear about it.

But that's not all we're going to talk about in 2025. So, If you like this, please subscribe. That's the best way that we can, you know, grow this podcast and share with other people and comment. I think we're on YouTube and all the other places you can get podcasts. So, love to have your engagement there and we'll catch you the next episode.