DataGeeks

A BMI Blog

How old legacy data can ruin your legacy


There's a certain kind of person out there who reveres a moldy old dishwashing sponge as if it were the Holy Grail. It's the same kind of person whose walls are covered in a thrift store trove of velvet clown paintings. Whatever it is, what started as minor addictions and small bits of manageable clutter, soon becomes a mountain of every "Life" magazine ever printed that blocks your ability to move from the living room to the dining room. Pretty soon the dirty dishes pile up and the rat infestation begins. You've probably seen some of these horrific collections on an episode of TLC's "Hoarders".   

For those not familiar with the show, "Hoarders" is an inside look at virtual shut-ins who hold on to every scrap of anything they've ever accumulated until their living spaces become condemned. Part of the appeal of the show is how foreign it seems.  Sure, we all have a pile of dirty laundry that needs tending or a batch of unopened mail in the corner, but we can handle it.  We can still walk through our houses.  So we feel good about ourselves and wonder how those people could have ever gotten so deep or how they'd fallen so far.

Congratulations, your house is relatively clean.  But, that's the stuff you can see. What about the seemingly small stuff, the bits and bites or terabytes of information pulsing through your server closet at work? 

Based on our experiences working with companies that maintain huge product databases, we know that if you're a CIO, data manager, or systems specialist, even though you think you're in the clear, you may have a burgeoning hoard on your hands.  It's easy to ignore. The data for hundreds of thousands, if not millions of products fits on tiny hard drives. Out of sight. Out of mind.

But, old, unsupported and unused information is far worse than any month old pile of pizza boxes. Lose your house and you're out a few hundred thousand dollars.  Let information on a few thousand products that are no longer supported by a manufacturer accumulate year after year and soon you have millions of useless data points in your system that could cost your company millions of dollars.

How does a glut of bad legacy data happen?

With the growth of the internet, there's been an arms race, especially in e-retailing and distribution companies to sell the most stuff out there.  Many managers boast their product or sku count like Ivy League-bound seniors trade SAT scores.

"Hey, we have 500,000 products."

"Oh Yeah, we sell 525,633." 

So what if ten percent of those items and their supporting data and merchandising information represent discontinued or old items. No harm. No foul. Sure, that's what pro cyclists thought a few years ago as they injected themselves with EPO and growth hormone.

Despite the harm, we sort of get this behavior.  It's perceived as a competitive advantage.  What we don't understand, and this is often the rule not the exception is fear in upper level managers to delete any product information no matter what its usefulness. These managers are scared they might get rid of useful data while purging old information.  Some managers reason the products or information they would delete might be restored and that by keeping the old data around they'll save time and effort reentering such data.  It's possible. Electronica and synthesizer music is all the rage these days and you see a bell bottom from time to time.  But, information on deleted or unsupported products is a lot more like Disco or super wide ties. It probably ain't comin' back. Ever.

What's the harm? 

So you're loyal to your data, bad and good. You stick with it like Jay Leno and The Tonight Show.  So what?

If you don't segregate or removed old or bad data from the good data, inevitably you can't tell the two apart.  Once they're mixed, they're tough to separate.  If you're a catalog supply company, some of this bad data might make it in to your printed materials. If you're an e-commerce operation, some of this information might make it's way back on the website.  Customers might then be able to order products that no longer exist.  Once you discover this mistake, you'll have to contact that customer and tell them you no longer have the item.  Your customer service team may have to process a return or issue a credit, or spend precious time dealing with a frustrated customer.  In some rare cases, a customer might be able to get a hold of some discontinued stock you have on hand.  This is fine if the item wasn't discontinued for product recall purposes.  Send out a defective product that hurts someone though and you might have a lawsuit on your hands. 

Internally if you can't separate the good data from the old and bad data, many of your internal processes to ensure data quality will take longer than expected, because your teams will have to slog through all the data, not just the good data.  They might spend lots of time trying to track down information that no longer exists from suppliers for unsupported products.

Externally, if you hire a company like Bytemanagers to optimize your data, you could end up paying a lot of money to have us clean up information you no longer support.  Or, even worse, you might just pay us a lot of money to cleanse these items once and for all.  If you'd been vigilant along the way, you could have saved some dough.

Don't get us wrong. We love your business.  But, it's a lot more rewarding for us to focus on improving the information that makes you money and rewards your customers.

What to do? 

If you've got separation anxiety and just can't bear to part with data, no matter everything we've outlined, at the very least install a process that clearly demarcates data that has been removed from active status on your websites, catalogs, and databases.  Being able to separate the good from the bad is the first step.

If you're certain that the information will never be useful, delete it right away and remove it from your active databases.  If you're conservative, set up a process where deleted or discontinued information gets removed automatically after a 1, 2 or 5 year waiting period.

If you maintain a vigilant effort against data creep, you'll save your company precious dollars and time. At the very least, if you decide to start selling a line of cool velvet clown paintings, industrial trash bags or insect repellent to the hoarders on TLC, you'll have plenty of space in your databases to add the new products.

2 comments for “How old legacy data can ruin your legacy”

  1. Carrie
    Posted Monday, December 19, 2011 at 10:12:41 AM

    Nice post

  2. Johns Moritz
    Posted Thursday, January 26, 2012 at 1:30:08 AM

    Adding the LinkedIn Share button to your blog posts is as easy as adding the Tweet or Like button, and may be a good way to increase traffic and readers to your blog from the professional social network.

Post a comment


Blog Archive

Blog Categories