Spring Cleaning your Data

April is the month which ushers in the arrival of springtime, and sees an end to those cold winter days for another few seasons. April is also when our minds turn toward the annual chore of spring cleaning. You know the drill – start at the top of each room, and work down; clear out all of the clutter and reorganize; give floors, windows and baseboards a good scrubbing; polish up furniture and fixtures to make them shine; and maybe even re-energize the space with a fresh coat of paint. This is the very same approach that we should use to "Spring Clean" the data in our data warehouses, and to do this on an annual basis.

Top-Down Approach

Much like we do when cleaning where we work, one room at a time and work top to bottom, let's start with a single subject area for our data, such as finance, manufacturing, distribution, inventory, projects, services, and so on. Within that subject area, we need to evaluate it starting at the top-level and working into successive levels of detail:

  • Are we capturing and storing the right type of data in our data warehouse?
  • Are we really using this data in our business intelligence initiatives, or are we a "hoarder"; storing it in case we might find some use for it later?
  • Do we have "dark data" out there, which does have business value, but which we've never "lit up" with analytics?
  • What additional types of data might be useful to have for analysis that we're not storing?
  • Who uses this data, and what else do they do with this data that we might be able to optimize?

Declutter and re-organize

On at least an annual basis, we should de-clutter our data by removing data elements from our warehouses that have outlived their usefulness. This would include subject areas that we no longer evaluate or report on, lines of business that we no longer manage, and data that has exceeded its shelf-life, and is too old to be used by the organization. One approach might be to summarize this detailed data and store the summarized information, then archive and purge the detail; to move the data to lower-cost storage devices; or to simply purge it, if appropriate. This is also a great time to reorganize your data.

This might include:

  • Performing health checks and tuning to improve database performance
  • Partitioning your data and optimizing indices for faster access
  • Evaluating scheduled jobs to determine if there is any overlap
  • Benchmarking performance of ETL and other scheduled jobs, and adjusting where necessary
  • Archiving jobs that are no longer needed
  • Assessing the performance of our dashboards, KPIs and reports, to determine if performance improvements are needed
  • Decommissioning any reports, KPIs, or dashboards that are no longer useful in shaping performance or driving the business.

Scrub, scrub, scrub

Much like we might spend time scrubbing our floors to rid them of the built-up wax, dirt and grime, let's do this with our data as well. Do we have individual data elements that are no longer used, such as sales people who have moved on, product families that we no longer carry, or territories we no longer serve? If so, this is a great time of year to remove them from your environment, to streamline your current reporting by not presenting users with massive lists of selection criteria and filters that are antiquated. This is also an opportune time to evaluate the overall quality of our data, and determine if some type of governance or master data management is needed to ensure high-quality data over the coming year, to identify and review errors or inconsistencies in the data, and develop a plan to remediate it or prevent the errors from being introduced at the source.

Break out the polish

We shouldn't only focus on polishing up our woodwork during our spring cleaning. We should also look at adding some polish to any of our digital assets to shine them up. This might include changes to some dashboards, scorecards, and reports to simplify or reorganize the presentation of data to make it more meaningful, or to take advantage of some advancements in technology; it might mean adding more functionality to our business intelligence environment, to reduce some of the manual processing and to provide more value to the user; or simply putting a new "skin" on our existing dashboards and analytics to brighten them up, and give them a newer appearance.

Some fresh paint

Lastly, let's look at freshening things up with some paint, or in the case of your data, with some new dashboards and analytics. We shouldn't just polish up our existing dashboards and analytics, but should also consider what additional dashboards, KPIs, or analytics can provide our organization with an up to date, at-a-glance view of the business without requiring anyone to run a report, massage data in excel, then manually create and format a report for distribution; where we can replace those manually created pivot tables with interactive dashboards; where we can transform all of the sorting and filtering that we do in Excel with true self-service business intelligence capabilities; and possibly even provide our organization's leadership with the ability to view all of the rich analytics we have developed through a mobile device.

As is the case with spring cleaning of our homes, this process takes a little extra effort and some focused time. When complete though, the results will be very rewarding.

Next Post