Data expert Wouter Neef from Data Booster describes five common challenges in data democratization and how to overcome them.
In this decade we can observe the emergence of a new data layer in modern enterprises, which morph into data-driven organizations, where the majority of employees become data citizens. Data are no longer a well-kept secret for the experts from analytics and engineering but can be used by any member of the organization. This process, making data available across organizations, is often coined as data democratization.
In this article, data expert Wouter Neef (CEO, Data Booster) describes five common hurdles he observed in data democratization. Wouter does not only point to problems but also shows potential solutions, saving you a tremendous amount of time in your data journey.
Wouter Neef: It is clear that data privacy legislation and data security risks require organizations to restrict access to data. As a result, democratizing employees’ access to the company’s data can be controversial. However, if access to data is limited to data teams only the dependency ratio becomes too high. Organizations should avoid creating this dependence. Instead of letting every data question go through a data analysts team, self-service for decision-makers should be promoted. This enables decision-makers to answer Quick Questions by themselves instead of passing them on as low-value and unattractive work to data experts. Instead of spending up to 70% of their time on operational work, data analysts - but also rarer data scientists and engineers - can focus on more high-value tasks.
Wouter Neef: Once data are available, you need to make sure that they are always up-to-date. You don’t want to drive your decisions with last week’s data, do you? The traditional approach to making data available is batch-based. From time to time, ETL pipelines load data from data sources into the systems, where they are being consumed, e.g., data warehouses. Since each run of an ETL pipeline touches all data, they cannot be executed too often, leading to consumed data being outdated most of the time. When designing your data architecture, you might consider modern event streaming technologies, such as Apache Kafka, for implementing data pipelines processes instead of using rusty and inefficient batch approaches. The combination of event streaming with change data capture does not only offer a much more efficient usage of compute resources but also ensures that data consumers can always work with fresh data.
Wouter Neef: Your data enables your organization to derisk important decisions, but what if your decision-makers don’t know where to find the right data? Or what if they don’t agree on the definition of your key metrics? Whether you call it a Databook (Uber), a Dataportal (Airbnb), or a DataHub (Linkedin), when your organization is large or growing, what you need is a fast and easy way for your decision-makers to identify the data they need for their decisions. Such a metadata hub can search through your entire data system and help you find the right table out of the thousands of tables available. Information in the metadata hub can, for instance, include column descriptions and the team that is responsible for the dataset. But a metadata hub alone is not enough. Businesses depend on metrics and KPIs to track company performance and determine company success. What if your teams disagree on the definition of an active customer? Or what if this definition is somewhere hidden in a spreadsheet or a wiki page? It means your decision-makers would waste a lot of time making sense out of it. A centralized metrics hub is not just nice to have: It empowers your organization to deliver accurate and fast insights.
Wouter Neef: Data will become the operating system of companies in the upcoming decade, across all industries. It’s no surprise that data literacy is one of the most in-demand and highly sought-after competencies in the business world these days. To avoid misinterpretation of data, the knowledge and skills people need to work with data are important for everyone, not only data analysts and data scientists. Professionals in your organization should be able to ask the right questions and know how to use your data to support decision-making. Whether they work with data directly through SQL or use tools like Tableau or Looker they should understand the basics to start making data-informed decisions. But knowing the tools and technology is not enough. Decision-makers should be able to understand the data tables which they use for decision-making and how the data leads to the key metrics in the organization. Teaching your employees about the data sources they will have access to and what tables they will use the most will result in more data-driven decision-making. And don’t forget: the simpler these data sources are, the easier it is to train your professionals.
Wouter Neef: In the last couple of years, we could witness lots of enterprises appointing their first Chief Digital Officer (CDO) and making data a C-level priority. In 2012, just 12% of Fortune 1000 companies had a CDO. By 2018, 67.9% of surveyed firms reported having a CDO, which resembles a significant progress. While appointing a CDO is a first step into the right direction, you must make sure that the rest of your leadership team is breathing data, too. Show them the importance of data democratization and educate them on how to turn data into value such that everyone in your organization is aligned on your data strategy for the upcoming years.
Giving every employee in your organization access to data is easy, however, this is not the end goal of data democratization. To create true value you must make sure that your employees have access to up-to-date data on which they can base their decisions. Next to that, make sure they know where to find the right data sources and understand what the data is about. But most important: gather your leadership behind your data democratization strategy to incentivize and motivate employees to use data in their decision-making.