The five W's and one H (Who, What, Why, Where, When and How) are classic writers' tools; as a news editor back in college, asking those questions helped me flush out an otherwise thin story, moving a piece from "maybe next week" to a successful front page. As a trial attorney, building a case around them gave the jury a compelling hearts-and-minds reason to adopt my argument. And now, in my role as a data governance strategist, I recognize that organizations can also benefit from asking the five W's and one H question; each is essential to developing a comprehensive data-governance program that delivers its expected value.
The following provides a starting point of how these should be addressed for any organization seeking to improve the accessibility, understanding, and compliance of its data through a robust data-governance program.
Who Drives a Successful Data Governance Program?
The who of successful data governance begins with leadership from the top (the business side), enthusiastic buy-in from the business data users (including owners and stewards), and IT's committed expertise.
Each of these groups needs to understand the value of the program, but in different ways. Leadership needs to understand the value as reflected by increased revenue, decreased costs, and/or controlled risk. Business data users should be excited that improved data governance will help get the right data into their hands to make better, more reliable decisions. IT should understand that governance will lead to clarity with regard to storage/retention protocols, security, and other resources (all leading to predictable budgeting). As icing on the cake, legal will likely be relieved to know the business is gaining a firm grasp on compliance, from Personally Identifiable Information (PII) to industry-specific regulation, especially in the face of a changing privacy landscape.
Actioning the who requires change-management expertise. As Laura Madsen writes in Disrupting Data Governance, "Data Governance is a change agent job, pure and simple." The right people will effectuate the change by incorporating the appropriate vision, messaging, and quick wins before anchoring the new governance program into the company's culture.
What Data Are You Governing?
Knowing what you're governing is similar to answering what a teacher wants to know about her or his class at the beginning of the year: who are my students and how do they shape up against where they should be academically? The data-governance professional must understand both what data they're governing and how well they "fit" their expected form (ranges, types, etc.). This seems obvious, but how often have you thought you understood all of your company's data, only to find out there were other data subscriptions that only one department knew about, that another department was working off of tables that had been siloed away, or KPIs tracked in one unit that conflicted with those being tracked in another?
Once you know what you have, how does it match up against where it should be? A teacher with all students reading at grade level has a very different workload than if only some are at grade level, with the rest reading one or two grades behind or above. Governing data that's generally fit for its intended purpose (higher quality) is a different process than governing data that needs to be cleaned, transformed, or enriched. Knowing what you have and how it fits its intended purpose is essential to keeping a project on track, avoiding time/budget creep, and meeting pre-defined success metrics.
Accomplishing the what – knowing what you have – can involve everything from owner, steward, and business-user interviews; implementing a company-wide data catalog and lineage tool; and/or implementing data-quality profiling tools, among others. These methods and tools will help create and define the pool of what exactly you're going to govern.
Why is it Important to Govern Data?
Why should be at the top of the list, both in terms of importance and getting started. The facts of a story can be gleaned in seconds, but the why drives it; it's where human insight, analysis, and emotion breathe life into words. In your data governance journey, the why is everything: it's the reason a data governance budget gets approved, the reason leaders support the change effort, and the reason users and practitioners buy in and create lasting change.
Madsen distills the why in Disrupting Data Governance: "The only reason we do data governance is so people can actually use the data." How that essential purpose of using the data is put into practice will vary for each enterprise; understanding the specific needs of the enterprise and the business' data users (insufficient or incomplete data, broken reports, noncompliant data, risk and compliance, unmonetized data, etc.) will not only guide you to setting success metrics, but also to having a successful data-governance program.
Putting the why into action requires vision, leadership, and pragmatism. The visionary needs to create a believable, accomplishable narrative about where the program can be and what benefits it can achieve for the enterprise. The leader needs to get the business on board with the vision, and to effectuate change to bring the vision to fruition. The pragmatic implementors (possibly also in the role of the leader) need to meet the benchmarks and socialize the successes; these will keep the program moving and the budget flowing.
Where is Your Data?
Knowing where your data is, in all of its various stages, is critical to executing governance over it. For example, while governing cloud-based and on-prem data is relatively similar, security and access may require different policies, especially if your enterprise uses more than one cloud provider. Similarly, mapping all of the changes your data has undergone from ingestion to production (its lineage) is essential to allowing your analysts and data scientists to fully understand data quality and the conclusions derived from the data: Has the data been cleaned? To what extent? Was it transformed or combined with other data? What issues arose during that process?.
Moreover, knowing what personal or confidential data is retained, why, and where are all key components of an organization's privacy program. If you need to retain any unmasked PII, you may mitigate the associated risk by ensuring it's stored in a compliant, secure location. Likewise, inadvertently or ignorantly moving data from one where to another can trigger alerts and/or penalties, depending upon applicable internal, contractual, or regulatory controls. Suffice it to say, knowing where your data flows to and from, and where it's at rest, are crucial to governing it properly. Discovering the where may be implemented in a similar fashion to the what, through interviews and profiling tools.
When Should You Get Started?
Ideally now; but you have to know how to start. Building the initial use case can be driven by several motivating factors, from risk (hopefully before a breach or fine), to confusion (conflicting reports or missing data), to budgeting issues, and beyond. Whatever the motivator, the when should typically be as soon as possible, but ultimately depends upon funding and support from the business.
Initial funding should focus on a gap or maturity analysis. These will highlight risks from the present state and the potential benefits from instituting a governance program, as well as how mature your existing governance controls are across the company. These analyses will also clarify areas that are ripe both for quick wins and long-term gains, which will help get the formal program off the ground and keep it moving. Success can still follow when circumstances limit these analysis or early governance programs to only one or two departments; in fact, quick wins here may be more easily achievable and broaden the program to additional departments.
How Do You Make it Happen?
As described, data governance is a people-driven change process that requires human vision, interaction, leadership, and eventually, consensus. Governing data means facilitating relationships and agreements between the people who use the data and those who build and support its infrastructure, pipelines, and security. Installing a data catalog or implementing data-quality standards provide valuable insight or management of data. However, successful, long-term data governance requires cross-departmental relationships, maintaining a committed network of data owners and stewards, and ongoing support from business leadership, all with the ultimate goal of getting the right data to the right people in the right way.
Nevertheless, "What tool are you using for that?" has to be the most ubiquitous question at any gathering of data-governance professionals, and there's no shortage of answers. Hundreds of tools purport to "do data governance" or various components of it, including data catalogs, data profiling, data-quality monitoring, pipeline transparency, privacy masking and anonymizing, data lifecycle management, security and access controls, and more. Your organization's overall data strategy, budget, and culture will play the most significant role in which tools are compatible or preferential.
Take the time to investigate and determine which tool(s) would best serve your program. The investments for these tools are typically significant, both in purchase dollars and in time across the enterprise necessary to train users and achieve a level of competency in their use. Jumping from tool to tool is also a surefire way to lose ROI, when considering the cost in training and achieving lasting buy-in from users and administrators. While governance is not a tool; tools certainly help get governance done!
* * *
Instituting a successful governance program is difficult. Incremental steps with small wins will help, along with answering the questions above. Most of all, developing the relationships that support engagement and buy-in will ease the burden and build long-term success. Partnering with the right experts in data governance will ease the burden of implementing a high-ROI data-governance program with staying power.