Data Governance

What it is Data Governance, why you need it, and best practices

In this article, we provide definitions frameworks, and practical advice to help you understand and perform modern data governance.

What is Data Governance?
Data governance refers to the set of roles, processes, policies and  tools which ensure proper data quality throughout the data lifecycle and  proper data usage across an organization. Data governance allows users  to more easily find, prepare, use and share trusted datasets on their  own, without relying on IT.

Why is it important?
The primary benefit of data governance is providing the high-quality data necessary for data analytics and BI tools. The insights gained from these tools result in better business decisions and improved performance. Additional benefits include:

  • Improved data accuracy, completeness, and consistency
  • Prevention of data misuse
  • Agreement on common data definitions
  • Removal of data silos between departments and systems
  • Increased trust in data for analytics and decision making
  • Easier to locate data making all data more available
  • Better compliance with data privacy laws and other government regulations such as the EU General Data Protection Regulation (GDPR) and the US Health Insurance Portability and Accountability Act (HIPAA)

In addition, one of the top 10 BI and data trends this year is that regulations are now combining data management, security, privacy, and identity and access management. So, security and governance have become a top priority, especially as you share APIs and data with partners.

Data Governance Framework
The three main components of a data governance framework are people, process, and technology.

PEOPLE: For your governance program, you should consider including the following roles:

  • Steering Committee: Made up of the Chief Data Officer (and/or the head of IT) and executives from each business unit, this group sets the usage policies and data standards. The committee also defines the mission statement and goals for the program, as well as how its success will be measured.
  • Governance Team: Led by a data governance manager, this team implements and maintains the systems and tools. It’s typically composed of data architects and other governance specialists from the IT department.
  • Data stewards: This team manages the datasets and is responsible for the enforcement of rules and day-to-day needs of the business.

PROCESS: You’ll also need formal processes (or activities) to ensure consistent execution and enforcement of the usage policies and data standards set by the steering committee. These processes can be described in flow charts which make clear inputs and tasks for each use case.

TECHNOLOGY: As the name suggests, this component refers to the tools and techniques used to efficiently maintain and manage the security, integrity, lineage, usability, and availability of data. Modern tools can automate most aspects of managing a governance program. For example, a governed data catalog profiles and documents every data source and defines who in an organization can take which actions on which data.

Please watch a 2-minute video here that describes how data engineers, data stewards, and  data consumers work with a data catalog as part of a robust data  governance process.

Data Governance Best Practices

While you set up the framework described above, keep in mind these three best practices to ensure you’re successful right out of the gate.

Write a glossary
Developing a data glossary (or dictionary) which defines the business terms and concepts you use in your organization will give you consistent business context across multiple tools. For example, everyone should be clear on what qualifies as a “Marketing Qualified Lead” or an “Inactive Customer”.

Map and classify your data
Mapping where your data resides will help you know which system it’s in and how it flows through your organization. Classifying your datasets based on considerations like privacy or sensitivity issues determine how your policies are applied to each dataset.

Establish a data catalog
Building a clear, use case-based data catalog gives you the ability to make different kinds of data available to different kinds of users quickly, without compromising risk. Data catalogs provide information on data lineage, search functions and collaboration tools and give an indexed inventory of available data assets.

Rolul provenienței datelor (Data Lineage)
Proveniența datelor se referă la procesul de urmărire a tuturor modificărilor aduse datelor pe parcursul călătoriei lor, de la sursă până la locația curentă. Instrumentele de proveniență a datelor vă ajută să înțelegeți și să vizualizați aceste modificări și fluxuri de date, astfel încât să puteți ști de unde a provenit orice element de date, cum s-a împărțit și a fuzionat cu alte date și ce transformări i-au fost aplicate.
Astfel, într-un cadru de guvernanță a datelor, un data steward sau un data engineer ar folosi o vizualizare a provenienței datelor, similară cu exemplul de mai jos, pentru a ști că pot avea încredere în date și/sau pentru a putea trasa orice eroare până la cauza rădăcină.

Provocarea principală: Echilibrarea vitezei și a riscului
Guvernanța datelor s-a concentrat tradițional pe gestionarea datelor finalizate, cum ar fi metricile pentru închiderea financiară, raportările către autorități și indicatorii cheie de performanță. Acest tip de date necesită definiții formale și calitate ridicată a datelor.
Însă, în prezent, știința datelor și analiza avansată folosesc adesea date brute sau semi-prelucrate, ceea ce creează o tensiune între furnizorii de date și consumatorii de date. Furnizorii depun eforturi pentru a pune datele la dispoziție responsabil, pentru toată lumea, fără a expune afacerea la riscuri. Consumatorii, pe de altă parte, doresc datele imediat pentru proiectele lor.
Sistemul ierarhizat prezentat mai jos oferă o soluție acestei provocări. Funelul abordează nevoile diferite ale utilizatorilor prin tipuri diferite de date, aplicând un nivel crescător de control și standarde de calitate pe măsură ce datele parcurg sistemul.

This system helps the enterprise governance function focus on a breadth  of understanding across the enterprise, including enabling restrictions  to sensitive data, as well as a depth of understanding for a smaller  number of critical data assets.

Article source: Qlik blog.

For information about Qlik™, click here: qlik.com.
For specific and specialized solutions from QQinfo, click here: QQsolutions.
In order to be in touch with the latest news in the field, unique solutions explained, but also with our personal perspectives regarding the world of management, data and analytics, click here: QQblog !