Data Catalog
Click here to get Data Governance eBook: https://www.qlik.com/us/resource-library/deliver-data-with-confidence.
What is a Data Catalog?
A data catalog is an inventory of data assets, organized by metadata and data management and search tools, which provides on-demand access to business-ready data. In this way, a data catalog not only provides an inventory of all available data, it connects datasets with rich information to help you find the data you need and evaluate its fitness for your particular use case.
Why It Matters?
For arguments, watch the video presentation at this link: https://videos.qlik.com/watch/9DFq9XWpL6h2w1fmSUyrFF ?
Why do you need a data catalog?
Your organization is likely flooded by large and complex datasets from many sources—financial systems, web analytics, ad platforms, CRM systems, marketing automation, partner data, and maybe even real time sources and IoT. Finding the right data and knowing you can trust it is a major challenge in the era of data lakes, big data, and self-service analytics.
- View business metadata and data lineage to improve understanding and trust.
- Apply personalized tags, properties, and business metadata for greater utilization.
- Browse dataset samples and profile statistics to ensure that data sets contain the expected information.
Today, many organizations are taking a product management approach to managing data by building data products. Data products are highly trusted, re-usable, and consumable data assets purposefully designed for domain-specific business outcomes. And a data product catalog provides a comprehensive repository that organizes and documents various data products within your organization. It serves as a federated resource, providing detailed information about each data product, including its purpose, data sources, processing methods, and intended audience.
Benefits of a Data Catalog
A data catalog puts all your data into one simplified view where all users can more easily find, understand, and use any enterprise data source to gain insights. This brings your organization a competitive advantage, cost savings, operational efficiencies, and better fraud and risk management.
Here are the key specific benefits of an enterprise data catalog:
- Get data insights faster by having on-demand access to analytics-ready data.
- Trust your data by understanding its data lineage — a detailed history showing the original source and journey of this data.
- Make different kinds of data available to different kinds of users quickly, without compromising risk.
- Streamline the transformation of raw data into analytics-ready information assets through automated profiling and metadata tools.
- Make your data more understandable by collaborating with and capturing knowledge from different teams to enhance metadata.
Qlik Catalog
When you empower your business with on-demand access to analytics-ready data, you accelerate discovery and people get answers faster. Qlik Catalog is a modern enterprise data management solution that simplifies and speeds up how you catalog, manage, prepare, and deliver your trustworthy, actionable data to business users across your enterprise.
Qlik Catalog builds a secure, enterprise-scale repository of all the data your business has available for analytics, giving your data consumers a single, go-to catalog to find, understand, and gain insights from any underlying enterprise data source. The solution’s data preparation and metadata tools streamline the transformation of raw data into analytics-ready assets, while the product’s Smart Data Catalog and graphical user interface (GUI) help people easily discover and select whatever data they need. Built on a platform of hardened data security and featuring governance capabilities, you can easily integrate Qlik Catalog with any of your other data management tools to gain enterprise-grade scalability, reliability, and performance.

Data Agility and Scale for Next-Generation Analytics
All Your Data in One Simplified View
With Qlik Catalog, data consumers across your organization get on-demand access to business-ready data through a data marketplace, which is an integrated, secure, enterprise data catalog. Flexible options let you decide what data gets copied into Qlik Catalog collection and what stays in place, while still ensuring all data in the collection is fully documented in the Smart Data Catalog. Support for on-premises, cloud-based, or a hybrid deployment approach, lets you choose where you want to source prepare, and deliver your data.
Actionable Data, On Demand
Qlik Catalog is a self-service platform, and the data you manage in it is always transparent, trustworthy, and business-ready for discovery. Data validation, profiling, and quality measures document the exact content and quality of each data source, while easy- to-use data preparation tools let your users quickly promote data from raw to ready to build-your- own discovery-ready datasets. With Qlik Catalog, analytics-ready data is available on-demand, allowing data analysts, downstream applications, algorithms, or your other data consumers to find and begin using new data in minutes, not months.
Enterprise-Grade Data Management
For your marketplace to scale, enterprise policies, such as data protection and appropriate use, need to be in place at all times. Qlik Catalog facilitates the use of metadata to capture, enforce, and monitor data policies and usage from the time data enters your marketplace. Qlik Catalog delivers on the data marketplace vision by ensuring that your data marketplace includes the full range of essential enterprise-grade data management capabilities, including robust security, governance, performance, interoperability, scalability, and reliability.
Reuse and Collaboration
By preserving and cataloging new datasets and data preparation flows as they’re generated, Qlik Catalog empowers your users to reuse and build on previously created assets in the marketplace. Greater reuse lowers data preparation costs and speeds delivery of new data to your data consumers. Your IT experts, data stewards, and business users can make data collections easier to understand and more useful by sharing their insights regarding particular datasets using business metadata, tags, and blog fields as well as standard and custom-defined properties.

Metadata Makes it Possible
Metadata plays a central role in Qlik Catalog, far beyond simply documenting data with simple cataloging or classification. In Qlik Catalog, metadata — captured in a Smart Data Catalog — drives the enterprise data management process and many of Qlik Catalog’s key capabilities.
Metadata is used to structure, document, secure, and manage data collection, ensuring you have a well-governed marketplace (rather than a swamp). IT teams, data stewards, and data consumers access and use metadata through the Qlik Catalog’s GUI to find, understand, enrich, and work with data in the marketplace. And data operations teams use Qlik Catalog’s metadata APIs to automate the integration of Qlik Catalog workflows with other enterprise schedulers, applications, or repositories at scale.
Core Capabilities

Onboard Data
Qlik Catalog automatically profiles and documents the exact content, structure, and quality of your enterprise data as it is brought into the data marketplace from any and all sources. As part of the onboarding process, Qlik Catalog generates rich metadata, allowing your new data sources to be added into the Qlik Smart Data Catalog. You also choose whether the data is added to the Qlik Catalog data storage layer or just kept at the source. Either way, users have full access to the rich data profiles in the catalog. Built-in loaders simplify the onboarding process and support a wide variety of source types and locations, including RDBMS, mainframe applications, flat files, JSON, XML, Parquet, Avro, Qlik QVD files, AWS S3, Azure ADLS/WASB, and Kafka queues.

Enrich the Catalog
Qlik Catalog builds a Smart Data Catalog that documents every aspect of the data and data management process. As your users search and explore the catalog, technical, business, and operational metadata makes each data element understandable, transparent, trustworthy, and actionable.

Prepare the Data
Qlik Catalog makes your data business-ready by preparing and enhancing it with data standardization, cleansing, transformation, and protection measures. Starting with the clean, well-documented “raw” data created through Qlik Catalog’s onboard process, your users can easily standardize, enhance, blend, and filter data using Qlik Catalog’s drag-and-drop interface — no coding needed. This enables a wide range of users, from business users to data scientists, to access the data they need on demand and create data sets that perfectly match their unique analytic requirements.

Shop and Publish
Qlik Catalog makes data available for easy, secure consumption by all types of enterprise data users. It supports one-time exports and the recurring, automated publishing of bespoke data sets to downstream data consumers, including data science or analytic platforms, applications (including Qlik Sense), or cloud data stores. Simple integration with workflow schedulers and built-in event logging and notifications allows Qlik Catalog jobs to be seamlessly integrated into your broader dataflow and application integration schemes. Sensitive fields are obfuscated automatically, so data security is enforced, and Qlik Catalog allows you to specify record layout specs, file format, and more.
Modern Data Management, Built for the Enterprise
Qlik Catalog is a Java-based software solution that runs on top of modern data stores and computer platforms, including Hadoop, Amazon Web Services, Microsoft Azure, and Google Cloud Platform. While leveraging these platforms for data storage and compute power, Qlik Catalog gives you additional layers of functionality to onboard, discover, prepare, and deliver trustworthy, actionable enterprise data to your data consumers. The solution also contains a common framework of services to support integrated data management, including security, data governance, and metadata management — all services that enable a lights-out operational data-as-a-service platform.
User Interface and Modules
All Qlik Catalog features — including data onboarding, cataloging, preparation, shopping, and publishing — are accessed through an intuitive, browser-based GUI. Any GUI actions also direct activity in the metadata repository and services layer.
Metadata Repository
Qlik Catalog’s metadata repository is a relational database that manages and maintains all the metadata collected and generated along every step of your enterprise data management process. It’s the secure hub for exchanging metadata with other applications and environment modules that leverage or collect metadata.
Common Security and Governance Framework
Qlik Catalog includes a common framework of security, governance, and metadata capabilities that protect data, manage your user access privileges, and track your users’ activity at all times. Consistent application of a unified set of data management practices simplifies platform administration and ensures data security and governance at scale.
Qlik Catalog Services
A Java-based, metadata-driven system, Qlik Catalog takes action on data, such as ingestion (which includes automated validation, profiling, and history management), metadata creation and management, and data preparation. Actions are executed in the underlying data storage and compute infrastructure layer, making Qlik Catalog Services extremely lightweight.
Accelerate Your Transition to Modern Data Management
Qlik Catalog makes providing your business users with analytics-ready data fast and easy. It allows more people in your organization to use analytics to generate new insights that transform your business.
Uses cases

Agile Analytics
Accelerate discovery with self-service access to business-ready data

Enterprise Data On Demand
Provide your business with analytics-ready enterprise data anytime through a modern data management platform

Faster Data Preparation
Migrate from ETL to a speedier, modern data preparation and delivery approach

Data Governance Enablement
Deliver well-governed data to your business

Mainframe Data for Modern Analytics
Modernize complex COBOL-based datasets

Catalog and Share Qlik QVD Files
Easily discover and distribute QVD file data with any application, not just
Qlik Sense