Resources for Data Steward

On this page

Audience Time to read
Data Steward 3 min

This page includes links to relevant documentation and videos to help Data Stewards perform their tasks in the data quality virtuous cycle.

Before you start

Learn about the fundamental concepts and features in CluedIn.

Activity Resources
Learn about golden records in CluedIn Golden records
Learn how to search for golden records in CluedIn Search
Learn how to work with filters in CluedIn Filters

Data ingestion

Data Stewards can review the records in quarantine and approve or reject them as needed. Approved records will go into processing and aggregate to the existing golden records or create new golden records.

Activity Resources
Learn about the quarantine tool in CluedIn Quarantine
Learn how to make changes to data that has been quarantined before submitting it to CluedIn Video

Data transformation

Data Stewards can identify and correct any data quality issues with the help of clean projects, enhance golden records with information from third-party sources, and establish criteria for finding and eliminating duplicates.

Cleaning

Activity Resources
Get acquainted with the cleaning process in CluedIn How to get started with data cleaning
Learn about different ways to create a clean project Create a clean project
Fix data quality issues in the clean application Manage a clean project
Build automated rules based on data cleaning activities Video
Understand the statuses of the clean project Clean project reference
Use profiling to find data quality issues Video

Enrichment

Activity Resources
Get acquainted with the enrichment process Concept of enricher
Learn about different enrichers and find instructions on how to configure each enricher Enricher reference
Enrich data from third-party services in CluedIn Video

Deduplication

Activity Resources
Get acquainted with the deduplication process in CluedIn How to get started with deduplication
Watch step-by-step video on the basic deduplication of data Video
Learn about the strategies on how to efficiently deduplicate large data sets in CluedIn Deduplication in practice
Create a deduplication project and configure matching rules for detecting duplicates Create a deduplication project
Learn how to use probabilistic matching rules to find duplicates Video
Learn how to process and merge groups of duplicates Manage groups of duplicates
Learn how to split merged records Video

Data automation

Data Stewards can create business rules to apply data transformations, capture data quality issues, and determine operational values.

Activity Resources
Get acquainted with high-level rule creation process How to get started with rules
Get acquainted with different types of rules in CluedIn Data part rules, survivorship rules, golden record rules
Learn how to create a business rule Create a rule
Use OpenAI for data automation in CluedIn Video
Use OpenAI to explain automation in rules Video

Data export

Data Stewards can configure the stream and define which golden records should be sent to the external systems.

Activity Resources
Get acquainted with the high-level streaming process How to get started with streaming
Learn how to create and configure a stream Create a stream
Watch a video about creating a stream proactively without having the data yet Video
Watch a video about synchronized and event log stream modes Video

Additional data management activities

Data Stewards can use hierarchies to visualize relations between golden records, create glossary categories and terms to group golden records, and use other tools in CluedIn to facilitate data management.

Activity Resources
Visualize relations between golden records with the help of hierarchies How to get started with hierarchies
Build manual hierarchies Video
Build automated hierarchies Video
Create glossary categories and terms How to get started with glossary
Use glossary terms in streams Video
Using Copilot to help in the data management process Video