MDS → CluedIn (Part 1): Why moving makes sense, and how migration is done

What MDS is typically used for (real-world pattern)
Why moving away from MDS increasingly makes sense
- 1) Platform direction
- 2) Modern MDM is distribution-first
What CluedIn provides in this migration context (at a glance)
Migration options into CluedIn
Migration decision matrix: MDS integration vs SQL ingestion vs Hybrid
What you do first (practical sequence)
Best practices: how to approach migration to CluedIn

Audience	Time to read
Data Architect / MDM Lead / Data Engineer	12–18 min

This article explains:

what MDS is typically used for,
why moving to a modern MDM approach makes sense, and
how you migrate into CluedIn using either the MDS integration or database ingestion.

What MDS is typically used for (real-world pattern)

Most MDS implementations follow a hub-style approach:

Define a controlled model
Models/entities/attributes with members (records), including domain-based attributes.
Validate and govern
Business rules for validating and setting values (often used as a quality gate).
Manage hierarchies
Derived and explicit hierarchies and collections for navigation and reporting structures.
Publish mastered data downstream
Often via subscription views (denormalized outputs) consumed by ETL/jobs/apps.

Why moving away from MDS increasingly makes sense

1) Platform direction

If you’re planning around newer SQL Server versions, be aware Microsoft documents that MDS is discontinued/removed in SQL Server 2025 (17.x) Preview, and supported in SQL Server 2022 (16.x) and earlier.

2) Modern MDM is distribution-first

A modern MDM program is not “a master table and a UI.” It’s:

unifying and mastering across sources
improving data continuously (matching/survivorship/quality)
distributing mastered data reliably to many consumers (apps, platforms, analytics)

CluedIn is designed around publishing mastered outputs through Streams to Export Targets (connectors), including synchronized (mirror) and event log (change events) modes.

What CluedIn provides in this migration context (at a glance)

Migrating from MDS is rarely just about “where the mastered table lives”. It’s about accelerating time-to-value while improving data quality, governance, and downstream delivery patterns. CluedIn supports that by combining flexible modelling, mastering, enrichment, and distribution into one platform.

From a modelling perspective, CluedIn supports both a traditional model-first approach and an agile data-first approach that lets you ingest data early and evolve the model iteratively as you learn more about the domain and the consumers’ needs. This is particularly useful in MDS migrations where the “perfect model” usually changes once teams start comparing real records and edge cases.

For integration and capability building, CluedIn uses connectors in three distinct roles: crawlers to pull data in, enrichers to add or improve data based on keys, and export targets to push golden records out. This is a meaningful difference from many MDS estates that rely heavily on SQL views and procedures as the integration contract.

On top of classic enrichment, CluedIn provides an enrichment framework where you can configure predefined enrichers and also create custom ones, which is a common requirement in MDS programs that have accumulated “validation and standardization logic” over time.

CluedIn also adds a modern AI-assisted layer through AI Agents: they can analyze mastered datasets to suggest data quality rules, identify potential duplicates, and propose improvements, with a human review workflow to approve changes. This is valuable in migrations where the historical rule-set is either incomplete, outdated, or overly complex, and you need faster iteration without losing control.

Finally, for distribution, CluedIn’s Streams and Export Targets are the primary way to publish mastered data as either synchronized datasets (mirrors) or change events (event log), depending on the needs of each downstream consumer. Export target health checks help prevent starting a broken publishing pipeline, which matters when you start replacing “subscription view + job” patterns with operational publishing.

Migration options into CluedIn

Option A — MDS integration (pull from on-prem MDS via Azure Relay)

CluedIn documents a dedicated MDS integration using Azure Relay to expose on-prem MDS into CluedIn running in Azure. Details on MDS Integration can be seen here

In practice, this option is less about “moving MDS into CluedIn” and more about creating a safe bridge so CluedIn can read MDS content while you design the modern model and publishing contracts. The reason Azure Relay shows up here is simple: many MDS deployments sit in tightly controlled networks, and Relay enables controlled connectivity without opening inbound firewall access while still keeping the integration operationally manageable.

This is the right choice when you want a coexistence period and need to validate outcomes side-by-side. It is also the most forgiving approach when your MDS estate is messy, because it lets you migrate in waves without first rewriting every extraction surface. That said, you still have to make the hard decisions early: how identifiers behave, how you treat MDS business keys, and whether your downstream consumers need synchronized “mirror” publishing or event-style publishing. Those decisions are what determine whether your migration is smooth or turns into a long-running argument about “parity.”

Operationally, Relay-based setups add a small but real footprint. It needs configuration on the MDS side, credential/secret handling, and proper monitoring. If you run this for more than a short transition period, treat it like a production integration with clear ownership, alerts, and a runbook.

MDS1

Option B — Database ingestion (treat MDS outputs as SQL surfaces)

If your MDS environment already produces stable extraction surfaces—commonly subscription views or curated extract tables—you can ingest those SQL surfaces into CluedIn and then shift governance and publishing into CluedIn.

Choosing this option is essentially saying: “MDS will be treated like another upstream system, and we will ingest the contract it already publishes.” The advantage is speed. If your subscription views are stable and well understood by downstream teams, you can stand up ingestion quickly, get mastered outputs into CluedIn, and begin publishing to modern targets using Streams and Export Targets without waiting for deeper refactoring.

The trade-off is that correctness becomes your responsibility. With SQL ingestion, you must be disciplined about schema stability and incremental strategy. If the views change unexpectedly, your ingestion breaks. If hierarchies matter, you need to ensure the extraction surface contains relationship edges or path information rather than only flat attributes. If MDS business rules were acting as an entry-time gate, you will typically re-express that behavior as contract validation and exception handling around publishing, rather than assuming the legacy “hub gate” behavior carries over automatically.

MDS2

Option C — Hybrid (common in practice)

Hybrid migration is common because it combines safety with forward momentum. You start by bridging MDS into CluedIn using the MDS integration so you can prove end-to-end publishing and validate parity without disrupting the current operational setup. In parallel, you build a cleaner long-term ingestion path that is less dependent on MDS, typically by ingesting from curated SQL extracts or, better, from the original source systems where master data actually originates. Once the cleaner path is stable and reconciled, you phase out MDS as an ingestion source and keep CluedIn as the mastering and distribution layer.

This approach tends to work best when you have multiple domains and multiple consumers, because it supports a wave-based migration where each wave proves one domain and one contract, while the overall dependency on MDS steadily decreases.

MDS3

Migration decision matrix: MDS integration vs SQL ingestion vs Hybrid

Use this to choose the approach that best fits your constraints.

Dimension	Option A: MDS integration (Azure Relay)	Option B: SQL ingestion (views/tables)	Option C: Hybrid
Time-to-first-value	Medium	Fast (if extracts exist)	Medium
Risk to operations	Low (coexist friendly)	Medium (depends on extract correctness)	Low–Medium
Dependency on MDS	High during transition	Lower	Medium → Low
Best when	On-prem + need safe connectivity + parity validation	You already have clean “contract tables/views”	You need parity now, but want to modernize ingestion later
Effort profile	More setup for relay + integration	More work to define/validate extracts	Balanced; staged
Handles messy MDS estates	Better (fewer immediate refactors)	Worse (extract logic becomes your responsibility)	Better

Rule of thumb

If your estate is on-prem and high-risk → start with Option A or C.
If you already have clean subscription views/extract tables and want speed → Option B.
If you want safety and long-term cleanliness → Option C.

What you do first (practical sequence)

Inventory what matters in MDS
- entities, identifiers, key attributes
- hierarchies/collections that are actually used
- rules that truly matter (many are legacy workarounds)
- consumers + contracts (who expects what shape, when)
Pick your migration approach using the matrix above.
Publish early Create Streams + Export Targets early to prove downstream contracts, even before mastering is “perfect”.

Best practices: how to approach migration to CluedIn

The migrations that succeed are not the ones with the fanciest model. They’re the ones that treat migration as a delivery program with measurable outcomes. Start by defining what must change for the business to feel value: which consumers must be served first, what latency is acceptable for each consumer, what the minimum usable quality threshold is, and what “correct” means in reconciliation terms. If those answers are fuzzy, you’ll end up debating screenshots of UIs and arguing about whether a record “looks right”, instead of shipping working contracts.

Treat every downstream feed as a contract, not as an export. A contract should have a stable schema, clear field definitions, explicit required-field expectations, and clear update semantics. In CluedIn terms, this usually means agreeing whether a consumer needs a synchronized mirror of the golden record or an event-style feed that emits changes. The point isn’t the mode itself; the point is that publishing must be intentional and repeatable, not “here’s a dump, good luck”. Streams and Export Targets give you the mechanism; your job is to enforce contract discipline around them.

Decide identity early and stick to it relentlessly. Most MDS-to-modern migrations fail or drag on because teams postpone the decision about identifiers and then discover too late that every consumer has implicitly encoded expectations about keys. Whether you keep MDS identifiers, keep business keys, introduce a new canonical ID, or maintain a crosswalk, you must make the decision explicit and test it in publishing from the first wave. Identity isn’t a modeling detail; it’s the backbone of trust.

Don’t migrate everything that exists. MDS environments accumulate rules, hierarchies, and “just in case” structures over years. Some of them are genuine compliance or operational needs; many are historical workarounds or reporting conveniences. A modern migration should translate only what is necessary to deliver working contracts, then iterate. The fastest route to failure is trying to recreate the entire MDS universe before proving that you can publish a reliable mastered dataset to a real consumer.

Publish early, even if mastering is not perfect. Publishing forces reality: missing required fields, mismatched identifiers, wrong latency assumptions, and hidden consumer dependencies show up immediately when you run a contract end-to-end. This is why wave-based migrations work. One wave should include ingestion, mastering decisions, a stream, a target, reconciliation against the legacy feed, and a cutover. Then you repeat with the next contract.

Design for failure from day one, especially if you go event-driven. Duplicates will happen, retries will happen, and out-of-order delivery will happen. If you ignore this, you’ll ship a “demo integration” that collapses in production. Your publishing should enable idempotent processing downstream, and your integration should have a clear dead-letter or quarantine approach so failures are captured, explainable, and recoverable. CluedIn’s Export Targets include health checks that help prevent starting a broken pipeline, but reliability is still a system-level responsibility that includes your downstream processors and monitoring.

Finally, plan the retirement of MDS as a gated milestone. MDS does not disappear by accident; it lingers because of hidden dependencies such as scheduled jobs, reporting extracts, and undocumented “someone’s Excel” processes. Make retirement explicit, run dual-run reconciliation until confidence is earned, cut over consumer by consumer, and only then decommission. If you don’t treat retirement as a real deliverable with an owner and a checklist, you will keep paying for MDS long after “the migration” was supposedly done.

Next: Part 2 covers adoption and the mindset shift you can’t avoid.