Ingest data

Audience	Time to read
Data Engineer, Data Analyst	3 min

Data ingestion instructions

In the following table, you’ll find links to video trainings and documentation for each data ingestion tool. Follow the steps for your tool of choice to get your data into CluedIn.

Tool	Link to documentation
File	Link to training video and documentation.
Endpoint	Link to training video and documentation.
Database	Link to documentation.
Azure Data Factory	Link to training video.
Microsoft Fabric/Azure Databricks	Link to documentation.
Microsoft Purview	Link to documentation.
Crawler	Link to documentation.

Data ingestion limitations

The current public release of CluedIn does not support nested data and will flatten nested objects. Depending on the structure of your nested data, the current flattening approach might be suitable. However, in some cases, you might need to process the nested object in a separate data set. We are currently working on the support of nested objects in the future.

Structure of ingested data

When the data is ingested into CluedIn, it will be represented in the following structure.

Group – this is a folder to organize your sources in a logical form.
Data source – this is an object that contains the necessary information on how to connect to the source (if applicable), as well as the users and roles that have permissions to the source. Think of it like a database in SQL Server.
Data set – this is the actual data obtained from the source. The data set contains unprocessed records, mapping information, quarantine capabilities, and rules applied to your raw records. Think of it like a table in a SQL Server database.

Data ingestion results

If you followed our instructions, you should see the result similar to the following.

Data source containing data sets

Data set containing raw records on the Preview tab

The main goal of data ingestion is to have some records on the Preview tab.

Next step

Ensure that the required records are available on the Preview tab of the data set. Once the necessary data is ingested, you can start the mapping process.

Ingest data

On this page

Data ingestion instructions

Structure of ingested data

Data ingestion results

Next step