CluedIn for Data Governance Manager — operating manual

On this page

  1. 0) Your First 48 Hours (Checklist)
  2. 1) Governance Model & Roles
    1. 1.1 RACI Snapshot
    2. 1.2 Governance Objects
  3. 2) Classification & Policy-as-Code
    1. 2.1 Standard Labels
    2. 2.2 Masking & Access Policies (CluedIn-style pseudo)
    3. 2.3 Export Approval for Sensitive Data
  4. 3) Access Control & Segregation of Duties (SoD)
    1. 3.1 Example SoD
  5. 4) Data Quality Governance
    1. 4.1 Select KPIs (by entity/CDE)
    2. 4.2 KPI Template
    3. 4.3 Issue Management
  6. 5) Privacy, Consent & Purpose Limitation
    1. 5.1 Legal Bases & Purposes
    2. 5.2 DSR (Access/Deletion/Correction)
    3. 5.3 Anonymization & Pseudonymization
  7. 6) Retention, Deletion & Legal Hold
    1. 6.1 Retention Schedules
    2. 6.2 Legal Hold
    3. 6.3 Destruction Evidence
  8. 7) Catalog, Glossary & Lineage
    1. 7.1 Glossary & CDEs
    2. 7.2 Lineage & Purview
  9. 8) AI Governance on CluedIn
    1. 8.1 Allowed Uses
    2. 8.2 Guardrails (policy sketch)
  10. 9) Audit, Evidence & Compliance
    1. 9.1 Audit Log Requirements
    2. 9.2 Control Library & Evidence Plan
  11. 10) Change Management & Exceptions
  12. 11) Incident Response (Data Incidents/Breaches)
  13. 12) Operating Cadence
  14. 13) KPIs & Scorecard (Examples)
  15. 14) Templates & Artifacts
    1. 14.1 Classification Policy (excerpt)
    2. 14.2 Export Contract (governance fields)
    3. 14.3 Access Request Workflow
    4. 14.4 DPIA Checklist (snapshot)
  16. 15) Maturity Roadmap
  17. 16) Quick Reference (Who to Call)

Audience: Data Governance Managers, Chiefs of Data/Privacy, Risk & Compliance leads
Goal: Provide a practical, prescriptive playbook to govern data on CluedIn—policy/controls, ownership, privacy & consent, access & retention, DQ governance, lineage, AI use, audit evidence, and operating cadence.

This manual is tool-aware (CluedIn), but policy-oriented. It favors governance-as-code, minimal bureaucracy, and measurable outcomes.


0) Your First 48 Hours (Checklist)

People & Ownership

  • Confirm domain owners and data stewards for top entities (e.g., Person, Organization, Order).
  • Stand up the Data Governance Council (DGC) rhythm and RACI.
  • Publish the use‑case brief and policy pack for day‑1 scope.

Policy & Access

  • Approve classification scheme (PII/Restricted/Confidential/Public).
  • Enable SSO‑only, group‑based roles, least privilege defaults.
  • Define SoD (Segregation of Duties) and approval thresholds.

Controls & Observability

  • Turn on audit log export and set retention ≥ your compliance need.
  • Require export contracts and PII masking policies by default.
  • Select DQ KPIs and set initial thresholds/alerts.

Privacy & Retention

  • Map legal bases and purpose for PII processing.
  • Approve retention schedules and deletion/hold workflows.
  • Validate DSR (Data Subject Request) process with runbook.

1) Governance Model & Roles

1.1 RACI Snapshot

| Area | DGC | Governance Manager | Admin | Data Steward | Data Engineer | Legal/Privacy | |—|—|—|—|—|—|—| | Classifications & policies | A/R | R | C | C | C | C/A | | Roles & access | A/R | R | R | C | C | C | | DQ KPIs & SLAs | A | R | C | R | R | C | | Retention & deletion | A | R | C | C | C | R | | AI guardrails | A | R | C | C | C | R | | Audit evidence | A | R | C | C | C | C |

A = Accountable, R = Responsible, C = Consulted

1.2 Governance Objects

  • Classification scheme (labels/tiers).
  • Policies (masking, row/column access, export approvals).
  • Export contracts (schema, semantics, SLA).
  • DQ metrics & thresholds, issue register.
  • Lineage and glossary (CDEs—Critical Data Elements).
  • Retention schedule and legal hold register.
  • AI policy and model/agent registry.

2) Classification & Policy-as-Code

2.1 Standard Labels

labels:
  - name: PII
    description: "Personal data subject to privacy rules"
  - name: Restricted
    description: "Sensitive business data; limited access"
  - name: Confidential
  - name: Public

2.2 Masking & Access Policies (CluedIn-style pseudo)

policy: mask_email_default
target: entity:Person.field:email
actions: [read]
effect: allow_with_mask
mask: "partial_email"          # e.g., a***@example.com
unless:
  - role_in: ["Data Steward","Administrator"]
labels_required: ["PII"]
policy: row_filter_region
target: entity:Order
actions: [read]
effect: allow_when
when: "record.region in user.allowed_regions"
applies_to:
  - roles: ["Analyst","Viewer"]

2.3 Export Approval for Sensitive Data

policy: export_requires_approval
target: export:*
actions: [promote]              # moving to prod
effect: require_approval
when: "export.contains_label('PII') or export.contains_label('Restricted')"
approvers: ["Data Governance Manager","Data Protection Officer"]

Principle: Policies must be machine-evaluable, versioned, and reviewed via PRs.


3) Access Control & Segregation of Duties (SoD)

  • Group to Role mapping from IdP; no direct user grants unless time-boxed.
  • Least privilege defaults; read-mostly for wide audiences.
  • SoD matrix to prevent a single actor from authoring and approving sensitive changes.

3.1 Example SoD

| Action | Allowed | Requires Approval | |—|—|—| | Create export with PII | Engineer/Steward | Governance Manager | | Change masking policy | Admin | Governance Manager + DPO | | Grant Admin role | Admin | DGC chair approval | | Create long-lived API token | Admin/Engineer | Governance Manager |


4) Data Quality Governance

4.1 Select KPIs (by entity/CDE)

  • Completeness (required fields non-null %)
  • Validity (regex/domain rules)
  • Uniqueness (duplicate rate)
  • Consistency (cross-field rules)
  • Timeliness (source→export latency)

4.2 KPI Template

entity: Person
kpis:
  completeness_email: { warn: ">= 0.98", fail: "< 0.95" }
  validity_email_regex: { warn: ">= 0.98", fail: "< 0.95" }
  duplicate_rate_email: { warn: "<= 0.03", fail: "> 0.05" }
alerts:
  - metric: validity_email_regex
    action: "notify #data-quality"
review_cadence: "weekly"

4.3 Issue Management

  • Central DQ issue register with owner, ETA, and business impact.
  • SLAs: high-severity breach triage within 24h, fix within 7 days or approved waiver.
  • Evidence: charts, logs, and audit events linked to each closure.

Maintain a register of processing purposes and legal bases (GDPR Art. 6, CCPA/CPRA analogs). Map fields/entities to purposes.

purpose: "Customer Support"
legal_basis: "Legitimate Interests"
entities: ["Person","Ticket"]
fields:
  Person.email: ["PII"]
retention: "3 years from last activity"

5.2 DSR (Access/Deletion/Correction)

  • Standard DSR runbook with time targets (e.g., 30 days).
  • Use CluedIn search and policies to locate, mask, or delete records.
  • Soft-delete first; schedule hard-delete post-hold checks.
  • Record audit trail for every DSR.

5.3 Anonymization & Pseudonymization

  • Prefer masking/hasing for analytics where possible.
  • Track re-identification risk; document controls.

6.1 Retention Schedules

Create per-entity schedules with legal references; encode as policy.

policy: retention_person
target: entity:Person
actions: [delete]
effect: schedule_delete
when: "now() > record.last_activity + duration('3 years')"
exceptions:
  - "legal_hold == true"
  • Admin can set legal_hold=true at entity or export scope.
  • Prevents deletion; logs an audit event with case ID.
  • Review holds quarterly with Legal.

6.3 Destruction Evidence

  • Produce destruction certificates with counts, time window, and correlation IDs.
  • Store centrally with retention policy evidence.

7) Catalog, Glossary & Lineage

7.1 Glossary & CDEs

  • Each CDE has definition, owner, calculation notes, and DQ caveats.
  • Stewards maintain; Governance approves changes.
term: "Active Customer"
definition: "Customer with at least one completed order in the last 90 days"
owner: "Sales Ops"
cdes: ["Person.id","Order.completed_at"]
dq_notes: "Exclude orders with status in ('cancelled','fraud')"

7.2 Lineage & Purview

  • Ensure exports are scannable by Purview or push Atlas lineage after runs.
  • Require lineage for all prod exports; no “unknown source” data in BI.

8) AI Governance on CluedIn

8.1 Allowed Uses

  • Read‑only analysis by Agents on masked datasets by default.
  • Suggestion workflows (validations, dedup rules) require human approval.
  • Auto-fix limited to deterministic transformations with rollback.

8.2 Guardrails (policy sketch)

policy: ai_mask_pii
target: ai:agents
actions: [read]
effect: allow_when
when: "dataset.view == 'masked' and agent.mode in ['analysis','suggest']"
  • Log prompts & outputs; retain for investigation period.
  • Maintain an AI model/agent registry with owner, purpose, and data scope.

9) Audit, Evidence & Compliance

9.1 Audit Log Requirements

  • Retention ≥ policy (e.g., 3–7 years).
  • Immutable storage (WORM) where mandated.
  • Coverage: SSO events, role grants, token lifecycle, policy changes, export promotions, dedup merges, retention deletes.

Audit record (illustrative)

{
  "ts": "2025-08-23T11:02:44Z",
  "actor": "tiw@cluedin.com",
  "action": "policy.update",
  "target": "mask_email_default",
  "old": {"mask": "none"},
  "new": {"mask": "partial_email"},
  "ip": "203.0.113.5",
  "correlation_id": "a6c9-...-4f"
}

9.2 Control Library & Evidence Plan

  • Map controls → evidence (log queries, screenshots, configs).
  • Pre‑build audit packets (SSO config, RBAC matrices, policy files, retention jobs, DQ dashboards, incident postmortems).

10) Change Management & Exceptions

  • All policy/config changes via PR with risk notes and rollback.
  • CAB/DGC approves sensitive changes (PII exports, masking off).
  • Exception register with owner, expiry, and mitigation; auto‑review cadence.

Change ticket template

change: "Enable dedup auto-approve @ 0.97 for Person"
risk: "Potential false merges"
mitigation: "Raise reviewer sampling, add unmerge runbook"
rollback: "Set threshold to 1.0; disable auto-approve"
approvers: ["Governance Manager","DPO"]

11) Incident Response (Data Incidents/Breaches)

Trigger examples: PII exposure, policy disabled in prod, unauthorized token use, export schema exposing restricted fields.

Runbook (condensed)

  1. Identify: Alert triage, correlation_id, scope quantification.
  2. Contain: Re-enable policies, revoke tokens, pause exports.
  3. Eradicate: Fix mapping/cleaning/policy root cause.
  4. Recover: Backfill corrected outputs; notify consumers.
  5. Notify: Legal assesses regulatory notifications (GDPR 72h, etc.).
  6. Review: Post-incident with preventive controls & tests.

12) Operating Cadence

Weekly DGC (30–45 min)

  • DQ breaches & trends; top 5 risks.
  • Policy changes awaiting approval.
  • Export lineage gaps and consumers onboarded.
  • Retention/DSR stats; exceptions expiring.

Monthly Governance Review

  • KPI scorecard by domain, audit log sampling results, AI usage review, access recertification outcomes.

Quarterly

  • Maturity assessment; roadmap; control gap remediation.

13) KPIs & Scorecard (Examples)

  • % CDEs with owner, glossary entry, lineage ✅
  • DQ: completeness/validity at or above threshold for top entities ✅
  • Access: % users via group‑based roles; # exceptions open ↓
  • Privacy: DSR SLA hit rate; # PII exports with approvals ✅
  • Retention: % due-for-delete completed; # holds reviewed ✅
  • AI governance: % agents constrained to masked datasets ✅
scorecard:
  cde_coverage: { target: "100%", actual: "92%" }
  pii_export_approvals: { target: "100%", actual: "100%" }
  dsr_sla: { target: ">= 95%", actual: "97%" }

14) Templates & Artifacts

14.1 Classification Policy (excerpt)

tiers:
  - Public
  - Confidential
  - Restricted
labels:
  PII:
    description: "Personal data"
    defaults:
      mask_read: true
      export_requires_approval: true

14.2 Export Contract (governance fields)

name: contacts_v1
owner: "Sales Ops"
pii: true
approval_ids: ["chg-2025-1032"]
sla:
  freshness_p95_minutes: 60
lineage_required: true

14.3 Access Request Workflow

workflow: request_export_access
steps:
  - submit: requester -> manager_approval
  - review: governance -> approve_or_deny
  - provision: admin -> role_grant(group)
  - audit: evidence_link + expiry_date

14.4 DPIA Checklist (snapshot)

  • Purpose & legal basis documented
  • Data categories & flows mapped
  • Risks & mitigations listed
  • Residual risk accepted by DPO
  • Re‑assessment date set

15) Maturity Roadmap

Level 1 → 2

  • Classifications applied to top 3 entities; basic masking; manual approvals.
  • DQ KPIs defined; weekly review established.

Level 2 → 3

  • Policies as code; PR reviews; automated approvals with conditions.
  • Full lineage coverage; DSR automation; retention jobs live.

Level 3 → 4

  • Predictive DQ & anomaly detection; auto‑fix playbooks with guardrails.
  • AI agents widely used on masked data; real‑time policy observability.

16) Quick Reference (Who to Call)

  • Admin: SSO/RBAC, tokens, feature toggles.
  • Engineer: mappings, cleaning, exports, incident on-call.
  • Steward: validations, dedup, glossary, labels.
  • Legal/Privacy: DPIA, DSR, notifications, holds.
  • You (Gov Mgr): policies, approvals, exceptions, audits, council.

Bottom line: Treat governance as a product: versioned configs, small safe changes, clear owners, and measurable outcomes. With CluedIn, encode policies, label data, control access, automate retention, prove lineage, and keep your audit shelf ready.