Verified by Garnet Grid

Data Governance & Data Catalog

Implement enterprise data governance. Covers data classification, data catalog tools, access policies, data stewardship, metadata management, and compliance for data assets.

Data governance without tooling is policy that nobody follows. Data governance without policy is tooling that nobody trusts. You need both: clear policies for how data should be classified, accessed, and used, combined with automated tooling that enforces those policies at scale.


Data Classification

LevelData TypesAccessExamples
PublicMarketing content, product infoAnyoneBlog posts, pricing page
InternalBusiness metrics, internal docsAll employeesRevenue dashboards, wiki
ConfidentialCustomer data, financial dataNeed-to-knowCustomer PII, contracts
RestrictedCryptographic keys, credentialsNamed individualsAPI keys, passwords

Data Catalog

A data catalog is the “Google for your data.” It answers: What data do we have? Where is it? What does it mean? Who owns it? Who can access it?

ToolTypeBest For
DataHub (LinkedIn)Open sourceEngineering-driven organizations
OpenMetadataOpen sourceModern data stack (dbt, Airflow)
Amundsen (Lyft)Open sourceDiscovery-focused, Python ecosystem
AtlanCommercialEnterprise governance + discovery
CollibraCommercialLarge enterprise, regulatory compliance
dbt docsBuilt-inAlready using dbt (lightweight catalog)

Metadata Management

# Table-level metadata
table: customers
schema: analytics
description: "All registered customer accounts. One row per customer."
owner: customer-data-team
pii: true
classification: confidential
refresh: daily (6:00 AM UTC)

columns:
  - name: customer_id
    type: integer
    description: "Primary identifier for customer"
    pii: false
    
  - name: email
    type: string
    description: "Customer email address"
    pii: true
    masking: "hash in non-production environments"
    
  - name: full_name
    type: string
    description: "Customer legal name"
    pii: true
    masking: "redact in analytics views"
    
  - name: segment
    type: string
    description: "Customer segment: 'enterprise', 'mid-market', 'smb'"
    pii: false
    allowed_values: ["enterprise", "mid-market", "smb"]

Data Stewardship Model

RoleResponsibilityExample
Data OwnerAccountable for data quality and accessVP of Sales owns CRM data
Data StewardDay-to-day governance, quality rulesData analyst maintains quality rules
Data EngineerPipeline reliability, schema managementBuilds and monitors pipelines
Data ConsumerUses data responsibly, reports issuesBusiness analyst building reports
Privacy OfficerCompliance, retention policiesReviews PII handling, GDPR/CCPA

Anti-Patterns

Anti-PatternProblemFix
No data catalog”Where is the customer churn data?” → 3-day searchCatalog all data assets, searchable
No data ownersNobody responsible, quality degradesNamed owner for every data domain
Governance = blockingData requests take weeks to approveSelf-service with guardrails, not gates
Classification in name onlyData labeled but no enforcementAutomated access controls based on classification
PII everywhereCompliance risk, breach impact unlimitedPII detection, masking, access logging

Checklist

  • Data classification policy defined (4 levels minimum)
  • Data catalog deployed and populated
  • Every table/dataset has a documented owner
  • Column-level metadata: descriptions, PII flags, masking rules
  • Access controls enforced based on classification
  • Data quality rules defined and automated
  • PII detection and masking in non-production
  • Compliance: retention policies, right-to-delete processes
  • Data stewardship roles assigned per domain

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For data governance consulting, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →