ERP Data Migration: Field Mapping & Validation Playbook

80% of ERP project delays are caused by data migration. Not because the technical migration is hard, but because nobody profiled the source data, nobody documented the mapping rules, and nobody validated the results until go-live week. This playbook prevents that by giving you an explicit phase-gate process with validation criteria at every step.

The golden rule of ERP migration: data cleansing happens in the source system, not during migration. If you try to fix data quality issues in your ETL scripts, they’ll grow in complexity until they become unmaintainable. Clean the source first, then migrate clean data.

Migration Phases

Phase 1: Data Profiling → Understand what you have
Phase 2: Field Mapping → Define source → target transformations
Phase 3: Data Cleansing → Fix quality issues before migration
Phase 4: Migration Development → Build ETL scripts/tools
Phase 5: Mock Migration → Run full migration in test environment
Phase 6: Validation → Verify accuracy and completeness
Phase 7: Cutover → Execute production migration

Phase Timeline

Phase	Duration	Key Deliverable	Gate Criteria
Profiling	2-3 weeks	Data quality report	Business sign-off on scope
Mapping	3-4 weeks	Signed mapping document	All fields mapped or excluded
Cleansing	3-6 weeks	Clean source data	Quality KPIs met
Development	4-6 weeks	ETL scripts + unit tests	All mappings implemented
Mock Migration	2-3 weeks (×3 mocks)	Reconciliation report	Row counts + financials match
Validation	1-2 weeks	Business sign-off	SME spot-checks pass
Cutover	1 weekend	Production system live	Post-live validation passes

Phase 1: Data Profiling

Before mapping anything, understand your source data.

-- Profile a source table
SELECT
    'customers' AS table_name,
    COUNT(*) AS total_rows,
    COUNT(DISTINCT customer_id) AS unique_ids,
    SUM(CASE WHEN email IS NULL THEN 1 ELSE 0 END) AS null_emails,
    SUM(CASE WHEN phone IS NULL THEN 1 ELSE 0 END) AS null_phones,
    MIN(created_date) AS earliest_record,
    MAX(created_date) AS latest_record,
    SUM(CASE WHEN status = 'inactive' THEN 1 ELSE 0 END) AS inactive_count
FROM source_customers;

Profiling Checklist

Check	What to Look For	Red Flag
Row count	Baseline for validation	Millions more than expected
NULL rates per column	Mandatory fields that are empty	> 50% NULL on required fields
Duplicate keys	IDs that aren’t actually unique	Any duplicates on PK
Date ranges	Historical data you may not need	Records from 1900 or future dates
Orphan records	Child records without parent	> 1% orphan rate
Encoding issues	Unicode, special characters	Mojibake (garbled text)
Status codes	What values exist vs what target accepts	Unmapped status values
Data volume per entity	How many records per category	One category with 90% of data

Entity Prioritization

Entity	Dependencies	Migration Order	Risk Level
Chart of accounts	None (foundation)	1st	Low
Customers	None	2nd	Medium
Vendors/Suppliers	None	3rd	Medium
Products/Items	Categories, UoM	4th	Medium
Open purchase orders	Vendors, Products	5th	High
Open sales orders	Customers, Products	6th	High
Open invoices (AR/AP)	Customers, Vendors	7th	Critical
General ledger balances	Chart of accounts	8th	Critical
Transaction history	All of the above	Last (if at all)	High

Phase 2: Field Mapping

Mapping Document Template

Source Table	Source Field	Target Table	Target Field	Transform	Validation	Notes
customers	cust_id	Customer	CustomerAccount	Direct map	NOT NULL, unique	PK
customers	cust_name	Customer	Name	TRIM + PROPER CASE	Max 100 chars
customers	email	Customer	PrimaryEmail	LOWER + TRIM	Validate format
customers	phone	Customer	Phone	Format to E.164	Strip non-digits first
customers	status	Customer	Status	CASE: A→Active, I→Inactive	Valid values only	Default: Active
customers	credit_limit	Customer	CreditLimit	Convert to target currency	NULL → 0.00, ≥ 0
orders	order_date	SalesOrder	OrderDate	Convert timezone to UTC	Not in future
orders	total	SalesOrder	TotalAmount	2 decimal places	Validate > 0

Transformation Rules

# Common transformation functions
def transform_customer_name(name: str) -> str:
    """Standardize customer names."""
    if not name:
        return "UNNAMED CUSTOMER"
    return name.strip().title()[:100]  # Trim, proper case, max length

def transform_phone(phone: str) -> str:
    """Convert to E.164 format."""
    digits = re.sub(r'\D', '', phone)
    if len(digits) == 10:
        return f"+1{digits}"  # US default
    elif len(digits) == 11 and digits[0] == '1':
        return f"+{digits}"
    return f"+{digits}"

def map_status(source_status: str) -> str:
    """Map source status codes to target."""
    mapping = {
        'A': 'Active', 'ACTIVE': 'Active',
        'I': 'Inactive', 'INACTIVE': 'Inactive',
        'S': 'Suspended', 'D': 'Deleted',
    }
    return mapping.get(source_status.upper(), 'Active')

Handling Unmapped Data

Scenario	Strategy	Example
Source field has no target	Park in `_legacy` table	`misc_field_3` → legacy reference table
Target field has no source	Set default value	`CurrencyCode` → ‘USD’
Values don’t match target enum	Map or reject	Source ‘X’ status has no target equivalent
Source has multiple values, target has one	Merge with priority rules	3 phone fields → pick most recent

Phase 3: Data Cleansing

Common Cleansing Tasks

Issue	Frequency	Fix Strategy	Where to Fix
Duplicate customer records	5-15%	Deduplicate merge rules	Source system
Invalid email formats	10-20%	Regex validation + manual review	Source system
Missing mandatory fields	Varies	Default values or manual entry	Source system
Inconsistent addresses	20-40%	Address standardization service	ETL or source
Orphaned transactions	2-5%	Re-link or exclude	Source system

Phase 5: Mock Migration

Run at least 3 full mock migrations before cutover.

Mock	Purpose	Timing	Success Criteria
Mock 1	Test ETL scripts, identify mapping errors	8 weeks before go-live	Scripts execute without errors
Mock 2	Fix issues from Mock 1, validate with business	4 weeks before go-live	Row counts within 0.5%
Mock 3	Dress rehearsal — full cutover simulation	2 weeks before go-live	Row counts exact, financials to the penny

Mock Migration Runbook

Pre-Mock:
  □ Refresh test environment to clean state
  □ Take snapshot of source data (point-in-time)
  □ Notify stakeholders of mock schedule

Execute:
  □ Run ETL scripts in sequence (master data → transactions)
  □ Record execution time for each entity
  □ Log all errors and warnings to tracking spreadsheet

Validate:
  □ Row count reconciliation (source vs target)
  □ Financial reconciliation (AR, AP, GL balances)
  □ Business spot-checks (50 random records per entity)
  □ Integration tests (can users create orders against migrated data?)

Post-Mock:
  □ Document all issues found
  □ Estimate remediation effort
  □ Update migration runbook with lessons learned

Phase 6: Validation

Reconciliation Report

-- Row count reconciliation
SELECT
    'Customers' AS entity,
    (SELECT COUNT(*) FROM source_customers WHERE status != 'D') AS source_count,
    (SELECT COUNT(*) FROM target.customers) AS target_count,
    (SELECT COUNT(*) FROM source_customers WHERE status != 'D') -
    (SELECT COUNT(*) FROM target.customers) AS difference;

-- Financial reconciliation
SELECT
    'Open AR Balance' AS metric,
    (SELECT SUM(balance) FROM source_ar WHERE status = 'open') AS source_total,
    (SELECT SUM(amount) FROM target.ar_transactions WHERE open = 1) AS target_total;

Validation Gates

Gate	Criteria	Action if Failed
Row count	Source − target < 0.1%	Investigate, re-migrate
Financial totals	Difference < $0.01	Trace every cent
Referential integrity	0 orphan records	Fix parent records first
Business spot-check	50 random records validated by SME	Fix mapping rules
Integration test	Users can create transactions on migrated data	Fix configuration

Phase 7: Cutover Planning

Cutover Timeline (Example Weekend)

Time	Activity	Owner	Rollback Point
Friday 6 PM	Freeze source system (read-only)	IT Operations	N/A
Friday 7 PM	Final data extract from source	Migration Team	✅ (restore read-write)
Friday 9 PM	Run production migration scripts	Migration Team	✅ (discard and retry)
Saturday 6 AM	Automated validation suite runs	Migration Team	✅ (restore source)
Saturday 9 AM	Business validation (SME spot-checks)	Business Team	✅ (restore source)
Saturday 3 PM	Go/No-Go decision	Steering Committee	✅ (last rollback point)
Sunday	Training day, resolve issues	All	No rollback after this
Monday 8 AM	Production go-live	All	Incident response mode

Common Pitfalls

Pitfall	Impact	Prevention
Not profiling source data	Discover bad data at go-live	Profile in Phase 1, no exceptions
Migrating ALL history	10x migration time	Define cutoff (2-3 years typical)
No mock migrations	Untested scripts in production	Minimum 3 mocks
Skipping financial reconciliation	Balance discrepancies post go-live	Reconcile to the penny
One person owns migration	Single point of failure	Cross-train, document everything
Data cleansing during migration	ETL complexity explodes	Clean in source system first
No rollback plan	Stuck with bad migration	Document rollback at every step

Checklist

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For ERP migration consulting, visit garnetgrid.com. :::