ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

ERP Data Migration Playbook

Migrate data to and from ERP systems without disrupting business operations. Covers data mapping, cleansing strategies, cutover planning, parallel running, reconciliation, and the patterns that prevent data migration from becoming a multi-year nightmare.

ERP Data Migration Playbook

TL;DR

ERP data migration is a critical phase in any ERP implementation that involves moving data from legacy systems to the new ERP system. It is a high-risk activity that can cause significant disruptions if not handled correctly. This playbook provides a comprehensive guide to ensure a smooth and successful data migration process, from discovery and mapping to post-go-live support.

Why This Matters

ERP data migration is a high-stakes activity that can impact the entire organization. According to a survey by Deloitte, 80% of ERP implementations fail to meet their cost, schedule, and quality goals. One of the primary reasons for these failures is the failure to execute a robust data migration plan. Poor data migration can result in financial misreporting, lost sales, and customer dissatisfaction. For instance, a mid-sized manufacturing company with $500 million in annual revenue can experience a loss of $1 million or more due to incorrect data migration.

Core Concepts

Data Migration Overview

Data migration involves moving data from the legacy system to the new ERP system. This process is complex and involves several phases, including discovery and mapping, data cleansing and preparation, development and testing, cutover, and post-go-live support.

Key Terminology

  • Data Migration: The process of moving data from one system to another.
  • ETL (Extract, Transform, Load): A process used to extract data from source systems, transform it into a consistent format, and load it into the target system.
  • Field Mapping: Mapping the fields from the source system to the target system.
  • Data Cleansing: The process of identifying and correcting or removing inaccurate data.
  • Source System: The legacy system from which data is being moved.
  • Target System: The new ERP system into which data is being moved.
  • Data Volume: The amount of data that needs to be migrated.

Key Considerations

  • Data Quality: Ensuring that the data being migrated is accurate and consistent.
  • Data Dependencies: Identifying and managing the relationships between data elements.
  • Performance: Ensuring that the migration process does not impact system performance.
  • Business Impact: Minimizing the impact on business operations during the migration.

Diagram: Data Migration Process

graph TB
    A[Discovery & Mapping] --> B[Cleansing & Preparation]
    B --> C[Development & Testing]
    C --> D[Cutover]
    D --> E[Post-Go-Live Support]
    A --> B
    B --> C
    C --> D
    D --> E

Implementation Guide

Phase 1: Discovery & Mapping (Weeks 1-4)

Step 1: Inventory All Source Systems and Data Stores

import pandas as pd

# Example code for inventorying source systems
source_systems = [
    {"name": "CRM", "tables": ["customers", "orders", "products"]},
    {"name": "ERP", "tables": ["business_partners", "invoices", "transactions"]},
    # Add more source systems as needed
]

# Convert to DataFrame for easier manipulation
df_source_systems = pd.DataFrame(source_systems)
print(df_source_systems)

Step 2: Document Data Relationships and Dependencies

# Example code for documenting data relationships
dependencies = {
    "customers": ["orders", "invoices"],
    "orders": ["products", "invoices"],
    "invoices": ["transactions"],
}

# Example code for creating field-level mapping
field_mappings = {
    "customers": {
        "source_table": "legacy_crm.customers",
        "target_table": "erp.business_partners",
        "field_map": {
            "customer_id": "customer_id",
            "first_name": "first_name",
            "last_name": "last_name",
            "email": "email",
            "phone": "phone",
        },
    },
    # Add more mappings as needed
}

# Example code for identifying data quality issues
data_quality_issues = {
    "customers": ["Duplicate emails", "Invalid phone numbers"],
    "orders": ["Missing order dates", "Invalid product codes"],
    # Add more issues as needed
}

# Example code for defining business rules
business_rules = {
    "customers": ["Transform email to lowercase", "Remove non-numeric characters from phone"],
    "orders": ["Convert order dates to ISO format", "Map product codes to new codes"],
    # Add more rules as needed
}

# Example code for estimating data volumes
data_volumes = {
    "customers": {"current": 10000, "target": 8000},
    "orders": {"current": 50000, "target": 45000},
    # Add more volumes as needed
}

### Phase 2: Cleansing & Preparation (Weeks 5-8)
#### Step 1: Deduplicate Master Data
```python
# Example code for deduplication
df_customers = pd.read_csv("customers.csv")
df_customers.drop_duplicates(subset=["email"], inplace=True)
print(df_customers)

df_vendors = pd.read_csv("vendors.csv")
df_vendors.drop_duplicates(subset=["vendor_id"], inplace=True)
print(df_vendors)

Step 2: Standardize Formats

# Example code for standardizing formats
df_customers["phone"] = df_customers["phone"].str.replace("[^0-9]", "", regex=True)
df_customers["phone"] = df_customers["phone"].apply(lambda x: f"{x[:3]}-{x[3:6]}-{x[6:]}" if len(x) == 10 else x)
print(df_customers)

df_vendors["address"] = df_vendors["address"].str.title()
print(df_vendors)

Step 3: Resolve Orphaned Records

# Example code for resolving orphaned records
df_orders = pd.read_csv("orders.csv")
df_customers = pd.read_csv("customers.csv")

df_orders = df_orders.merge(df_customers, on="customer_id", how="left")
df_orders = df_orders.dropna(subset=["customer_id"])
print(df_orders)

Step 4: Archive Historical Data

# Example code for archiving historical data
archive_path = "archive/"
df_customers.to_csv(f"{archive_path}customers_archive.csv", index=False)
print(f"Data archived to {archive_path}customers_archive.csv")

df_orders.to_csv(f"{archive_path}orders_archive.csv", index=False)
print(f"Data archived to {archive_path}orders_archive.csv")

Step 5: Get Business Signoff on Cleansed Data

# Example code for business signoff
def get_business_signoff(df):
    signoff = input("Do you approve the data (yes/no)? ")
    return signoff.lower() == "yes"

if get_business_signoff(df_customers):
    print("Business signoff approved.")
else:
    print("Business signoff not approved.")

Phase 3: Development & Testing (Weeks 9-16)

Step 1: Build ETL/Migration Scripts

# Example code for building ETL scripts
def build_etl_script(df_customers, df_orders, df_vendors):
    etl_script = f"""
    INSERT INTO erp.business_partners (customer_id, first_name, last_name, email, phone)
    SELECT customer_id, first_name, last_name, email, phone
    FROM {df_customers.to_sql()};

    INSERT INTO erp.orders (order_id, customer_id, order_date, product_id, quantity)
    SELECT order_id, customer_id, order_date, product_id, quantity
    FROM {df_orders.to_sql()};

    INSERT INTO erp.vendors (vendor_id, name, address, phone)
    SELECT vendor_id, name, address, phone
    FROM {df_vendors.to_sql()};
    """
    return etl_script

etl_script = build_etl_script(df_customers, df_orders, df_vendors)
print(etl_script)

Step 2: Mock Migration 1: Load to Test Environment

# Example code for loading to test environment
def load_to_test_environment(df_customers, df_orders, df_vendors):
    df_customers.to_sql("business_partners", con="test_db", if_exists="replace", index=False)
    df_orders.to_sql("orders", con="test_db", if_exists="replace", index=False)
    df_vendors.to_sql("vendors", con="test_db", if_exists="replace", index=False)
    print("Data loaded to test environment.")

load_to_test_environment(df_customers, df_orders, df_vendors)

Step 3: Validate with Business Stakeholders

# Example code for business validation
def validate_with_stakeholders(df_customers, df_orders, df_vendors):
    for df in [df_customers, df_orders, df_vendors]:
        print(df.head())
        feedback = input("Is the data correct (yes/no)? ")
        if feedback.lower() != "yes":
            print("Data validation failed.")
            return False
    print("Data validation successful.")
    return True

if validate_with_stakeholders(df_customers, df_orders, df_vendors):
    print("Data validated successfully.")
else:
    print("Data validation failed.")

Step 4: Mock Migration 2: Full Dress Rehearsal

# Example code for full dress rehearsal
def full_dress_rehearsal(df_customers, df_orders, df_vendors):
    load_to_test_environment(df_customers, df_orders, df_vendors)
    validate_with_stakeholders(df_customers, df_orders, df_vendors)
    print("Full dress rehearsal completed successfully.")

Step 5: Performance Test at Production Volumes

# Example code for performance testing
def performance_test(df_customers, df_orders, df_vendors):
    # Simulate high traffic to test performance
    simulate_high_traffic(df_customers, df_orders, df_vendors)
    print("Performance test completed successfully.")

def simulate_high_traffic(df_customers, df_orders, df_vendors):
    # Code to simulate high traffic goes here
    pass

performance_test(df_customers, df_orders, df_vendors)

Step 6: Document Cutover Runbook

# Example code for documenting cutover runbook
def document_cutover_runbook(df_customers, df_orders, df_vendors):
    with open("cutover_runbook.txt", "w") as file:
        file.write("Cutover Runbook\n")
        file.write("Freeze source system\n")
        file.write("Extract final delta data\n")
        file.write("Run migration scripts\n")
        file.write("Reconcile source and target counts\n")
        file.write("Get business signoff\n")
        file.write("Go live or rollback\n")
    print("Cutover runbook document created successfully.")

document_cutover_runbook(df_customers, df_orders, df_vendors)

Phase 4: Cutover (Weekend)

Step 1: Freeze Source System

# Example code for freezing the source system
def freeze_source_system():
    print("Freezing the source system...")
    # Code to freeze the source system goes here
    print("Source system frozen successfully.")

freeze_source_system()

Step 2: Extract Final Delta Data

# Example code for extracting final delta data
def extract_final_delta_data(df_customers, df_orders, df_vendors):
    df_customers.to_csv("delta_customers.csv", index=False)
    df_orders.to_csv("delta_orders.csv", index=False)
    df_vendors.to_csv("delta_vendors.csv", index=False)
    print("Delta data extracted successfully.")

extract_final_delta_data(df_customers, df_orders, df_vendors)

Step 3: Run Migration Scripts

# Example code for running migration scripts
def run_migration_scripts(df_customers, df_orders, df_vendors):
    # Code to run migration scripts goes here
    print("Migration scripts run successfully.")

run_migration_scripts(df_customers, df_orders, df_vendors)

Step 4: Reconcile: Source Counts = Target Counts

# Example code for reconciliation
def reconcile_counts(df_customers, df_orders, df_vendors):
    df_customers_source = pd.read_csv("legacy_customers.csv")
    df_customers_target = pd.read_csv("delta_customers.csv")

    print("Customer counts match:", len(df_customers_source) == len(df_customers_target))

    df_orders_source = pd.read_csv("legacy_orders.csv")
    df_orders_target = pd.read_csv("delta_orders.csv")

    print("Order counts match:", len(df_orders_source) == len(df_orders_target))

    df_vendors_source = pd.read_csv("legacy_vendors.csv")
    df_vendors_target = pd.read_csv("delta_vendors.csv")

    print("Vendor counts match:", len(df_vendors_source) == len(df_vendors_target))

reconcile_counts(df_customers, df_orders, df_vendors)

Step 5: Business Validation Signoff

# Example code for business validation signoff
def business_validation_signoff():
    feedback = input("Do you approve the migration (yes/no)? ")
    if feedback.lower() == "yes":
        print("Business validation signoff approved.")
    else:
        print("Business validation signoff not approved.")

business_validation_signoff()

Step 6: Go-Live or Rollback Decision

# Example code for go-live or rollback decision
def go_live_or_rollback():
    feedback = input("Do you want to go live (yes/no)? ")
    if feedback.lower() == "yes":
        print("Go live successful.")
    else:
        print("Rollback initiated.")

go_live_or_rollback()

Phase 5: Hypercare (Weeks 1-4 Post-Go-Live)

Step 1: Daily Reconciliation of Key Entities

# Example code for daily reconciliation
def daily_reconciliation(df_customers, df_orders, df_vendors):
    df_customers_source = pd.read_csv("legacy_customers.csv")
    df_customers_target = pd.read_csv("current_customers.csv")

    print("Customer reconciliation:", len(df_customers_source) == len(df_customers_target))

    df_orders_source = pd.read_csv("legacy_orders.csv")
    df_orders_target = pd.read_csv("current_orders.csv")

    print("Order reconciliation:", len(df_orders_source) == len(df_orders_target))

    df_vendors_source = pd.read_csv("legacy_vendors.csv")
    df_vendors_target = pd.read_csv("current_vendors.csv")

    print("Vendor reconciliation:", len(df_vendors_source) == len(df_vendors_target))

daily_reconciliation(df_customers, df_orders, df_vendors)

Step 2: Rapid Issue Triage and Fix

# Example code for rapid issue triage
def rapid_issue_triage():
    issues = ["Customer data missing", "Order quantities incorrect", "Vendor contact details outdated"]
    for issue in issues:
        print(f"Handling {issue}...")
        # Code to handle issues goes here
        print(f"{issue} handled successfully.")

rapid_issue_triage()

Step 3: User Support Escalation Path

# Example code for user support escalation
def user_support_escalation():
    feedback = input("Do you need support (yes/no)? ")
    if feedback.lower() == "yes":
        print("Support initiated.")
    else:
        print("No support needed.")

user_support_escalation()

Step 4: Source System Available for Reference

# Example code for source system availability
def source_system_reference():
    feedback = input("Do you need the source system available for reference (yes/no)? ")
    if feedback.lower() == "yes":
        print("Source system available for reference.")
    else:
        print("Source system not available for reference.")

source_system_reference()

Anti-Patterns

Failing to Define Business Rules

One of the most common mistakes is not defining clear business rules for data transformation. Without these rules, data can be transformed incorrectly, leading to data quality issues and business disruptions.

Overloading the Source System

Another common mistake is overloading the source system with data migration activities. This can cause system performance issues and data inconsistencies. It is crucial to plan the migration process carefully to minimize the impact on the source system.

Ignoring Data Quality

Ignoring data quality issues can lead to incorrect data being moved to the target system. This can result in financial misreporting, customer dissatisfaction, and operational disruptions. It is essential to identify and correct data quality issues before the migration process begins.

Inadequate Testing

Inadequate testing can lead to issues in the target system that are not discovered until after the go-live. This can cause significant disruptions to business operations. It is crucial to perform thorough testing, including performance testing, to ensure that the migration process is successful.

Decision Framework

CriteriaOption AOption BOption C
Data QualityThoroughly clean data before migrationUse existing data without cleaningClean data after migration
TestingThoroughly test data migrationPerform basic testingSkip testing
Source System ImpactMinimize impact on source systemMaximize impact on source systemNo impact on source system
Business RulesDefine clear business rulesUse default rulesNo business rules

Summary

  • Ensure a comprehensive discovery and mapping process to understand the source systems and data dependencies.
  • Thoroughly clean and prepare the data to ensure accuracy and consistency.
  • Develop and test migration scripts to ensure a successful cutover.
  • Plan the cutover process carefully to minimize disruptions.
  • Provide post-go-live support to address any issues that arise.

Key takeaways:

  • Data quality is critical and must be addressed early in the process.
  • Thorough testing is essential to ensure the success of the migration.
  • Minimizing the impact on the source system is crucial to maintain business operations.
  • Clear business rules and well-defined data mappings are necessary for a successful migration.
Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →