ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

GDPR for Engineers: Building Privacy-Compliant Systems

Implement GDPR compliance as an engineering practice, not a legal checkbox. Covers data minimization, consent management, right to erasure, data portability, privacy by design patterns, and the technical architecture that makes compliance maintainable.

GDPR is not a legal problem you hand to the legal team and forget about. It is an engineering constraint that affects how you design databases, APIs, logging systems, and data pipelines. Every INSERT INTO users statement, every log line that contains an email address, every analytics event that includes a user ID — these are all privacy decisions that engineers make dozens of times a day, usually without realizing it.

This guide covers the engineering patterns that make GDPR compliance a natural part of your architecture rather than a painful retrofit.


The 8 Principles That Affect Your Architecture

GDPR PrincipleWhat It Means for Engineering
Lawful basisEvery piece of personal data needs a legal reason to exist in your system
Purpose limitationData collected for login cannot be used for marketing without separate consent
Data minimizationDo not collect data you do not need. Do not keep data longer than necessary
AccuracyUsers can correct their data. Your system must support updates
Storage limitationData has a shelf life. Auto-delete when no longer needed
Integrity & confidentialityEncrypt at rest and in transit. Access controls everywhere
AccountabilityYou must prove compliance, not just claim it. Audit trails required
Rights of data subjectsUsers can access, export, correct, and delete their data

Data Classification: Know What You Have

Before you can protect personal data, you need to know where it lives. This is harder than it sounds because personal data spreads through systems like water finding cracks.

┌─────────────────────────────────────────┐
│  PERSONAL DATA CATEGORIES               │
├─────────────────────────────────────────┤
│  Directly Identifying:                  │
│    Name, Email, Phone, Address          │
│    → Always encrypted at rest           │
│    → Never in logs                      │
│    → Retention: legal minimum only      │
├─────────────────────────────────────────┤
│  Indirectly Identifying:               │
│    User ID, IP Address, Device ID       │
│    → Pseudonymized where possible       │
│    → Hashed in analytics                │
│    → Retention: purpose-dependent       │
├─────────────────────────────────────────┤
│  Sensitive (Special Category):          │
│    Health, Religion, Ethnicity,         │
│    Biometrics, Political opinions       │
│    → Explicit consent required          │
│    → Encrypted with separate keys       │
│    → Strict access controls + audit log │
├─────────────────────────────────────────┤
│  Non-Personal:                          │
│    Aggregated metrics, anonymized data   │
│    → No GDPR restrictions               │
│    → Still good practice to protect     │
└─────────────────────────────────────────┘

Data Inventory Template

Data FieldCategoryWhere StoredLegal BasisRetentionDeletion Method
EmailDirect PIIusers table, email service, logsContractAccount lifetime + 30 daysHard delete + provider API
IP AddressIndirect PIIaccess logs, CDN logsLegitimate interest90 daysAuto-purge cron
NameDirect PIIusers table, billingContractAccount lifetimeHard delete
Analytics eventsIndirect PIIanalytics DBConsent26 monthsAutomated TTL

Right to Erasure: “Delete My Data”

This is the most technically challenging GDPR requirement. A user requests deletion, and you must remove their personal data from everywhere — primary databases, backups, logs, analytics, third-party services, caches.

Erasure Architecture

class DataErasureService:
    """Orchestrate user data deletion across all systems."""

    def __init__(self):
        self.handlers = [
            PrimaryDatabaseHandler(),
            AnalyticsDatabaseHandler(),
            LogPurgeHandler(),
            ThirdPartyServiceHandler(),   # Stripe, Sendgrid, etc.
            SearchIndexHandler(),
            CacheInvalidationHandler(),
            BackupRedactionHandler(),
        ]

    async def erase_user(self, user_id: str, request_id: str) -> ErasureReport:
        report = ErasureReport(user_id=user_id, request_id=request_id)

        for handler in self.handlers:
            try:
                result = await handler.erase(user_id)
                report.add_success(handler.name, result)
            except Exception as e:
                report.add_failure(handler.name, str(e))
                # Do not stop — continue with other handlers
                # Failed handlers will retry

        # Audit trail (must NOT contain personal data)
        await self.audit_log.record(
            event="data_erasure",
            request_id=request_id,
            user_id_hash=hash(user_id),   # Hashed, not the real ID
            results=report.summary(),
            timestamp=datetime.utcnow(),
        )

        return report

Soft Delete vs Hard Delete

ApproachGDPR Compliant?Use When
Soft delete (set deleted=true)❌ No — data still existsOnly as a temporary step before hard delete
Hard delete (remove from DB)✅ YesPrimary databases
Anonymize (replace with dummy data)✅ YesWhen you need to keep the record structure (e.g., for order history)
Crypto shredding (delete encryption key)✅ YesWhen data is encrypted per-user
-- Anonymization example: keep order records, remove personal data
UPDATE orders
SET
  customer_name = 'REDACTED',
  customer_email = 'deleted-' || id || '@redacted.local',
  shipping_address = 'REDACTED',
  phone = NULL,
  anonymized_at = NOW()
WHERE customer_id = $1;

-- Then delete the user record entirely
DELETE FROM users WHERE id = $1;

class ConsentManager:
    """Track user consent with full audit trail."""

    PURPOSES = {
        'essential': 'Required for service operation (no consent needed)',
        'analytics': 'Anonymous usage analytics',
        'marketing': 'Email marketing and promotions',
        'personalization': 'Personalized content recommendations',
        'third_party': 'Data sharing with partners',
    }

    async def record_consent(self, user_id: str, purpose: str,
                              granted: bool, source: str):
        await self.db.insert('consent_records', {
            'user_id': user_id,
            'purpose': purpose,
            'granted': granted,
            'source': source,           # 'signup_form', 'settings_page', etc.
            'ip_address': None,         # Do NOT store IP with consent
            'timestamp': datetime.utcnow(),
            'version': self.current_policy_version,
        })

    async def check_consent(self, user_id: str, purpose: str) -> bool:
        # Get most recent consent record for this purpose
        record = await self.db.query(
            'SELECT granted FROM consent_records '
            'WHERE user_id = $1 AND purpose = $2 '
            'ORDER BY timestamp DESC LIMIT 1',
            user_id, purpose
        )
        return record and record['granted']

Privacy by Design: Patterns That Work

PatternImplementationExample
Data minimizationCollect only what you needDo not ask for birthdate if you do not need it
PseudonymizationReplace identifiers with tokensUse opaque user IDs in analytics, never emails
Encryption at restEncrypt PII columns or entire tablesAES-256 for PII fields, per-user keys for crypto shredding
Purpose bindingTag data with its purpose at collection timepurpose='authentication' on email, cannot use for marketing
Automatic expiryTTL on data that should not live foreverLog entries expire after 90 days
Access controlsLimit who can see what personal dataPII accessible only to specific service accounts

Implementation Checklist

  • Create a data inventory: list every personal data field, where it is stored, and its legal basis
  • Implement data retention policies: auto-delete data past its purpose
  • Build the erasure pipeline: orchestrate deletion across all systems (DB, logs, third parties)
  • Never log PII directly — use user IDs in logs, never emails or names
  • Implement consent management with full audit trail
  • Encrypt PII at rest (column-level or table-level encryption)
  • Add data classification to your schema documentation
  • Build data export API for right-to-portability (JSON format)
  • Test the erasure pipeline monthly: create test user, request deletion, verify complete removal
  • Train engineers on data classification: what is PII, what requires consent, what must auto-expire
Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →