Verified by Garnet Grid

Graph Databases: Neo4j Use Cases & Architecture

When and how to use graph databases in enterprise applications. Covers Neo4j fundamentals, Cypher queries, data modeling, performance tuning, and real-world use cases from fraud detection to recommendation engines.

Graph databases model data as nodes and relationships — not tables and rows. When your queries care about how entities connect to each other, graph databases outperform relational databases by orders of magnitude. A 12-hop relationship traversal that takes 30 minutes in PostgreSQL completes in milliseconds in Neo4j.

The fundamental insight: relational databases store data and infer relationships at query time (JOINs). Graph databases store relationships as first-class citizens, making traversal operations O(1) per hop instead of O(n) per JOIN.


When to Use a Graph Database

Use CaseWhy Graph WinsComplexity
Social networks”Friends of friends who like X” — multi-hop traversalMedium
Fraud detectionRing detection, suspicious transaction patternsHigh
Recommendation engines”Users who bought X also bought Y” — collaborative filteringMedium
Knowledge graphsEntity relationships, semantic searchHigh
Network/IT infrastructureDependency mapping, impact analysisMedium
Supply chainTrack goods across multi-tier supplier networksMedium
AuthorizationRole hierarchies, permission inheritanceLow-Medium

The Test: When SQL Gets Painful

If your SQL has self-joins, recursive CTEs, or you’re implementing a BFS/DFS algorithm in application code, a graph database is likely a better fit.

-- SQL: Find all friends-of-friends (painful)
SELECT DISTINCT f2.friend_id
FROM friendships f1
JOIN friendships f2 ON f1.friend_id = f2.user_id
WHERE f1.user_id = 'alice'
  AND f2.friend_id != 'alice'
  AND f2.friend_id NOT IN (
    SELECT friend_id FROM friendships WHERE user_id = 'alice'
  );

-- 3 hops? Add another JOIN. 4 hops? Another. N hops? Recursive CTE.
-- Performance degrades exponentially with depth.

Performance Comparison at Depth

Query DepthSQL (PostgreSQL)Graph (Neo4j)
1 hop<10ms<5ms
2 hops10-50ms<5ms
3 hops100-500ms<10ms
4 hops1-10 seconds<10ms
6 hops30+ seconds (may timeout)<50ms
12 hopsNot feasible<100ms

Neo4j Fundamentals

Data Model

Nodes:       (Person {name: "Alice", age: 30})
             (Company {name: "Acme Corp"})

Relationships: (Alice)-[:WORKS_AT {since: 2022}]->(Acme)
               (Alice)-[:FRIENDS_WITH]->(Bob)

Labels:      Person, Company, Project (like table names)
Properties:  Key-value pairs on nodes and relationships

Cypher Query Language

// Create nodes and relationships
CREATE (alice:Person {name: "Alice", role: "Engineer"})
CREATE (bob:Person {name: "Bob", role: "Manager"})
CREATE (acme:Company {name: "Acme Corp", industry: "Tech"})
CREATE (alice)-[:WORKS_AT {since: 2022, team: "Platform"}]->(acme)
CREATE (bob)-[:WORKS_AT {since: 2020, team: "Platform"}]->(acme)
CREATE (alice)-[:REPORTS_TO]->(bob)

// Query: Find Alice's coworkers
MATCH (alice:Person {name: "Alice"})-[:WORKS_AT]->(company)<-[:WORKS_AT]-(coworker)
RETURN coworker.name, company.name

// Query: Friends of friends (2 hops — trivial in Cypher)
MATCH (me:Person {name: "Alice"})-[:FRIENDS_WITH*2]-(fof:Person)
WHERE fof <> me
RETURN DISTINCT fof.name

// Query: Shortest path between two people
MATCH path = shortestPath(
  (a:Person {name: "Alice"})-[:FRIENDS_WITH*..6]-(b:Person {name: "Zara"})
)
RETURN path, length(path)

Real-World Use Cases

Fraud Detection

Detect circular payment patterns and suspicious clusters:

// Find circular money flows (potential money laundering)
MATCH path = (a:Account)-[:TRANSFERRED_TO*3..6]->(a)
WHERE ALL(t IN relationships(path) WHERE t.amount > 10000)
RETURN path, 
       reduce(total = 0, t IN relationships(path) | total + t.amount) AS total_flow

// Find accounts with unusually dense connections
MATCH (a:Account)-[t:TRANSFERRED_TO]->(b:Account)
WITH a, COUNT(DISTINCT b) AS connections, SUM(t.amount) AS total
WHERE connections > 50 AND total > 1000000
RETURN a.id, connections, total
ORDER BY connections DESC

Recommendation Engine

// Collaborative filtering: "People who bought X also bought..."
MATCH (user:Customer {id: "cust_123"})-[:PURCHASED]->(product:Product)
      <-[:PURCHASED]-(other:Customer)-[:PURCHASED]->(rec:Product)
WHERE NOT (user)-[:PURCHASED]->(rec)
RETURN rec.name, COUNT(other) AS score
ORDER BY score DESC
LIMIT 10

// Content-based: Similar products by shared attributes
MATCH (p:Product {id: "prod_456"})-[:IN_CATEGORY]->(cat)<-[:IN_CATEGORY]-(similar)
WHERE similar <> p
OPTIONAL MATCH (similar)<-[r:PURCHASED]-()
RETURN similar.name, similar.price, COUNT(r) AS popularity
ORDER BY popularity DESC
LIMIT 10

Access Control / Authorization

// Does Alice have access to this document through any permission path?
MATCH path = (user:User {name: "Alice"})-[:MEMBER_OF|:HAS_ROLE|:INHERITS*1..5]->
             (perm)-[:GRANTS_ACCESS]->(doc:Document {id: "doc_789"})
RETURN COUNT(path) > 0 AS has_access

// What can Alice access? (full permission tree)
MATCH (user:User {name: "Alice"})-[:MEMBER_OF|:HAS_ROLE|:INHERITS*1..5]->
      (perm)-[:GRANTS_ACCESS]->(resource)
RETURN DISTINCT resource.name, resource.type, labels(resource)

Data Modeling Best Practices

1. Relationships Are First-Class Citizens

// ❌ Don't store relationships as properties
CREATE (a:Person {friends: ["bob", "charlie"]})

// ✅ Model them as graph relationships
CREATE (a)-[:FRIENDS_WITH {since: 2023}]->(b)
CREATE (a)-[:FRIENDS_WITH {since: 2024}]->(c)

2. Choose Between Embedding and Connecting

ApproachUse WhenExample
Embed (properties)Few values, rarely queried independentlyskills: ["Python", "Go"]
Connect (nodes)Many values, shared across entities, queried independently(Person)-[:HAS_SKILL]->(Skill)
// Embed: few values, rarely queried independently
CREATE (p:Person {name: "Alice", skills: ["Python", "Go", "SQL"]})

// Connect: many values, queried independently, shared across nodes
CREATE (p:Person {name: "Alice"})
CREATE (s:Skill {name: "Python"})
CREATE (p)-[:HAS_SKILL {level: "expert", years: 8}]->(s)

3. Use Relationship Types Liberally

// ❌ Generic relationship with type property
(a)-[:INTERACTED {type: "purchased", date: "2026-01-15"}]->(b)

// ✅ Specific relationship types (faster traversal)
(a)-[:PURCHASED {date: "2026-01-15"}]->(b)
(a)-[:VIEWED {date: "2026-01-14"}]->(b)
(a)-[:WISHLISTED {date: "2026-01-10"}]->(b)

Performance Optimization

Indexes

// Unique constraint + index
CREATE CONSTRAINT FOR (p:Person) REQUIRE p.id IS UNIQUE;

// Composite index for lookup patterns
CREATE INDEX FOR (p:Product) ON (p.category, p.status);

// Full-text search index
CREATE FULLTEXT INDEX productSearch FOR (p:Product) ON EACH [p.name, p.description];
CALL db.index.fulltext.queryNodes("productSearch", "wireless headphones") 
YIELD node, score
RETURN node.name, score LIMIT 10;

Query Optimization Rules

RuleBadGood
Bound depthMATCH (a)-[*]->(b)MATCH (a)-[*1..5]->(b)
Avoid cartesian productsMATCH (a:Person), (b:Product)MATCH (a:Person)-[:PURCHASED]->(b:Product)
Use PROFILEGuess at performancePROFILE MATCH ... RETURN ...
Filter earlyFilter after collecting all dataUse WHERE in the MATCH clause
Use parametersInline values (no plan caching)$userId parameter (plan caching)

Neo4j vs Alternatives

FeatureNeo4jAmazon NeptuneArangoDBJanusGraph
Query languageCypherGremlin/SPARQLAQLGremlin
HostingSelf-hosted + Aura (cloud)AWS managedSelf-hosted + cloudSelf-hosted
ACIDFullFullFullEventual (configurable)
ScalabilityFabric (sharding)Auto-scalingNative shardingDistributed
VisualizationNeo4j Browser (excellent)LimitedBuilt-inThird-party
EcosystemLargest, most matureAWS integratedMulti-modelOpen-source
Best forMost use cases, richest communityAWS-native shopsMulti-model needsJVM/Hadoop ecosystems

When NOT to Use a Graph Database

ScenarioBetter Alternative
Simple CRUD with no relationship queriesPostgreSQL, MySQL
Tabular analytics and aggregationsData warehouse (Snowflake, BigQuery)
High-volume writes with minimal readsTime-series DB (TimescaleDB), append-only store
Your “graph” is really just a treeNested sets or materialized paths in SQL
Full-text search as primary use caseElasticsearch
Key-value lookups onlyRedis, DynamoDB

Graph Database Market Comparison

DatabaseLicenseHostingQuery LanguageBest For
Neo4jCommunity (free) / EnterpriseSelf-hosted, AuraDB (cloud)CypherGeneral-purpose graph, knowledge graphs
Amazon NeptuneProprietaryAWS onlyGremlin, SPARQLAWS-native, RDF support
ArangoDBApache 2.0Self-hosted, ArangoDB CloudAQLMulti-model (graph + document + key-value)
TigerGraphCommunity / EnterpriseSelf-hosted, TigerGraph CloudGSQLLarge-scale analytics, deep link traversal
DgraphApache 2.0Self-hosted, Dgraph CloudDQL (GraphQL-like)GraphQL-native applications

When to Start with a Graph Database

Start with a graph database from day one when:

  • Your core value proposition depends on relationships (social networks, recommendation engines, fraud detection)
  • You need real-time traversals at depth 3+ (friends-of-friends, supply chain tracing)
  • Your schema evolves frequently and relationships are first-class entities
  • You are building a knowledge graph that connects disparate data sources

Getting Started Checklist

  • Identified the core entities (nodes) and their relationships
  • Drawn the data model as a whiteboard graph
  • Validated that traversal queries are a primary access pattern
  • Compared performance at depth: SQL vs graph on your data
  • Chosen between Neo4j Aura (managed) and self-hosted
  • Created unique constraints on primary identifiers
  • Bounded all variable-length path queries
  • Used specific relationship types (not generic with type property)
  • Populated test data and benchmarked critical queries with PROFILE
  • Decided embed vs connect for each property
  • Set up APOC library for advanced procedures

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For database architecture consulting, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →