Comparing query performance: SQL JOINs vs Cypher traversals on Northwind data.
How SQL vs Cypher resolve relationships:
| Aspect |
SQL (Relational) |
Cypher (Graph) |
Relationship Storage |
Implicit (foreign keys) |
Explicit (stored relationships) |
Join Operation |
Computed at query time using indexes |
Pre-computed, direct pointer traversal |
Performance Scaling |
Degrades with data volume and JOIN count |
Constant time per relationship hop |
Index Dependency |
Critical for JOIN performance |
Less critical for traversals |
Query Comparison 1: Simple Lookup
Business Question: Find all orders for customer "ALFKI"
SQL Approach
How SQL executes this:
1. Look up customer 'ALFKI' using index on customer_id
2. Scan orders table for matching customer_id (or use index)
3. Join the results
Complexity: O(log n) for indexed lookup + O(m) for matching orders
Cypher Approach
How Cypher executes this:
1. Look up customer 'ALFKI' using index on customerID
2. Follow PLACED relationship pointers directly to Order nodes
3. Return results
Complexity: O(log n) for indexed lookup + O(k) where k is number of relationships (constant time per relationship)
For this simple query, both perform similarly because:
Winner: Tie for simple lookups
Query Comparison 2: Multi-Table JOIN
Business Question: Find all products ordered by customer "ALFKI" with their categories
SQL Approach (3 JOINs)
How SQL executes this:
1. Look up customer 'ALFKI'
2. JOIN to orders (scan/index lookup)
3. JOIN to order_details (scan/index lookup)
4. JOIN to products (scan/index lookup)
5. JOIN to categories (scan/index lookup)
6. Apply DISTINCT and ORDER BY
Complexity: Multiple index lookups and hash/merge joins
Cypher Approach
How Cypher executes this:
1. Look up customer 'ALFKI'
2. Traverse PLACED relationships to Orders
3. Traverse CONTAINS relationships to Products
4. Traverse IN_CATEGORY relationships to Categories
5. Apply DISTINCT and ORDER BY
Complexity: Index lookup + direct pointer traversal (constant time per hop)
As JOINs increase, SQL performance degrades while Cypher remains efficient:
| Metric |
SQL |
Cypher |
Tables/Nodes involved |
5 tables |
4 node types |
Join operations |
4 JOINs |
3 traversals |
Index dependencies |
Needs indexes on all join columns |
Only needs index on starting node |
Scaling behavior |
Slower with more data |
Consistent regardless of total data size |
Winner: Cypher - more efficient for multi-hop queries
Query Comparison 3: Variable-Depth Traversal
Business Question: Find the management chain for employee Nancy Davolio
SQL Approach (Recursive CTE)
How SQL executes this:
1. Execute base query to find Nancy
2. Iteratively execute recursive query until no more results
3. Union all results
4. Complex query plan with multiple iterations
Complexity: O(n * d) where n is employees and d is hierarchy depth
Cypher Approach
How Cypher executes this:
1. Find Nancy using index
2. Traverse REPORTS_TO relationships to any depth
3. Return path information
Complexity: O(k) where k is the actual path length
| Metric |
SQL |
Cypher |
Query complexity |
Complex recursive CTE |
Simple pattern with * |
Execution model |
Iterative with unions |
Direct traversal |
Depth handling |
Must iterate for each level |
Handles any depth naturally |
Code readability |
15+ lines |
4 lines |
Winner: Cypher — simpler and more efficient for hierarchical queries (4 lines compared to 15+ lines)
Query Comparison 4: Pattern Matching
Business Question: Find customers who ordered the same products as customer "ALFKI"
SQL Approach
How SQL executes this:
1. Find ALFKI’s orders
2. Find products in those orders
3. Find other order_details with same products
4. Find orders containing those details
5. Find customers who placed those orders
6. Exclude ALFKI
Complexity: 5 JOINs with potential for large intermediate result sets
Cypher Approach
How Cypher executes this:
1. Find ALFKI
2. Traverse to their ordered products
3. Traverse back through other orders to other customers
4. Filter out ALFKI
Complexity: Direct bidirectional traversal
| Metric |
SQL |
Cypher |
Join operations |
5 JOINs |
4 traversals (bidirectional) |
Pattern clarity |
Obscured by JOIN syntax |
Visual pattern matching |
Intermediate results |
Large join products |
Only relevant paths |
Query intent |
Hard to understand |
Clear and readable |
Winner: Cypher - pattern matching is what graphs were built for
Query Comparison 5: Aggregation with Relationships
Business Question: Find the top 5 customers by total order value
For aggregation queries, both databases perform similarly because:
-
Both must scan all relevant data
-
Both can use indexes for initial filtering
-
Aggregation computation is similar
Winner: Tie for aggregation-heavy queries
SQL can be more efficient for:
| Scenario |
Reason |
Simple aggregations without relationships |
No traversal benefit; relational optimization is mature |
Bulk data operations |
SQL batch operations can be highly optimized |
Single-table queries |
No relationship traversal needed |
Complex mathematical computations |
SQL has more mature numeric functions |
Cypher excels at:
| Scenario |
Reason |
Multi-hop relationship queries |
Direct traversal instead of multiple JOINs |
Variable-depth paths |
Native support instead of recursive CTEs |
Pattern matching |
Built-in pattern syntax instead of complex JOINs |
Real-time recommendations |
Fast traversal for "customers who bought X also bought Y" |
Network/hierarchy analysis |
Shortest path, centrality algorithms built-in |
| Query Type |
SQL Performance |
Cypher Performance |
Winner |
Simple lookup |
Fast |
Fast |
Tie |
2-3 table JOIN |
Good |
Good |
Slight edge to Cypher |
4+ table JOIN |
Degrades |
Consistent |
Cypher |
Recursive/hierarchical |
Complex, slow |
Simple, fast |
Cypher |
Pattern matching |
Very complex |
Natural |
Cypher |
Bulk aggregation |
Optimized |
Good |
Slight edge to SQL |
Organizing Query Examples in Neo4j Aura
The Cypher queries in this lesson demonstrate common query patterns. Save them in Neo4j Aura for reference:
-
Folder: Query-Patterns-Simple - Save the simple lookup query from Query Comparison 1
-
Folder: Query-Patterns-Multi-Hop - Save the multi-hop traversal query from Query Comparison 2
-
Folder: Query-Patterns-Hierarchical - Save the variable-depth path query from Query Comparison 3
-
Folder: Query-Patterns-Matching - Save the pattern matching query from Query Comparison 4
-
Folder: Query-Patterns-Aggregation - Save the aggregation query from Query Comparison 5
These queries serve as templates. Adapt them to your specific node labels and relationship types when working with other datasets.
Bookmark the performance comparison table
Bookmark this lesson. The performance comparison table at the end summarizes when to use graph traversals versus when SQL-style aggregations may be more appropriate. This guidance applies beyond Northwind to any graph database project.
Check Your Understanding
For which type of query does Cypher typically outperform SQL?
-
❏ Simple aggregations on a single table
-
❏ Bulk insert operations
-
✓ Queries requiring 4 or more table JOINs
-
❏ Mathematical computations
Hint
Cypher follows relationship pointers in constant time per hop; SQL JOINs are computed at query time and cost increases as more tables are joined. Multi-hop or multi-table JOIN queries are where graphs excel.
Solution
Cypher outperforms SQL for queries requiring multiple JOINs, typically 4 or more tables.
Why?
* SQL JOINs are computed at query time, requiring index lookups for each join
* Cypher traversals follow pre-computed relationship pointers in constant time per hop
* As JOIN count increases, SQL performance degrades while Cypher remains consistent
SQL may be faster for:
* Simple single-table aggregations without traversal benefit
* Bulk operations with mature optimization
* Mathematical computations with more mature functions
Hierarchical Queries
You need to find all managers in an employee’s reporting chain up to the CEO. Which approach is more efficient and why?
-
❏ SQL with multiple self-joins - one for each level
-
❏ SQL with a recursive CTE - iterates until complete
-
✓ Cypher with variable-length path - direct traversal
-
❏ Both are equally efficient
Hint
Consider how each approach handles unknown depth in a hierarchy.
Solution
Cypher with variable-length paths is most efficient for hierarchical queries.
Why Cypher wins:
* The * operator handles any depth naturally
* Direct relationship traversal (no iteration needed)
* Simple, readable syntax
SQL alternatives are less efficient:
* Multiple self-joins require knowing the max depth in advance
* Recursive CTEs iterate for each level, building unions
* Both approaches are more complex and slower
Summary
In this lesson, you learned:
-
How SQL JOINs differ from Cypher traversals at a fundamental level
-
Query comparisons for simple lookups, multi-table JOINs, hierarchical queries, and pattern matching
-
When Cypher outperforms SQL (multi-hop, variable-depth, pattern matching)
-
When SQL may be preferable (simple aggregations, bulk operations)
-
How to choose the right query approach for your use case
In the next lesson, you will find resources for continuing your learning journey.