Project a graph

Introduction

In the previous lesson, you learned the three-step GDS workflow: Project → Run → Write. You saw how this pattern enables fast, safe, iterative analysis by separating your source data from algorithmic operations.

This lesson focuses exclusively on Step 1: Projection—the foundation of all GDS work.

By the end of this lesson, you will understand:

How Cypher projections work in detail
What graph structures you’re creating when you project
Why different projection types matter for algorithms

Algorithm requirements drive projection choices

Different algorithms have different requirements for graph structure. Some algorithms work optimally on monopartite graphs (single node type), while others are designed for bipartite graphs (two distinct node types). As you learn projection techniques throughout this module, keep in mind that your projection choices should be guided by which algorithms you plan to use. You’ll learn more about algorithm-specific requirements in Module 3.

Cypher Projection Anatomy

Let’s revisit the example from the previous lesson and break it down completely.

cypher

Basic Cypher projection example

MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actors-graph', // (3)
  source, // (4)
  target // (5)
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (6)

Projection breakdown

Match Actor nodes connected to Movie nodes via ACTED_IN relationships
Call the GDS projection function
Name the projection 'actors-graph'
Include source nodes (Actors) in the projection
Include target nodes (Movies) in the projection
Return projection statistics

This projection has three main components:

1. The Cypher Pattern

cypher

Match the pattern

MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie) // (1)

Query breakdown

Match Actor nodes connected to Movie nodes via ACTED_IN relationships

This is standard Cypher. You’re matching a pattern in your database:

source nodes with the Actor label
target nodes with the Movie label
ACTED_IN relationships connecting them

The variable names source and target are Cypher variables—you’ll reference them in the projection call. While you could technically use any variable names, we consistently use source and target because:

They clearly indicate directionality in the pattern
Many GDS configuration options reference these conventions
Descriptive names improve code readability and maintainability

2. The projection call

cypher

Project with configuration options

WITH gds.graph.project( // (1)
  'projection-call', // (2)
  source, // (3)
  target, // (4)
  {}, // (5)
  {} // (6)
) AS g

Projection breakdown

Call the GDS projection function
Name the projection 'actors-graph'
Include source nodes in the projection
Include target nodes in the projection
First configuration map (for relationship properties)
Second configuration map (for projection options)

The WITH clause pipes your matched pattern into gds.graph.project():

'actors-graph': The name you’ll use to reference this projection when running algorithms

source: Nodes matched by your source variable become nodes in the projection

target: Nodes matched by your target variable also become nodes in the projection

Relationships: Automatically inferred from the pattern between source and target

The curly brackets are there to house your other configuration settings. To create a simple projection, you do not need them. In fact, you could rewrite the same projection command without even referencing them.

cypher

Simplified projection without configuration

WITH gds.graph.project( // (1)
  'actors-graph', // (2)
  source, // (3)
  target // (4)
) AS g

Projection breakdown

Call the GDS projection function
Name the projection 'actors-graph'
Include source nodes in the projection
Include target nodes in the projection

We include them here just to acknowledge them, and remind you that they still exist in the background.

3. The return statement

You have already dealt with this — it’s really no different from your normal return query when using pure Cypher.

cypher

Return projection metadata

RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels // (1)

Query breakdown

Return the graph name, node count, and relationship count

This returns metadata about your projection:

How many nodes were projected
How many relationships were created
The graph name for verification

Check your understanding

Understanding Projection Components

You run this projection:

cypher

MATCH (source:Author)-[r:WROTE]->(target:Article)
WITH gds.graph.project('research-network', source, target) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

What nodes are included in the 'research-network' projection?

❏ Only Author nodes
❏ Only Article nodes
✓ Both Author nodes and Article nodes
❏ All nodes in the database

Hint

Look at what variables are passed into gds.graph.project(). Both source and target are included.

Solution

Both Author nodes and Article nodes is correct.

When you pass source and target to gds.graph.project(), both sets of matched nodes become part of the projection:

source (Author nodes) are included
target (Article nodes) are included
The WROTE relationships between them are automatically inferred

This creates a bipartite graph with both node types.

Complete the Projection

Complete the Cypher projection below to create a graph of Actors and Movies:

cypher

MATCH (source:Actor)-[r:ACTED_IN]->(target:Movie)
WITH gds.graph.project(
  'my-projection',
  /*select:source*/,
  target
) AS g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels

✓ source
❏ Actor
❏ :Actor
❏ sourceNode

Hint

You need to pass the Cypher variable that represents the Actor nodes.

Solution

The answer is source.

In gds.graph.project(), you pass the Cypher variables (not the labels) that were defined in your MATCH pattern. Since we matched (source:Actor), we pass source to include those nodes in the projection.

Summary

You now understand how Cypher projections work: you match a pattern with standard Cypher, then pipe it into gds.graph.project() to create an in-memory graph.

A projection has three components:

Cypher pattern - Standard MATCH defining which nodes and relationships to include
Projection call - WITH gds.graph.project() piping the matched pattern into GDS
Return statement - Returns metadata about the created projection

But there’s an important detail about what you’ve just projected that might surprise you. In the next lesson, you’ll learn about different graph structure types and discover what type of graph your actors-movies projection actually created.

Get started with Graph Data Science

Get started with the Graph Data Science library

GDS basic concepts

Working with algorithms

Essential projection techniques

Project a graph

Introduction

Cypher Projection Anatomy

1. The Cypher Pattern

2. The projection call

3. The return statement

Check your understanding

Understanding Projection Components

Complete the Projection

Summary

Chatbot