Running algorithms

Introduction

You’ve learned about execution modes and configuration. Now it’s time to practice running algorithms with different configurations and see how those changes affect results.

In this lesson, you’ll run degree centrality, PageRank and Louvain community detection on the actor collaboration network, experimenting with algorithm-specific configurations.

By the end of this lesson, you will understand:

How to run algorithms on a directed graph
How to interpret results from a directed graph
How algorithm configuration affects results

Setup: Create the Actor Network

First, check if the 'actor-network' graph exists and drop it if it does:

cypher

Drop actor-network if it exists (replace ????)

CALL gds.????.????(????) // (1)
YIELD graphName // (2)
RETURN graphName // (3)

Query breakdown

Call the graph drop procedure with error suppression (fill in procedure and parameters)
Yield the graph name
Return the graph name

Now, let’s create a fresh actor collaboration projection in which Actor nodes who appeared in the same movies connect directly to each other:

cypher

Project actor collaboration network (replace ????)

MATCH ???? // (1)
????( // (2)
  'actor-network', // (3)
  source, // (4)
  target, // (5)
  {},
  {}
) ???? // (6)
RETURN g.graphName, g.nodeCount, g.relationshipCount // (7)

Projection breakdown

Match Actor nodes connected through Movie nodes (fill in pattern)
Call the GDS projection function (fill in function name)
Name the projection 'actor-network'
Include source nodes
Include target nodes
Alias the projection result (fill in alias)
Return projection statistics

This creates a directed graph of actors connected through shared movies. Every actor who appeared in a movie receives a relationship from every other actor who appeared in that movie.

Now let’s create a second directed projection of actors to movies. Note, the actor names are included for illustrative purposes only — GDS does not actually see them.

cypher

Project Actor-Movie network

MATCH (source:Actor)-[:ACTED_IN]->(target:Movie) // (1)
WITH gds.graph.project( // (2)
  'actor-movie-network', // (3)
  source, // (4)
  target, // (5)
  {}
) AS g
RETURN g.graphName, g.relationshipCount, g.nodeCount // (6)

Projection breakdown

Match Actor nodes connected to Movie nodes via ACTED_IN relationships
Call the GDS projection function
Name the projection 'actor-movie-network'
Include source (Actor) nodes
Include target (Movie) nodes
Return projection statistics

This creates a graph of Actor nodes connected to the Movie nodes in which they appeared.

In theory, one might expect these to return similar results — but let’s run some experiments to find out what really happens.

Degree centrality: default configuration

By now, you’re familiar with degree centrality. It counts how many outgoing relationships each node has.

Let’s run it in stream mode on our 'actor-network' graph with default settings:

cypher

Stream degree centrality (replace ????)

CALL gds.degree.????('actor-network', {}) // (1)
YIELD nodeId, score // (2)
RETURN gds.util.asNode(????).name AS actor, score AS degree // (3)
ORDER BY degree DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

Call degree centrality in stream mode with default configuration
Yield node IDs and degree scores
Convert node IDs to names (fill in nodeId parameter)
Sort by degree in descending order
Limit to top 10 results

Since relationships are directed, this counts outgoing relationships—how many other actors each actor is connected to.

In our current graph, actor collaborations are symmetrical; two actors who appeared in the same movie will each receive a relationship from the other.

You can reconfigure degree centrality to count relationships in the opposite direction.

Complete the query below and run it to see what happens:

cypher

Degree centrality with reversed orientation (replace ????)

???? gds.????.stream('????', { // (1)
  orientation: 'REVERSE' // (2)
})
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS actor, score AS degree // (4)
ORDER BY degree DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call degree centrality in stream mode (fill in the procedure call and graph name)
Configure orientation to count incoming relationships
Yield node IDs and degree scores
Convert node IDs to actor names and return with degree
Sort by degree in descending order
Limit to top 10 results

This counts incoming relationships instead. However, our graph is completely symmetrical — every actor is connected to their collaborators in both directions — so the results remain the same.

However, let’s now rerun degree centrality on our Actor → Movie graph.

cypher

Degree centrality on Actor-Movie network

???? ???.?????.?????('actor-movie-network', {}) // (1)
YIELD ??????, score // (2)
RETURN gds.util.asNode(??????).name AS actor, score AS degree // (3)
ORDER BY degree DESC // (4)
LIMIT 10 // (5)

Algorithm breakdown

Call degree centrality in stream mode on 'actor-movie-network'
Yield node IDs and degree scores
Convert node IDs to names and return with degree
Sort by degree in descending order
Limit to top 10 results

Running this should return a table that looks much like the two previous tables. Robert De Niro still has the most movies, so he is still number 1.

However, now run it again, reversing the orientation, as before.

cypher

Degree centrality with reversed orientation on Actor-Movie network

CALL gds.degree.stream('actor-movie-network', { // (1)
  orientation: 'REVERSE' // (2)
})
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).name AS actor, score AS degree // (4)
ORDER BY degree DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call degree centrality in stream mode on 'actor-movie-network'
Configure orientation to count incoming relationships
Yield node IDs and degree scores
Convert node IDs to names and return with degree
Sort by degree in descending order
Limit to top 10 results

The command returns a table that looks like this:

actor	degree
null	4.0
null	4.0
null	4.0
null	4.0
null	4.0
null	4.0
null	4.0
null	4.0
null	4.0
null	4.0

This has happened because the algorithm is now reversing the relationships. Remember, the bipartite graph looks like this:

Jim Varney and Tim Allen both connected to Toy Story.

However, when we reversed the relationships, the algorithm interpreted it as this:

Jim Varney and Tim Allen both connected from Toy Story.

With reversed relationships, PageRank now calculates the out-degree of Movie nodes instead, and so they receive the rank instead of the actors.

In the algorithm run, we specified:

RETURN gds.util.asNode(nodeId).name AS actor

However, we should have written:

RETURN gds.util.asNode(nodeId).title AS movie

Run it again now, with the correct node property:

cypher

Degree centrality with reversed orientation on Actor-Movie network

CALL gds.degree.stream('actor-movie-network', { // (1)
  orientation: 'REVERSE' // (2)
})
YIELD nodeId, score // (3)
RETURN gds.util.asNode(nodeId).title AS movie, score AS degree // (4)
ORDER BY degree DESC // (5)
LIMIT 10 // (6)

You should have received a table that looks like this:

movie	degree
"Sabrina"	4.0
"Sudden Death"	4.0
"Waiting to Exhale"	4.0
"Heat"	4.0
"Tom and Huck"	4.0
"Jumanji"	4.0
"Toy Story"	4.0
"Grumpier Old Men"	4.0
"Father of the Bride Part II"	4.0
"GoldenEye"	4.0

This exercise illustrates how the projection influences what we can gain from the algorithm.

When you reversed the relationships on the 'actor-network' projection, it went from this:

To this:

There was fundamentally no change between them because both relationships were reversed.

However, when you reversed the relationships on the 'actor-movie-network' graph, it went from this:

To this:

Toy Story connected to Tim Allen and Jim Varney via reversed relationships.

Think about what degree centrality is actually doing here. It’s counting outgoing relationships to define 'centrality'. So, when we change the direction, it is no longer determining the centrality of Actor nodes. It is determining the centrality of Movie nodes.

In this graph, all movies have a maximum cast of 4 members. So, when we use 'number of ACTED_IN relationships' as the metric, we receive a slew of top results, all with a score of 4.

The behavior of the algorithm is determined by the projection’s data model and the direction of its relationships.

Let’s test another configuration behavior with Louvain.

Louvain: Default configuration

Louvain detects communities by grouping nodes that are more densely connected to each other than to the rest of the network.

It does this by progressively lumping densely connected nodes together into hierarchical clusters.

For example, if you were to imagine what Louvain was doing, it might look something like this:

Level 1: No clustering

Level 2: Identifies natural clusters

5 clusters of nodes with a few connections between them

Level 3: Tries to make each cluster more internally connected

3 clusters of nodes with a few connections between them

Level 4: Clusters can no longer be any more internally connected than they currently are

2 clusters of nodes with a few connections between them

It doesn’t actually create these clusters with relationships, and it doesn’t literally move nodes around — it simply assigns them a communityId value.

The communityId value represents what clusters nodes would belong to if we hypothetically were to cluster them together. For example, in the images above, we created a 'Hub' node for each communityId.

We then connected nodes with that communityId to their relevant hubs to simulate what Louvain is seeing.

To see this in action, let’s run Louvain in stream mode with default settings:

cypher

Stream Louvain communities by size

CALL gds.louvain.stream('actor-network', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, count(*) AS numActors // (3)
RETURN communityId, numActors // (4)
ORDER BY numActors DESC // (5)
LIMIT 10 // (6)

Algorithm breakdown

Call Louvain algorithm in stream mode with default settings
Yield node IDs and their community assignments
Group by community and count members
Return community ID and size
Sort by size in descending order
Limit to top 10 communities

This shows the top 10 communities by size. Notice how many actors are in each community.

You will likely notice a clearly falling scale from the most populated network to the least populated.

Louvain: Analyzing default results

Let’s see some actual actors in the largest community:

cypher

View sample actors from largest community

CALL gds.louvain.stream('actor-network', {}) // (1)
YIELD nodeId, communityId // (2)
WITH communityId, COLLECT(gds.util.asNode(nodeId).name) AS actors, COUNT(*) AS size // (3)
ORDER BY size DESC // (4)
LIMIT 1 // (5)
RETURN communityId, size, actors[0..20] AS sampleActors // (6)

Algorithm breakdown

Call Louvain algorithm in stream mode with default settings
Yield node IDs and their community assignments
Group by community, collect actor names, and count members
Sort by size in descending order
Limit to the largest community only
Return community ID, size, and first 10 actor names as sample

These actors likely work in similar circles, even if not together. You will also notice some more famous names here.

However, the power of Louvain — and graph — is that no two actors need to have worked directly with one another to be included in the same group.

Imagine that two actors had never met, but through a two-hop relationship, they are both connected to the same densely connected group of actors. They may be considered part of the same group if:

They are more densely connected to other actors in this group than any other
Moving them from any other group to this group does not decrease the overall average density of all clusters in the graph

For example, you likely got 'Tim Allen' and 'Al Pacino' in your top results. But they have never starred in a movie together.

These are not two actors that you would think of putting together. Yet, when we check the full IMDb graph, we see that they are far more connected than one might initially presume.

Tim Allen and Al Pacino connected to each other through multiple actors.

Louvain: Custom Configuration

Louvain has several configuration options. Let’s experiment with maxLevels.

By default, Louvain runs with 10 levels. That means that it will attempt to create more modular — and usually larger — clusters up to ten times.

Run the queries below, in sequence, to see what happens when we run Louvain at different maxLevels settings:

cypher

Louvain with maxLevels: 1

CALL gds.louvain.stats('actor-network', { // (1)
  maxLevels: 1 // (2)
})
YIELD communityCount, modularity, ranLevels // (3)
RETURN communityCount, modularity, ranLevels // (4)

Algorithm breakdown

Call Louvain in stats mode on 'actor-network'
Set maximum hierarchy levels to 1
Yield community statistics
Return community count, modularity, and levels run

Note the communityCount and try with 2 levels:

cypher

Louvain with maxLevels: 2

CALL gds.louvain.stats('actor-network', { // (1)
  maxLevels: 2 // (2)
})
YIELD communityCount, modularity, ranLevels // (3)
RETURN communityCount, modularity, ranLevels // (4)

Algorithm breakdown

Call Louvain in stats mode on 'actor-network'
Set maximum hierarchy levels to 2
Yield community statistics
Return community count, modularity, and levels run

Notice here how the communityCount drops significantly. This happens because Louvain has found that these smaller communities are more interconnected togther than they are apart.

Try again with maxLevels 20. Replace the ???? with the correct values and run the algorithm:

cypher

Louvain with maxLevels: 20 (replace ????)

CALL gds.louvain.stats('actor-network', { // (1)
???? // (2)
})
YIELD communityCount, modularity, ranLevels // (3)
RETURN communityCount, modularity, ranLevels // (4)

Algorithm breakdown

Call Louvain in stats mode on 'actor-network'
Set maximum hierarchy levels configuration (fill in maxLevels: 20)
Yield community statistics
Return community count, modularity, and levels run

Even though we told Louvain to try 20 levels, it has stopped after only 4. That happens because the communities have 'converged'.

In other words, the connections within each community are not going to become any more dense through further iterations — so the algorithm stops.

There are many other configuration settings to play around with. You can review these on the Louvain docs.

Check your understanding

Degree centrality orientation

You run Degree Centrality with orientation: 'NATURAL' on a directed actor collaboration network. What is the algorithm counting?

✓ Outgoing relationships—how many other actors each actor is connected to
❏ Incoming relationships—how many actors connect to each actor
❏ Total relationships—all connections regardless of direction
❏ Bidirectional relationships—only connections that go both ways

Hint

'NATURAL' orientation means the algorithm follows the relationships as they exist in the projection. In a directed graph, what direction do the relationships go?

Solution

Outgoing relationships—how many other actors each actor is connected to.

Degree Centrality with orientation: 'NATURAL' counts relationships in their natural direction as they exist in the projection.

In a directed graph, this means counting outgoing relationships from each node.

The three orientation options are:

'NATURAL': Counts outgoing relationships (follows projection direction)
'REVERSE': Counts incoming relationships
'UNDIRECTED': Counts all relationships regardless of direction

In an actor collaboration network with directed relationships, NATURAL tells you how many co-stars each actor is connected to (out-degree).

Understanding orientation is crucial because it changes what "centrality" means: are you measuring who connects outward (out-degree), who receives connections (in-degree), or total connectivity (undirected)?

Summary

Degree centrality counts relationships. You can configure orientation to count outgoing, incoming, or total connections.

Louvain detects communities through hierarchical clustering. Configuration options like maxLevels control the granularity of community detection. Use stats mode to test different configurations and find the right balance for your analysis.

You’ve practiced running degree centrality and Louvain with different configurations on a directed graph. You’ve seen how algorithm parameters affect results. In the next lesson, you’ll learn how projection configuration affects algorithm behavior—specifically, how to change relationships to undirected and run PageRank and Leiden. Before you continue, play around with the algorithm configurations in the sandbox or your own environment if you set one up.

Get started with Graph Data Science

Get started with the Graph Data Science library

GDS basic concepts

Working with algorithms

Essential projection techniques

Running algorithms

Introduction

Setup: Create the Actor Network

Degree centrality: default configuration

Louvain: Default configuration

Louvain: Analyzing default results

Louvain: Custom Configuration

Check your understanding

Degree centrality orientation

Summary

Chatbot