Create your graph data model

Creating your graph data model

You have learned how to prepare your data and add it to the Data Importer.

In this lesson, you will learn: How to create a graph data model that maps your CSV data to nodes and relationships. You will learn how to define nodes, relationships, and properties using the Data Importer interface.

In this lesson, you will learn how to:

Create a data model in the Data Importer
Define Movie and Person nodes with their properties
Define ACTED_IN relationships connecting actors to movies
Review and confirm your model before importing

Step 4: Create your data model

The data model defines how your CSV data becomes a graph. In the Movies dataset example, you will create: * Movie nodes - Each movie becomes a node with properties like title and movieId * Person nodes - Each actor becomes a node with properties like name and personId * ACTED_IN relationships - These connections link actors to movies, enabling queries that traverse relationships

How to do it: Click Create model manually to start building your graph structure.

Understanding nodes and relationships in context

How this relates to Neo4j Fundamentals:

If you’ve taken the Neo4j Fundamentals course, you learned that graphs consist of: * Nodes (vertices) - Entities in your domain (like movies and actors) * Relationships (edges) - Connections between entities (like ACTED_IN) * Properties - Attributes stored on nodes and relationships (like title on Movie nodes, characters on ACTED_IN relationships) * Labels - Categories for nodes (like Movie and Person)

In this import, you’re creating these graph elements from your CSV data: * Each unique movie becomes a Movie node with a title property * Each unique actor becomes a Person node with a name property * Each actor-movie pair becomes an ACTED_IN relationship with a characters property

How this relates to Graph Data Modeling Fundamentals:

If you’ve taken the Graph Data Modeling Fundamentals course, you learned about: * Instance models - The actual nodes and relationships in your graph (what you’re creating now) * Domain models - The conceptual design of your graph (Movie and Person connected by ACTED_IN)

The Data Importer helps you create an instance model from your CSV. You’re deciding: * Which entities become nodes (Movie, Person) * Which connections become relationships (ACTED_IN) * Which CSV columns become properties (title, name, characters)

This modeling step is critical—a well-designed model makes recommendation queries fast and intuitive.

How this relates to Importing Data Fundamentals:

If you’ve taken the Importing Data Fundamentals course, you learned about: * Import methods - Different ways to load data (Data Importer, LOAD CSV, neo4j-admin import) * Data transformation - Converting tabular data (CSV) into graph structures (nodes and relationships) * Unique constraints - Ensuring nodes aren’t duplicated (using movieId and personId as unique identifiers)

In this lesson, you’re using the Data Importer (a visual, no-code tool) to transform your CSV into a graph. The Data Importer automatically handles: * Creating unique nodes (using movieId and personId as keys) * Mapping CSV columns to node and relationship properties * Generating efficient Cypher statements for the import

Key takeaway: Nodes and relationships are the building blocks you learned about in Neo4j Fundamentals. The modeling principles from Modeling Fundamentals guide how you structure them. The import techniques from Importing Fundamentals show you how to create them from your data. This lesson combines all three—you’re applying fundamentals to create a graph data model.

Step 5: Define Movie nodes

In the Movies dataset example, Movie nodes represent the main entities. Each Movie node has properties (title, movieId) that you can use in queries.

How to do it:

Click the Add node label button (or the + icon)
In the details panel on the right, set the label to Movie
Click Map from table to connect CSV columns to node properties
Map movieId → This becomes the unique identifier for each Movie node
Map title → This becomes a property to search and display

How it works in the background: When you map movieId and title, the Data Importer will create Cypher statements like:

CREATE (m:Movie {movieId: '123', title: 'The Matrix'})

This creates Movie nodes that your recommendation queries can traverse.

After adding the label, you can edit the model structure to refine how your CSV data maps to graph elements.

Step 6: Define Person nodes

In the Movies dataset example, Person nodes represent actors. When you query "Find movies with Tom Hanks," you’re traversing from a Person node through ACTED_IN relationships to Movie nodes.

How to do it:

Click the Add node label button again to create a second node type
Set the label to Person
Click Map from table
Map personId → Unique identifier for each Person node
Map name → Property to search (e.g., "Tom Hanks")

Example for recommendations: Once imported, you’ll be able to query:

MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie)
RETURN m.title

This finds all movies Tom Hanks acted in—the foundation of actor-based recommendations.

Optional: Edit property types by clicking the pencil icon next to each property. For example, you might want to ensure personId is stored as an integer for better query performance.

Step 7: Define ACTED_IN relationships

Relationships connect nodes in your graph. The ACTED_IN relationship connects Person nodes to Movie nodes, enabling queries like: * "Find all movies with the same actors" (traverse from Movie through ACTED_IN to Person, then back to other Movies) * "Find actors who worked together" (find two Person nodes connected to the same Movie)

How to do it:

Hover over the edge of the Person node—you’ll see a plus-sign (+)
Click and drag from Person to Movie node
Name the relationship type ACTED_IN
The Data Importer automatically maps personId and movieId to connect the right nodes
Click Map from table and select characters—this stores the character name as a property on the relationship

How it works in the background: The Data Importer creates Cypher statements like:

MATCH (p:Person {personId: '123'}), (m:Movie {movieId: '456'})
CREATE (p)-[:ACTED_IN {characters: ['Neo']}]->(m)

This creates the connections that enable relationship traversal in your graph.

Verification: The green checkmark indicates that the relationship mapping is correct. Your model now shows Person nodes connected to Movie nodes via ACTED_IN relationships—exactly what you need for recommendation queries.

Step 8: Review and confirm your model

Before importing, verify that your model correctly maps CSV data to graph structure. Incorrect mappings mean your queries won’t work.

How to do it:

Review the model diagram—you should see Person and Movie nodes connected by ACTED_IN relationships
Click on each node to verify property mappings (movieId, title for Movie; personId, name for Person)
Verify the ACTED_IN relationship maps personId and movieId correctly
Confirm primary keys: The Data Importer uses movieId and personId as unique identifiers to avoid creating duplicate nodes

How it works in the background: The Data Importer analyzes your CSV to ensure: * No duplicate nodes (uses movieId/personId as unique keys) * All relationships can be created (both Person and Movie nodes exist) * Data types are correct (strings, numbers, etc.)

Summary

In this lesson, you created your graph data model. You:

Created your data model: Started building the graph structure in the Data Importer
Defined Movie nodes: Mapped movieId and title properties to Movie nodes
Defined Person nodes: Mapped personId and name properties to Person nodes
Defined ACTED_IN relationships: Connected Person and Movie nodes with relationship properties
Reviewed your model: Verified all mappings are correct before importing

The graph structure you created (Person -[:ACTED_IN]→ Movie) enables queries. This model allows you to: * Find movies with the same actors * Discover actors who worked together * Identify similar movies based on shared cast

Check your understanding

Model Structure

In the movie recommendation model, what connects Person nodes to Movie nodes?

❏ A shared property value
✓ ACTED_IN relationships
❏ Both nodes have the same label
❏ They are in the same subgraph

Hint

Relationships in a graph connect nodes. For actors and movies, the relationship represents the actor appearing in the movie.

Solution

ACTED_IN relationships connect Person nodes to Movie nodes.

Relationships are the connections between nodes in a graph. The ACTED_IN relationship connects Person (actor) nodes to Movie nodes, enabling queries like "Find all movies with Tom Hanks" by traversing from Person through ACTED_IN to Movie.

Summary

In this lesson, you created your graph data model. You:

Created your data model: Started building the graph structure in the Data Importer
Defined Movie nodes: Mapped movieId and title properties to Movie nodes
Defined Person nodes: Mapped personId and name properties to Person nodes
Defined ACTED_IN relationships: Connected Person and Movie nodes with relationship properties
Reviewed your model: Verified all mappings are correct before importing

What’s next: In the next lesson, you’ll run the import and verify that your movie data was loaded correctly into your Aura instance.

For more information on data modeling, see the Neo4j Aura Import documentation.

In the next lesson, you’ll run the import and verify your data was loaded correctly.

AuraDB Fundamentals

Introduction to Neo4j Aura

Getting Started

Tools

Operations

Create your graph data model

Creating your graph data model

Step 4: Create your data model

Understanding nodes and relationships in context

Step 5: Define Movie nodes

Step 6: Define Person nodes

Step 7: Define ACTED_IN relationships

Step 8: Review and confirm your model

Summary

Check your understanding

Model Structure

Summary

Chatbot