Overview
The GraphRAG for Python package (neo4j-graphrag) allows you to access Neo4j Generative AI functions.
During this course you will use the neo4j_graphrag package to build a knowledge graph and retrievers to extract information from the graph using LLMs.
In this lesson you will review how a graph schema can be extracted from text using an LLM.
Using the SchemaFromTextExtractor
Open workshop-genai/extract_schema.py
import os
from dotenv import load_dotenv
load_dotenv()
from neo4j_graphrag.experimental.components.schema import SchemaFromTextExtractor
from neo4j_graphrag.llm import OpenAILLM
import asyncio
schema_extractor = SchemaFromTextExtractor(
llm = OpenAILLM(
model_name="gpt-5-nano",
),
use_structured_output=True,
)
text = """
Neo4j is a graph database management system (GDBMS) developed by Neo4j Inc.
"""
# Extract the schema from the text
extracted_schema = asyncio.run(schema_extractor.run(text=text))
print(extracted_schema)The code uses the SchemaFromTextExtractor class to extract a schema from a given text input.
The extractor:
-
Creates a prompt instructing the LLM to:
-
Identify entities and relationships in any given text
-
Format the output as JSON
-
-
Passes the prompt and text to the LLM for processing
-
Parses the JSON response to create a schema object
Output
Given the text, "Neo4j is a graph database management system (GDBMS) developed by Neo4j Inc.", a simplified version of the extracted schema would be:
node_types=(
NodeType(label='GraphDatabase),
NodeType(label='Company')
)
relationship_types=(
RelationshipType(label='DEVELOPED_BY'),
)
patterns=(
('GraphDatabaseManagementSystem', 'DEVELOPED_BY', 'Company')
)Execute
Run the program and observe the output. You will see a more detailed schema based on the text provided.
This schema can be used to store the data held within the text.
graph LR
A(("Neo4j<br/>(GraphDatabase)"))
B(("Neo4j<br/>(Company)"))
A -->|DEVELOPED_BY| BExperiment
Experiment with different text inputs to see how the schema extraction varies based on the content provided, for example:
-
Python is a programming language created by Guido van Rossum.
-
The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France.
-
Large Language Models (LLMs) are a type of artificial intelligence model designed to generate human-like text.
When you have experimented with the schema extraction, you can continue.
Lesson Summary
In this lesson, you:
-
Learned how to extract a graph schema from unstructured text using an LLM.
-
Explored how different text inputs can lead to different schema extractions.
In the next lesson, you will create a knowledge graph construction pipeline using the SimpleKGPipeline class.