How to create a query module in Python
The Python API provided by Memgraph lets you develop query modules. It is accompanied by the mock API, which makes it possible to develop and test query modules for Memgraph without having to run a Memgraph instance.
In this tutorial, we will learn how to develop a query module in Python on the example of the random walk algorithm.
Prerequisitesβ
There are three options for installing and working with Memgraph MAGE:
- Pulling the
memgraph/memgraph-mage
image: check theDocker Hub
installation guide. - Building a Docker image from the MAGE repository: check the
Docker build
installation guide. - Building MAGE from source: check the
Build from source on Linux
installation guide.
Developing a moduleβ
These steps are the same for all MAGE installation options (Docker Hub, Docker build and Build from source on Linux).
Position yourself in the MAGE repository you cloned earlier. Specifically,
go in the python
subdirectory and create a new file named random_walk.py
.
mage
βββ python
βββ random_walk.py
For coding the query module, weβll use the
mgp
package that has the Memgraph Python
API including the key graph data structures:
Vertex and
Edge.
To install mgp
, run pip install mgp
.
Here's the code for the random walk algorithm:
import mgp
import random
@mgp.read_proc
def get_path(
start: mgp.Vertex,
length: int = 10,
) -> mgp.Record(path=mgp.Path):
"""Generates a random path of length `length` or less starting
from the `start` vertex.
:param mgp.Vertex start: The starting node of the walk.
:param int length: The number of edges to traverse.
:return: Random path.
:rtype: mgp.Record(mgp.Path)
"""
path = mgp.Path(start)
vertex = start
for _ in range(length):
try:
edge = random.choice(list(vertex.out_edges))
path.expand(edge)
vertex = edge.to_vertex
except IndexError:
break
return mgp.Record(path=path)
The get_path
is decorated with the @mgp.read_proc
decorator, which tells
Memgraph it's a read
procedure, meaning it won't make changes to the graph.
The path is created from the start
node, and edges are appended to it
iteratively.
Terminate procedure executionβ
Just as the execution of a Cypher query can be terminated with TERMINATE
TRANSACTIONS
"id";
query,
the execution of the procedure can as well, if it takes too long to yield a
response or gets stuck in an infinite loop due to unpredicted input data.
Transaction ID is visible upon calling the SHOW TRANSACTIONS; query.
In order to be able to terminate the procedure, it has to contain function
ctx.check_must_abort()
which precedes crucial parts of the code, such as
while
and until
loops, or similar points where the procedure might become
costly.
Consider the following example:
import mgp
@mgp.read_proc
def long_query(ctx: mgp.ProcCtx) -> mgp.Record(my_id=int):
id = 1
try:
while True:
if ctx.check_must_abort():
break
id += 1
except mgp.AbortError:
return mgp.Record(my_id=id)
The mgp.AbortError:
ensures that the correct message about termination is sent
to the session where the procedure was originally run.
Importing, querying and testing a moduleβ
Now in order to import, query and test a module, check out the following page.
Feel free to create an issue or open a pull request on our GitHub
repo to speed up the development.
Also, don't forget to throw us a star on GitHub. β
Working with the mock APIβ
The mock Python API lets you develop and test query modules for Memgraph without having to run a Memgraph instance. As itβs compatible with the Python API you can add modules developed with it to Memgraph as-is, without having to refactor your code.
The documentation on importing the mock API and running query modules with it is available here, accompanied by examples.