Data Queries

There are two fundamental types of AQL queries:

  • queries which access data (read documents)
  • queries which modify data (create, update, replace, delete documents)

Data Access Queries

Retrieving data from the database with AQL does always include a RETURN operation. It can be used to return a static value, such as a string:

RETURN "Hello ArangoDB!"

The query result is always an array of elements, even if a single element was returned and contains a single element in that case: ["Hello ArangoDB!"]

The function DOCUMENT() can be called to retrieve a single document via its document handle, for instance:

RETURN DOCUMENT("users/phil")

RETURN is usually accompanied by a FOR loop to iterate over the documents of a collection. The following query executes the loop body for all documents of a collection called users. Each document is returned unchanged in this example:

FOR doc IN users
  RETURN doc

Instead of returning the raw doc, one can easily create a projection:

FOR doc IN users
  RETURN { user: doc, newAttribute: true }

For every user document, an object with two attributes is returned. The value of the attribute user is set to the content of the user document, and newAttribute is a static attribute with the boolean value true.

Operations like FILTER, SORT and LIMIT can be added to the loop body to narrow and order the result. Instead of above shown call to DOCUMENT(), one can also retrieve the document that describes user phil like so:

FOR doc IN users
  FILTER doc._key == "phil"
  RETURN doc

The document key is used in this example, but any other attribute could equally be used for filtering. Since the document key is guaranteed to be unique, no more than a single document can match this filter. For other attributes this may not be the case. To return a subset of active users (determined by an attribute called status), sorted by name in ascending order, you can do:

FOR doc IN users
  FILTER doc.status == "active"
  SORT doc.name
  LIMIT 10

Note that operations do not have to occur in a fixed order and that their order can influence the result significantly. Limiting the number of documents before a filter is usually not what you want, because it easily misses a lot of documents that would fulfill the filter criterion, but are ignored because of a premature LIMIT clause. Because of the aforementioned reasons, LIMIT is usually put at the very end, after FILTER, SORT and other operations.

See the High Level Operations chapter for more details.

Data Modification Queries

AQL supports the following data modification operations:

  • INSERT: insert new documents into a collection
  • UPDATE: partially update existing documents in a collection
  • REPLACE: completely replace existing documents in a collection
  • REMOVE: remove existing documents from a collection
  • UPSERT: conditionally insert or update documents in a collection

You can use them to modify the data of one or multiple documents with a single query. This is superior to fetching and updating the documents individually with multiple queries. However, if only a single document needs to be modified, ArangoDB’s specialized data modification operations for single documents might execute faster.

Below you find some simple example queries that use these operations. The operations are detailed in the chapter High Level Operations.

Modifying a single document

Let’s start with the basics: INSERT, UPDATE and REMOVE operations on single documents. Here is an example that inserts a document into a collection called users with the INSERT operation:

INSERT {
  firstName: "Anna",
  name: "Pavlova",
  profession: "artist"
} INTO users

The collection needs to exist before executing the query. AQL queries cannot create collections.

If you run the above query, the result is an empty array because we did not specify what to return using a RETURN keyword. It is optional in modification queries, but mandatory in data access queries. Despite the empty result, the above query still creates a new user document.

You may provide a key for the new document; if not provided, ArangoDB creates one for you.

INSERT {
  _key: "GilbertoGil",
  firstName: "Gilberto",
  name: "Gil",
  city: "Fortalezza"
} INTO users

As ArangoDB is schema-free, attributes of the documents may vary:

INSERT {
  _key: "PhilCarpenter",
  firstName: "Phil",
  name: "Carpenter",
  middleName: "G.",
  status: "inactive"
} INTO users
INSERT {
  _key: "NatachaDeclerck",
  firstName: "Natacha",
  name: "Declerck",
  location: "Antwerp"
} INTO users 

The UPDATE operation lets you add or change attributes of existing documents. The following query modifies a previously created user, changing the status attribute and adding a location attribute:

UPDATE "PhilCarpenter" WITH {
  status: "active",
  location: "Beijing"
} IN users

The REPLACE operation is an alternative to the UPDATE operation that lets you replace all attributes of a document (except for attributes that cannot be changed, like _key):

REPLACE {
  _key: "NatachaDeclerck",
  firstName: "Natacha",
  name: "Leclerc",
  status: "active",
  level: "premium"
} IN users

You can delete a document with the REMOVE operation, only requiring the document key to identify it:

REMOVE "GilbertoGil" IN users

Modifying multiple documents

Data modification operations are normally combined with FOR loops to iterate over a given list of documents. They can optionally be combined with FILTER statements and the like.

To create multiple new documents, use the INSERT operation together with FOR. You can also use INSERT to generate copies of existing documents from other collections, or to create synthetic documents (e.g. for testing purposes). The following query creates 1000 test users with some attributes and stores them in the users collection:

FOR i IN 1..1000
  INSERT {
    id: 100000 + i,
    age: 18 + FLOOR(RAND() * 25),
    name: CONCAT('test', TO_STRING(i)),
    status: i % 2 == 0 ? "active" : "not active",
    active: false,
    gender: i % 3 == 0 ? "male" : i % 3 == 1 ? "female" : "diverse"
  } IN users

Let’s modify existing documents that match some condition:

FOR u IN users
  FILTER u.status == "not active"
  UPDATE u WITH { status: "inactive" } IN users

You can also update existing attributes based on their previous value:

FOR u IN users
  FILTER u.active == true
  UPDATE u WITH { numberOfLogins: u.numberOfLogins + 1 } IN users

The above query only works if there is already a numberOfLogins attribute present in the document. If it is unclear whether there is a numberOfLogins attribute in the document, the increase must be made conditional:

FOR u IN users
  FILTER u.active == true
  UPDATE u WITH {
    numberOfLogins: HAS(u, "numberOfLogins") ? u.numberOfLogins + 1 : 1
  } IN users

Updates of multiple attributes can be combined in a single query:

FOR u IN users
  FILTER u.active == true
  UPDATE u WITH {
    lastLogin: DATE_NOW(),
    numberOfLogins: HAS(u, "numberOfLogins") ? u.numberOfLogins + 1 : 1
  } IN users

Note than an update query might fail during execution, for example, because a document to be updated does not exist. In this case, the query aborts at the first error. In single server mode, all modifications done by the query are rolled back as if they never happened.

You can copy documents from one collection to another by reading from one collection but write to another. Let’s copy the contents of the users collection into the backup collection:

FOR u IN users
  INSERT u IN backup

Note that both collections must already exist when the query is executed. The query might fail if the backup collection already contains documents, as executing the insert might attempt to insert the same document (identified by the _key attribute) again. This triggers a unique key constraint violation and aborts the query. In single server mode, all changes made by the query are also rolled back. To make such a copy operation work in all cases, the target collection can be emptied beforehand, using a REMOVE query or by truncating it by other means.

To not just partially update, but completely replace existing documents, use the REPLACE operation. The following query replaces all documents in the backup collection with the documents found in the users collection. Documents common to both collections are replaced. All other documents remain unchanged. Documents are compared using their _key attributes:

FOR u IN users
  REPLACE u IN backup

The above query fails if there are documents in the users collection that are not in the backup collection yet. In this case, the query would attempt to replace documents that do not exist. If such case is detected while executing the query, the query is aborted. In single server mode, all changes made by the query are rolled back.

To make the query succeed regardless of the errors, use the ignoreErrors query option:

FOR u IN users
  REPLACE u IN backup OPTIONS { ignoreErrors: true }

This continues the query execution if errors occur during a REPLACE, UPDATE, INSERT, or REMOVE operation.

Finally, let’s find some documents in collection users and remove them from collection backup. The link between the documents in both collections is established via the documents’ keys:

FOR u IN users
  FILTER u.status == "deleted"
  REMOVE u IN backup

The following example removes all documents from both users and backup:

LET r1 = (FOR u IN users  REMOVE u IN users)
LET r2 = (FOR u IN backup REMOVE u IN backup)
RETURN true

Altering substructures

To modify lists in documents, for example, to update specific attributes of objects in an array, you can compute a new array and then update the document attribute in question. This may involve the use of subqueries and temporary variables.

Create a collection named complexCollection and run the following query:

FOR doc IN [
  {
    "topLevelAttribute": "a",
    "subList": [
      {
        "attributeToAlter": "value to change",
        "filterByMe": true
      },
      {
        "attributeToAlter": "another value to change",
        "filterByMe": true
      },
      {
        "attributeToAlter": "keep this value",
        "filterByMe": false
      }
    ]
  },
  {
    "topLevelAttribute": "b",
    "subList": [
      {
        "attributeToAlter": "keep this value",
        "filterByMe": false
      }
    ]
  }
] INSERT doc INTO complexCollection

The following query updates the subList top-level attribute of documents. The attributeToAlter values in the nested object are changed if the adjacent filterByMe attribute is true:

FOR doc in complexCollection
  LET alteredList = (
    FOR element IN doc.subList
       RETURN element.filterByMe
              ? MERGE(element, { attributeToAlter: "new value" })
              : element
  )
  UPDATE doc WITH { subList: alteredList } IN complexCollection
  RETURN NEW
[
  {
    "_key": "2607",
    "_id": "complexCollection/2607",
    "_rev": "_fWb_iOO---",
    "topLevelAttribute": "a",
    "subList": [
      {
        "attributeToAlter": "new value",
        "filterByMe": true
      },
      {
        "attributeToAlter": "new value",
        "filterByMe": true
      },
      {
        "attributeToAlter": "keep this value",
        "filterByMe": false
      }
    ]
  },
  {
    "_key": "2608",
    "_id": "complexCollection/2608",
    "_rev": "_fWb_iOO--_",
    "topLevelAttribute": "b",
    "subList": [
      {
        "attributeToAlter": "keep this value",
        "filterByMe": false
      }
    ]
  }
]

To improve the query’s performance, you can only update documents if there is a change to the subList to be saved. Instead of comparing the current and the altered list directly, you may compare their hash values using the HASH() function, which is faster for larger objects and arrays. You can also replace the subquery with an inline expression:

FOR doc in complexCollection
  LET alteredList = doc.subList[*
    RETURN CURRENT.filterByMe
    ? MERGE(CURRENT, { attributeToAlter: "new value" })
    : CURRENT
  ]
  FILTER HASH(doc.subList) != HASH(alteredList)
  UPDATE doc WITH { subList: alteredList } IN complexCollection
  RETURN NEW

Returning documents

Data modification queries can optionally return documents. In order to reference the inserted, removed or modified documents in a RETURN statement, data modification statements introduce the OLD and/or NEW pseudo-values:

FOR i IN 1..100
  INSERT { value: i } IN test 
  RETURN NEW
FOR u IN users
  FILTER u.status == "deleted"
  REMOVE u IN users 
  RETURN OLD
FOR u IN users
  FILTER u.status == "not active"
  UPDATE u WITH { status: "inactive" } IN users 
  RETURN NEW

NEW refers to the inserted or modified document revision, and OLD refers to the document revision before update or removal. INSERT statements can only refer to the NEW pseudo-value, and REMOVE operations only to OLD. UPDATE, REPLACE and UPSERT can refer to either.

In all cases, the full documents are returned with all their attributes, including the potentially auto-generated attributes, such as _id, _key, and _rev, and the attributes not specified in the update expression of a partial update.

Projections of OLD and NEW

It is possible to return a projection of the documents with OLD or NEW instead of returning the entire documents. This can be used to reduce the amount of data returned by queries.

For example, the following query returns only the keys of the inserted documents:

FOR i IN 1..100
  INSERT { value: i } IN test 
  RETURN NEW._key

Using OLD and NEW in the same query

For UPDATE, REPLACE, and UPSERT operations, both OLD and NEW can be used to return the previous revision of a document together with the updated revision:

FOR u IN users
  FILTER u.status == "not active"
  UPDATE u WITH { status: "inactive" } IN users 
  RETURN { old: OLD, new: NEW }

Calculations with OLD or NEW

It is also possible to run additional calculations with LET statements between the data modification part and the final RETURN of an AQL query. For example, the following query performs an upsert operation and returns whether an existing document was updated, or a new document was inserted. It does so by checking the OLD variable after the UPSERT and using a LET statement to store a temporary string for the operation type:

UPSERT { name: "test" }
  INSERT { name: "test" }
  UPDATE { } IN users
LET opType = IS_NULL(OLD) ? "insert" : "update"
RETURN { _key: NEW._key, type: opType }

Restrictions

The name of the modified collection (users and backup in the above cases) must be known to the AQL executor at query-compile time and cannot change at runtime. Using a bind parameter to specify the collection name is allowed.

It is not possible to use multiple data modification operations for the same collection in the same query, or follow up a data modification operation for a specific collection with a read operation for the same collection. Neither is it possible to follow up any data modification operation with a traversal query (which may read from arbitrary collections not necessarily known at the start of the traversal).

That means you may not place several REMOVE or UPDATE statements for the same collection into the same query. It is however possible to modify different collections by using multiple data modification operations for different collections in the same query. In case you have a query with several places that need to remove documents from the same collection, it is recommended to collect these documents or their keys in an array and have the documents from that array removed using a single REMOVE operation.

Data modification operations can optionally be followed by LET operations to perform further calculations and a RETURN operation to return data.

Transactional Execution

On a single server, data modification operations are executed transactionally. If a data modification operation fails, any changes made by it are rolled back automatically as if they never happened.

If the RocksDB engine is used and intermediate commits are enabled, a query may execute intermediate transaction commits in case the running transaction (AQL query) hits the specified size thresholds. In this case, the query’s operations carried out so far are committed and not rolled back in case of a later abort/rollback. That behavior can be controlled by adjusting the intermediate commit settings for the RocksDB engine.

In a cluster, AQL data modification queries are not executed transactionally. Additionally, AQL queries with UPDATE, REPLACE, UPSERT, or REMOVE operations require the _key attribute to be specified for all documents that should be modified or removed, even if a shard key attribute other than _key is chosen for the collection.