HTTP interface for WAL access
The WAL Access API is used to facilitate faster and
more reliable asynchronous replication. The API offers access to the
write-ahead log or operations log of the ArangoDB server. As a public
API, it is only supported to access these REST endpoints on a single-server
instance. While these APIs are also available on DB-Server instances, accessing them
as a user is not supported. This API replaces some of the APIs in /_api/replication
.
Return tick ranges available in the operations of WAL
returns the tick ranges available in the write-ahead-log
GET /_api/wal/range
Returns the currently available ranges of tick values for all WAL files. The tick values can be used to determine if certain data (identified by tick value) are still available for replication.
The body of the response contains a JSON object.
- tickMin: minimum tick available
- tickMax: maximum tick available
- time: the server time as string in format “YYYY-MM-DDTHH:MM:SSZ”
- server: An object with fields version and serverId
Responses
HTTP 200: is returned if the tick ranges could be determined successfully.
HTTP 405: is returned when an invalid HTTP method is used.
HTTP 500: is returned if the server operations state could not be determined.
HTTP 501: is returned when this operation is called on a Coordinator in a cluster.
Examples
Returns the available tick ranges.
shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/wal/range
HTTP/1.1 200 OK
content-type: application/json
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0, max-age=0, s-maxage=0
connection: Keep-Alive
content-length: 128
content-security-policy: frame-ancestors 'self'; form-action 'self';
expires: 0
pragma: no-cache
server: ArangoDB
strict-transport-security: max-age=31536000 ; includeSubDomains
x-arango-queue-time-seconds: 0.000000
x-content-type-options: nosniff
Return last available tick value
Return last available tick value
GET /_api/wal/lastTick
Returns the last available tick value that can be served from the server’s replication log. This corresponds to the tick of the latest successful operation.
The result is a JSON object containing the attributes tick, time and server.
- tick: contains the last available tick, time
- time: the server time as string in format “YYYY-MM-DDTHH:MM:SSZ”
- server: An object with fields version and serverId
Note: this method is not supported on a Coordinator in a cluster.
Responses
HTTP 200: is returned if the request was executed successfully.
HTTP 405: is returned when an invalid HTTP method is used.
HTTP 500: is returned if an error occurred while assembling the response.
HTTP 501: is returned when this operation is called on a Coordinator in a cluster.
Examples
Returning the first available tick
shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/wal/lastTick
HTTP/1.1 200 OK
content-type: application/json
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0, max-age=0, s-maxage=0
connection: Keep-Alive
content-length: 106
content-security-policy: frame-ancestors 'self'; form-action 'self';
expires: 0
pragma: no-cache
server: ArangoDB
strict-transport-security: max-age=31536000 ; includeSubDomains
x-arango-queue-time-seconds: 0.000000
x-content-type-options: nosniff
Tail recent server operations
Fetch recent operations
GET /_api/wal/tail
Query Parameters
-
global (boolean, optional): Whether operations for all databases should be included. If set to
false
, only the operations for the current database are included. The valuetrue
is only valid on the_system
database. The default isfalse
. -
from (number, optional): Exclusive lower bound tick value for results. On successive calls to this API you should set this to the value returned with the
x-arango-replication-lastincluded
header (unless that header contains 0). -
to (number, optional): Inclusive upper bound tick value for results.
-
lastScanned (number, optional): Should be set to the value of the
x-arango-replication-lastscanned
header or alternatively 0 on first try. This allows the RocksDB storage engine to break up large transactions over multiple responses. -
chunkSize (number, optional): Approximate maximum size of the returned result.
-
syncerId (number, optional): The ID of the client used to tail results. The server uses this to keep operations until the client has fetched them. Must be a positive integer. Note:
syncerId
orserverId
is required to have a chance at fetching all operations with the RocksDB storage engine. -
serverId (number, optional): The ID of the client machine. If
syncerId
is unset, the server uses this to keep operations until the client has fetched them. Must be a positive integer. Note:serverId
orsyncerId
is required to have a chance at fetching all operations with the RocksDB storage engine. -
clientInfo (string, optional): Short description of the client, used for informative purposes only.
Returns data from the server’s write-ahead log (also named replication log). This method can be called by replication clients after an initial synchronization of data. The method returns all “recent” logged operations from the server. Clients can replay and apply these operations locally so they get to the same data state as the server.
Clients can call this method repeatedly to incrementally fetch all changes
from the server. In this case, they should provide the from
value so
they only get returned the log events since their last fetch.
When the from
query parameter is not used, the server returns log
entries starting at the beginning of its replication log. When the from
parameter is used, the server only returns log entries which have
higher tick values than the specified from
value (note: the log entry with a
tick value equal to from
is excluded). Use the from
value when
incrementally fetching log data.
The to
query parameter can be used to optionally restrict the upper bound of
the result to a certain tick value. If used, the result contains only log events
with tick values up to (including) to
. In incremental fetching, there is no
need to use the to
parameter. It only makes sense in special situations,
when only parts of the change log are required.
The chunkSize
query parameter can be used to control the size of the result.
It must be specified in bytes. The chunkSize
value is only honored
approximately. Otherwise, a too low chunkSize
value could cause the server
to not be able to put just one log entry into the result and return it.
Therefore, the chunkSize
value is only consulted after a log entry has
been written into the result. If the result size is then bigger than
chunkSize
, the server responds with as many log entries as there are
in the response already. If the result size is still smaller than chunkSize
,
the server tries to return more data if there’s more data left to return.
If chunkSize
is not specified, some server-side default value is used.
The Content-Type
of the result is application/x-arango-dump
. This is an
easy-to-process format, with all log events going onto separate lines in the
response body. Each log event itself is a JSON object, with at least the
following attributes:
-
tick
: the log event tick value -
type
: the log event type
Individual log events also have additional attributes, depending on the event type. A few common attributes which are used for multiple events types are:
-
cuid
: globally unique id of the View or collection the event was for -
db
: the database name the event was for -
tid
: id of the transaction the event was contained in -
data
: the original document data
A more detailed description of the individual replication event types and their data structures can be found in Operation Types.
The response also contains the following HTTP headers:
-
x-arango-replication-active
: whether or not the logger is active. Clients can use this flag as an indication for their polling frequency. If the logger is not active and there are no more replication events available, it might be sensible for a client to abort, or to go to sleep for a long time and try again later to check whether the logger has been activated. -
x-arango-replication-lastincluded
: the tick value of the last included value in the result. In incremental log fetching, this value can be used as thefrom
value for the following request. Note that if the result is empty, the value is0
. This value should not be used asfrom
value by clients in the next request (otherwise the server would return the log events from the start of the log again). -
x-arango-replication-lastscanned
: the last tick the server scanned while computing the operation log. This might include operations the server did not returned to you due to various reasons (i.e. the value was filtered or skipped). You may use this value in thelastScanned
header to allow the RocksDB storage engine to break up requests over multiple responses. -
x-arango-replication-lasttick
: the last tick value the server has logged in its write ahead log (not necessarily included in the result). By comparing the last tick and last included tick values, clients have an approximate indication of how many events there are still left to fetch. -
x-arango-replication-frompresent
: is set to true if server returned all tick values starting from the specified tick in the from parameter. Should this be set to false the server did not have these operations anymore and the client might have missed operations. -
x-arango-replication-checkmore
: whether or not there already exists more log data which the client could fetch immediately. If there is more log data available, the client could call the tailing API again with an adjustedfrom
value to fetch remaining log entries until there are no more.If there isn’t any more log data to fetch, the client might decide to go to sleep for a while before calling the logger again.
Note: this method is not supported on a Coordinator in a cluster.
Responses
HTTP 200: is returned if the request was executed successfully, and there are log events available for the requested range. The response body is not empty in this case.
HTTP 204: is returned if the request was executed successfully, but there are no log events available for the requested range. The response body is empty in this case.
HTTP 400: is returned if either the from
or to
values are invalid.
HTTP 405: is returned when an invalid HTTP method is used.
HTTP 500: is returned if an error occurred while assembling the response.
HTTP 501: is returned when this operation is called on a Coordinator in a cluster.
Examples
No log events available
shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/wal/tail?from=186138
HTTP/1.1 204 No Content
content-type: application/x-arango-dump
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0, max-age=0, s-maxage=0
connection: Keep-Alive
content-length: 0
content-security-policy: frame-ancestors 'self'; form-action 'self';
expires: 0
pragma: no-cache
server: ArangoDB
strict-transport-security: max-age=31536000 ; includeSubDomains
x-arango-queue-time-seconds: 0.000000
x-arango-replication-checkmore: false
x-arango-replication-frompresent: true
x-arango-replication-lastincluded: 0
x-arango-replication-lastscanned: 186135
x-arango-replication-lasttick: 186138
x-content-type-options: nosniff
A few log events (One JSON document per line)
shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/wal/tail?from=186138
HTTP/1.1 200 OK
content-type: application/x-arango-dump
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0, max-age=0, s-maxage=0
connection: Keep-Alive
content-length: 74
content-security-policy: frame-ancestors 'self'; form-action 'self';
expires: 0
pragma: no-cache
server: ArangoDB
strict-transport-security: max-age=31536000 ; includeSubDomains
x-arango-queue-time-seconds: 0.000000
x-arango-replication-checkmore: true
x-arango-replication-frompresent: true
x-arango-replication-lastincluded: 186154
x-arango-replication-lastscanned: 186167
x-arango-replication-lasttick: 186167
x-content-type-options: nosniff
More events than would fit into the response
shell> curl --header 'accept: application/json' --dump - http://localhost:8529/_api/wal/tail?from=186112&chunkSize=400
HTTP/1.1 200 OK
content-type: application/x-arango-dump
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0, max-age=0, s-maxage=0
connection: Keep-Alive
content-length: 74
content-security-policy: frame-ancestors 'self'; form-action 'self';
expires: 0
pragma: no-cache
server: ArangoDB
strict-transport-security: max-age=31536000 ; includeSubDomains
x-arango-queue-time-seconds: 0.000000
x-arango-replication-checkmore: true
x-arango-replication-frompresent: true
x-arango-replication-lastincluded: 186128
x-arango-replication-lastscanned: 186138
x-arango-replication-lasttick: 186138
x-content-type-options: nosniff
Operation Types
There are several different operation types thar an ArangoDB server might print.
All operations include a tick
value which identified their place in the operations log.
The numeric fields tick and tid always contain stringified numbers to avoid problems with
drivers where numbers in JSON might be mishandled.
The following operation types are used in ArangoDB:
Create Database (1100)
Create a database. Contains the field db with the database name and the field data, contains the database definition.
{
"tick": "2103",
"type": 1100,
"db": "test",
"data": {
"database": 337,
"id": "337",
"name": "test"
}
}
Drop Database (1101)
Drop a database. Contains the field db with the database name.
{
"tick": "3453",
"type": 1101,
"db": "test"
}
Create Collection (2000)
Create a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data attribute contains the collection definition.
{
"tick": "3702",
"db": "_system",
"cuid": "hC0CF79DA83B4/555",
"type": 2000,
"data": {
"allowUserKeys": true,
"cacheEnabled": false,
"cid": "555",
"deleted": false,
"globallyUniqueId": "hC0CF79DA83B4/555",
"id": "555",
"indexes": [],
"isSystem": false,
"keyOptions": {
"allowUserKeys": true,
"lastValue": 0,
"type": "traditional"
},
"name": "test"
}
}
Drop Collection (2001)
Drop a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection.
{
"tick": "154",
"type": 2001,
"db": "_system",
"cuid": "hD15F8FE99859/555"
}
Rename Collection (2002)
Rename a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data field contains the name field with the new name
{
"tick": "385",
"db": "_system",
"cuid": "hD15F8FE99859/135",
"type": 2002,
"data": {
"name": "other"
}
}
Change Collection (2003)
Change collection properties. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data attribute contains the updated collection definition.
{
"tick": "154",
"type": 2003,
"db": "_system",
"cuid": "hD15F8FE99859/555",
"data": {
"waitForSync": true
}
}
Truncate Collection (2004)
Truncate a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection.
{
"tick": "154",
"type": 2004,
"db": "_system",
"cuid": "hD15F8FE99859/555"
}
Create Index (2100)
Create an index. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The field data contains the index definition.
{
"tick": "1327",
"type": 2100,
"db": "_system",
"cuid": "hD15F8FE99859/555",
"data": {
"deduplicate": true,
"fields": [
"value"
],
"id": "260",
"selectivityEstimate": 1,
"sparse": false,
"type": "persistent",
"unique": false
}
}
Drop Index (2101)
Drop an index. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The field data contains the field id with the index id.
{
"tick": "1522",
"type": 2101,
"db": "_system",
"cuid": "hD15F8FE99859/555",
"data": {
"id": "260"
}
}
Create View (2110)
Create a view. Contains the field db with the database name, and cuid with the globally unique id to identify this view. The field data contains the view definition
{
"tick": "1833",
"type": 2110,
"db": "_system",
"cuid": "hD15F8FE99859/322",
"data": {
"cleanupIntervalStep": 10,
"collections": [],
"commitIntervalMsec": 60000,
"consolidate": {
"segmentThreshold": 300,
"threshold": 0.8500000238418579,
"type": "tier"
},
"deleted": false,
"globallyUniqueId": "hD15F8FE99859/322",
"id": "322",
"isSystem": false,
"locale": "C",
"name": "myview",
"type": "arangosearch"
}
}
Drop View (2111)
Drop a view. Contains the field db with the database name, and cuid with the globally unique id to identify this view.
{
"tick": "3113",
"type": 2111,
"db": "_system",
"cuid": "hD15F8FE99859/322"
}
Change View (2112)
Change view properties (including the name). Contains the field db with the database name and cuid with the globally unique id to identify this view. The data attribute contain the updated properties.
{
"tick": "3014",
"type": 2112,
"db": "_system",
"cuid": "hD15F8FE99859/457",
"data": {
"cleanupIntervalStep": 10,
"collections": [
135
],
"commitIntervalMsec": 60000,
"consolidate": {
"segmentThreshold": 300,
"threshold": 0.8500000238418579,
"type": "tier"
},
"deleted": false,
"globallyUniqueId": "hD15F8FE99859/457",
"id": "457",
"isSystem": false,
"locale": "C",
"name": "renamedview",
"type": "arangosearch"
}
}
Start Transaction (2200)
Mark the beginning of a transaction. Contains the field db with the database name and the field tid for the transaction id. This log entry might be followed by zero or more document operations and then either one commit or an abort operation (i.e. types 2300, 2302 and 2201 / 2202) with the same tid value.
{
"tick": "3651",
"type": 2200,
"db": "_system",
"tid": "556"
}
Commit Transaction (2201)
Mark the successful end of a transaction. Contains the field db with the database name and the field tid for the transaction id.
{
"tick": "3652",
"type": 2201,
"db": "_system",
"tid": "556"
}
Abort Transaction (2202)
Mark the abortion of a transaction. Contains the field db with the database name and the field tid for the transaction id.
{
"tick": "3654",
"type": 2202,
"db": "_system",
"tid": "556"
}
Insert / Replace Document (2300)
Insert or replace a document. Contains the field db with the database name, cuid with the globally unique id to identify the collection and the field tid for the transaction id. The field tid might contain the value “0” to identify a single operation that is not part of a multi-document transaction. The field data contains the document. If the field _rev exists the client can choose to perform a revision check against a locally available version of the document to ensure consistency.
{
"tick": "196",
"type": 2300,
"db": "_system",
"tid": "0",
"cuid": "hE0E3D7BE511D/119",
"data": {
"_id": "users/194",
"_key": "194",
"_rev": "_XUJFD3C---",
"value": "test"
}
}
Remove Document (2302)
Remove a document. Contains the field db with the database name, cuid with the globally unique id to identify the collection and the field tid for the transaction id. The field tid might contain the value “0” to identify a single operation that is not part of a multi-document transaction. The field data contains the _key and _rev of the removed document. The client can choose to perform a revision check against a locally available version of the document to ensure consistency.
{
"cuid": "hE0E3D7BE511D/119",
"data": {
"_key": "194",
"_rev": "_XUJIbS---_"
},
"db": "_system",
"tick": "397",
"tid": "0",
"type": 2302
}