ArangoDB Spark Connector - Scala Reference
This library has been deprecated in favor of the new ArangoDB Datasource for Apache Spark.
ArangoSpark.save
ArangoSpark.save[T](rdd: RDD[T], collection: String, options: WriteOptions)
ArangoSpark.save[T](dataset: Dataset[T], collection: String, options: WriteOptions)
Save data from rdd or dataset into ArangoDB
Arguments
-
rdd / dataset:
RDD[T]
orDataset[T]
The rdd or dataset with the data to save
-
collection:
String
The collection to save in
-
options:
WriteOptions
-
database:
String
Database to write into
-
hosts:
String
Alternative hosts to context property
arangodb.hosts
-
user:
String
Alternative user to context property
arangodb.user
-
password:
String
Alternative password to context property
arangodb.password
-
useSsl:
Boolean
Alternative useSsl to context property
arangodb.useSsl
-
sslKeyStoreFile:
String
Alternative sslKeyStoreFile to context property
arangodb.ssl.keyStoreFile
-
sslPassPhrase:
String
Alternative sslPassPhrase to context property
arangodb.ssl.passPhrase
-
sslProtocol:
String
Alternative sslProtocol to context property
arangodb.ssl.protocol
-
method:
WriteOptions.Method
Write method to use, it can be one of:
WriteOptions.INSERT
WriteOptions.UPDATE
WriteOptions.REPLACE
-
Examples
val sc: SparkContext = ...
val documents = sc.parallelize((1 to 100).map { i => MyBean(i) })
ArangoSpark.save(documents, "myCollection", WriteOptions("myDB"))
ArangoSpark.saveDF
ArangoSpark.saveDF(dataframe: DataFrame, collection: String, options: WriteOptions)
Save data from dataframe into ArangoDB
Arguments
-
dataframe: DataFrame`
The dataFrame with the data to save
-
collection:
String
The collection to save in
-
options:
WriteOptions
-
database:
String
Database to write into
-
hosts:
String
Alternative hosts to context property
arangodb.hosts
-
user:
String
Alternative user to context property
arangodb.user
-
password:
String
Alternative password to context property
arangodb.password
-
useSsl:
Boolean
Alternative useSsl to context property
arangodb.useSsl
-
sslKeyStoreFile:
String
Alternative sslKeyStoreFile to context property
arangodb.ssl.keyStoreFile
-
sslPassPhrase:
String
Alternative sslPassPhrase to context property
arangodb.ssl.passPhrase
-
sslProtocol:
String
Alternative sslProtocol to context property
arangodb.ssl.protocol
-
method:
WriteOptions.Method
Write method to use, it can be one of:
WriteOptions.INSERT
WriteOptions.UPDATE
WriteOptions.REPLACE
-
Examples
val sc: SparkContext = ...
val documents = sc.parallelize((1 to 100).map { i => MyBean(i) })
val sql: SQLContext = SQLContext.getOrCreate(sc);
val df = sql.createDataFrame(documents, classOf[MyBean])
ArangoSpark.saveDF(df, "myCollection", WriteOptions("myDB"))
ArangoSpark.load
ArangoSpark.load[T: ClassTag](sparkContext: SparkContext, collection: String, options: ReadOptions): ArangoRDD[T]
Load data from ArangoDB into rdd
Arguments
-
sparkContext:
SparkContext
The sparkContext containing the ArangoDB configuration
-
collection:
String
The collection to load data from
-
options:
ReadOptions
-
database:
String
Database to write into
-
hosts:
String
Alternative hosts to context property
arangodb.hosts
-
user:
String
Alternative user to context property
arangodb.user
-
password:
String
Alternative password to context property
arangodb.password
-
useSsl:
Boolean
Alternative useSsl to context property
arangodb.useSsl
-
sslKeyStoreFile:
String
Alternative sslKeyStoreFile to context property
arangodb.ssl.keyStoreFile
-
sslPassPhrase:
String
Alternative sslPassPhrase to context property
arangodb.ssl.passPhrase
-
sslProtocol:
String
Alternative sslProtocol to context property
arangodb.ssl.protocol
-
Examples
val sc: SparkContext = ...
val rdd = ArangoSpark.load[MyBean](sc, "myCollection", ReadOptions("myDB"))
ArangoRDD.filter
ArangoRDD.filter(condition: String): ArangoRDD[T]
Adds a filter condition. If used multiple times, the conditions will be combined with a logical AND.
Arguments
-
condition:
String
The condition for the filter statement. Use
doc
inside to reference the document. e.g."doc.name == 'John'"
Examples
val sc: SparkContext = ...
val rdd = ArangoSpark.load[MyBean](sc, "myCollection").filter("doc.name == 'John'")
Spark Streaming Integration
RDDs can also be saved to ArangoDB from Spark Streaming using ArangoSpark.save().
Example
dStream.foreachRDD(rdd =>
ArangoSpark.save(rdd, COLLECTION, new WriteOptions().database(DB)))