Analytic Query Support
Overview
Elide includes a semantic modeling layer and analytic query API for OLAP style queries against our database.
A semantic model is the view of the data we want our users to understand. It is typically non-relational (for simplicity) and consists of concepts like tables, measures, and dimensions. End users refer to these concepts by name only (they are not expected to derive formulas or know about the physical storage or serialization of data).
A virtual semantic layer maps a semantic model to columns and tables in a physical database. Elide's virtual semantic layer accomplishes this mapping through a Hjson configuration language. Hjson is a human friendly adaptation of JSON that allows comments and a relaxed syntax among other features. Elide's virtual semantic layer includes the following information:
- The defintions of tables, measures, and dimensions we want to expose to the end user.
- Metadata like descriptions, categories, and tags that better describe and label the semantic model.
- For every table, measure, and dimension, a SQL fragment that maps it to the physical data. These fragements are used by elide to generate native SQL queries against the target database.
Elide leverages the AggregationDataStore
store to expose the read-only models defined in the semantic model. Model
attributes represent either metrics (for aggregating, filtering, and sorting) and dimensions (for grouping, filtering,
and sorting). Models exposed through the aggregation store are flat and do not contain relationships to other models.
The Aggregation store includes a companion store, the MetaDataStore
, which exposes metadata about the Aggregation
store models including their metrics and dimensions. The metadata store models are predefined, read-only, and served
from server memory.
There are two mechanisms to create models in the Aggregation store's semantic layer:
- Through Hjson configuration files that can be maintained without writing code or rebuilding the application.
- Through JVM language classes annotated with Elide annotations.
The former is preferred for most use cases because of better ergonomics for non-developers. The latter is useful to add custom Elide security rules or life cycle hooks.
Querying
Models managed by the AggregationDataStore
can be queried via JSON-API or GraphQL similar to other Elide models.
There are a few important distinctions:
- If one or more metrics are included in the query, every requested dimension will be used to aggregate the selected metrics.
- If only dimensions (no metrics) are included in the query, Elide will return a distinct list of the requested dimension value combinations.
- Every elide model includes an ID field. The ID field returned from aggregation store models is not a true identifier. It represents the row number from a returned result. Attempts to load the model by its identifier will result in an error.
Analytic Queries
Similar to other Elide models, analytic models can be sorted, filtered, and paginated. A typical analytic query might look like:
- JSON-API
- GraphQL
/playerStats?fields[playerStats]=highScore,overallRating,countryIsoCode&sort=highScore
{
playerStats(sort: "highScore") {
edges {
node {
highScore
overallRating
countryIsoCode
}
}
}
}
Conceptually, these queries might generate SQL similar to:
SELECT MAX(highScore), overallRating, countryIsoCode FROM playerStats GROUP BY overallRating, countryIsoCode ORDER BY MAX(highScore) ASC;
Here are the respective responses:
- JSON-API
- GraphQL
{
"data": [
{
"type": "playerStats",
"id": "0",
"attributes": {
"countryIsoCode": "HKG",
"highScore": 1000,
"overallRating": "Good"
}
},
{
"type": "playerStats",
"id": "1",
"attributes": {
"countryIsoCode": "USA",
"highScore": 1234,
"overallRating": "Good"
}
},
{
"type": "playerStats",
"id": "2",
"attributes": {
"countryIsoCode": "USA",
"highScore": 2412,
"overallRating": "Great"
}
}
]
}
{
"data": {
"playerStats": {
"edges": [
{
"node": {
"highScore": 1000,
"overallRating": "Good",
"countryIsoCode": "HKG"
}
},
{
"node": {
"highScore": 1234,
"overallRating": "Good",
"countryIsoCode": "USA"
}
},
{
"node": {
"highScore": 2412,
"overallRating": "Great",
"countryIsoCode": "USA"
}
}
]
}
}
}
Metadata Queries
A full list of available table and column metadata is covered in the configuration section. Metadata can be queried through the table model and its associated relationships.
- JSON-API
- GraphQL
/table/playerStats?fields[table]=name,category,description,requiredFilter,tags,metrics,dimensions,timeDimensions
{
table(ids: ["playerStats"]) {
edges {
node {
name
category
description
requiredFilter
tags
metrics {edges {node {id}}}
dimensions {edges {node {id}}}
timeDimensions {edges {node {id}}}
}
}
}
}
Here are the respective responses:
- JSON-API
- GraphQL
{
"data": {
"type": "table",
"id": "playerStats",
"attributes": {
"category": "Sports Category",
"description": "Player Statistics",
"name": "playerStats",
"requiredFilter": "",
"tags": [
"Game",
"Statistics"
]
},
"relationships": {
"dimensions": {
"data": [
{
"type": "dimension",
"id": "playerStats.playerName"
},
{
"type": "dimension",
"id": "playerStats.player2Name"
},
{
"type": "dimension",
"id": "playerStats.playerLevel"
},
{
"type": "dimension",
"id": "playerStats.overallRating"
},
{
"type": "dimension",
"id": "playerStats.countryIsInUsa"
},
{
"type": "dimension",
"id": "playerStats.countryIsoCode"
},
{
"type": "dimension",
"id": "playerStats.countryUnSeats"
},
{
"type": "dimension",
"id": "playerStats.countryNickName"
},
{
"type": "dimension",
"id": "playerStats.subCountryIsoCode"
}
]
},
"metrics": {
"data": [
{
"type": "dimension",
"id": "playerStats.id"
},
{
"type": "metric",
"id": "playerStats.lowScore"
},
{
"type": "metric",
"id": "playerStats.highScore"
},
{
"type": "metric",
"id": "playerStats.highScoreNoAgg"
}
]
},
"timeDimensions": {
"data": [
{
"type": "timeDimension",
"id": "playerStats.updatedDate"
},
{
"type": "timeDimension",
"id": "playerStats.recordedDate"
}
]
}
}
}
}
{
"data": {
"table": {
"edges": [
{
"node": {
"name": "playerStats",
"category": "Sports Category",
"description": "Player Statistics",
"requiredFilter": "",
"tags": [
"Game",
"Statistics"
],
"metrics": {
"edges": [
{
"node": {
"id": "playerStats.id"
}
},
{
"node": {
"id": "playerStats.highScoreNoAgg"
}
},
{
"node": {
"id": "playerStats.lowScore"
}
},
{
"node": {
"id": "playerStats.highScore"
}
}
]
},
"dimensions": {
"edges": [
{
"node": {
"id": "playerStats.countryUnSeats"
}
},
{
"node": {
"id": "playerStats.overallRating"
}
},
{
"node": {
"id": "playerStats.countryNickName"
}
},
{
"node": {
"id": "playerStats.player2Name"
}
},
{
"node": {
"id": "playerStats.countryIsoCode"
}
},
{
"node": {
"id": "playerStats.playerName"
}
},
{
"node": {
"id": "playerStats.playerLevel"
}
},
{
"node": {
"id": "playerStats.countryIsInUsa"
}
},
{
"node": {
"id": "playerStats.subCountryIsoCode"
}
}
]
},
"timeDimensions": {
"edges": [
{
"node": {
"id": "playerStats.recordedDate"
}
},
{
"node": {
"id": "playerStats.updatedDate"
}
}
]
}
}
}
]
}
}
}
Configuration
Feature Flags
There are feature flags that enable Hjson configuration, analytic queries, and Metadata queries respectively:
Name | Description | Default |
---|---|---|
elide.aggregation-store.dynamic-config.enabled | Enable model creation through the Hjson configuration files. | false |
elide.aggregation-store.enabled | Enable support for data analytic queries. | false |
elide.aggregation-store.metadata-store.enabled | Enable the metadata query APIs exposing the metadata about the Aggregation store models including their metrics and dimensions. | false |
- Spring
- Elide Standalone
Configure in application.yaml
.
elide:
aggregation-store:
enabled: true
metadata-store:
enabled: true
dynamic-config:
enabled: true
Override ElideStandaloneSettings
.
public abstract class Settings implements ElideStandaloneSettings {
@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
return new ElideStandaloneAnalyticSettings() {
@Override
public boolean enableDynamicModelConfig() {
return true;
}
@Override
public boolean enableAggregationDataStore() {
return true;
}
@Override
public boolean enableMetaDataStore() {
return true;
}
};
}
}
File Layout
Analtyic model configuration can either be specified through JVM classes decorated with Elide annotations or Hjson configuration files. Hjson configuration files can be sourced either from the local filesystem or the classpath. If Hjson configuration is found in the classpath, the filesystem is ignored. All Hjson configuration must conform to the following directory structure:
CONFIG_ROOT/
├── models/
| ├── tables/
| | ├── model1.hjson
| | ├── model2.hjson
| ├── namespaces/
| | ├── namespace1.hjson
| | ├── namespace2.hjson
| ├── security.hjson
| └── variables.hjson
├── db/
| ├── sql/
| | ├── db1.hjson
| ├── variables.hjson
- Analytic model files are stored in
/models/tables
. Multiple models can be grouped together into a single file. - Analytic models can optionally belong to a namespace - a grouping of related models with the same API prefix.
Namespace configuration is defined in
/models/namespaces
. - Security rules are stored in
/models/security.hjson
. - Model, namespace, and security Hjson files support variable substitution with variables defined in
/models/variables.hjson
. - Data source configurations are stored in
/db/sql
. Multiple configurations can be grouped together into a single file. - Data source Hjson files support variable substitution with variables defined in
/db/variables.hjson
.
CONFIG_ROOT can be any directory in the filesystem or classpath. The root configuration location can be set as follows:
- Spring
- Elide Standalone
Configure in application.yaml
.
elide:
aggregation-store:
dynamic-config:
path: src/resources/configs
Override ElideStandaloneSettings
.
public abstract class Settings implements ElideStandaloneSettings {
@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
return new ElideStandaloneAnalyticSettings() {
@Override
public String getDynamicConfigPath() {
return File.separator + "configs" + File.separator;
}
};
}
}
Data Source Configuration
The Aggregation Data Store does not leverage JPA, but rather uses JDBC directly. By default, Elide will leverage the default JPA configuration for establishing connections through the Aggregation Data Store. However, more complex configurations are possible including:
- Using a different JDBC data source other than what is configured for JPA.
- Leveraging multiple JDBC data sources for different Elide models.
For these complex configurations, we must configure Elide using the Aggregation Store's Hjson configuration language. The following configuration file illustrates two data sources. Each data source configuration includes:
- A name that will be referenced in our Analytic models (effectively binding them to a data source).
- A JDBC URL
- A JDBC driver
- A user name
- An Elide SQL Dialect. This can either be the name of an Elide supported dialect or it can be the fully qualified class name of an implementation of an Elide dialect.
- A map of driver specific properties.
{
dbconfigs:
[
{
name: Presto Data Source
url: jdbc:presto://localhost:4443/testdb
driver: com.facebook.presto.jdbc.PrestoDriver
user: guestdb2
dialect: PrestoDB
}
{
name: Hive Data Source
url: jdbc:hive2://localhost:4444/dbName
driver: org.apache.hive.jdbc.HiveDriver
user: guestmysql
dialect: com.paiondata.elide.datastores.aggregation.queryengines.sql.dialects.impl.HiveDialect
propertyMap:
{
sslEnabled : true
}
}
]
}
By default, Elide uses HikariCP's DataSource for JDBC connection pool. A custom DataSourceConfiguration
can be
configured by the following override:
- Spring
- Elide Standalone
Create a @Configuration
class that defines our custom implementation as a @Bean
.
@Configuration
public class ElideConfiguration {
@Bean
public DataSourceConfiguration dataSourceConfiguration() {
return new DataSourceConfiguration() {
@Override
public DataSource getDataSource(DBConfig dbConfig, DBPasswordExtractor dbPasswordExtractor) {
HikariConfig config = new HikariConfig();
config.setJdbcUrl(dbConfig.getUrl());
config.setUsername(dbConfig.getUser());
config.setPassword(dbPasswordExtractor.getDBPassword(dbConfig));
config.setDriverClassName(dbConfig.getDriver());
dbConfig.getPropertyMap().forEach((k, v) -> config.addDataSourceProperty(k, v));
return new HikariDataSource(config);
}
};
}
}
Override ElideStandaloneSettings
.
public abstract class Settings implements ElideStandaloneSettings {
@Override
public DataSourceConfiguration getDataSourceConfiguration() {
return new DataSourceConfiguration() {
@Override
public DataSource getDataSource(DBConfig dbConfig, DBPasswordExtractor dbPasswordExtractor) {
HikariConfig config = new HikariConfig();
config.setJdbcUrl(dbConfig.getUrl());
config.setUsername(dbConfig.getUser());
config.setPassword(dbPasswordExtractor.getDBPassword(dbConfig));
config.setDriverClassName(dbConfig.getDriver());
dbConfig.getPropertyMap().forEach((k, v) -> config.addDataSourceProperty(k, v));
return new HikariDataSource(config);
}
};
}
}
Data Source Passwords
Data source passwords are provided out of band by implementing a DBPasswordExtractor
:
public interface DBPasswordExtractor {
String getDBPassword(DBConfig config);
}
A custom DBPasswordExtractor
can be configured by the following override:
- Spring
- Elide Standalone
Create a @Configuration
class that defines our custom implementation as a @Bean
.
@Configuration
public class ElideConfiguration {
@Bean
public DBPasswordExtractor dbPasswordExtractor() {
return new DBPasswordExtractor() {
@Override
public String getDBPassword(DBConfig config) {
return StringUtils.EMPTY;
}
};
}
}
Override ElideStandaloneSettings
.
public abstract class Settings implements ElideStandaloneSettings {
@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
return new ElideStandaloneAnalyticSettings() {
@Override
public DBPasswordExtractor getDBPasswordExtractor() {
return new DBPasswordExtractor() {
@Override
public String getDBPassword(DBConfig config) {
return StringUtils.EMPTY;
}
};
}
};
}
}
Dialects
A dialect must be configured for Elide to correctly generate analytic SQL queries. Elide supports the following dialects out of the box:
Friendly Name | Class |
---|---|
H2 | com.paiondata.elide.datastores.aggregation.queryengines.sql.dialects.impl.H2Dialect |
Hive | com.paiondata.elide.datastores.aggregation.queryengines.sql.dialects.impl.HiveDialect |
PrestoDB | com.paiondata.elide.datastores.aggregation.queryengines.sql.dialects.impl.PrestoDBDialect |
Postgres | com.paiondata.elide.datastores.aggregation.queryengines.sql.dialects.impl.PostgresDialect |
MySQL | com.paiondata.elide.datastores.aggregation.queryengines.sql.dialects.impl.MySQLDialect |
Druid | com.paiondata.elide.datastores.aggregation.queryengines.sql.dialects.impl.DruidDialect |
If not leveraging Hjson configuration, a default dialect can be configured for analytic queries:
- Spring
- Elide Standalone
Configure in application.yaml
.
elide:
aggregation-store:
default-dialect: H2
Override ElideStandaloneSettings
.
public abstract class Settings implements ElideStandaloneSettings {
@Override
public ElideStandaloneAnalyticSettings getAnalyticProperties() {
return new ElideStandaloneAnalyticSettings() {
@Override
public String getDefaultDialect() {
return "Hive";
}
};
}
}
Model Configuration
Concepts
Elide exposes a virtual semantic model of tables and columns that represents a data warehouse. The virtual semantic model can be mapped to one or more physical databases, tables, and columns through configuration by a data analyst. The analyst maps virtual tables and columns to fragments of native SQL queries that are later assembled into complete SQL statements at query time.
Analytic models are called Tables in Elide. They are made up of:
- Metrics - Numeric columns that can be aggregated, filtered on, and sorted on.
- Dimensions - Columns that can be grouped on, filtered on, and sorted on.
- TimeDimension - A type of Dimension that represents time. Time dimensions are tied to grain (a period) and a timezone.
- Columns - The supertype of Metrics, Dimensions, and TimeDimensions. All columns share a set of common metadata.
- Joins - Even though Elide analytic models are flat (there are no relationships to other models), individual model columns can be sourced from multiple physical tables. Joins provide Elide the information it needs to join other database tables at query time to compute a given column.
- Namespace - Every Table maps to one Namespace or the default Namespace if undefined. Namespaces group related tables together that share a common API prefix.
Other concepts include:
- Arguments - Tables and Columns can optionally have Arguments. They represent parameters that are supplied by the client to change how the column or table SQL is generated.
- Table Source - Columns and Arguments can optionally include metadata about their distinct legal values. Table Source references another Column in a different Table where the values are stored.
Example Configuration
- Hjson
- Java
{
tables: [{
name: PlayerStats
table: playerStats
dbConnectionName: Presto Data Source
friendlyName: Player Stats
description:
'''
A long description
'''
category: Sports
cardinality : large
readAccess : '(user AND member) OR (admin.user AND NOT guest user)'
filterTemplate : createdOn>={{start}};createdOn<{{end}}
isFact : true
tags: ['Game', 'Player']
joins: [
{
name: playerCountry
to: PlayerCountry
kind: toOne
type: left
definition: '{{playerCountry.$id}} = {{$country_id}}'
}
]
measures : [
{
name : highScore
type : INTEGER
definition: 'MAX({{$highScore}})'
friendlyName: High Score
}
]
dimensions : [
{
name : name
type : TEXT
definition : '{{$name}}'
cardinality : large
},
{
name : countryCode
type : TEXT
definition : '{{playerCountry.isoCode}}'
friendlyName: Country Code
},
{
name : gameType
type : TEXT
definition : '{{$game_type}}'
friendlyName: Game Type
},
{
name : gameOn
type : TIME
definition : '{{$game_on}}'
grains:
[
{
type: MONTH
sql: PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM'), 'yyyy-MM')
},
{
type: DAY
sql: PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')
},
{
type: SECOND
sql: PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM-dd HH:mm:ss'), 'yyyy-MM-dd HH:mm:ss')
}
]
},
{
name : createdOn
type : TIME
definition : '{{$created_on}}'
grains:
[{
type : DAY
sql : '''
PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')
'''
}]
},
{
name : updatedOn
type : TIME
definition : '{{$updated_on}}'
grains:
[{
type : MONTH
sql : '''
PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM'), 'yyyy-MM')
'''
}]
}
]
}]
}
@Include
@VersionQuery(sql = "SELECT COUNT(*) from playerStats")
@FromTable(name = "playerStats", dbConnectionName = "Presto Data Source")
@TableMeta(description = "A long description", category = "Sports", tags = {"Game", "Player"}, filterTemplate = "createdOn>={{start}};createdOn<{{end}}", size = CardinalitySize.LARGE, friendlyName = "Player Stats")
@ReadPermission(expression = "(user AND member) OR (admin.user AND NOT guest user)")
public class PlayerStats {
public static final String DATE_FORMAT = "PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')";
public static final String YEAR_MONTH_FORMAT = "PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM'), 'yyyy-MM')";
@Id
private String id;
@MetricFormula("MAX({{$highScore}})")
@ColumnMeta(friendlyName = "High Score")
private long highScore;
@ColumnMeta(size = CardinalitySize.LARGE)
private String name;
@Join("{{$country_id}} = {{playerCountry.$id}}", type = JoinType.LEFT)
private Country playerCountry;
@DimensionFormula("{{playerCountry.isoCode}}")
@ColumnMeta(friendlyName = "Country Code")
private String countryCode;
@DimensionFormula("{{$game_type}}")
@ColumnMeta(friendlyName = "Game Type")
private String gameType;
@Temporal(grains = {
@TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT),
@TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT)
}, timeZone = "UTC")
@DimensionFormula("{{$game_on}}")
private Time gameOn;
@Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT) }, timeZone = "UTC")
@DimensionFormula("{{$created_on}}")
private Time createdOn;
@Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT) }, timeZone = "UTC")
@DimensionFormula("{{$updated_on}}")
private Time updatedOn;
}
Handlebars Templates
There are a number of locations in the model configuration that require a SQL fragment. These include:
- Column definitions
- Table query definitions
- Table join expressions
SQL fragments cannot refer to physical database tables or columns directly by name. Elide generates SQL queries at runtime, and these queries reference tables and columns by aliases that are also generated. Without the correct alias, the generated SQL query will be invalid. Instead, physical table and column names should be substituted with handlebars template expressions.
All SQL fragments support handlebars template expressions. The handlebars context includes the following fields we can reference in our templated SQL:
{{$columnName}}
- Expands to the correctly aliased, physical database column name for the current Elide model.{{columnName}}
- Expands another column in the current Elide model.{{joinName.column}}
- Expands to a column in another Elide model joined to the current model through the referenced join.{{joinName.$column}}
- Expands to the correctly aliased, physical database column name for another Elide model joined to the current model through the referenced join.{{$$table.args.argumentName}}
- Expands to a table argument passed by the client or extracted from the client query filter through a table filterTemplate.{{$$column.args.argumentName}}
- Expands to a column argument passed by the client or extracted from the client query filter through a column filterTemplate. $$column always refers to the current column that is being expanded.{{$$column.expr}}
- Expands to a column's SQL fragment. $$column always refers to the current column that is being expanded.
Join names can be linked together to create a path from one model to another model's column through a set of joins. For
example the handlebar expression: {{join1.join2.join3.column}}
references a column that requires three separate joins.
The templating engine also supports a custom handlebars helper that can reference another column and provide overridden column arguments:
{{sql column='columnName[arg1:value1][arg2:value2]'}}
- Expands to a column in the current Elide model with argument values explicitly set.{{sql from='joinName' column='columnName'}}
- Identical to{{joinName.columnName}}
.{{sql from='joinName' column='$columnName'}}
- Identical to{{joinName.$columnName}}
.{{sql from='joinName' column='columnName[arg1:value1]'}}
- Identical to{{joinName.columnName}}
but passing 'value1' for the column argument, 'arg1'.
The helper takes two arguments:
- column - The column to expand. Optional column arguments (
[argumentName:argumentValue]
) can be appended after the column name. - from - An optional argument containing the join name where to source the column from. If not present, the column is sourced from the current model.
Tables
Tables must source their columns from somewhere. There are three, mutually exclusive options:
- Tables can source their columns from a physical table by its name.
- Tables can source their columns from a SQL subquery.
- Tables can extend (override or add columns to) an existing Table. More details can be found here.
These options are configured via the 'table', 'sql', and 'extend' properties.
Table Properties
Tables include the following simple properties:
Hjson Property | Explanation | Hjson Value | Annotation/Java Equivalent |
---|---|---|---|
name | The name of the elide model. It will be exposed through the API with this name. | tableName | @Include(name="tableName") |
version | If leveraging Elide API versions, the API version associated with this model. | 1.0 | @ApiVersion(version="1.0") |
friendlyName | The friendly name for this table. Unicode characters are supported. | 'Player Stats' | @TableMeta(friendlyName="Player Stats") |
description | A description of the table. | 'A description for tableName' | @TableMeta(description="A description for tableName") |
category | A free-form text category for the table. | 'Some Category' | @TableMeta(category="Some Category") |
tags | A list of free-form text labels for the table. | ['label1', 'label2'] | @TableMeta(tags={"label1","label2"}) |
cardinality | tiny, small, medium, large, huge - A hint about the number of records in the table. | small | @TableMeta(size=CardinalitySize.SMALL) |
dbConnectionName | The name of the physical data source where this table can be queried. This name must match a data source configuration name. | MysqlDB | @FromTable(dbConnectionName="MysqlDB") |
schema | The database schema where the physical data resides | schemaName | @FromTable(name=schemaName.tableName) |
table | Exactly one of table, sql, and extend must be provided. Provides the name of the physical base table where data will be sourced from. | tableName | @FromTable(name=tableName) |
sql | Exactly one of table, sql, and extend must be provided. Provides a SQL subquery where the data will be sourced from. | 'SELECT foo, bar FROM blah;' | @FromSubquery(sql="SELECT foo, bar FROM blah;") |
extend | Exactly one of table, sql, and extend must be provided. This model extends or inherits from another analytic model. | tableName | class Foo extends Bar |
readAccess | An elide permission rule that governs read access to the table. | 'member and admin.user' | @ReadPermission(expression="member and admin.user") |
filterTemplate | An RSQL filter expression template that either must directly match the client provided filter or be conjoined with logical 'and' to the client provided filter. | countryIsoCode=={{code}} | @TableMeta(filterTemplate="countryIsoCode=={{code}}") |
hidden | The table is not exposed through the API. | true | @Exclude |
isFact | Is the table a fact table. Models annotated using FromTable or FromSubquery or TableMeta or configured through Hjson default to true unless marked otherwise. Yavin will use this flag to determine which tables can be used to build reports. | true | @TableMeta(isFact=false) |
namespace | The namepsace this table belongs to. If none is provided, the default namespace is presumed. | SalesNamespace | @Include(name="namespace") on the Java package. |
hints | A list of optimizer hints to enable for this particular table. This is an experimental feature. | ['AggregateBeforeJoin'] | @TableMeta(hints="AggregateBeforeJoin") |
Tables also include:
- A list of columns including measures, dimensions, and time dimensions.
- A list of joins.
- A list of arguments.
- Hjson
- Java Package
- Java Model
{
tables:
[
{
namespace: SalesNamespace
name: orderDetails
friendlyName: Order Details
description: Sales orders broken out by line item.
category: revenue
tags: [Sales, Revenue]
cardinality: large
isFact: true
filterTemplate: 'recordedDate>={{start}};recordedDate<{{end}}'
#Instead of table, could also specify either 'sql' or 'extend'.
table: order_details
schema: revenue
dbConnectionName: SalesDBConnection
hints: [AggregateBeforeJoin]
readAccess: guest user
arguments: []
joins: []
measures: []
dimensions: []
}
]
}
@Include(name = "SalesNamespace")
package example;
import com.paiondata.elide.annotation.Include;
@Include(name = "orderDetails") //Tells Elide to expose this model in the API.
@VersionQuery(sql = "SELECT COUNT(*) from playerStats") //Used to detect when the cache is stale.
@FromTable( //Could also be @FromSubquery
name = "revenue.order_details",
dbConnectionName = "SalesDBConnection"
)
@TableMeta(
friendlyName = "Order Details",
description = "Sales orders broken out by line item.",
category = "revenue",
tags = {"Sales", "Revenue"},
size = CardinalitySize.LARGE,
isFact = true,
filterTemplate = "recordedDate>={{start}};recordedDate<{{end}}",
hints = {"AggregateBeforeJoin"},
)
@ReadPermission(expression = "guest user")
public class OrderDetails extends ParameterizedModel { //ParameterizedModel is a required base class if any columns take arguments.
//...
}
Columns
Columns are either measures, dimensions, or time dimensions. They all share a number of common properties. The most important properties are:
- The name of the column.
- The data type of the column.
- The definition of the column.
Column definitions are templated, native SQL fragments. Columns definitions can include references to other column definitions or physical column names that are expanded at query time. Column expressions can be defined in Hjson or Java:
- Hjson
- Java Package
{
measures : [
{
name : highScore
type : INTEGER
definition: 'MAX({{$highScore}})'
}
]
dimensions : [
{
name : name
type : TEXT
definition : '{{$name}}'
},
{
name : countryCode
type : TEXT
definition : '{{playerCountry.isoCode}}'
}
]
}
// A Dimension
@DimensionFormula("CASE WHEN {{$name}} = 'United States' THEN true ELSE false END")
private boolean inUsa;
// A metric
@MetricFormula("{{wins}} / {{totalGames}} * 100")
private float winRatio;
Column Properties
Columns include the following properties:
Hjson Property | Explanation | Example Hjson Value | Annotation/Java Equivalent |
---|---|---|---|
name | The name of the column. It will be exposed through the API with this name. | columnName | String columnName; |
friendlyName | The friendly name for this column to be displayed in the UI. | 'Country Code' | @ColumnMeta(friendlyName = "Country Code") |
description | A description of the column. | 'A description for columnA' | @ColumnMeta(description="A description for columnA") |
category | A free-form text category for the column. | 'Some Category' | @ColumnMeta(category="Some Category") |
tags | A list of free-form text labels for the column. | ['label1', 'label2'] | @ColumnMeta(tags={"label1","label2"}) |
cardinality | tiny, small, medium, large, huge - A hint about the dimension's cardinality. | small | @ColumnMeta(size=CardinalitySize.SMALL) |
readAccess | An elide permission rule that governs read access to the column. | 'admin.user' | @ReadPermission(expression="admin.user") |
definition | A SQL fragment that describes how to generate the column. | MAX({{sessions}}) | @DimensionFormula("CASE WHEN {{name}} = 'United States' THEN true ELSE false END") |
type | The data type of the column. One of 'INTEGER', 'DECIMAL', 'MONEY', 'TEXT', 'COORDINATE', 'BOOLEAN' or a fully qualified java class name (dimensions only). | 'BOOLEAN' | String columnName; |
hidden | The column is not exposed through the API. | true | @Exclude |
Non-time dimensions include the following, mutually exclusive properties that describe the set of discrete, legal values (for type-ahead search or other usecases) :
Hjson Property | Explanation | Example Hjson Value | Annotation/Java Equivalent |
---|---|---|---|
values | An optional enumerated list of dimension values for small cardinality dimensions | ['Africa', 'Asia', 'North America'] | @ColumnMeta(values = {"Africa", "Asia", "North America") |
tableSource | The semantic table and column names where to find the values | See the section on Table Source. | See the section on Table Source. |
Time Dimensions And Time Grains
Time dimensions represent time and include one or more time grains. The time grain determines how time is represented as text in query filters and query results. Supported time grains include:
Grain | Text Format |
---|---|
SECOND | "yyyy-MM-dd HH:mm:ss" |
MINUTE | "yyyy-MM-dd HH:mm" |
HOUR | "yyyy-MM-dd HH" |
DAY | "yyyy-MM-dd" |
WEEK | "yyyy-MM-dd" |
ISOWEEK | "yyyy-MM-dd" |
MONTH | "yyyy-MM" |
QUARTER | "yyyy-MM" |
YEAR | "yyyy" |
When defining a time dimension, a native SQL expression may be provided with the grain to convert the underlying column
(represented as {{$$column.expr}}
) to its expanded SQL definition:
- Hjson
- Java Package
{
name : createdOn
type : TIME
definition : "FORMATDATETIME({{$createdOn}}, 'yyyy-MM')"
grains:
[{
type : MONTH
sql : '''
PARSEDATETIME({{$$column.expr}}, 'yyyy-MM')
'''
}]
}
public static final String DATE_FORMAT = "PARSEDATETIME({{$$column.expr}}, 'yyyy-MM')";
@Temporal(grains = {@TimeGrainDefinition(grain = TimeGrain.MONTH, expression = DATE_FORMAT)}, timeZone = "UTC")
@DimensionFormula("FORMATDATETIME({{$createdOn}}, 'yyyy-MM')")
private Date createdOn;
Elide would expand the above example to this SQL fragment:
PARSEDATETIME(FORMATDATETIME(createdOn, 'yyyy-MM'), 'yyyy-MM')
.
Time grain definitions are optional and default to type 'DAY' with a native SQL expression of {{$\$column.expr}}
.
Joins
Table joins allow column expressions to reference fields from other tables. At query time, if a column requires a join, the join will be added to the generated SQL query. Each table configuration can include zero or more join definitions:
- Hjson
- Java Package
joins: [
{
name: playerCountry
to: country
kind: toOne
type: left
definition: '{{$country_id}} = {{playerCountry.$id}}' # 'playerCounty' here is the join name.
},
{
name: playerTeam
to: team
kind: toMany
type: full
definition: '{{$team_id}} = {{playerTeam.$id}}' # 'playerTeam' here is the joinName.
}
]
private Country country;
private Team team;
//'country' here is the the join/field name.
@Join("{{$country_id}} = {{country.$id}}", type = JoinType.LEFT)
public Country getCountry() {
return country;
}
//'team' here is the the join/field name.
@Join("{{$team_id}} = {{team.$id}}", type = JoinType.FULL)
public Team getTeam() {
return team;
}
Join Properties
Each join definition includes the following properties:
Hjson Property | Explanation |
---|---|
name | A unique name for the join. The name can be referenced in column definitions. |
namespace | The namepsace the join table belongs to. If none is provided, the default namespace is presumed. |
to | The name of the Elide model being joined against. This can be a semantic model or a CRUD model. |
kind | 'toMany' or 'toOne' (Default: toOne) |
type | 'left', 'inner', 'full' or 'cross' (Default: left) |
definition | A templated SQL join expression. See below. |
Join Definition
Join definitions are templated SQL expressions that represent the ON clause of a SQL statement:
definition: "{{$orderId}} = {{delivery.$orderId}} AND {{delivery.$delivered_on }} > '1970-01-01'"
Arguments
Columns and tables can both be parameterized with arguments. Arguments include the following properties:
Hjson Property | Explanation |
---|---|
name | The name of the argument |
description | The argument description |
type | The primitive type of the argument |
values | An optional list of allowed values |
default | An optional default value if none is supplied by the client |
In addition, arguments can also optionally reference a Table Source. The properties values
and
tableSource
are mutually exclusive.
Column And Argument Types
Column and argument values are mapped to primitive types which are used for validation, serialization, deserialization, and formatting.
The following primitive types are supported:
- Time - Maps to Elide supported time grains.
- Integer - Integer number.
- Decimal - Decimal number.
- Money - A decimal number that represents money.
- Text - A text string.
- Coordinate - A text representation of latitude, longitude or both.
- Boolean - true or false.
- Id - Represents the ID of the model. For analytic models, this is the row number and not an actual primary key.
Input values (filter values, column arguments, or table arguments) are validated by:
- Type coercion to the underlying Java data type.
- Regular expression matching using the following rules.
Table Source
Table sources contain additional metadata about where distinct legal values of a column or argument can be found. This metadata is intended to aid presentation layers with search suggestions.
Hjson Property | Explanation |
---|---|
table | The table where the distinct values can be located. |
namespace | The namespace that qualifies the table. If not provided, the default namespace is presumed. |
column | The column in the table where the distinct values can be located |
suggestionColumns | Zero or more additional columns that should be searched in conjunction with the primary column to locate a particular value. |
- Hjson
- Java Package
dimensions : [
{
name : countryNickname
type : TEXT
definition : '{{country.nickName}}'
tableSource : {
table: country
column: nickName
suggestionColumns: [name, description]
}
}
]
@DimensionFormula("{{country.nickName}}")
@ColumnMeta(
tableSource = @TableSource(table = "country", column = "nickName", suggestionColumn = {"name", "description"})
)
private String countryNickName;
Namespaces
Namespaces organize a set of related tables together so that they can share:
- Package level metadata like name, friendly name, and description.
- Default read permission rules for every table in the namespace.
- A common API prefix that is prepended to each model name in the namespace (
NamespaceName_ModelName
).
While, namespaces are optional, all tables belong to one and only one namespace. If no namespace is defined, the table will belong to the 'default' namespace. The default namespace does not have an API prefix.
- Hjson
- Java Package
{
namespaces:
[
{
name: SalesNamespace
description: Namespace for Sales Schema Tables
friendlyName: Sales
readAccess: Admin or SalesTeam
}
]
}
@Include(
name = "SalesNamespace",
description = "Namespaces for Sales Schema Tables",
friendlyName = "Sales"
)
@ReadPermission(expression = "Admin or SalesTeam")
package example;
import com.paiondata.elide.annotation.Include;
import com.paiondata.elide.annotation.ReadPermission;
Inheritance
Tables can extend another existing Table. The following actions can be performed:
- New columns can be added.
- Existing columns can be modified.
- Table properties can be modified.
The Table properties listed below can be inherited without re-declaration. Any Table property not listed below, has to be re-declared.
dbConnectionName
schema
table
sql
Unlike Table properties, Column properties are not inherited. When overriding a Column in an extended Table, the column properties have to be redefined.
Hjson inheritance v.s. Java inheritance
Hjson inheritance and Java inheritance differ in one key way. Hjson inheritance allows the type of a measure or dimension to be changed in the subclassed model. Changing the type of an inherited measure or dimension in Java might generate a compilation error.
Example Extend Configuration
The sample below uses the Example Configuration as its parent model. Let's assume we are a club that exposes the Player Stats from the intra-squad practice games and the tournament games to coaches using the PlayerStats model. We want to expose the data from the same persistent store to the general public with the following differences:
- Exclude the intra-squad games from
highScore
calculation. - Modify the Grain of
game_on
column fromDAY
toYEAR
. - Accessible by Admins and Guest users.
To avoid the compilation error highlighted above, we will have to write the
new JVM class with all the columns and properties instead of inheriting unchanged ones from the Parent model. With the
Hjson extend
, it will be a few lines of simple changes to inherit from the Parent model without duplication as
highlighted in the example below.
- Hjson
- Java Package
{
tables: [{
name: TournamentPlayerStats
extend: PlayerStats
readAccess : 'admin.user OR guest user'
measures : [
{
name : highScore
type : INTEGER
definition: MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)
}
],
dimensions : [
{
name : gameOn
type : TIME
definition : '{{$game_on}}'
# Change Type from MONTH, DAY & SECOND to YEAR & MONTH
grains:
[
{
type: YEAR
sql: PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy'), 'yyyy')
},
{
type: MONTH
sql: PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM'), 'yyyy-MM')
}
]
}
]
}]
}
@Include
@VersionQuery(sql = "SELECT COUNT(*) from playerStats")
@ReadPermission(expression = "admin.user OR guest user")
@FromTable(name = "playerStats", dbConnectionName = "Presto Data Source")
public class TournamentPlayerStats {
public static final String DATE_FORMAT = "PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM-dd'), 'yyyy-MM-dd')";
public static final String YEAR_MONTH_FORMAT = "PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy-MM'), 'yyyy-MM')";
public static final String YEAR_FORMAT = "PARSEDATETIME(FORMATDATETIME({{$$column.expr}}, 'yyyy'), 'yyyy')";
@Id
private String id;
// Change formula to filter on Tournament Games
@MetricFormula("MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)")
@ColumnMeta(friendlyName = "High Score")
private long highScore;
@ColumnMeta(size = CardinalitySize.LARGE)
private String name;
@Join("{{$country_id}} = {{playerCountry.$id}}", type = JoinType.LEFT)
private Country playerCountry;
@DimensionFormula("{{playerCountry.isoCode}}")
@ColumnMeta(friendlyName = "Country Code")
private String countryCode;
@DimensionFormula("{{$game_type}}")
@ColumnMeta(friendlyName = "Game Type")
private String gameType;
@Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT) }, timeZone = "UTC")
@DimensionFormula("{{$updated_on}}")
private Time updatedOn;
@Temporal(grains = { @TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT) }, timeZone = "UTC")
@DimensionFormula("{{$created_on}}")
private Time createdOn;
// Change types of gameOn from Day & Month to Day, Month & Year
@Temporal(grains = {
@TimeGrainDefinition(grain = TimeGrain.DAY, expression = DATE_FORMAT),
@TimeGrainDefinition(grain = TimeGrain.MONTH, expression = YEAR_MONTH_FORMAT)
@TimeGrainDefinition(grain = TimeGrain.YEAR, expression = YEAR_FORMAT)
}, timeZone = "UTC")
@DimensionFormula("{{$game_on}}")
private Time gameOn;
}
We can use Java's inheritance, if the goal does not involve changing the type of columns. Hjson extend
will still require a few lines of simple changes.
- Hjson
- Java Package
{
tables: [{
name: TournamentPlayerStats
extend: PlayerStats
readAccess : 'admin.user OR guest user'
measures : [
{
name : highScore
type : INTEGER
definition: MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)
}
]
}]
}
@Include
@ReadPermission(expression = "admin.user OR guest user")
public class TournamentPlayerStats extends PlayerStats {
// Change formula to filter on Tournament Games
@MetricFormula("MAX(CASE WHEN {{gameType}} = 'tournament' THEN {{highScore}}) ELSE NULL END)")
private long highScore;
}
Security Configuration
The semantics of security are described here.
HJSON has limited support for security definitions. Currently, only role based access controls (user checks) can be defined in HJSON. For more elaborate rules, the Elide security checks must be written in code.
A list of available user roles can be defined in HJSON in the security.hjson
file:
{
roles : [
admin.user
guest user
member
user
]
}
Each role defined generates an Elide user check that extends RoleMemberCheck defined in Role.
These roles can then be referenced in security rules applied to entire tables or individual columns in their respective Hjson configuration:
readAccess = 'member OR guest user'
The readAccess
table and column attribute can also reference Elide checks that are compiled with our application to
implement row level security or other more complex security rules.
Variable Substitution
To avoid repeated configuration blocks, all Hjson files (table, security, and data source) support variable substitution. Variables are defined in the variables.hjson file:
{
foo: [1, 2, 3]
bar: blah
hour: hour_replace
measure_type: MAX
name: PlayerStats
table: player_stats
}
The file format is a simple mapping from the variable name to a JSON structure. At server startup, Elide will replace
any variable name surrounded by <%
and %>
tags with the corresponding JSON structure.
Caching
The Aggregation data store supports a configurable caching strategy to cache query results. More details can be found in the performance section.
Bypassing Cache
Elide JAX-RS endpoints (elide-standalone) and Spring controllers (Spring) support a Bypass Cache header ('bypasscache')
that can be set to true
for caching to be disabled on a per query basis. If no bypasscache header is specified by the
client or a value other than true
is used, caching is enabled by default.
Security
Elide analytic models differ from CRUD models in some important ways. In a client query on a CRUD model backed by JPA, all model fields are hydrated (in some cases with lazy proxies) regardless of what fields the client requests. In an analytic query, only the model fields requested are hydrated. Checks which can execute in memory on the Elide server (Operation & Filter Expression checks) may examine fields that are not hydrated and result in errors for analytic queries. To avoid this scenario, the Aggregation Store implements its own permission executor with different restrictions and semantics.
The aggregation store enforces the following model permission restrictions:
- Operation checks are forbidden.
- Filter Expression checks may only decorate the model but not its fields.
- User checks are allowed anywhere.
Unlike CRUD models, model 'read' permissions are not interpreted as field permission defaults. Model and field permissions are interpreted independently.
Elide performs the following authorization steps when reading records:
- Determine if the database query can be avoided (by only evaluating checks on the user principal).
- Filter records in the database (by evaluating only filter expression checks).
- Filter records in memory (by evaluating all checks on each record returned from the database).
- Verify the client has permission to filter on the fields in the client's filter expression (by evaluating field level permissions).
- Prune fields from the response that the client cannot see (by evaluating field level permissions).
The aggregation store will prune rows returned in the response (steps 1-3) by evaluating the following expression:
(entityRule AND (field1Rule OR field2Rule ... OR fieldNRule)
Step 4 and 5 simply evaluates the user checks on each individual field.
Experimental Features
Configuration Validation
All Hjson configuration files are validated by a JSON schema. The schemas for each file type can be found here:
Hjson configuration files can be validated against schemas using a command-line utility following these steps:
-
Build your Elide project to generate a Fat JAR. Make sure to include a Fat JAR build configuration in your POM file.
mvn clean install
-
Using the generated JAR for validation:
java -cp elide-*-example.jar com.paiondata.elide.modelconfig.validator.DynamicConfigValidator --help
java -cp elide-*-example.jar com.paiondata.elide.modelconfig.validator.DynamicConfigValidator --configDir <Path for Config Directory>
-
The config directory needs to adhere to this file layout.
Query Optimization
Some queries run faster if aggregation is performed prior to joins (for dense joins). Others my run faster if aggregation is performed after joins (for sparse joins). By default, Elide generates queries that first aggregatoin and then join. Elide includes an experimental optimizer that will rewrite the queries to aggregate first and then join. This can be enabled at the table level by providing the hint, 'AggregateBeforeJoin' in the table configuration.
Filter Templates
A filter template is a RSQL filter expression that must match (in whole or in part) the client's query (or the client query will be rejected). Filter templates can be added to either table or column definitions. At the table level, the filter template must match every query against the table. At the column level, the template is only required to match if the client query explicitly requests the particular column.
Variable extraction
A filter template can optionally contain a template variable on the right hand side of any predicate. These variables are assigned to the values provided in the client query filter and added to the table arguments (for table filter templates) or the column arguments (for column filter templates). For example, the following filterTemplate would add the variables 'start' and 'end' to the table arguments:
{
tables:
[
{
name: orderDetails
filterTemplate : deliveryTime>={{start}};deliveryTime<{{end}}
...
Matching
A filter templates matches a client query if one of the two conditions holds:
- The filter template exactly matches the client filter.
- The filter template exactly matches part of the client filter that is conjoined with logical 'and' to the remainder of the client filter.
For example, the client RSQL filter lowScore>100;(highScore>=100;highScore<999)
matches the template
highScore>={{low}};highScore<{{high}}
.