This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Skip to end of metadata
Go to start of metadata

For a variety of components we have the need to determine if a condition is true of the JSON documents being enriched. For those purposes, there exists a simple DSL created to define those conditions.  Right now, the language is integrated in the following components:

  • Global Validation
  • Threat Triage

The Language

The query language supports the following:

  • Referencing fields in the enriched JSON
  • Simple boolean operations: andnotor
  • Simple comparison operations <><=>=
  • Determining whether a field exists (via exists)
  • The ability to have parenthesis to make order of operations explicit
  • A fixed set of functions which take strings and return boolean. Currently:
    • IN_SUBNET(ip, cidr1, cidr2, ...)
    • IS_EMPTY(str)
    • STARTS_WITH(str, prefix)
    • ENDS_WITH(str, suffix)
    • REGEXP_MATCH(str, pattern)
    • IS_IP : Validates that the input fields are an IP address. By default, if no second arg is set, it assumes IPV4, but you can specify the type by passing in either IPV6 or IPV4 to the second argument.
    • IS_DOMAIN
    • IS_EMAIL
    • IS_URL
    • IS_DATE
    • IS_INTEGER
  • A fixed set of transformation functions:
    • TO_LOWER(string) : Transforms the first argument to a lowercase string
    • TO_UPPER(string) : Transforms the first argument to an uppercase string
    • TO_STRING(string) : Transforms the first argument to a string
    • TO_INTEGER(x) : Transforms the first argument to an integer
    • TO_DOUBLE(x) : Transforms the first argument to a double
    • TRIM(string) : Trims whitespace from both sides of a string.
    • JOIN(list, delim) : Joins the components of the list with the specified delimiter
    • SPLIT(string, delim) : Splits the string by the delimiter. Returns a list.
    • GET_FIRST(list) : Returns the first element of the list
    • GET_LAST(list) : Returns the last element of the list
    • GET(list, i) : Returns the i'th element of the list (i is 0-based).
    • MAP_GET(key, map, default) : Returns the value associated with the key in the map. If the key does not exist, the default will be returned. If the default is unspecified, then null will be returned.
    • DOMAIN_TO_TLD(domain) : Returns the TLD of the domain.
    • DOMAIN_REMOVE_TLD(domain) : Remove the TLD of the domain.
    • DOMAIN_REMOVE_SUBDOMAINS(domain) : Remove the sub domain of the domain.
    • REMOVE_TLD(domain) : Removes the TLD from the domain.
    • URL_TO_HOST(url) : Returns the host from a URL
    • URL_TO_PROTOCOL(url) : Returns the protocol from a URL
    • URL_TO_PORT(url) : Returns the port from a URL
    • URL_TO_PATH(url) : Returns the path from a URL
    • TO_EPOCH_TIMESTAMP(dateTime, format, timezone) : Returns the epoch timestamp of the dateTime given theformat. If the format does not have a timestamp and you wish to assume a given timestamp, you may specify thetimezone optionally.

Example query:

IN_SUBNET( ip, '192.168.0.0/24') or ip in [ '10.0.0.1', '10.0.0.2' ] or exists(is_local)

This evaluates to true precisely when one of the following is true:

  • The value of the ip field is in the 192.168.0.0/24 subnet
  • The value of the ip field is 10.0.0.1 or 10.0.0.2
  • The field is_local exists
  • No labels