Introduction
Audit framework has undergone a major enhancements from Apache Ranger 0.4 to 0.5. Some of the major changes added in 0.5 were.
Audit to Solr: This is now preferred and recommended audit store. Ranger admin now can show audits stored in Solr.
While audit to DB continues to be supported its use has been deprecated. In future releases its support may be withdrawn.
Audit aggregation: Audit messages logged within a configurable time can be aggregated and logged as a single audit event along with the count. This can be particularly useful for plugins with a large number of audit events, e.g. kafka, hbase, solr, etc..
Scope of this document
As a result of these changes the audit configuration is 0.5 differs from 0.4. This document provides those configuration details. For historical reasons these are also known as v3 style configuration. The name v3 is a nod to the prior configurations which were named v2 style configurations.
Configuration properties naming convention
Audit configurations properties following the following naming convention: xasecure.audit.destination.<sink-type>.<cfg-name-element1>.<cfg-name-element2>
....
Where:
sink-type
denotes the type of audit sink, e.g.hdfs
,solr
,db
, etc.cfg-name-element1
,cfg-name-element2
, etc. denote the parts of the configuration item for that specific sink-type, e.g. for sink-type ofdb
a couple of audit configuration properties are:xasecure.audit.destination.db.jdbc.driver
andxasecure.audit.destination.db.jdbc.url
, etc.
For a concrete example, please refer to the details of one of the audit sinks below.
Audit to Solr
SolrCloud is the preferred audit store. Audit messages stored in Solr can be viewed via Ranger Admin web app. Solr can be configured to purge audits older than, say, a month or so, with HDFS sink used for long term storage.
All properties for solr listed below start with the following prefix:
“xasecure.audit.destination.solr.”
. For example, full name of the first property below would be:xasecure.audit.destination.solr.urls
To enable audit to
solr
set the propertyxasecure.audit.destination.solr
totrue
.Following are the configuration details to configure Ranger audit to Solr.
Property name | Details |
|
|
|
|
| Example value:
|
Audit to Db
Solr is the preferred and recommended audit store. Use of database to store Ranger Audits is deprecated. Users are strongly encouraged to move to Solr to store their audit messages. The new DB Audit Provider exits only to ease the adoption of Apache Ranger 0.4 users of audit to Ranger 0.5 audit framework. DB Audit Provider might be removed in future releases.
All properties for db listed below start with the following prefix:
“xasecure.audit.destination.db.”
. For example, full name of the first property below would be:xasecure.audit.destination.db.jdbc.driver
.To enable audit to
db
set the propertyxasecure.audit.destination.db
totrue
.Following are the configuration details to configure Ranger audit to
db
.
Property name | Details |
|
|
|
|
| For example, database user for database where ranger audit data is to be stored: |
| Password to be used to connect to the target database. This property is ignored if a password can be found in the credentials file. |
|
|
Audit to HDFS
HDFS is the preferred and recommended long term store for Ranger audit messages along with Solr for keeping short term audit messages that might need to be searched. Audits in Solr would be used to view audits logs using Ranger Admin UI where as audits kept in HDFS can be for compliance or other off-line uses like thread detection, etc.. Solr can be configured to purge audits older than, say, a month or so.
All properties for hdfs listed below start with the following prefix:
“xasecure.audit.destination.hdfs.”
. For example, full name of the first property below would be:xasecure.audit.destination.hdfs.dir
.To enable audit to
hdfs
set the property xasecure.audit.destination.hdfs totrue
.Following are the configuration details to configure Ranger audit to hdfs.
Property name | Details |
|
|
|
|
|
|
| Age of the audit log file in seconds after which it would get rolled over to a new file. Default is set to |
Audit to Log4j
To enable Ranger to send audit logs to a log4j appender, set property xasecure.audit.destination.log4j to true
. Also make sure that property logger
is specified as mentioned below.
Property name | Details |
| The name of the logger where the audit logs should be sent to, as specified in the component's log4j configuration file. Ranger writes audit logs at INFO level. Please ensure that the log4j configuration has INFO level enabled for the logger specified above. |
Example
Below are the configuration details to enable Ranger Hive plugin to write audit logs to log4j.
Configure a log4j appender for audit logs in component's log4j configuration file (hive-log4j.properties
for Hive):
log4j.appender.RANGER_AUDIT=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RANGER_AUDIT.File=${hive.log.dir}/ranger-hive-audit.log
log4j.appender.RANGER_AUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.RANGER_AUDIT.layout.ConversionPattern=%m%n
log4j.logger.ranger.audit=INFO,RANGER_AUDIT
Configure Ranger plugin to write audit logs to log4j (ranger-hive-audit.xml
for Hive):
xasecure.audit.destination.log4j=true
xasecure.audit.destination.log4j.logger=ranger.audit
Ambari Examples
If you are using Ambari, then you need to update the properties in the corresponding service config sections and restart the services using Ambari.
If you modify the service log4j properties manually (outside Ambari), then when Ambari restarts, it will overwrite it. So, you should always update the properties from Ambari config sections
HiveServer2 Configuration
Append this within the section "Advanced hive-log4j"
log4j.appender.RANGER_AUDIT=org.apache.log4j.DailyRollingFileAppender log4j.appender.RANGER_AUDIT.File=${hive.log.dir}/ranger-hive-audit.log log4j.appender.RANGER_AUDIT.layout=org.apache.log4j.PatternLayout log4j.appender.RANGER_AUDIT.layout.ConversionPattern=%m%n log4j.logger.ranger.audit=INFO,RANGER_AUDIT
Add the following properties in "Custom ranger-hive-audit" section.
xasecure.audit.destination.log4j=true xasecure.audit.destination.log4j.logger=ranger.audit
Audit Queues
There is a system of queues that handle audit messages before it gets written to final destination. These queues provides various feature. Following diagram gives an overview and subsequent sections provide details of each one of them.
Asynchronous logging to in-memory buffer queue
Audit providers logs audit messages to sinks asynchronously so that host service’s operations are not prevented or slowed down by a slow or unavailable audit sink. Further, in case of an unavailable or slow audit sink it buffers the audit messages in memory to minimize local disk access. In case of extended outage of an audit sink it spools the unwritten audit messages to disk files so they can be sent to audit sink as and when it becomes available.
Various aspected of those queue providers can be configured via following settings. All properties below start with the following prefix: “xasecure.audit.destination.async.”
. For example, full name of the first property below would be: xasecure.audit.destination.async.queue.size
.
Configuration name | Notes |
|
|
Summarization
In high volume systems, like kafka a very large number of audit messages can be generated in a short amount of time. For compliance and for other practical reasons, like threat detection, it may not be desirable to throttle back the amount or granularity of auditing.
Ranger 0.5 adds the ability to summarize audit messages in such situations while preserving the distinguishing traits of each audit message. To ensure that no unique/distinguishing information is lost, during summarization, audit messages are aggregated if and only if they differ only in their time stamp. If anything else about an audit is different then it is preserved as a separate audit message. Further in interest of capturing as much information as possible the time interval on the aggregate audit message denotes the max and min time of actual audit events that were a part for that summary event.
Following are properties control the behavior of audit summarization.
Configuration name | Notes |
|
|
|
|
|
|
Summarization Batch size |
|
Batching and bulk write of of audit messages
It can be faster to write several messages to solr in a batch rather than write them one at a time. Similarly when writing audit messages to a database it is much faster to batch write of several messages into a single transaction. Ranger Audit framework provides this via the use of buffer queues.
Following example assumes that:
You are configuring queue provider for
solr
.You have using standard queue provider, i.e.
batch
.
This each property configuration name below should be prefixed by: xasecure.audit.destination.solr.batch
. Change the values of audit sink type and queue name to suite your configuration.
Configuration name | Notes |
| By default up to |
|
|
Configuration related to File spooling
If audit framework detects that an audit destination is down then it buffers the audit messages in memory. Once memory buffer fills up then it can be configured to spool the unsent messages to disk files to prevent or minimize the loss of audit messages. Following configuration settings help one to control the behavior around disk spooling of audit messages:
Following example assumes that:
You are configuring queue provider for
solr
.You have using standard queue provider, i.e.
batch
.
Accordingly each property configuration name is prefixed by: xasecure.audit.destination.solr.batch.filespool
. Change the values of audit sink type and queue name to suite your configuration.
Configuration name | Default value | Notes |
enabled | false | Controls if audit messages would be spooled to local disk files if in-memory buffer queue gets filled up. |
dir | N/A | Local disk directory where spool files would be kept. This value must be specified. |
filename.format | spool_%app-type%_%time:yyyyMMdd-HHmm.ss%.log |
|
archive.dir | archive subdirectory of the spool file dir. | For example, if spool file for solr sink is configured to be /var/log/hadoop/hdfs/audit/solr/spool then by default the spool files would get archived to /var/log/hadoop/hdfs/audit/solr/spool/archive directory. |
archive.max.files | 100 | Max number of files to archive. If number of files in the archive directory exceed this number then oldest file(s) would get deleted. |
file.rollover.sec | 86400 | Age of the spool file in seconds after which it would get rolled over to a new file. Default is set to a day (24 * 60 * 60 = 86400 seconds) . |
destination.retry.ms | 30000 | How often should spooler try to reconnect to the destination that was down the last time in milliseconds. The default is 30s (30 * 1000 = 30000) |
drain.threshold.percent | 80 | Don’t start spooling to disk unless in-memory queue is this much percent full. As long as audit destination is able to keep up and in-memory queue is adequately sized, a high enough value would ensures that messages are never flushed to local disk. |
drain.full.wait.ms | 300000 | Once a destination comes back up amount of time to let new audit messages get buffered in memory before spooling them. By default this is set to 5 minutes. If spool is given enough time to send on-disk messages to the final destination and in-memory queue is properly sized then disk spooling of new messages can be avoided and system can revert back to in-memory buffering with no disk access. |
Suppressing the Spooling of Audit messages
If you wish to suppress the automatic spooling of audit messages then set the following property settings. Please note that doing so has consequences since one can lose audit messages.
Configuration name | Notes |
xasecure.audit.destination.<sink-type>.queue |
|
Common configuration Properties
Below are a few properties common to audit framework as a whole and/or they apply to all audit providers.
Configuration name | Default value | Notes |
xasecure.audit.log.failure.report.min.interval.ms | 60000 | In event of a failure to send audit events to an audit sink, say, due to a connectivity issue, this is the interval at which WARN messages would be logged to log4j. |
xasecure.audit.credential.provider.file | N/A |
|
Using Custom Audit Providers and Queue Providers
Audit frameworks allows a user to plugin their custom implementations of not only the Audit Destination Providers (e.g. custom Solr or HDFS Provider) but also provide custom implementations of Queue providers used by the Framework for buffering audit messages on their way to the final Audit sink.
Standard Audit providers and Queue Providers are quite Robust and function rick. You can ignore this section if you don’t have a need to use their custom implementations.
Configuration name | Notes |
| If you wanted to use a new audit sink, say, JMS to store audit messages then you could define a new property to signal that by setting xasecure.audit.destination.jms to true. |
| Since there isn’t a standard Audit Provider for JMS one needs to let the framework know about the class which implements it. Set the property xasecure.audit.destination.jms.classname to the fully qualified class name of the implementation, e.g. com.company.JmsAuditDestination. |
|
|
| Let’s say you If you also also want to use a custom Queue Provider then use this property to identify that Queue provider type. To use the default queue provider either leave this property unspecified or set it to batch. |
| This property is provides the full name of the class which implements the custom Queue provider. For example, to use a Queue provider that uses a ring buffer with your JMS Audit Provider:
|
|
|
Passing Custom config properties to standard Audit Providers
For any audit sink framework would also load any custom property named as follows: xasecure.audit.destination.<sink-type>.config.<custom1.elem1>.<custom.elem2>....
Where:
sink-type
denotes the type of audit sink, e.g.hdfs
,solr
,db
, etc.config
: is the configuration name element which signals to the framework that following properties should be made available to the audit provider as a custom property for its use.cfg-name-element1
,cfg-name-element2
, etc. denote the parts of the configuration item for that specific sink-type, e.g. for sink-type ofdb
a couple of audit configuration properties are:xasecure.audit.destination.db.jdbc.driver
andxasecure.audit.destination.db.jdbc.url
, etc.
Use of standard HDFS Audit provider to Audit to Azure Blob Storage is an example of how this provision for custom properties is used by standard audit providers to extend their functionality merely via configuration.
Backward compatibility
A brief note about backward compatibility. Old v2 style configuration(s) and are still supported, of course, and will work as is. Old configurations trigger the use of older implementations of Audit Providers. Please refer to this Blog posting for a refresher on those details.
In addition following should be noted about continued use of v2 style configurations.
Future enhancements to audit framework would be made only to the v3 (new Ranger 0.5 Audit) Providers. Hence, users are encouraged to move to new v3 style configurations.
Further, it is not possible to mix v2 and v3 style configurations. Presence of any v3 style configuration would suppress any v2 style Audit Providers.
2 Comments
Haihui Xu
In the section Audit to Db, "To enable audit to
solr
set the propertyxasecure.audit.destination.db
totrue
." May be "To enable audit todb
set the propertyxasecure.audit.destination.db
totrue
." Alok LalAlok Lal
Fixed. Thanks