OBJECTIVE
An Admin of Apache Eagle should be able to get audit details of actions performed on Policies and Site/Datasource.
The Audit details should have details like User, Action(Create/Update/Delete), Timestamp, Link to the Policy/Site/Datsource etc.
PROPOSED APPROACH
Approach #1
- Request for Create/Update/Delete comes in for Policies, Site or Datasource.
- After the requested data is persisted in corresponding HBase tables, the audit information will be stored on separate audit tables (one for each of the parent tables where actual data is stored for the given service request).
- Response for the request is sent back to the user.
Approach #2
- Request for Create/Update/Delete comes in for Policies, Site or Datasource.
- After the requested data is persisted in corresponding HBase tables, the audit information will be stored on a single audit table (for any action by any service).
- Response for the request is sent back to the user.
Approach #2 tries to be more generic, as new tables definitions and implementations need not be added for auditing a new service.
PROPOSED DESIGN
Following is the initial design upon which changes will be made to accommodate the auditing feature.
For the purpose of explanation of the design, will be using Policy Definition service as an example.
#1 – Client sends a request to create a Policy to the service component with a certain payload.
http://localhost:8080/eagle-service/rest/entities?serviceName=AlertDefinitionService
#2 – After authentication and preprocessing, the call lands to the create(Entities, EntityDefinition) method in HBaseStorage.java
#3 – The required data for the operation is persisted onto HBase.
#4 – At this point, the method will build and send the response back to the client. The Auditing of the service request will be done here (implementation of one of the approaches mentioned above).
Since there are separate methods for Create, Update and Delete in HBaseStorage.java we would be able to audit what action happened for the particular request so that it can be audited.
#5 – The audit data will be persisted in a single HBase Audit table or multiple HBase Audit tables (as given in the approach)
#6 – After the audit information is persisted in HBase, the response for the service will be set back to the client.
SAMPLE TABLE DESIGN
Approach #1
// Individual audit tables for individual data tables
Service #1: Policy Definition
Data Table: alertdef
Audit Table: alertdefAudit
Audit Columns:
- encodedRowKey (encoded format of the row key as obtained from persisting the policy data)
- userID
- actionTaken (CREATE/UPDATE/DELETE)
- auditTimestamp
Service #2: Alert Data Source
Data Table: alertDataSource
Audit Table: alertDataSourceAudit
Audit Columns:
- encodedRowKey (encoded format of the row key as obtained from persisting the datasource information)
- userID
- actionTaken (CREATE/UPDATE/DELETE)
- auditTimestamp
Approach #2
// Single audit table for all data tables
Service #1: Policy Definition
Service #2: Alert Data Source
Audit Table: serviceAudit
Audit Columns:
- serviceName (to differentiate which service the audit entry belongs to)
- encodedRowKey (encoded format of the row key as obtained from persisting the datasource information)
- userID
- actionTaken (CREATE/UPDATE/DELETE)
- auditTimestamp
AUDIT RETRIEVAL APPROACH
Below are the designs for retrieving audit data for each of the proposed approaches.
APPROACH #1
As this approach suggests using multiple tables, we would need to create multiple Entity Definitions and DAO implementations for each of the audit tables created.
SAMPLE SERVICE CALL
http://localhost:8080/eagle-service/rest/list?query=AlertDefinitionServiceAudit[@encodedRowKey="ABC_DEF" AND @actionTaken="CREATE/UPDATE/DELETE"]{*}&pageSize=100
APPROACH #2
As we are going with only one table for auditing for as many numbers of data tables available, we need only one Entity Definition and DAO implementation for the single audit table created.
In the service though we would be passing only additional parameter as compared to the Approach #1 as this will be used to identify for which service the audit entries needs to be retrieved.
SAMPLE SERVICE CALL
http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @encodedRowKey="ABC__DEF"]{*}&pageSize=100
http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @encodedRowKey="ABC__DEF" AND @actionTaken="CREATE/UPDATE/DELETE"]{*}&]{*}&pageSize=100
GENERIC INTERFACE DESIGN
A generic interface for implementing a custom audit source. Modeled similar to the PropertyChangeListener used in TaggedLogAPIEntity.
- AuditListener Interface - Contains the method to be implemented for auditing purpose.
- HBaseStorageAudit Class - Implements the AuditListener interface. Register the implementation class for callback and override the above method with the custom logic, which builds up the audit data and persists it to a audit table in HBase. Here we enforce the condition which audits only in case of HBase operations related to entities used for Policy/Datasource.
- HBaseStorage Class - From the create/update/delete methods call the method in HBaseStorageAudit that triggers the listener method.
- AuditSupport Class - Used for maintaining AuditListener implementations and calling.
Adding to the audit columns mentioned in Approach #2, we can also persist the columns available in the @Tags annotation of an entity, so that we can model the audit retrieval service based on the data retrieval service of different entities.
We can also have a configuration parameter which lets us decide whether to use the audit feature or not if required.
9 Comments
Edward Zhang
I think option 2 looks make more sense than option 1.
This design is trying to audit table modification from very low level, but problem is for user we want high level audit information, for example who creates which policy. How does current design achieve this goal?
Murali Krishna
Edward Zhang, we could add two more columns to the current table design for Approach #2 say, source_id, source_name, source_description. These columns would contain high level details wrt to Policy/Site details like policy/site ID and policy/site name.
Another service can be added to the retrieval to get the details of the policies/site created by a specific user. Example,
-- Get user specific audits
http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @userID="admin"]{*}&]{*}&pageSize=100
-- Get user and action specific audits
http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @userID="admin" AND @actionTaken="CREATE/UPDATE/DELETE"]{*}&]{*}&pageSize=100
Edward Zhang
Thanks. I did not mean we need add source_id, source_name, source_description. My original question is if we do audit in low level for example in class HBaseStorage.java, how do we know the content of this updated/created/deleted entity? In HBaseStorage everything is generic, we don't know if that is for policy or for site or for data source.
Murali Krishna
In the different methods available in HBaseStorage.java for the HBase operations, we would have the EntityDefinition parameter from which we can get what service call is that. Using this we can store the value of serviceName column in the audit table to differentiate whether that audit entry is for policy/site/datasource.
Example,
Policy creation request >> http://localhost:8080/eagle-service/rest/entities?serviceName=AlertDefinitionService
AlertDefinitionService - this Service Name will be available in the entity definition
Datasource creation request >> http://localhost:8080/eagle-service/rest/entities?serviceName=AlertDataSourceService
AlertDataSourceService - this Service Name will be available in the entity definition
We would not be needing content of the entity if we just go with the encodedRowKeys as they will be readily available in the response object for the CREATE/UPDATE/DELETE operations in HBaseStorage.java.
Senthil Kumar
Edward Zhang and Murali Krishna , Here is the way to find out the contents from Generic Entity ..
for(TaggedLogAPIEntity entity : entities){
try {
Object policyId = entityDefinition.getValue(entity, "policyId");
Object policyName = entityDefinition.getValue(entity, "policyName");
System.out.println(policyId.toString() + "--" + policyName.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
Hopefully this way we can fetch required fields from entity and audit the same..
Edward Zhang
Thanks for the investigation. I think probably we need some generic callback interface where for policy create/update/delete we can implement the logic you have mentioned even by forcing type conversion to AlertDefinitionAPIEntity. Please suggest.
Senthil Kumar
Edward Zhang , Can you Pls review the Generic Interface Design approach added ??
Edward Zhang
Thanks Senthil for this proposal. Nice!
Murali Krishna
Edward Zhang, in the eagle services for PolicyDefinition and DataSourceDefinition, both the create and update operations go through the create method of HBaseStorage.java. So how do we differentiate between create and update operation without hitting HBase again to check if the row exists (is this acceptable from the point of performance). Or is there any other way to differentiate ?
Also in delete operation, it is done by passing the encoded row key. We would need to do a GET operation here to get the row details (for example, the policy ID or datasource or site) for the audit.