LensDriver is an interface which allows developers to integrate query processing engines into Lens.
Responsibilities of a Lens Driver:
- Provides query estimate
- Execute queries - need to support both blocking as non blocking execute calls.
- HQL to * mapping - convert HQL to query language understood by the backend engine.
For example JDBC driver has a columnar rewriter which converts HQL to ANSI SQL and does some optimizations required by InfoBright - Provide in memory and persistent result sets
Adding a new Driver for a new query processing engine :
For adding a new driver, we need to consider all the following things.
- estimate implementation - estimate should at least do semantic validation. Actual cost estimate can be symbolic. For example currently we assume JDBC cost to be zero and Hive cost to be 1.
- does backend support HQL? If no, you will have to translate HQL to appropriate query language
- does backend support async execution of queries? If no, driver should handle that
- is there a get status API
- result set implementation -
Is it possible to persist result set to HDFS?
Is it possible to stream the result set instead of loading entirely in Lens memory? - What happens when the backend engine goes down while Lens is running? Is there a way to recover queries?
- Does engine support cancel query? If not, what would happen to abandoned queries?
- Mapping of driver query statuses to Lens query statuses
- If Lens restarts while the backend engine is still running, is it possible to recover queries if you have a reference to the remote query?
- Do we need to maintain connections per user?
- Do we need to pool connections?
- Do we need a separate pool for estimate queries?
- Connection thread safety?
Drivers usage in Lens:
QueryExecutionService in lens loads drivers at initialization time. Driver lifecycle is managed by query service.
Driver state is persisted by query service. Not all drivers need persistent state, so it depends on the query engine. If driver needs to persist states across restarts, then it should implement writeExternal and readExternal properly.
Query execution workflow:
1. Submit query to query queue
2. Query submitter thread 'takes' from the queue
- Rewrites query for each driver
- Ask for cost estimate to each driver
- Select driver with minimum cost (driver selector)
- ExecuteAsync in the selected driver
Background thread polls query status from driver. Driver sets DriverQueryStatus
/** * The Interface LensDriver. */ public interface LensDriver extends Externalizable { /** * Get driver configuration */ Configuration getConf(); /** * Configure driver with {@link Configuration} passed. * * @param conf The configuration object * @throws LensException the lens exception */ void configure(Configuration conf) throws LensException; /** * Estimate the cost of execution for given query. * * This should be returned with very less latency - should return within 10s of milli seconds. * * @param qctx The query context * * @return The QueryCost object * * @throws LensException the lens exception if driver cannot estimate */ QueryCost estimate(AbstractQueryContext qctx) throws LensException; /** * Explain the given query. * * @param explainCtx The explain context * * @return The query plan object * @throws LensException the lens exception */ DriverQueryPlan explain(AbstractQueryContext explainCtx) throws LensException; /** * Prepare the given query. * * @param pContext the context * @throws LensException the lens exception */ void prepare(PreparedQueryContext pContext) throws LensException; /** * Explain and prepare the given query. * * @param pContext the context * @return The query plan object; * @throws LensException the lens exception */ DriverQueryPlan explainAndPrepare(PreparedQueryContext pContext) throws LensException; /** * Close the prepare query specified by the prepared handle, releases all the resources held by the prepared query. * * @param handle The query handle * @throws LensException the lens exception */ void closePreparedQuery(QueryPrepareHandle handle) throws LensException; /** * Blocking execute of the query * <p></p> * The driver would be closing the driver handle, once the results are fetched. * * @param context the context * @return returns the result set, null if there is no result available * @throws LensException the lens exception */ LensResultSet execute(QueryContext context) throws LensException; /** * Asynchronously execute the query. * * @param context The query context * @throws LensException the lens exception */ void executeAsync(QueryContext context) throws LensException; /** * Register for query completion notification. * * @param handle the handle * @param timeoutMillis the timeout millis * @param listener the listener * @throws LensException the lens exception */ void registerForCompletionNotification(QueryHandle handle, long timeoutMillis, QueryCompletionListener listener) throws LensException; /** * Update driver query status in the context object. * * @param context The query context * @throws LensException the lens exception */ void updateStatus(QueryContext context) throws LensException; /** * Fetch the results of the query, specified by the handle. * * @param context The query context * @return returns the result set * @throws LensException the lens exception */ LensResultSet fetchResultSet(QueryContext context) throws LensException; /** * Close the resultset for the query. * * @param handle The query handle * @throws LensException the lens exception */ void closeResultSet(QueryHandle handle) throws LensException; /** * Cancel the execution of the query, specified by the handle. * * @param handle The query handle. * @return true if cancel was successful, false otherwise * @throws LensException the lens exception */ boolean cancelQuery(QueryHandle handle) throws LensException; /** * Close the query specified by the handle, releases all the resources held by the query. * * @param handle The query handle * @throws LensException the lens exception */ void closeQuery(QueryHandle handle) throws LensException; /** * Close the driver, releasing all resouces used up by the driver. * * @throws LensException the lens exception */ void close() throws LensException; /** * Add a listener for driver events. * * @param driverEventListener the driver event listener */ void registerDriverEventListener(LensEventListener<DriverEvent> driverEventListener); /** * Add the user config loader to driver for use * @param userConfigLoader */ void registerUserConfigLoader(UserConfigLoader userConfigLoader); }