Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

Engineers with systems operating on and caching flat domain objects want to easily persist those objects to a relational database table.  Ex: put Employee object into a Region, read/write Employee object to/from Employee table

Goals

  1. User can easily map a region to a JDBC table with minimal configuration.

  2. User can persist entity objects put into the region for reading and writing.

Approach

  1. Expose new jdbc-mapping and jdbc-connection commands.

  2. Implement JdbcReader, JdbcWriter, and JdbcAsyncWriter to be added to a region on region create.

Out of Scope

  1. This won’t solve complex use cases involving complex entity objects, nested structures, etc.  This won’t solve every inline caching use case.  It is designed to be general fit and so won’t solve unique use cases.  Users are still free to code their own   Normalized flat tables and corresponding domain objects covering only one topic are encouraged.  


 
 GFSH Changes

 

create jdbc-mapping
alter jdbc-mapping
describe jdbc-mapping
destroy jdbc-mapping
list jdbc-mappings

create jdbc-connection
alter jdbc-connection
describe jdbc-connection
destroy jdbc-connection
list jdbc-connections

Notable New Classes

JdbcReader

JdbcWriter

JdbcAsyncWriter

 

Out-Of-Box implementation

 

JDBC Connector is designed to be usable out of the box.  It is designed to plug into a pattern where you have a specific table name that (probably) wont change. 

1) Just create a region with the same name, add on a JdbcWriter (for write through) or JdbcAsyncWriter (for write behind) or JdbcReader (for read through). 

2) Create a jdbc-connection to your backend DB, create a jdbc-mapping to match the region to the connection.

3) Begin putting entity objects into the region the default behavior will attempt to write to a table matching the name of the region, into columns matching the field names on your object.

If your names don't match you can also configure specific region name to table names and field names to columns names in the jdbc-mapping command.

 

Getting Started

JDBC connector works best when starting from an empty region and empty table (thus synced).  And using only one application--one region--one table tenancy.



  Setup
  1. You’ll need an existing jdbc datasource with a table you want to inline (map) a region with and an entity object that implements PdxSerializable.

    ex: a MySQL RDS instance running in AWS


2. In gfsh you’ll need to add the jdbc jar to the server classpath when you start the server.

gfsh>start server --name=s1 --classpath="/Users/fkrone/workspace/mysql-connector-java-5.1.45-bin.jar"


3. Create a region with the same name as the table you want to persist to (optionally you can use a table/region name mismatch and configure the mapping with the jdbc-mapping command)

 

gfsh>create region --name="employee" --type=REPLICATE --cache-loader="org.apache.geode.connectors.jdbc.JdbcLoader"


4. Create a jdbc-connection with DB url and credentials

gfsh>create jdbc-connection --name=aws_mysql --url="jdbc:mysql://rds-mysql-test.123.us-west-2.rds.amazonaws.com:3306/test?user=emma&password=iamthegreatest"


5. map the new connection to a region using the jdbc-mapping command.

 

gfsh>create jdbc-mapping --connection="aws_mysql" --region="employee"

 

 

Risks and Mitigations

 

It is strongly recommended that you do your own use-case specific testing when using JDBC Connector.  Test an accurate read/write load you would expect to the region to the DB and make sure JDBC Connector performs according to your SLAs.  Make sure you understand the number of connections made to your DB and review against your licensing agreement where applicable. It is also recommended that you not read or write directly to a DB which is your system of record.  If possible, use a DB that is designed to be a backup of your cache and ETL from there.

 

Region to Database table mapping

By default the connector uses region name as table name and PDX field name as column name. If the region or field name differs from the database, then a jdbc-mapping can be used to specify the mapping.

Default Mapping and Case Sensitivity

The default mapping first looks for an exact match. If not found it then looks for a case insensitive match. If multiple case insensitive matches are found an exception is thrown.

Write Mapping

Every PDX field must map to an existing column name. If a column does not exist then exception will be thrown, when a create or update is done.

Read Mapping

If the jdbc-mapping has pdx-class-name then every table column requires an existing PDX field name. If not the read throws an exception.  If the jdbc-mapping does not have a pdx-class-name then a PDX field name is created for every table column.

Type mapping

The connector tries to map the java PDX field type to an appropriate sql type. The user needs to make sure that the java PDX type is compatible with the database column type.

Type mapping while writing

The JDBC Connector relies on the JDBC driver’s ability to map a PDX type to the column type. The JDBC driver will throw an exception if the types are not compatible. If the PDX data is null then the JDBC PareparedStatement.setNull method is called. For non-null PDX data the JDBC PreparedStatement.setObject method is called with the PDX data. The PDX data is passed to PreparedStatement.setObject unchanged with the following two exceptions:

Writing a PDX field that is a Character or char

For characters whose value is zero, JDBC setNull will be called. All non-zero characters are changed to a String.  We advise against using Character as it will use PareparedStatement.setObject, JDBC will convert it for the DB column on insert, and on a read will use ResultSet.getObject and won’t convert back to Character (more under type mapping below).

Writing a PDX field that is a java.util.Date

java.util.Date instances are changed to a sql date type (java.sql.Date, java.sql.Time, or java.sql.Timestamp) based on the column type.

Type mapping while reading with a pdx-class-name

When data is read from JDBC, and the region mapping has a pdx-class-name, the data is converted to the PDX field type by calling getXXX on the resultset.

For example, if the PDX field type is short, then ResultSet.getShort is called.

The JDBC getXXX method will throw an exception if its unable to convert the column data.

The following special cases exist when reading with a pdx-class-name:

Reading a char field

ResultSet.getString is called and the first char of the String is put in the PDX field. If the String is null or empty then zero is put in the PDX field.

Reading a date field

The column type is used to call one of the following: ResultSet.getDate, ResultSet.getTime, or ResultSet.getTimestamp. The resulting object is put in the PDX field as a java.util.Date.

Reading a byte array field

If the column type is BLOB then ResultSet.getBlob is called and the entire blob is read into a byte array. Otherwise ResultSet.getBytes is called.

Reading an array field other than byte array

ResultSet.getObject is called and cast to the field’s array type. In most, if not all, cases this will result in a class cast exception.

Reading an object field

If the column type is BLOB then ResultSet.getBlob is called and the entire BLOB is read into a byte array, otherwise ResultSet.getObject is called. If it returns a java.sql.Date, java.sql.Time, or java.sql.Timestamp then its converted to a java.util.Date.

Note that if an object field originally contained an instance of java.lang.Character, then it is written to JDBC as an instance of String using PreparedStatement.setObject.  So when it is read using ResultSet.getObject, it will not be an instance of java.lang.Character.

When data is read from JDBC and the region mapping has no pdx-class-name, then the type of every field is object.

Registering PDX Metadata

When a write is done on a region, PDX metadata is registered that describes the class. If an application reads before doing any writes, it can register the PDX metadata using the cache api Cache.registerPdxMetaData(). This can also happen if the PDX registry is not persisted and the cache server is restarted.

Entity Object Structure

Nested structures are not supported.  Flat table structures with unique key columns must be used.

This is for persisting flat domain objects to a table.  Pull from those flat tables for more complex transformations, etc.


Your entity object needs to implement PdxSerializable.


public class Employee implements PdxSerializable {


Implement the toData, fromData methods from the interface.  Jdbc returns objects in results which the JdbcLoader will try to convert.  Implement toData and fromData to cast object returns to specific fields.