This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Skip to end of metadata
Go to start of metadata

Our goal is to define a single place where all type information will reside. We define TypeConfiguration class which will have multiple type extensions for different features. This configuration will be set for the whole Ignite instance with ability to override it on per-cache level.

TypeConfiguration

public class TypeConfiguration implements Serializable {
    /** Serial version UID. */
    private static final long serialVersionUID = 0L;
 
    /** 
     * Type name. Can be one of three things:
     * - Simple type name. Use it when type is not in classpath;
     * - Fully qualified type name. Prefer it when type is in classpath to avoid conflicts. 
     *   E.g. "my.package.employee.Address", "my.package.organization.Address";
     * - Package wildcard. Must ends with ".*" in this case. E.g. "my.package.*".
     */
    private String typeName;
 
    /** Used to configure single type when it is not in classpath. */
    public void setTypeName(String);
 
    /** Used to configure specific class. Both typeName and packageName fields will be set. */
    public void setClass(Class);
 
    /** Affinity key field name. */
    private String affKeyFieldName;

    /** Type info extensions. */
    private Map<Class<? extends TypeInfo>, ? extends TypeInfo> typeInfos;

    public TypeInfo[] getTypeInfo() {...} 
    public void setTypeInfo(TypeInfo... typeInfos) {...}  

    public <T extends TypeInfo> T getTypeInfo(Class<T> infoCls) {...}
}

Notes:

  • TypeInfo's are set as varargs to follow general Ignite rules (e.g. IgniteConfiguration.setCacheConfiguration()).
  • TypeInfo getter/setter do not have "s" at the end to follow general Ignite rules (e.g. IgniteConfiguration.setCacheConfiguration());
  • Shouldn't we move "affKeyFieldName" to another TypeInfo, e.g. AffinityKeyTypeInfo?

TypeInfo

public interface TypeInfo extends Serializable {
    /** Whether implemenation supports single type. */
    bool supportSingleType();
 
    /** Whether implementation supports multiple types. */
    bool supportMultipleTypes();
}

Notes:

  • Package configuration are only supported by PortableTypeInfo for now. For this reason it makes sense to add "support*" methods to prevent misconfiguration.

PersistenceTypeInfo

public class PersistenceTypeInfo implements TypeInfo {
    /** Serial version UID. */
    private static final long serialVersionUID = 0L;

    /** Schema name in database. */
    private String dbSchema;

    /** Table name in database. */
    private String dbTbl;

    /** Persisted fields. */
    private Collection<PersistenceField> fields;
}
public class PersistenceField implements Serializable {
    /** Serial version UID. */
    private static final long serialVersionUID = 0L;

    /** Column name in database. */
    private String dbFieldName;

    /** Column JDBC type in database. */
    private int dbFieldType;

    /** Field name in java object. */
    private String javaFieldName;

    /** Corresponding java type. */
    private Class<?> javaFieldType;
}

Notes:

  • Package wildcards are not supported.

PortableTypeInfo

public class PortableTypeInfo implements TypeInfo {
    /** Serial version UID. */
    private static final long serialVersionUID = 0L;

    /** ID mapper. */
    private PortableIdMapper idMapper;

    /** Serializer. */
    private PortableSerializer serializer;

    /** Portable metadata enabled flag. When disabled queries and pretty toString() will not work. */
    private Boolean metaDataEnabled = true;
}

Notes:

  • Supported for both types and package wildcards.

QueryTypeInfo

public class QueryTypeInfo implements TypeInfo {
    /** Serial version UID. */
    private static final long serialVersionUID = 0L;

    /** Fields. */
    private QueryField[] flds;
 
    /** Group indexes. */
    private QueryCompoundIndex[] grpIdxs;
 
    /** Fields to index as text. */
    private String[] txtIdxs;    
}
public class QueryField implements Serializable {
    /** Serial version UID. */
    private static final long serialVersionUID = 0L;

    /** Field name. */
    private String name;
 
    /** Field class. */
    private Class cls;
 
    /** Whether to index the field. Disabled by default. */
    private QueryFieldIndexType idxTyp = QueryFieldIndexType.NONE;
}
public enum QueryFieldIndexType {
    /** Do not index the field. */
    NONE,
 
    /** Ascending order. */ 
    ASC,
 
    /** Descending order. */ 
    DESC
}
public class QueryCompoundIndex implements Serializable {
    /** Serial version UID. */
    private static final long serialVersionUID = 0L;

    /** Index name. */
    String name;
 
    /** Participating fields. */
    private QueryFieldInfo[] flds;
}

Notes

  • QueryFieldInfo.cls is a problem for non-Java users, because they have to write ugly things like "java.lang.Integer" which is very hard to understand for non-Java users. Lets switch to enum here?

IgniteConfiguration

public class IgniteConfiguration {
    /** Type configurations. */
    private TypeConfiguration[] typeCfg;
 
    public TypeConfiguration[] getTypeConfiguration();
    public void setTypeConfiguration(TypeConfiguration... typeCfg);
}

CacheConfiguration

public class CacheConfiguration {
    /** Type configurations. */
    private TypeConfiguration[] typeCfg;
 
    public TypeConfiguration[] getTypeConfiguration();
    public void setTypeConfiguration(TypeConfiguration... typeCfg);
}
  • No labels

21 Comments

    1. Not sure if I like the name Configuration, I would prefer Metadata.
    2. It looks like different Ignite components may add additional type metadata. Should we have generic support for it? For example, we can a list of metadata beans added to type metadata.
    3. I think IndexingConfiguration should be renamed to QueryMetadata
    4. Does it makes sense to get rid of all these maps and lists in QueryMeatadata and create a data structure that would support it in a more readable fashion?

     

    1. I do not think that Metadata fits: metadata is something provided by a system, but these objects are configured by a user.
    2. Added generic support via marker interface.
  1. I like the latest version better. A few more comments:

    •  useTs flag in PortableTypeInfo is no longer valid, as we use the most precise timestamp on every platform
    • need to investigate if metaDataEnabled in PortableTypeInfo makes sense (only if not having metadata renders better performance)
    • we should add CacheConfiguration.isKeepDeserialized() method to allow caching deserialized values on-heap.
    • I think we need to add QueryFieldInfo class to simplify indexing configuration (similar to PersistenceFieldInfo
    • we should add CacheConfiguration.setIndexedTypes(String, String, String...) method to specify key-value pairs for indexing (in addition to the current method that accepts Class<?> parameters)
    • we should also change the signature of the existing setIndexedTypes(...) method to CacheConfiguration.setIndexedTypes(Class<?>, Class<?>, Class<?>...) to ensure that user provides at least a key-value type pair, which is the most common use case.
  2. Folks,

    I updated the design a bit. Please review it paying attention to notes after each section.

    Two issues are still to be addressed:

    • CacheConfiguration.setIndexedTypes - why are we passing them as String... or Class... ? Looks error-prone to me. Lets group them into pairs, e.g. IgniteType<String, String> or so?
    • It is not clear what to do with PlatformDotNetPortableTypeConfiguration. This is something like (TypeConfiguration + PortableTypeConfiguration), but for .Net and is very specific to this platform. It would be cool to convert it to "TypeInfo", but I have doubts about it, because it will make configuration for .Net users who are not used to Spring XML more complex.
    1. Looks very good. I made a few minor changes, e.g. added Serializable where it was missing, and also changed QueryField to extend Serializable, and not TypeInfo.

  3. Not so good, though (smile)

    It looks like we should not put portable configuration here. First, portable configuration is always global, it cannot be defined on per-cache level. Second, normally user will configure portables with wildcards. On the other hand, neither QueryTypeInfo, not PersistenceTypeInfo make sense on global level. They are always configured on per-cache level. I cannot imagine a scenario when user will want to share them between different caches.

    So we can configure portable separately on Ignite level, and queries/persistence on cache level. But this doesn't differ from current configuration much. Yes, it is a bit cleaner, but no major differences.

  4. I actually now believe that all the PersistenceTypeInfo configuration should go to the CacheJdbcPojoStore, which is the only place where it is used. This would leave us with only 2 things to cover: affinity-key and query-configuration.

  5. How about just having CacheTypeConfiguration which will have the optional affinity field and the query config:

    class CacheTypeConfiguration {
        String getTypeName();
     
        // Optional affinity field name for keys.
        String getAffinityFeild();
    
        // Query type configuration.
        QeuryTypeConfiguration getQueryTypeConfiguration(); 
    }
    class QueryTypeConfiguraiton {
        // Queryable fields.
        QueryField[] getQueryFields();
     
        // Not sure if this is enough for compound indexes.
        private String[][] getCompoundIndexes();
     
        // Fields to index as text.
        private String[] getTextIndexFields();   
    }

     

    I do understand that we merge key-type and value-type into one SQL table for SQL queries. This can be supported by having the following 2 methods:

    class CacheConfiguration {
        // Actually should be setQueryTypes(), 
        // but we are unfortunately are stuck with this name.
        void setIndexTypes(Class<?> keyType, Class<?> valType, Class<?>... moreKeyValPairs);
        void setIndexTypes(String keyType, String valType, String... moreKeyValPairs);
    }

    Thoughts?

    1. Here is an updated (better?) version of the configuration:

      class AffintiyKeyConfiguration {
          String getKeyTypeName();
       
          // Optional affinity field name for keys.
          String getAffinityFeild();
      }
      class QueryTypeConfiguraiton {
          // Key type name.
          String getKeyTypeName();
       
          // Value type name.
          String getValueTypeName();
      
          // Queryable fields.
          QueryField[] getQueryFields();
        
          // Not sure if this is enough for compound indexes.
          private String[][] getCompoundIndexes();
        
          // Fields to index as text.
          private String[] getTextIndexFields();  
      }

      Here we decoupled query type configuration from affinity key configuration. The querying engine will still take advantage of the affinity key configuration during query execution.  

  6. About JdbcCacheStoreConfiguration.

    I propose the following:

    public class CacheJdbcStoreField implements Serializable {
        /** Serial version UID. */
        private static final long serialVersionUID = 0L;
     
        /** Column name in database. */
        private String dbFieldName;
     
        /** Column JDBC type in database. */
        private int dbFieldType;
     
        /** Field name in java object. */
        private String javaFieldName;
     
        /** Corresponding java type. */
        private Class<?> javaFieldType;
        ...
    }
     
    public class CacheJdbcStoreConfiguration {
        /** */
        private static final long serialVersionUID = 0L;
    
        /** Schema name in database. */
        private String dbSchema;
    
        /** Table name in database. */
        private String dbTbl;
    
        /** Key class used to store key in cache. */
        private String keyType;
    
        /** Value class used to store value in cache. */
        private String valType;
    
        /** Key fields. */
        private Collection<CacheJdbcStoreField> keyFields;
    
        /** Value fields. */
        private Collection<CacheJdbcStoreField> valFields;
        ...
    }
     
    public abstract class CacheJdbcStoreFactory<K, V> implements Factory<CacheJdbcPojoStore<K, V>> {
        /** Name of data source bean. */
        private String dataSrcBean;
    
        /** Data source. */
        private transient DataSource dataSrc;
    
        /** Database dialect. */
        private JdbcDialect dialect;
     
        /** Store configuration*/
        private CacheJdbcStoreConfiguration cfg;
        ....
    }
     
    public class CacheJdbcPojoStoreFactory<K, V> extends CacheJdbcStoreFactory<K, V> {
        ...
    }
     
    public class CacheJdbcPortableStoreFactory<K, V> extends CacheJdbcStoreFactory<K, V> {
        ...
    }
    1. The persistence design looks good. I think we are going to merge portable marshaller with regular marshaller, so I don't think we will need 2 factories. What is the current purpose of having 2 marshallers?

      1. Dmitriy, thanks for pointing this. I did some investigations and let's discuss the following design:

        Note, I think we could drop prefix "Cache" ? Class names will be shorter and anyway these  classes will be in package org.apache.ignite.cache.store.jdbc;

        public class JdbcStoreField implements Serializable {
            /** Serial version UID. */
            private static final long serialVersionUID = 0L;
          
            /** Column name in database. */
            private String dbFieldName;
          
            /** Column JDBC type in database. */
            private int dbFieldType;
          
            /** Field name in java object. */
            private String javaFieldName;
          
            /** Corresponding java type. */
            private Class<?> javaFieldType;
            ...
        }
         
        public class JdbcStoreTypeConfiguration {
            /** */
            private static final long serialVersionUID = 0L;
         
            /** Schema name in database. */
            private String dbSchema;
         
            /** Table name in database. */
            private String dbTbl;
         
            /** Key class used to store key in cache. */
            private String keyType;
         
            /** Value class used to store value in cache. */
            private String valType;
         
            /** Key fields. */
            private CacheJdbcStoreField[] keyFields;
         
            /** Value fields. */
            private CacheJdbcStoreField[]> valFields;
         
            /** If {@code true} object is stored as IgniteObject. */
            private boolean keepSerialized;
            ...
        }
         
        public class JdbcStoreConfiguration {
            /** Types that store could process. */
            private JdbcStoreTypeConfiguration[] types;
         
            /** Name of data source bean. */
            private String dataSrcBean;
         
            /** Database dialect. */
            private JdbcDialect dialect;
            ...
        }
         
        public class JdbcStoreFactory<K, V> implements Factory<JdbcStore<K, V>> {
            /** Data source. */
            private transient DataSource dataSrc;
        
            /** Store configuration*/
            private JdbcStoreConfiguration cfg;
            ....
        }
    2. Alexey, do we really need such long names?

      How about JdbcTypeField and JdbcType?

  7. Guys,

    I think query and indexing related configuration should look like the following:

    /**
     * Query entity is a description of {@link IgniteCache cache} entry (composed of key and value)
     * in a way of how it must be indexed and can be queried.
     */
    public class QueryEntity {
        private String keyType;
        private String valType;
     
        // Map of field names to type names.
        private LinkedHashMap<String, String> flds;
     
        // Collection of indexes.
        private Collection<QueryIndex> idxs;
     
    	// In addition to the obvious getters and setters, 
        // we should also have these convenience methods.
        // All these methods should throw an exception
        // in case if a duplicate field or index already exists.
        public void addField(String name, String type);
        public void addField(String name, Class<?> type);
        public void addIndex(QueryIndex idx);
    }
    
    
    /**
     * Contains list of fields to be indexed. It is possible to provide field name
     * suffixed with index specific extension, for example for {@link Type#SORTED sorted} index
     * the list can be provided as following {@code ("id", "name asc", "age desc")}.
     */
    public class QueryIndex {
        private List<String> fields;
        private Type type;
        /**
         * Index type.
         */
        public enum Type {
            SORTED, FULLTEXT, GEOSPATIAL
        }
    }

    Respectively cache in addition to setIndexedTypes method must have setQueryEntities method.

     

    1. Sergi, can you specify how to configure individual field indexes vs. group indexes?

      1. They are different only by number of fields. The same thing.

        1. Now I am confused. Are you suggesting that a group index can contain different types of indexes, like so?

          1. field1, SORTED
          2. field2, FULLTEXT
          3. field3, GEOSPATIAL

          Also, how do I specify ASC vs DESC property?

           

          1. No, this is impossible with proposed API.

        2. I am generally OK with the design, except for the following:

          • I don't like how "asc" and "desc" parameters are declared in the index.
          • I think index type should be an outer class, not inner class.
          • I also think it is important to specify the convenience constructors and setters for QueryIndex, as it will have direct impact on the usability.

          I would like to propose the following changes:

          // I only show constructors here, but we should also have
          // corresponding setter methods.
          public class QueryIndex {
              private LinkedHashMap<String, Boolean> fields;
              private QueryIndexType idxType;
              // Creates index for one field. 
              //
              // If index is sorted, then ascending sorting is used by default.
              // To specify sort order, use the next method.
              // 
              // This constructor should also have a corresponding setter method.
              public QueryIndex(String field, QueryIndexType type) {...}
          
              // Creates index for one field. The last boolean parameter 
              // is ignored for non-sorted indexes. 
              // 
              // This constructor should also have a corresponding setter method.
              public QueryIndex(String field, QueryIndexType type, boolean asc) {...}
          
              // Creates index for multiple fields. 
              // 
              // If index is sorted, then ascending sorting is used by default.
              // To specify sort order, use the next method.
              // 
              // This constructor should also have a corresponding setter method.
              public QueryIndex(Collection<String> fields, QueryIndexType type) {...}
          
              // Creates index for multiple fields. 
              // Fields are passed in as a map, with field name as a key and sort order 
              // as a value (true for ascending). The value is ignored for non-sorted indexes.
              // 
              // This constructor is useful for sorted indexes, where it is necessary to specify
              // a separate sort order for each field.
              // 
              // This constructor should also have a corresponding setter method.
              public QueryIndex(LinkedHashMap<String, Boolean> fields, QueryIndexType type) {...}
          
              // Basic getters.
              public LinkedHashMap<String, Boolean> getFields();
              public QueryIndexType getIndexType();
              public List<String> getFieldNames();
              public boolean hasField(String field);
          
              // Returns null if field does not exist.
              public Boolean getSortOrder(String field);
          }
          
          enum QueryIndexType {
              SORTED, FULLTEXT, GEOSPATIAL
          }
          1. Dmitry, in case of FULLTEXT and GEOSPATIAL indexes, there no such settings as ASC and DESC.

            So, for such indexes I should set NULL?

            I, personally, prefer Sergi way.

            1. I think this can be handled with convenience setters and constructors as shown above.