Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Kylin 4.0.0 Support Matrix


Kylin BinaryHadoop DistributionSparkHadoopHiveCluster Manager

Distributed Filesystem

Verified ?Comment
Kylin 4.0.0-spark2CDH 5.72.4.72.6.0-cdh5.7.61.1.0-cdh5.7.6YARNHDFS
  •  verified

Kylin 4.0.0-spark2HDP 2.42.4.72.7.1.2.4.0.0-161.2.1000.2.4.0.0-16YARNHDFS
  •  verified

Kylin 4.0.0-spark2AWS EMR 5.33.02.4.7

2.10.1-amzn-1

Hive 2.3.7-amzn-4

YARNHDFS/S3
  •  verified

Kylin 4.0.0-spark2CDH 6.2.02.4.73.0.0-cdh6.2.02.1.1-cdh6.2.0YARNHDFS
  •  verified

Kylin 4.0.0-spark3AWS EMR 6.3.03.1.1

3.2.1-amzn-3

3.1.2-amzn-4YARNHDFS/S3
  •  verified

Kylin 4.0.0-spark3CDH 6.2.03.1.13.0.0-cdh6.2.02.1.1-cdh6.2.0YARNHDFS
  •  verified

Kylin 4.0.0-spark3Apache3.1.13.2.02.3.9YARN, StandaloneS3
  •  verified
http://kylin.apache.org/docs40/install/deploy_without_hadoop.html


Note:

  1. Object storage such as S3 are not well tested, and is tagged as experimental feature, and performance is not good as HDFS. So it is not recommend in production env without a storage cache layer (such as Alluxio).
  2. When using Standalone as cluster manager, Kylin 4.0.0 only support client as deployMode .
  3. Please configure proper kylin.engine.spark-conf.spark.sql.hive.metastore.version, kylin.engine.spark-conf.spark.sql.hive.metastore.jars, kylin.engine.spark-conf.spark.sql.hive.metastore.versionkylin.query.spark-conf.spark.sql.hive.metastore.jars; please check http://spark.apache.org/docs/latest/configuration.html or http://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html for detail (spark connect to hive).
  4. In some Hadoop platform or custom Hadoop(version) combination, you may still face some class conflict issue. Some of them are related with hive lib/jars. Please report them to user mailing list to find a solution.
  5. In Hadoop 3.X env, you may find Kylin didn't print logger output into 'kylin.log', and only part of them exists in 'kylin.out'. This is usually caused by Slf4j did't work as expected, I suggested you to copy 'log4j-1.2.17.jar' and 'slf4j-log4j12-1.7.25.jar' (these two jars maybe found under $SPARK_HOME/jars) into $KYLIN_HOME/ext and restart Kylin instance. You can found some output like 'SLF4J: Class path contains multiple SLF4J bindings.' in 'kylin.out'.
  6. Class conflict may happen in some Hadoop Platform we didn't tested, some user has reported them, here are some related issues :
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyKYLIN-5073
     and 
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyKYLIN-5069
     . If you faced these troubles, please try to check comment under these issues before open a JIRA issue.
  7. If you faced "java.lang.NoSuchMethodError: org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(Lorg/apache/hadoop/hive/conf/HiveConf;)V " when you are using Hive 3.X, please try to check this issue :
    Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyKYLIN-5084
      for solution.

Kylin 4.0.1 Support Matrix

...

Kylin BinaryHadoop DistributionSparkHadoopHiveCluster Manager

Distributed Filesystem

Verified ?Comment
Kylin 4.0.0-spark2CDH 5.72.4.72.6.0-cdh5.7.61.1.0-cdh5.7.6YARNHDFS
  •  verified

Kylin 4.0.0-spark2HDP 2.42.4.72.7.1.2.4.0.0-161.2.1000.2.4.0.0-16YARNHDFS
  •  verified

Kylin 4.0.0-spark2AWS EMR 5.33.02.4.7

2.10.1-amzn-1

Hive 2.3.7-amzn-4

YARNHDFS/S3
  •  verified
Deploy Kylin 4 on AWS EMR
Kylin 4.0.0-spark2CDH 6.2.02.4.73.0.0-cdh6.2.02.1.1-cdh6.2.0YARNHDFS
  •  verified
Deploy Kylin 4 on CDH 6
Kylin 4.0.0-spark3AWS EMR 6.3.03.1.1

3.2.1-amzn-3

3.1.2-amzn-4YARNHDFS/S3
  •  verified
Deploy Kylin 4 on AWS EMR
Kylin 4.0.0-spark3CDH 6.2.03.1.13.0.0-cdh6.2.02.1.1-cdh6.2.0YARNHDFS
  •  verified
Deploy Kylin 4 on CDH 6
Kylin 4.0.0-spark3Apache3.1.13.2.02.3.9YARN, StandaloneS3
  •  verified
http://kylin.apache.org/docs40/install/deploy_without_hadoop.html