Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Welcome to Kylin Wiki.

颜色标记:

处理方式颜色
可以删除红色
需要讨论绿色
建议保留黑色

Kylin原有的配置项

ModulePropertyDefault ValueOptional ValueDescription
Env

kylin.env

DEV

QA, PROD, LOCAL

The environment of Kylin. DEV will turn on some dev features, QA and PROD has no difference in terms of functions.

kylin.env.hdfs-working-dir

/kylin


Working folder in HDFS, better be qualified absolute path, make sure user has the right permission to this directory

kylin.env.zookeeper-base-path

/kylin


kylin zk base path

kylin.env.zookeeper-is-local

false


Run a TestingServer for curator locally

kylin.env.zookeeper-connect-string

sandbox.hortonworks.com


Connect to a remote zookeeper with the url, should set kylin.env.zookeeper-is-local to false

kylin.env.hadoop-conf-dir

/etc/hadoop/conf


Hadoop conf folder, will export this as "HADOOP_CONF_DIR" to run spark-submit. This must contain site xmls of core, yarn, hive, and hbase in one folder

kylin.env.hdfs-metastore-bigcell-dir

kylin.env.hdfs-metastore-bigcell-dir



kylin.env.zookeeper-base-sleep-time

3000



kylin.env.zookeeper.zk-auth

digest:ADMIN:KYLIN



kylin.env.zookeeper-acl-enabled

false








Job











































































kylin.job.zookeeper-monitor-interval

30



kylin.job.log-dir

/tmp/kylin/logs



kylin.job.use-remote-cli




kylin.job.remote-cli-port

22

Remote hadoop client port

kylin.job.remote-cli-hostname


Remote hadoop client host name

kylin.job.remote-cli-username


Remote hadoop client user name

kylin.job.remote-cli-password


Remote hadoop client password

kylin.job.remote-cli-working-dir




kylin.job.allow-empty-segment

true



kylin.job.max-concurrent-jobs

10



kylin.job.sampling-percentage

100



kylin.job.dependency-filter-list

"[^,]*hive-exec[^,]*?\\.jar" + "|"
+ "[^,]*hive-metastore[^,]*?\\.jar" + "|" + "[^,]*hive-hcatalog-core[^,]*?\\.jar"



kylin.job.notification-enabled

false

If mail enabled

kylin.job.notification-mail-enable-starttls

false



kylin.job.notification-mail-port

25



kylin.job.notification-mail-host




kylin.job.notification-mail-username




kylin.job.notification-mail-password




kylin.job.notification-mail-sender




kylin.job.notification-admin-emails

null



kylin.job.retry

0



kylin.job.retry-interval

30000

retry interval in milliseconds

kylin.job.retry-exception-classes




kylin.job.sampling-hll-precision

14



kylin.job.scheduler.default

0



kylin.job.scheduler.priority-considered

false



kylin.job.scheduler.priority-bar-fetch-from-queue

20



kylin.job.scheduler.poll-interval-second

30



kylin.job.scheduler.safemode

false



kylin.job.scheduler.safemode.runnable-projects




kylin.job.error-record-threshold

0



kylin.job.use-advanced-flat-table

false



kylin.job.advanced-flat-table.class




kylin.job.tracking-url-pattern




kylin.job.metadata-persist-retry

5



kylin.job.cube-auto-ready-enabled

true



kylin.job.cube-inmem-builder-class

org.apache.kylin.cube.inmemcubing.DoggedCubeBuilder



kylin.job.execute-output.max-size

10485760








Metadata


kylin.metadata.url

kylin_metadata@hbase


The metadata store in hbase

kylin.metadata.sync-retries

3


metadata cache sync retry times

kylin.metadata.sync-error-handler


MockupErrHandler







Server













kylin.server.mode

all

query, job

Kylin server mode, valid value [all, query, job]

kylin.server.cluster-servers

localhost:7070


List of web servers in use, this enables one web server instance to sync up with other servers.

kylin.server.query-metrics-percentiles-intervals

60, 360, 3600


kylin query metrics percentiles intervals default=60, 300, 3600

kylin.server.host-address

localhost:7070


get server rest address

kylin.server.cluster-name




kylin.server.init-tasks




kylin.server.sequence-sql.workers-per-server

1



kylin.server.sequence-sql.expire-time

86400000



kylin.server.query-metrics-enabled

false



kylin.server.query-metrics2-enabled

false



kylin.server.auth-user-cache.expire-seconds

300



kylin.server.auth-user-cache.max-entries

100



kylin.server.external-acl-provider




kylin.server.cluster-servers-with-mode









Storage





































































kylin.storage.hbase.owner-tag

whoami@kylin.apache.org


Optional information for the owner of kylin platform, it can be your team's email, Currently it will be attached to each kylin's htable attribute

kylin.storage.url

hbase


The storage for final cube file in hbase

kylin.storage.hbase.cluster-fs



HBase Cluster FileSystem, which serving hbase, format as hdfs://hbase-cluster:8020, Leave empty if hbase running on same cluster with hive and mapreduce

kylin.storage.hbase.compression-codec

gzip


default compression codec for htable,snappy,lzo,gzip,lz4

kylin.storage.hbase.namespace

default



kylin.storage.hbase.cluster-hdfs-config-file




kylin.storage.hbase.region-cut-gb

0.1


The cut size for hbase region, in GB. E.g, for cube whose capacity be marked as "SMALL", split region per 10GB by default

kylin.storage.hbase.min-region-count

1



kylin.storage.hbase.max-region-count

500



kylin.storage.hbase.hfile-size-gb

2


The hfile size of GB, smaller hfile leading to the converting hfile MR has more reducers and be faster. set 0 to disable this optimization

kylin.storage.hbase.run-local-coprocessor

false



kylin.storage.hbase.coprocessor-mem-gb

3.0



kylin.storage.partition.aggr-spill-enabled

true



kylin.storage.partition.max-scan-bytes

3L * 1024 * 1024 * 1024



kylin.storage.hbase.coprocessor-timeout-seconds

0



kylin.storage.hbase.max-fuzzykey-scan

200



kylin.storage.hbase.max-fuzzykey-scan-split

1



kylin.storage.hbase.max-visit-scanrange

1000000



kylin.storage.hbase.gtstorage

org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC



kylin.storage.hbase.scan-cache-rows

1024



kylin.storage.hbase.region-cut-gb

5.0



kylin.storage.hbase.max-scan-result-bytes

5 * 1024 * 1024



kylin.storage.hbase.compression-codec

none



kylin.storage.hbase.rowkey-encoding

FAST_DIFF



kylin.storage.hbase.block-size-bytes

1048576



kylin.storage.hbase.small-family-block-size-bytes

65536



kylin.storage.hbase.endpoint-compress-result

true



kylin.storage.hbase.max-hconnection-threads

2048



kylin.storage.hbase.core-hconnection-threads

2048



kylin.storage.hbase.hconnection-threads-alive-seconds

60



kylin.storage.hbase.replication-scope

0



kylin.storage.clean-after-delete-operation

false



kylin.storage.project-isolation-enable

true



kylin.storage.limit-push-down-enabled

true



kylin.storage.default

2



kylin.storage.hbase.table-name-prefix

KYLIN_








Web











kylin.web.timezone

GMT+8


Display timezone on UI,format like[GMT+N or GMT-N]

kylin.web.help.length

4



kylin.web.help.0

start|Getting Started|http://kylin.apache.org/docs/tutorial/kylin_sample.html



kylin.web.help.1

odbc|ODBC Driver|http://kylin.apache.org/docs/tutorial/odbc.html



kylin.web.help.2

tableau|Tableau Guide|http://kylin.apache.org/docs/tutorial/tableau_91.html



kylin.web.help.3

onboard|Cube Design Tutorial|http://kylin.apache.org/docs/howto/howto_optimize_cubes.html



kylin.web.export-allow-admin

true



kylin.web.export-allow-other

true



kylin.web.hide-measures

RAW



kylin.web.default-time-filter

0



kylin.web.cross-domain-enabled

true



kylin.web.dashboard-enabled

false









kylin.engine.default

2



MapReduce






















kylin.engine.mr.config-override.




kylin.engine.mr.mem-hungry-config-override.




kylin.engine.mr.uhc-config-override.




kylin.engine.mr.base-cuboid-config-override.




kylin.engine.mr.reduce-input-mb

500



kylin.engine.mr.reduce-count-ratio

1.0



kylin.engine.mr.min-reducer-number

1



kylin.engine.mr.max-reducer-number

500



kylin.engine.mr.mapper-input-rows

1000000



kylin.engine.mr.max-cuboid-stats-calculator-number

1


set 1 to disable multi-thread statistics calculation

kylin.engine.mr.cuboid-number-per-stats-calculator

100



kylin.engine.mr.per-reducer-hll-cuboid-number

100



kylin.engine.mr.hll-max-reducer-number

1



kylin.engine.mr.build-dict-in-reducer

true



kylin.engine.mr.yarn-check-status-url

null



kylin.engine.mr.yarn-check-interval-seconds

10



kylin.engine.mr.use-local-classpath

true



kylin.engine.mr.uhc-config-override.mapreduce.reduce.memory.mb

500



kylin.engine.mr.uhc-config-override.mapred.reduce.child.java.opts

-Xmx400M



kylin.engine.mr.yarn-check-interval-seconds

10


Time interval to check hadoop job status

kylin.engine.mr.max-reducer-number

5



kylin.engine.mr.uhc-reducer-count

3


for test

kylin.engine.mr.lib-dir









Spark












kylin.engine.spark-conf.



getSparkConfigOverride

kylin.engine.spark.additional-jars




kylin.engine.spark.rdd-partition-cut-mb

10.0



kylin.engine.spark.min-partition

1



kylin.engine.spark.max-partition

5000



kylin.engine.spark.storage-level

MEMORY_AND_DISK_SER



kylin.engine.spark.sanity-check-enabled

false



kylin.engine.spark-fact-distinct

false



kylin.engine.spark-uhc-dictionary

false



kylin.engine.spark-cardinality

false



kylin.engine.spark.output.max-size

10485760



kylin.engine.spark-dimension-dictionary

false



kylin.engine.spark-create-table-enabled

false








Flink





kylin.engine.flink-conf.




kylin.engine.flink.additional-jars




kylin.engine.flink.partition-cut-mb

10.0



kylin.engine.flink.min-partition

1



kylin.engine.flink.max-partition

5000



kylin.engine.flink.sanity-check-enabled

false








Livy





kylin.engine.livy-conf.livy-enabled

false



kylin.engine.livy.backtick.quote




kylin.engine.livy-conf.livy-url




kylin.engine.livy-conf.livy-key.




kylin.engine.livy-conf.livy-arr.




kylin.engine.livy-conf.livy-map.









Query































































kylin.query.enable-dict-enumerator

false



kylin.query.calcite.enumerable-rules-enabled

false



kylin.query.calcite.reduce-rules-enabled

true



kylin.query.convert-create-table-to-with

false



kylin.query.calcite.extras-props.




kylin.query.calcite.add-rule




kylin.query.calcite.remove-rule




kylin.query.enable-dynamic-column

false



kylin.query.skip-empty-segments

true

Used by Segment pruner

kylin.query.disable-cube-noagg-sql

false



kylin.query.stream-aggregate-enabled

true



kylin.query.max-limit-pushdown

10000



kylin.query.force-limit

-1



kylin.query.scan-threshold

10000000



kylin.query.lazy-query-enabled

false



kylin.query.lazy-query-waiting-timeout-milliseconds

60000



kylin.query.project-concurrent-running-threshold

0



kylin.query.max-scan-bytes

0



kylin.query.max-return-rows

5000000



kylin.query.translated-in-clause-max-size

1024 * 1024



kylin.query.derived-filter-translation-threshold

20



kylin.query.badquery-stacktrace-depth

10



kylin.query.badquery-history-number

50



kylin.query.badquery-alerting-seconds

90



kylin.query.timeout-seconds-coefficient

0.5



kylin.query.badquery-persistent-enabled

true



kylin.query.transformers




kylin.query.transformers




kylin.query.cache-threshold-duration

2000



kylin.query.cache-threshold-scan-count

10240



kylin.query.cache-threshold-scan-bytes

1024 * 1024



kylin.query.security-enabled

true



kylin.query.cache-enabled

true



kylin.query.ignore-unknown-function

false



kylin.cache.memcached.hosts




kylin.query.segment-cache-enabled

false



kylin.query.segment-cache-timeout

2000



kylin.query.segment-cache-max-size

200



kylin.query.access-controller

null



kylin.query.statement-cache-max-num

50000



kylin.query.statement-cache-max-num-per-key

50



kylin.query.statement-cache-enabled

true



kylin.query.max-dimension-count-distinct

5000000



kylin.query.timeout-seconds

0



kylin.query.pushdown.enabled

false



kylin.query.pushdown.update-enabled

false



kylin.query.schema-factory

org.apache.kylin.query.schema.OLAPSchemaFactory



kylin.query.pushdown.runner-class-name




kylin.query.pushdown.runner.ids




kylin.query.pushdown.converter-class-names

{org.apache.kylin.source.adhocquery.HivePushDownConverter}



kylin.query.pushdown.cache-enabled

false



kylin.query.pushdown.jdbc.url




kylin.query.pushdown.jdbc.driver




kylin.query.pushdown.jdbc.username




kylin.query.pushdown.jdbc.password




kylin.query.pushdown.jdbc.pool-max-total

0



kylin.query.pushdown.jdbc.pool-max-idle

8



kylin.query.pushdown.jdbc.pool-min-idle

0



kylin.query.security.table-acl-enabled

true



kylin.query.escape-default-keyword

false



kylin.query.realization-filter

null



kylin.query.signature-class

org.apache.kylin.rest.signature.FactTableRealizationSetCalculator



kylin.query.cache-signature-enabled

false



kylin.query.flat-filter-max-children

500000








Configurations new added 



ModulePropertyDefault ValueOptional ValueDescription

kylin.engine.spark.build-class-name

org.apache.kylin.engine.spark.job.CubeBuildJob




kylin.engine.spark.task-impact-instance-enabled

true


If calculate  cpu_cores of cube build job, executor_instance = cpu_cores / executor_cores(config in kylin.properties or use default)

kylin.engine.spark.task-core-factor

3


The factor to calculate number of cores

kylin.engine.driver-memory-base

1024




kylin.engine.driver-memory-strategy

{"2", "20", "100" }


Auto adjust the memory of driver


kylin.engine.driver-memory-maximum

4096




kylin.engine.persist-flattable-threshold

1


Confused, explain later....

kylin.engine.spark.cluster-info-fetcher-class-name

org.apache.kylin.cluster.YarnInfoFetcher




kylin.engine.spark.merge-class-name

org.apache.kylin.engine.spark.job.CubeMergeJob




kylin.engine.max-retry-time

3


Auto retry failed job due to exception information

kylin.engine.retry-memory-gradient

1.5




kylin.engine.retry-overheadMemory-gradient

0.2




kylin.engine.max-allocation-proportion

0.9




kylin.engine.base-executor-instance

5




kylin.engine.executor-instance-strategy

100,2,500,3,1000,4




kylin.engine.submit-hadoop-conf-dir



Kind of redundant






kylin.snapshot.parallel-build-enabled

true




kylin.snapshot.parallel-build-timeout-seconds

3600




kylin.snapshot.shard-size-mb

128









kylin.storage.provider

org.apache.kylin.common.storage.DefaultStorageProvider









kylin.storage.columnar.shard-size-mb

128


Not used now

kylin.storage.columnar.shard-rowcount

2500000


Not used now

kylin.storage.columnar.shard-countdistinct-rowcount

1000000


Not used now

kylin.storage.columnar.repartition-threshold-size-mb

128


Not used now

kylin.storage.columnar.shard-min

1


Not used now

kylin.storage.columnar.shard-max

1000


Not used now

kylin.storage.columnar.hdfs-blocksize-bytes

5 * shard_size


Not used now

kylin.storage.columnar.shard-expand-factor

10


Not used now

kylin.storage.columnar.dfs-replication

3


Not used now

kylin.spark-conf.auto.prior

true


If auto adjust spark configuration






kylin.job.log-print-enabled

true









kylin.query.spark-engine.join-memory-fraction

0.3


driver memory that can be used by join(mostly BHJ)


kylin.query.spark-engine.enabled

true




kylin.query.spark-engine.partition-split-size-mb

64




kylin.query.spark-engine.expose-sharding-trait

true




kylin.query.spark-engine.spark-sql-shuffle-partitions

-1




kylin.query.spark-conf.



spark conf override

kylin.query.engine.sparder-additional-files





kylin.query.engine.sparder-additional-jars





kylin.query.pushdown.auto-set-shuffle-partitions-enabled

true




kylin.query.pushdown.base-shuffle-partition-size

48




kylin.query.intersect.separator

|









kylin.kerberos.enabled

false




kylin.kerberos.keytab





kylin.kerberos.zookeeper.server.principal

zookeeper/hadoop




kylin.kerberos.ticket.refresh.interval.minutes

720




kylin.kerberos.monitor.interval.minutes

10




kylin.kerberos.platform





kylin.platform.zk.kerberos.enable





kylin.kerberos.krb5.conf

krb5.conf




kylin.kerberos.jaas.conf

jaas.conf




kylin.kerberos.principal









Configurations need to be removed

ModulePropertyDescription









  • No labels