Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

颜色标记:

处理方式颜色
可以删除红色
需要讨论绿色
建议保留黑色


ModulePropertyDefault ValueOptional ValueDescription
Env

kylin.env

DEV

QA, PROD, LOCAL

The environment of Kylin. DEV will turn on some dev features, QA and PROD has no difference in terms of functions.

kylin.env.hdfs-working-dir

/kylin


Working folder in HDFS, better be qualified absolute path, make sure user has the right permission to this directory

kylin.env.zookeeper-base-path

/kylin


kylin zk base path

kylin.env.zookeeper-is-local

false


Run

a

TestingServer

for

curator

locally

kylin.env.zookeeper-connect-string

sandbox.hortonworks.com


Connect

to

a

remote

zookeeper

with

the

url,

should

set

kylin.env.zookeeper-is-local

to

false

kylin.env.hadoop-conf-dir

/etc/hadoop/conf


Hadoop

conf

folder,

will

export

this

as

"HADOOP_CONF_DIR"

to

run

spark-submit.

This

must

contain

site

xmls

of

core,

yarn,

hive,

and

hbase

in

one

folder

kylin.env.hdfs-metastore-bigcell-dir

kylin.env.hdfs-metastore-bigcell-dir



kylin.env.zookeeper-base-sleep-time

3000



kylin.env.zookeeper.zk-auth

digest:ADMIN:KYLIN



kylin.env.zookeeper-acl-enabled

false








Job











































































kylin.job.zookeeper-monitor-interval

30



kylin.job.log-dir

/tmp/kylin/logs



kylin.job.use-remote-cli




kylin.job.remote-cli-port

22



kylin.job.remote-cli-hostname




kylin.job.remote-cli-username




kylin.job.remote-cli-password




kylin.job.remote-cli-working-dir




kylin.job.allow-empty-segment

true



kylin.job.max-concurrent-jobs

10



kylin.job.sampling-percentage

100



kylin.job.dependency-filter-list

"[^,]*hive-exec[^,]*?\\.jar"

+

"|"
+

"[^,]*hive-metastore[^,]*?\\.jar"

+

"|"

+

"[^,]*hive-hcatalog-core[^,]*?\\.jar"



kylin.job.notification-enabled

false



kylin.job.notification-mail-enable-starttls

false



kylin.job.notification-mail-port

25



kylin.job.notification-mail-host




kylin.job.notification-mail-username




kylin.job.notification-mail-password




kylin.job.notification-mail-sender




kylin.job.notification-admin-emails

null



kylin.job.retry

0



kylin.job.retry-interval

30000



kylin.job.retry-exception-classes




kylin.job.sampling-hll-precision

14



kylin.job.scheduler.default

0



kylin.job.scheduler.priority-considered

false



kylin.job.scheduler.priority-bar-fetch-from-queue

20



kylin.job.scheduler.poll-interval-second

30



kylin.job.scheduler.safemode

false



kylin.job.scheduler.safemode.runnable-projects




kylin.job.error-record-threshold

0



kylin.job.use-advanced-flat-table

false



kylin.job.advanced-flat-table.class




kylin.job.tracking-url-pattern




kylin.job.metadata-persist-retry

5



kylin.job.cube-auto-ready-enabled

true



kylin.job.cube-inmem-builder-class

org.apache.kylin.cube.inmemcubing.DoggedCubeBuilder



kylin.job.execute-output.max-size

10485760








Metadata


kylin.metadata.url

kylin_metadata@hbase


The

metadata

store

in

hbase

kylin.metadata.sync-retries

3


metadata

cache

sync

retry

times

kylin.metadata.sync-error-handler


MockupErrHandler







Server













kylin.server.mode

all

query, job

Kylin

server

mode,

valid

value

[all,

query,

job]

kylin.server.cluster-servers

localhost:7070


List

of

web

servers

in

use,

this

enables

one

web

server

instance

to

sync

up

with

other

servers.

kylin.server.query-metrics-percentiles-intervals

60,

360,

3600


kylin

query

metrics

percentiles

intervals

default=60,

300,

3600

kylin.server.host-address

localhost:7070


get server rest address

kylin.server.cluster-name




kylin.server.init-tasks




kylin.server.sequence-sql.workers-per-server

1



kylin.server.sequence-sql.expire-time

86400000



kylin.server.query-metrics-enabled

false



kylin.server.query-metrics2-enabled

false



kylin.server.auth-user-cache.expire-seconds

300



kylin.server.auth-user-cache.max-entries

100



kylin.server.external-acl-provider




kylin.server.cluster-servers-with-mode









Storage





































































kylin.storage.hbase.owner-tag

whoami@kylin.apache.org


Optional

information

for

the

owner

of

kylin

platform,

it

can

be

your

team's

email,

Currently

it

will

be

attached

to

each

kylin's

htable

attribute

kylin.storage.url

hbase


The

storage

for

final

cube

file

in

hbase

kylin.storage.hbase.cluster-fs



HBase

Cluster

FileSystem,

which

serving

hbase,

format

as

hdfs://hbase-cluster:8020,

Leave

empty

if

hbase

running

on

same

cluster

with

hive

and

mapreduce

kylin.storage.hbase.compression-codec

gzip


default

compression

codec

for

htable,snappy,lzo,gzip,lz4

kylin.storage.hbase.namespace

default



kylin.storage.hbase.cluster-hdfs-config-file




kylin.storage.hbase.region-cut-gb

0.1


The

cut

size

for

hbase

region,

in

GB.

E.g,

for

cube

whose

capacity

be

marked

as

"SMALL",

split

region

per

10GB

by

default

kylin.storage.hbase.min-region-count

1



kylin.storage.hbase.max-region-count

500



kylin.storage.hbase.hfile-size-gb

2


The

hfile

size

of

GB,

smaller

hfile

leading

to

the

converting

hfile

MR

has

more

reducers

and

be

faster.

set

0

to

disable

this

optimization

kylin.storage.hbase.run-local-coprocessor

false



kylin.storage.hbase.coprocessor-mem-gb

3.0



kylin.storage.partition.aggr-spill-enabled

true



kylin.storage.partition.max-scan-bytes

3L

*

1024

*

1024

*

1024



kylin.storage.hbase.coprocessor-timeout-seconds

0



kylin.storage.hbase.max-fuzzykey-scan

200



kylin.storage.hbase.max-fuzzykey-scan-split

1



kylin.storage.hbase.max-visit-scanrange

1000000



kylin.storage.hbase.gtstorage

org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC



kylin.storage.hbase.scan-cache-rows

1024



kylin.storage.hbase.region-cut-gb

5.0



kylin.storage.hbase.max-scan-result-bytes

5

*

1024

*

1024



kylin.storage.hbase.compression-codec

none



kylin.storage.hbase.rowkey-encoding

FAST_DIFF



kylin.storage.hbase.block-size-bytes

1048576



kylin.storage.hbase.small-family-block-size-bytes

65536



kylin.storage.hbase.endpoint-compress-result

true



kylin.storage.hbase.max-hconnection-threads

2048



kylin.storage.hbase.core-hconnection-threads

2048



kylin.storage.hbase.hconnection-threads-alive-seconds

60



kylin.storage.hbase.replication-scope

0



kylin.storage.clean-after-delete-operation

false



kylin.storage.project-isolation-enable

true



kylin.storage.limit-push-down-enabled

true



kylin.storage.default

2



kylin.storage.hbase.table-name-prefix

KYLIN_



kylin.storage.project-isolation-enable

true








Web











kylin.web.timezone

GMT+8


Display

timezone

on

UI,format

like[GMT+N

or

GMT-N]

kylin.web.help.length

4



kylin.web.help.0

start|Getting

Started|http://kylin.apache.org/docs/tutorial/kylin_sample.html



kylin.web.help.1

odbc|ODBC

Driver|http://kylin.apache.org/docs/tutorial/odbc.html



kylin.web.help.2

tableau|Tableau

Guide|http://kylin.apache.org/docs/tutorial/tableau_91.html



kylin.web.help.3

onboard|Cube

Design

Tutorial|http://kylin.apache.org/docs/howto/howto_optimize_cubes.html



kylin.web.export-allow-admin

true



kylin.web.export-allow-other

true



kylin.web.hide-measures

RAW



kylin.web.default-time-filter

0



kylin.web.cross-domain-enabled

true



kylin.web.dashboard-enabled

false









kylin.engine.default

2



MapReduce






















kylin.engine.mr.config-override.




kylin.engine.mr.mem-hungry-config-override.




kylin.engine.mr.uhc-config-override.




kylin.engine.mr.base-cuboid-config-override.




kylin.engine.mr.reduce-input-mb

500



kylin.engine.mr.reduce-count-ratio

1.0



kylin.engine.mr.min-reducer-number

1



kylin.engine.mr.max-reducer-number

500



kylin.engine.mr.mapper-input-rows

1000000



kylin.engine.mr.max-cuboid-stats-calculator-number

1


set

1

to

disable

multi-thread

statistics

calculation

kylin.engine.mr.cuboid-number-per-stats-calculator

100



kylin.engine.mr.per-reducer-hll-cuboid-number

100



kylin.engine.mr.hll-max-reducer-number

1



kylin.engine.mr.build-dict-in-reducer

true



kylin.engine.mr.yarn-check-status-url

null



kylin.engine.mr.yarn-check-interval-seconds

10



kylin.engine.mr.use-local-classpath

true



kylin.engine.mr.uhc-config-override.mapreduce.reduce.memory.mb

500



kylin.engine.mr.uhc-config-override.mapred.reduce.child.java.opts

-Xmx400M



kylin.engine.mr.yarn-check-interval-seconds

10


Time

interval

to

check

hadoop

job

status

kylin.engine.mr.max-reducer-number

5



kylin.engine.mr.uhc-reducer-count

3


for test

kylin.engine.mr.lib-dir









Spark












kylin.engine.spark-conf.



getSparkConfigOverride

kylin.engine.spark.additional-jars




kylin.engine.spark.rdd-partition-cut-mb

10.0



kylin.engine.spark.min-partition

1



kylin.engine.spark.max-partition

5000



kylin.engine.spark.storage-level

MEMORY_AND_DISK_SER



kylin.engine.spark.sanity-check-enabled

false



kylin.engine.spark-fact-distinct

false



kylin.engine.spark-uhc-dictionary

false



kylin.engine.spark-cardinality

false



kylin.engine.spark.output.max-size

10485760



kylin.engine.spark-dimension-dictionary

false



kylin.engine.spark-create-table-enabled

false








Flink





kylin.engine.flink-conf.




kylin.engine.flink.additional-jars




kylin.engine.flink.partition-cut-mb

10.0



kylin.engine.flink.min-partition

1



kylin.engine.flink.max-partition

5000



kylin.engine.flink.sanity-check-enabled

false








Livy





kylin.engine.livy-conf.livy-enabled

false



kylin.engine.livy.backtick.quote




kylin.engine.livy-conf.livy-url




kylin.engine.livy-conf.livy-key.




kylin.engine.livy-conf.livy-arr.




kylin.engine.livy-conf.livy-map.

kylin.engine.default
2













Query































































kylin.query.enable-dict-enumerator

false



kylin.query.calcite.enumerable-rules-enabled

false



kylin.query.calcite.reduce-rules-enabled

true



kylin.query.convert-create-table-to-with

false



kylin.query.calcite.extras-props.




kylin.query.calcite.add-rule




kylin.query.calcite.remove-rule




kylin.query.enable-dynamic-column

false



kylin.query.skip-empty-segments

true



kylin.query.disable-cube-noagg-sql

false



kylin.query.stream-aggregate-enabled

true



kylin.query.max-limit-pushdown

10000



kylin.query.force-limit

-1



kylin.query.scan-threshold

10000000



kylin.query.lazy-query-enabled

false



kylin.query.lazy-query-waiting-timeout-milliseconds

60000



kylin.query.project-concurrent-running-threshold

0



kylin.query.max-scan-bytes

0



kylin.query.max-return-rows

5000000



kylin.query.translated-in-clause-max-size

1024 * 1024



kylin.query.derived-filter-translation-threshold

20



kylin.query.badquery-stacktrace-depth

10



kylin.query.badquery-history-number

50



kylin.query.badquery-alerting-seconds

90



kylin.query.timeout-seconds-coefficient

0.5



kylin.query.badquery-persistent-enabled

true



kylin.query.transformers




kylin.query.transformers




kylin.query.cache-threshold-duration

2000



kylin.query.cache-threshold-scan-count

10240



kylin.query.cache-threshold-scan-bytes

1024 * 1024



kylin.query.security-enabled

true



kylin.query.cache-enabled

true



kylin.query.ignore-unknown-function

false



kylin.cache.memcached.hosts




kylin.query.segment-cache-enabled

false



kylin.query.segment-cache-timeout

2000



kylin.query.segment-cache-max-size

200



kylin.query.access-controller

null



kylin.query.statement-cache-max-num

50000



kylin.query.statement-cache-max-num-per-key

50



kylin.query.statement-cache-enabled

true



kylin.query.max-dimension-count-distinct

5000000



kylin.query.timeout-seconds

0



kylin.query.pushdown.enabled

false



kylin.query.pushdown.update-enabled

false



kylin.query.schema-factory

org.apache.kylin.query.schema.OLAPSchemaFactory



kylin.query.pushdown.runner-class-name




kylin.query.pushdown.runner.ids




kylin.query.pushdown.converter-class-names

{org.apache.kylin.source.adhocquery.HivePushDownConverter}



kylin.query.pushdown.cache-enabled

false



kylin.query.pushdown.jdbc.url




kylin.query.pushdown.jdbc.driver




kylin.query.pushdown.jdbc.username




kylin.query.pushdown.jdbc.password




kylin.query.pushdown.jdbc.pool-max-total

0



kylin.query.pushdown.jdbc.pool-max-idle

8



kylin.query.pushdown.jdbc.pool-min-idle

0



kylin.query.security.table-acl-enabled

true



kylin.query.escape-default-keyword

false



kylin.query.realization-filter

null



kylin.query.signature-class

org.apache.kylin.rest.signature.FactTableRealizationSetCalculator



kylin.query.cache-signature-enabled

false



kylin.query.flat-filter-max-children

500000








Configurations new added 

...

ModulePropertyDefault ValueOptional ValueDescription

kylin.engine.spark.build-class-name

org.apache.kylin.engine.spark.job.CubeBuildJob




kylin.engine.spark.task-impact-instance-enabled

true




kylin.engine.spark.task-core-factor

3




kylin.engine.driver-memory-base

1024




kylin.engine.driver-memory-strategy

{"2",

"20",

"100"

}


Auto

adjust

the

memory

of

driver


kylin.engine.driver-memory-maximum

4096




kylin.engine.persist-flattable-threshold

1




kylin.engine.spark.cluster-info-fetcher-class-name

org.apache.kylin.cluster.YarnInfoFetcher




kylin.engine.spark.merge-class-name

org.apache.kylin.engine.spark.job.CubeMergeJob




kylin.engine.max-retry-time

3




kylin.engine.retry-memory-gradient

1.5




kylin.engine.retry-overheadMemory-gradient

0.2




kylin.engine.max-allocation-proportion

0.9




kylin.engine.base-executor-instance

5




kylin.engine.executor-instance-strategy

100,2,500,3,1000,4




kylin.engine.submit-hadoop-conf-dir










kylin.snapshot.parallel-build-enabled

true




kylin.snapshot.parallel-build-timeout-seconds

3600




kylin.snapshot.shard-size-mb

128









kylin.storage.provider

org.apache.kylin.common.storage.DefaultStorageProvider




kylin.storage.columnar.shard-size-mb

128




kylin.storage.columnar.shard-rowcount

2500000




kylin.storage.columnar.shard-countdistinct-rowcount

1000000




kylin.storage.columnar.repartition-threshold-size-mb

128




kylin.storage.columnar.shard-min

1




kylin.storage.columnar.shard-max

1000




kylin.storage.columnar.hdfs-blocksize-bytes

5

*

shard_size




kylin.storage.columnar.shard-expand-factor

10




kylin.storage.columnar.dfs-replication

3




kylin.spark-conf.auto.prior

true









kylin.job.log-print-enabled

true









kylin.query.spark-engine.join-memory-fraction

0.3


driver

memory

that

can

be

used

by

join(mostly

BHJ)


kylin.query.spark-engine.enabled

true




kylin.query.spark-engine.partition-split-size-mb

64




kylin.query.spark-engine.expose-sharding-trait

true




kylin.query.spark-engine.spark-sql-shuffle-partitions

-1




kylin.query.spark-conf.





kylin.query.engine.sparder-additional-files





kylin.query.engine.sparder-additional-jars





kylin.query.pushdown.auto-set-shuffle-partitions-enabled

true




kylin.query.pushdown.base-shuffle-partition-size

48




kylin.query.intersect.separator

|









kylin.kerberos.enabled

false




kylin.kerberos.keytab





kylin.kerberos.zookeeper.server.principal

zookeeper/hadoop




kylin.kerberos.ticket.refresh.interval.minutes

720




kylin.kerberos.monitor.interval.minutes

10




kylin.kerberos.platform





kylin.platform.zk.kerberos.enable





kylin.kerberos.krb5.conf

krb5.conf




kylin.kerberos.jaas.conf

jaas.conf




kylin.kerberos.principal









Configurations remained from Kylin3.x

...