Versions Compared

Old Version 5

changes.mady.by.user Xiaoxiang Yu

Saved on Aug 19, 2021

compared with

New Version Current

changes.mady.by.user mukvin

Saved on Jun 22, 2022

Key

This line was added.
This line was removed.
Formatting was changed.

Comment: update logger for cubing and query

Table of Contents

Background

In the previous kylin Kylin release, the logs of kylinKylin's build engine and query engine are collected or stored by the resource manager(such as: yarn logs -applicationId xxx) or HBase Region Server instance.

This may make it difficult to find the root cause of the failure job or slow query. In order to solve this problem, Kylin 4.0.0 refactored the log of the build job. In Kylin 4.0.0, we are trying trying to collect and store these log logs under Kylin's working dir(HDFS or S3).

The log4j configuration files of this log include the following two: include the following two: ${KYLIN_HOME}/conf/spark-driver-log4j.properties and ${KYLIN_HOME}/conf/spark-executor-log4j.properties.
Kylin also provides the default log4j configurations for users who do not want to upload logs under Kylin's working dir(HDFS or S3).

The default log4j configuration files include the following two: ${KYLIN_HOME}/conf/spark-driver-log4j-default.properties and ${KYLIN_HOME}/conf/spark-executor-log4j-default.properties. Without modification in kylin.properties, Kylin will start with the default log4j configuration.

Default logger for Query and Cubing

ConsoleAppender

the default configuration for spark driver in Query and Cubing: spark-driver-log4j-default.properties.

Code Block
vim ${KYLIN_HOME}/conf/spark-driver-log4j-default.properties log4j.rootLogger=INFO,stderr

the default configuration for spark executor in Query and Cubing: spark-executor-log4j-default.properties.

Code Block
vim ${KYLIN_HOME}/conf/spark-executor-log4j-default.properties log4j.rootLogger=INFO,stderr

If using default log4j properties, you will see the following messages:

Code Block

2022-06-17 17:50:22,610 INFO [Scheduler 1342122509 Job 6ba9cf9f-18e2-4290-93b9-36674b2cfca8-65] job.NSparkExecutable:451 : Current using default log4j properties for spark driver in using `ConsoleAppender`.Please modify `kylin.spark.driver.log4j.properties` to be `spark-driver-log4j.properties`for uploading log file to hdfs.

2022-06-17 17:50:22,610 INFO [Scheduler 1342122509 Job 6ba9cf9f-18e2-4290-93b9-36674b2cfca8-65] job.NSparkExecutable:457 : Current using default log4j properties for spark executor in using `ConsoleAppender`.Please modify `kylin.spark.executor.log4j.properties` to be `spark-executor-log4j.properties`for uploading log file to hdfs.

ConsoleAppender in spark-driver-log4j-default.properties and spark-executor-log4j-default.properties.

the default configuration for spark driver in Query and Cubing: spark-driver-log4j-default.properties.

Code Block

language	bash
title	Modify spark-driver-log4j.properties
linenumbers	true
collapse	true

vim ${KYLIN_HOME}/conf/spark-driver-log4j-default.properties 
log4j.rootLogger=INFO,stderr

the default configuration for spark executor in Query and Cubing: spark-executor-log4j-default.properties.

Code Block

language	bash
title	Modify spark-driver-log4j.properties
linenumbers	true
collapse	true

vim ${KYLIN_HOME}/conf/spark-executor-log4j-default.properties 
log4j.rootLogger=INFO,stderr

Logger for Cubing

Using spark-driver-log4j.properties and spark-executor-log4j.properties

Users can change the default log4j configuration files for Kylin.

Code Block
vim ${KYLIN_HOME}/conf/kylin.properties kylin.spark.driver.log4j.properties=spark-driver-log4j.properties kylin.spark.executor.log4j.properties=spark-executor-log4j.properties

Then Kylin will start with spark-driver-log4j.properties and spark-executor-log4j.properties to collect and store logs of Kylin's build engine.

Driver

...

Log

SparkDriverHdfsLogAppender in spark-driver-log4j.properties

spark-driver-log4j.properties is used to configure the output path, appender, layout, etc. of spark driver log in the build job. By default, spark driver log of a step of a build job will be output to a file in hdfs.

The file path is spliced by kylin.env.hdfs-working-dir, kylin.metadata.url, project name, step id, etc., where step id is spliced by job Id and two digits counting from 00, for example, a build job's first step's step id is jobId-00, the second step's step id is jobId-01, and the specific path of log file is: $`${kylin.env.hdfs-working-dir}/{kylin.metadata.url}/${project_name}/spark_logs/driver/${step_id}/executeexecute_output.json.timestamp.log `.

Image Modified

View logs through kylin WebUI

When enabled SparkDriverHdfsAppender, users can download driver logs from Kylin's Web UI, even the spark.submit.deployMode is cluster(means the driver is not located at the same node of Kylin Job Server).

By default, the Output will only show the contents of the first and last 100 lines of all logs of this step.

if you need to view all logs, you can click "download the log file" at the top of the Output window to download all logs, and then the complete spark driver log file of this step will be downloaded locally by the browser.

Image Modified

...

FileAppender in spark-driver-log4j.properties

If the user does not want to upload the log of the spark driver to hdfs during the build job, the configuration item in spark-driver-log4j.properties can be changed:

true

Code Block
language	bash
title	Modify spark-driver-log4j.properties
linenumbers	true
collapse

vim ${KYLIN_HOME}/conf/spark-driver-log4j.properties log4j.rootLogger=INFO,logFile

After modifying the configuration, restart kylin, and then the spark driver log of one step of a job will be output to the local file: $ ${KYLIN_HOME}/logs/spark/${step_id}.log .

Executor

...

Log

SparkExecutorHdfsAppender in spark-executor-log4j.properties

spark-executor-log4j.properties is used to configure the output path, appender, layout, etc. of spark executor log in the build job. Similar to spark driver log, spark executor log of one step of a build job will be output to a folder in hdfs.

Each file in this folder corresponds to an executor log. The path is $is ${kylin.env.hdfs-working-dir}/{kylin.metadata.url}/${project_name}/spark_logs/executor/yyyy-mm-dd/${job_id}/${step_id}/executor-x.log.

Image Modified

Logger for Query

Using spark-executor-log4j.properties

Users can change the default log4j configuration files for Kylin.

Code Block
vim ${KYLIN_HOME}/conf/kylin.properties kylin.spark.executor.log4j.properties=spark-executor-log4j.properties

Then Kylin will start with spark-executor-log4j.properties to collect and store logs of Kylin's Query engine.

Executor Log

SparkExecutorHdfsAppender in spark-executor-log4j.properties

spark-executor-log4j.properties is used to configure the output path, appender, layout, etc. of spark executor log in the query job.

Similar to spark driver log, spark executor log of query job will be output to a folder in hdfs. Each file in this folder corresponds to an executor log. The path is ${kylin.env.hdfs-working-dir}/{kylin.metadata.url}/${project_name}/_sparder_logs/yyyy-mm-dd/${job_id}/executor-x.log.

Image Added

Troubleshooting

When the spark job submitted by kylin is submitted to the yarn cluster for execution, the the user who uploads the spark executor log to HDFS may to HDFS may be yarn.

At this time, the user of yarn may not have write permission to the hdfs directory ${kylin.env.hdfs-working-dir}/${kylin.metadata.url}/${project _ name}/spark_logs, which leads to the failure of uploading spark executor log.

At this time, when viewing the task log with "yarn logs -applicationId <Application ID>", you you will see the following error:

Image Modified

This error can be solved by the following command:

bash

Code Block
language

title	acl
linenumbers	true
collapse	true
hadoop fs -setfacl -R -m user:yarn:rwx $rwx ${kylin.env.hdfs-working-dir}/{kylin.metadata.url}/${project_name}/spark_logs

...

Space shortcuts

Page tree

Versions Compared

Old Version 5

New Version Current

Key

Background

Default logger for Query and Cubing

ConsoleAppender

ConsoleAppender in spark-driver-log4j-default.properties and spark-executor-log4j-default.properties.

Logger for Cubing

Using spark-driver-log4j.properties and spark-executor-log4j.properties

Driver

Log

SparkDriverHdfsLogAppender in spark-driver-log4j.properties

View logs through kylin WebUI

FileAppender in spark-driver-log4j.properties

Executor

Log

SparkExecutorHdfsAppender in spark-executor-log4j.properties

Logger for Query

Using spark-executor-log4j.properties

Executor Log

SparkExecutorHdfsAppender in spark-executor-log4j.properties

Troubleshooting

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 5

New Version Current

Key

Background

Default logger for Query and Cubing

ConsoleAppender

ConsoleAppender in spark-driver-log4j-default.properties and spark-executor-log4j-default.properties.

Logger for Cubing

Using spark-driver-log4j.properties and spark-executor-log4j.properties

Driver

Log

SparkDriverHdfsLogAppender in spark-driver-log4j.properties

View logs through kylin WebUI

FileAppender in spark-driver-log4j.properties

Executor

Log

SparkExecutorHdfsAppender in spark-executor-log4j.properties

Logger for Query

Using spark-executor-log4j.properties

Executor Log

SparkExecutorHdfsAppender in spark-executor-log4j.properties

Troubleshooting