Apache Airavata
If you want to enable elasticsearch based logging feature for Airavata
feature you first have to setup a Kafka cluster so that all the logs will be pushed to the Kafka topic created based on the configuration provided. Once you have kafka cluster setup then you can start airavata by simply changing two parameters in airavata-server.properties, but if you want to know
how to setup Kafka cluster please use below related articles for the instructions.
Overwrite the following property names with your configurations based on your Kafla setup.
# Kafka Logging related configuration
isRunningOnAws= false - Set to true if you are running Airavata on AWS
kafka.broker.list= localhost:9092 - One or more kafka broker node address with the port, Giving one is enough because KafkaProducer will find out the addresses of other nodes
kafka.topic.prefix= staging - Topic prefix you want to use because Airavata will create the topic names for you
enable.kafka.logging= true - Enable kafka Appender to register as a log Appender.
{ "serverId" => { "serverId" => "192.168.59.3", "hostName" => "192.168.59.3", "version" => "airavata-0.16-135-gac0cae6", "roles" => [ [0] "gfac" ] }, "message" => "Skipping Zookeeper embedded startup ...", "timestamp" => "2016-09-09T20:57:08.329Z", "level" => "INFO", "loggerName" => "org.apache.airavata.common.utils.AiravataZKUtils", "mdc" => { "gateway_id": "21845d02-7d2c-11e6-ae22-562311499611", "experiment_id": "21845d02-7d2c-11e6-ae22-34b6b6499611", "process_id": "21845d02-7d2c-11e6-ae22-56b6b6499611", "token_id": "21845d02-7d2c-11e6-ae22-56b6b6499611" }, "threadName" => "main", "@version" => "1", "@timestamp" => "2016-09-09T20:57:11.678Z", "type" => "gfac_logs", "tags" => [ [0] "local", [1] "CoreOS-899.13.0" ], "timestamp_usec" => 0 }
Airavata has few services and its completely flexible for you to deploy in the way you like, you can deploy all the services (Apache thrift services) in on JVM or you can create one JVM for each component or you can merge only few of them to one JVM. In the above log you can see the roles section contains only gfac which means above log was taken from a gfac server node and no other component was running in that JVM. So topic creation logic is based on the role of the JVM. To keep the deployment clean we recommend to deploy one component per JVM so that its easier to scale
and diagnose the system. So during topic creation we check the number of roles configured in the JVM
if the number of roles is greater than 4 => <kafka_topic_prefix>_all_logs ex: staging_all_logs (kafka.topic.prefix = staging)
Otherwise we pick the first role => <kafka_topic_prefix>_<first role>_logs ex: staging_gfac_logs (kafka.topic.prefix = staging)
We have tried using the log messages and push them to elastic search cluster and view them kibana. To learn about how to setup elastic search and logstash please refer Related articles section. We have used below logstash configuration to read airavata log messages and push them
to elastic search.
input { kafka { topic_id => "local_all_logs" zk_connect => "127.0.0.1:2181" auto_offset_reset => "smallest" type => "all_logs" } kafka { topic_id => "local_apiserver_logs" zk_connect => "127.0.0.1:2181" auto_offset_reset => "smallest" type => "apiserver_logs" } kafka { topic_id => "local_gfac_logs" zk_connect => "127.0.0.1:2181" auto_offset_reset => "smallest" type => "gfac_logs" } kafka { topic_id => "local_orchestrator_logs" zk_connect => "127.0.0.1:2181" auto_offset_reset => "smallest" type => "orchestrator_logs" } kafka { topic_id => "local_credentialstore_logs" zk_connect => "127.0.0.1:2181" auto_offset_reset => "smallest" type => "credentialstore_logs" } } filter { mutate { add_field => { "[@metadata][level]" => "%{[level]}" } } mutate { lowercase => ["[@metadata][level]"] } mutate { gsub => ["level", "LOG_", ""] } mutate { add_tag => ["local", "CoreOS-899.13.0"] } ruby { code => " begin t = Time.iso8601(event['timestamp']) rescue ArgumentError => e # drop the event if format is invalid event.cancel return end event['timestamp_usec'] = t.usec % 1000 event['timestamp'] = t.utc.strftime('%FT%T.%LZ') " } } output { stdout { codec => rubydebug } if [type] == "apiserver_logs" { elasticsearch { hosts => ["elasticsearch.us-east-1.aws.found.io:9200"] user => "admin" password => "adminpassword" index => "local-apiserver-logs-logstash-%{+YYYY.MM.dd}" } } else if [type] == "gfac_logs" { elasticsearch { hosts => ["elasticsearch.us-east-1.aws.found.io:9200"] user => "admin" password => "adminpassword" index => "local-gfac-logs-logstash-%{+YYYY.MM.dd}" } } else if [type] == "orchestrator_logs" { elasticsearch { hosts => ["elasticsearch.us-east-1.aws.found.io:9200"] user => "admin" password => "adminpassword" index => "local-orchestrator-logs-logstash-%{+YYYY.MM.dd}" } } else if [type] == "credentialstore_logs" { elasticsearch { hosts => ["elasticsearch.us-east-1.aws.found.io:9200"] user => "admin" password => "adminpassword" index => "local-credentialstore-logs-logstash-%{+YYYY.MM.dd}" } } else { elasticsearch { hosts => ["elasticsearch.us-east-1.aws.found.io:9200"] user => "admin" password => "adminpassword" index => "local-airavata-logs-logstash-%{+YYYY.MM.dd}" } } }
Easiest and fastest way to use Elastic search is using the hosted version of Elastic search from a cloud provider, there are set of companies who provide elastic search as a service so you can setup a Elastic search cluster with few clicks. But most of these services charge you more money based on the load you have. If you have a very low load and you require relatively low TTL for your logs it might be efficient and financially make sense to use a ES cluster from of the providers. If you have relatively high TTL for your logs then setup your own cluster is also an option. To start setup your own elastic search cluster and Kiban follow the very last few links below. If you want to secure Kibana you can use another product from Elastic search called Shield and add security to your ES cluster and Kibana.
http://kafka.apache.org/documentation.html
https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html
https://www.elastic.co/cloud/as-a-service/signup
https://www.elastic.co/guide/en/kibana/current/production.html
https://www.elastic.co/guide/en/shield/shield-1.0/marvel.html