You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Summary of New Feature

This is a draft implementation of FLUME-1687

https://issues.apache.org/jira/browse/FLUME-1687

Please check it out and let me know your feedback.

The Apache Solr sink picks up batches of events from a channel and serializes them into SolrInputDocument objects that are sent to Apache Solr.

It uses a thread-safe client for Apache Solr (ConcurrentUpdateSolrServer) which buffers all added documents and then transmit them using open HTTP connections.

Internally, ConcurrentUpdateSolrServer has worker threads that are used to drain the buffer

The number of worker threads and the threshold at which the documents are sent to the server and the url where the Solr index is located is configurable.

How to Install

This Sink is designed to work with Flume 1.3.1

It has not yet been tested with other versions of Flume.

1. Copy the jar files below to your $FLUME_HOME/lib folder.

https://issues.apache.org/jira/secure/attachment/12579661/flume-new-feature-dependencies.zip

https://issues.apache.org/jira/secure/attachment/12579660/flume-new-features-1.3.1.jar

2. Configure the Sink within your Agent similar to the example below and kick off flume.

########################################################################
################## SINK CONFIGURATION ##################################
########################################################################

# FQCN (Fully-qualified class name) component type for the type
octopus.sinks.solr1.type = org.apache.flume.sink.solr.SolrSink

# Channel for this sink
octopus.sinks.solr1.channel = c1

octopus.sinks.solr1.serverUrl = http://localhost:8983/solr/flume

# Number of events to be written per transaction
octopus.sinks.solr1.batchSize = 500

# The number of background worker threads used by ConcurrentUpdateSolrServer to empty the queue
octopus.sinks.solr1.threadCount = 2

# Serializes the headers and body of an event into SolrInputDocuments that are sent to Apache Solr
octopus.sinks.solr1.serializer = org.apache.flume.sink.solr.SolrBasicEventSerializer

# A comma-delimited list of headers allowed.
# These must also be valid field names in the schema for this index
octopus.sinks.solr1.serializer.validHeaderFields = loglevel,timestamp,hostname

# The name of the field in the schema used for the event body. 'body' by default.
octopus.sinks.solr1.serializer.bodyFieldname = body


The jar file for the sources is available here

https://issues.apache.org/jira/secure/attachment/12579662/flume-new-features-1.3.1-sources.jar

  • No labels