Syslog Stress Test 2012-04-28

Who ran the test: Mike Percy <mpercy at cloudera dot com>

Test setup

Overview
The Flume NG agent was run on its own physical machine in a single JVM. A separate client machine generated load against the Flume box in syslog format. Flume stored data onto a 9-node HDFS cluster configured on its own separate hardware. No virtual machines were used in this test.

Hardware specs
CPU: Intel Xeon L5630 2 x quad-core with Hyper-Threading @ 2133MHz (8 physical cores)
Memory: 48GB
OS: SLES 11sp1 (SuSE Linux 64-bit)

Flume configuration
Java version: 1.6.0u26 (Server Hotspot VM)
Java heap size: 2GB
Num. agents: 1
Num. parallel flows: 10
Source: SyslogTcpSource
Channel: MemoryChannel
Sink: HDFSEventSink with avro_event serialization and snappy serializer compression

Single-flow config

agent.channels.svc_0_chan.type = memory
agent.channels.svc_0_chan.capacity = 100000
agent.channels.svc_0_chan.transactionCapacity = 1000

agent.sources.svc_0_src.type = org.apache.flume.source.SyslogTcpSource
agent.sources.svc_0_src.port = 10001
agent.sources.svc_0_src.channels = svc_0_chan

agent.sinks.svc_0_sink.type = hdfs
agent.sinks.svc_0_sink.hdfs.path = hdfs://xxxxxx.cloudera.com/service/20120428/flow0
agent.sinks.svc_0_sink.hdfs.fileType = DataStream
agent.sinks.svc_0_sink.hdfs.rollInterval = 300
agent.sinks.svc_0_sink.hdfs.rollSize = 0
agent.sinks.svc_0_sink.hdfs.rollCount = 0
agent.sinks.svc_0_sink.hdfs.batchSize = 1000
agent.sinks.svc_0_sink.hdfs.txnEventMax = 1000
agent.sinks.svc_0_sink.hdfs.kerberosPrincipal = flume/_HOST@CLOUDERA.COM
agent.sinks.svc_0_sink.hdfs.kerberosKeytab = /etc/flume-ng/conf/flume-xxxxxx.keytab
agent.sinks.svc_0_sink.serializer = avro_event
agent.sinks.svc_0_sink.serializer.compressionCodec = snappy
agent.sinks.svc_0_sink.channel = svc_0_chan

Hadoop configuration
The HDFS sink was connected to a 9-node Hadoop cluster running CDH3u3 with MIT Kerberos v5 security enabled.

Data description
Syslog entries containing sequentially increasing integers plus padding
Event size: 300 bytes

Results

Summary analysis
Load: 58,582 events/sec aggregate == approx. 5,850 events/sec per flow on average x 10 flows.
Event size: 300 bytes/event.
Duration: The load test ran for 23 hours and 20 minutes.
Result: Total events sent: 4,920,930,988; No lost events, only 7,000 duplicates (7 retried transactions).

Automated data integrity report

WARNING: xxxxxx.cloudera.com/hammer-0: dups: 1000
INFO: xxxxxx.cloudera.com/hammer-0: total=493252903 is OK
WARNING: xxxxxx.cloudera.com/hammer-1: dups: 1000
INFO: xxxxxx.cloudera.com/hammer-1: total=491792986 is OK
INFO: xxxxxx.cloudera.com/hammer-2: total=494361441 is OK
WARNING: xxxxxx.cloudera.com/hammer-3: dups: 1000
INFO: xxxxxx.cloudera.com/hammer-3: total=491668462 is OK
WARNING: xxxxxx.cloudera.com/hammer-4: dups: 1000
INFO: xxxxxx.cloudera.com/hammer-4: total=493222420 is OK
WARNING: xxxxxx.cloudera.com/hammer-5: dups: 1000
INFO: xxxxxx.cloudera.com/hammer-5: total=492295186 is OK
WARNING: xxxxxx.cloudera.com/hammer-6: dups: 1000
INFO: xxxxxx.cloudera.com/hammer-6: total=490704238 is OK
WARNING: xxxxxx.cloudera.com/hammer-7: dups: 1000
INFO: xxxxxx.cloudera.com/hammer-7: total=491209003 is OK
INFO: xxxxxx.cloudera.com/hammer-8: total=493007823 is OK
INFO: xxxxxx.cloudera.com/hammer-9: total=489416526 is OK
INFO: grand total=4920930988
  • No labels