Syslog Stress Test 2012-04-28
Who ran the test: Mike Percy <mpercy at cloudera dot com>
Test setup
Overview
The Flume NG agent was run on its own physical machine in a single JVM. A separate client machine generated load against the Flume box in syslog format. Flume stored data onto a 9-node HDFS cluster configured on its own separate hardware. No virtual machines were used in this test.
Hardware specs
CPU: Intel Xeon L5630 2 x quad-core with Hyper-Threading @ 2133MHz (8 physical cores)
Memory: 48GB
OS: SLES 11sp1 (SuSE Linux 64-bit)
Flume configuration
Java version: 1.6.0u26 (Server Hotspot VM)
Java heap size: 2GB
Num. agents: 1
Num. parallel flows: 10
Source: SyslogTcpSource
Channel: MemoryChannel
Sink: HDFSEventSink
with avro_event
serialization and snappy
serializer compression
Single-flow config
agent.channels.svc_0_chan.type = memory agent.channels.svc_0_chan.capacity = 100000 agent.channels.svc_0_chan.transactionCapacity = 1000 agent.sources.svc_0_src.type = org.apache.flume.source.SyslogTcpSource agent.sources.svc_0_src.port = 10001 agent.sources.svc_0_src.channels = svc_0_chan agent.sinks.svc_0_sink.type = hdfs agent.sinks.svc_0_sink.hdfs.path = hdfs://xxxxxx.cloudera.com/service/20120428/flow0 agent.sinks.svc_0_sink.hdfs.fileType = DataStream agent.sinks.svc_0_sink.hdfs.rollInterval = 300 agent.sinks.svc_0_sink.hdfs.rollSize = 0 agent.sinks.svc_0_sink.hdfs.rollCount = 0 agent.sinks.svc_0_sink.hdfs.batchSize = 1000 agent.sinks.svc_0_sink.hdfs.txnEventMax = 1000 agent.sinks.svc_0_sink.hdfs.kerberosPrincipal = flume/_HOST@CLOUDERA.COM agent.sinks.svc_0_sink.hdfs.kerberosKeytab = /etc/flume-ng/conf/flume-xxxxxx.keytab agent.sinks.svc_0_sink.serializer = avro_event agent.sinks.svc_0_sink.serializer.compressionCodec = snappy agent.sinks.svc_0_sink.channel = svc_0_chan
Hadoop configuration
The HDFS sink was connected to a 9-node Hadoop cluster running CDH3u3 with MIT Kerberos v5 security enabled.
Data description
Syslog entries containing sequentially increasing integers plus padding
Event size: 300 bytes
Results
Summary analysis
Load: 58,582 events/sec aggregate == approx. 5,850 events/sec per flow on average x 10 flows.
Event size: 300 bytes/event.
Duration: The load test ran for 23 hours and 20 minutes.
Result: Total events sent: 4,920,930,988; No lost events, only 7,000 duplicates (7 retried transactions).
Automated data integrity report
WARNING: xxxxxx.cloudera.com/hammer-0: dups: 1000 INFO: xxxxxx.cloudera.com/hammer-0: total=493252903 is OK WARNING: xxxxxx.cloudera.com/hammer-1: dups: 1000 INFO: xxxxxx.cloudera.com/hammer-1: total=491792986 is OK INFO: xxxxxx.cloudera.com/hammer-2: total=494361441 is OK WARNING: xxxxxx.cloudera.com/hammer-3: dups: 1000 INFO: xxxxxx.cloudera.com/hammer-3: total=491668462 is OK WARNING: xxxxxx.cloudera.com/hammer-4: dups: 1000 INFO: xxxxxx.cloudera.com/hammer-4: total=493222420 is OK WARNING: xxxxxx.cloudera.com/hammer-5: dups: 1000 INFO: xxxxxx.cloudera.com/hammer-5: total=492295186 is OK WARNING: xxxxxx.cloudera.com/hammer-6: dups: 1000 INFO: xxxxxx.cloudera.com/hammer-6: total=490704238 is OK WARNING: xxxxxx.cloudera.com/hammer-7: dups: 1000 INFO: xxxxxx.cloudera.com/hammer-7: total=491209003 is OK INFO: xxxxxx.cloudera.com/hammer-8: total=493007823 is OK INFO: xxxxxx.cloudera.com/hammer-9: total=489416526 is OK INFO: grand total=4920930988