Syslog Stress Test 2012-04-28
Who ran the test: Mike Percy <mpercy at cloudera dot com>
The Flume NG agent was run on its own physical machine in a single JVM. A separate client machine generated load against the Flume box in syslog format. Flume stored data onto a 9-node HDFS cluster configured on its own separate hardware. No virtual machines were used in this test.
CPU: Intel Xeon L5630 2 x quad-core with Hyper-Threading @ 2133MHz (8 physical cores)
OS: SLES 11sp1 (SuSE Linux 64-bit)
Java version: 1.6.0u26 (Server Hotspot VM)
Java heap size: 2GB
Num. agents: 1
Num. parallel flows: 10
avro_event serialization and
snappy serializer compression
The HDFS sink was connected to a 9-node Hadoop cluster running CDH3u3 with MIT Kerberos v5 security enabled.
Syslog entries containing sequentially increasing integers plus padding
Event size: 300 bytes
Load: 58,582 events/sec aggregate == approx. 5,850 events/sec per flow on average x 10 flows.
Event size: 300 bytes/event.
Duration: The load test ran for 23 hours and 20 minutes.
Result: Total events sent: 4,920,930,988; No lost events, only 7,000 duplicates (7 retried transactions).
Automated data integrity report