Install S4
There are 3 2 ways:
- Download the 0.5.0 release We recommend getting the "source release
- Download the binary release
- " release and building it, because some dependencies that may not be available on your machine, but are required for the "binary" release.
- or checkout Checkout from the Apache git repository, by following the instructions. The 0.5.0 tag corresponds to the current release.
...
- Compile and install S4 in the local maven repository: (you can also let the tests run , which is currently quite long: we're not yet using mockswithout the -DskipTests option)
Code Block S4:incubator-s4$ ./gradlew install -DskipTests .... verbose logs ...
- Build the startup scripts:
Code Block S4:incubator-s4$ ./gradlew s4-tools:installApp .... verbose logs ...:s4-tools:installApp
...
- HelloApp.java: defines a simple application: exposes an input stream ("names"), connected to the HelloPE. See the event dispatch configuration page for more information about how events are dispatched.
Code Block // App parent class provides integration with the S4 platform public class HelloApp extends App { @Override protected void onStart() { } @Override protected void onInit() { // That's where we define PEs and streams // create a prototype HelloPE helloPE = createPE(HelloPE.class); // Create a stream that listens to the "lines" stream and passes events to the helloPE instance. createInputStream("names", new KeyFinder<Event>() { // the KeyFinder is used to identify keys @Override public List<String> get(Event event) { return Arrays.asList(new String[] { event.get("name") }); } }, helloPE); } // skipped remaining methods
...
- Set-up the cluster:
- In 2 steps:
- Start a Zookeeper server instance:
Code Block S4:incubator-s4$ ./s4 zkServer S4:myApp$ calling referenced s4 script : /Users/S4/tmp/incubator-s4/s4 [main] INFO org.apache.s4.tools.ZKServer - Starting zookeeper server on port [2181] [main] INFO org.apache.s4.tools.ZKServer - cleaning existing data in [/var/folders/8V/8VdgKWU3HCiy2yV4dzFpDk+++TI/-Tmp-/tmp/zookeeper/data] and [/var/folders/8V/8VdgKWU3HCiy2yV4dzFpDk+++TI/-Tmp-/tmp/zookeeper/log]
- Define a new cluster. Say a cluster named "cluster1" with 2 partitions, nodes listening to ports starting from 12000:
Code Block S4:myApp$ ./s4 newCluster -c=cluster1 -nbTasks=2 -flp=12000 calling referenced s4 script : /Users/S4/tmp/incubator-s4/s4 [main] INFO org.apache.s4.tools.DefineCluster - preparing new cluster [cluster1] with [2] node(s) [main] INFO org.apache.s4.tools.DefineCluster - New cluster configuration uploaded into zookeeper
- Start a Zookeeper server instance:
- Alternatively you may combine these two steps into a single one, by passing the cluster configuration inline with the
zkServer
command:Code Block S4:incubator-s4$ ./s4 zkServer -clusters=c=cluster1:flp=12000:nbTasks=2
- In 2 steps:
- Start 2 S4 nodes with the default configuration, and attach them to cluster "cluster1" :
and again (maybe in another shell):Code Block S4:myApp$ ./s4 node -c=cluster1 calling referenced s4 script : /Users/S4/tmp/incubator-s4/s4 15:50:18.996 [main] INFO org.apache.s4.core.Main - Initializing S4 node with : - comm module class [org.apache.s4.comm.DefaultCommModule] - comm configuration file [default.s4.comm.properties from classpath] - core module class [org.apache.s4.core.DefaultCoreModule] - core configuration file[default.s4.core.properties from classpath] -extra modules: [] [main] INFO org.apache.s4.core.Main - Starting S4 node. This node will automatically download applications published for the cluster it belongs to
Code Block S4:myApp$ ./s4 node -c=cluster1
- Build, package and publish the app to cluster1:
- You may do that in a single step (currently, you must use the name of the current project, and you need to specify the gradle build file with a complete path).
Note that specifying the app class is optional but avoids issues when the scripts tries to guess automatically the app class:Code Block S4:myApp$ ./s4 deploy -appName=myApp -c=cluster1 -b=`pwd`/build.gradle -a=hello.HelloApp .... verbose logs for compiling, building the package, and publishing it to Zookeeper... 15:46:16.486 [main] INFO org.apache.s4.tools.Deploy - uploaded application [myApp] to cluster [cluster1], using zookeeper znode [/s4/clusters/cluster1/apps/myApp]
- You may also do that in 2 separate steps:
- Create an s4r archive. The following creates an archive named myApp.s4r (here you may specify an arbitrary name) in build/libs.
Again specifying the app class is optional :Code Block ./s4 s4r -a=hello.HelloApp -b=`pwd`/build.gradle myApp
- Publish the s4r archive (you may first copy it to a more adequate place). The name of the app is arbitrary:
You can follow this method for a distributed deployment (by copying the s4r to a shared location on a distributed file system)Code Block ./s4 deploy -s4r=`pwd`/build/libs/myApp.s4r -c=cluster1 -appName=myApp
- Create an s4r archive. The following creates an archive named myApp.s4r (here you may specify an arbitrary name) in build/libs.
- You may do that in a single step (currently, you must use the name of the current project, and you need to specify the gradle build file with a complete path).
- S4 nodes will detect the new application, download it, load it and start it. You will get something like:
Code Block [ZkClient-EventThread-15-localhost:2181] INFO o.a.s.d.DistributedDeploymentManager - Detected new application(s) to deploy {}[myApp] [ZkClient-EventThread-15-localhost:2181] INFO org.apache.s4.core.Server - Local app deployment: using s4r file name [myApp] as application name [ZkClient-EventThread-15-localhost:2181] INFO org.apache.s4.core.Server - App class name is: hello.HelloApp [ZkClient-EventThread-15-localhost:2181] INFO o.a.s4.comm.topology.ClusterFromZK - Changing cluster topology to { nbNodes=0,name=unknown,mode=unicast,type=,nodes=[]} from null [ZkClient-EventThread-15-localhost:2181] INFO o.a.s4.comm.topology.ClusterFromZK - Adding topology change listener:org.apache.s4.comm.tcp.TCPEmitter@79b2591c [ZkClient-EventThread-15-localhost:2181] INFO o.a.s.comm.topology.AssignmentFromZK - New session:87684175268872203; state is : SyncConnected [ZkClient-EventThread-19-localhost:2181] INFO o.a.s4.comm.topology.ClusterFromZK - Changing cluster topology to { nbNodes=1,name=cluster1,mode=unicast,type=,nodes=[{partition=0,port=12000,machineName=myMachine.myNetwork,taskId=Task-0}]} from { nbNodes=0,name=unknown,mode=unicast,type=,nodes=[]} [ZkClient-EventThread-15-localhost:2181] INFO o.a.s.comm.topology.AssignmentFromZK - Successfully acquired task:Task-1 by myMachine.myNetwork [ZkClient-EventThread-19-localhost:2181] INFO o.a.s4.comm.topology.ClusterFromZK - Changing cluster topology to { nbNodes=2,name=cluster1,mode=unicast,type=,nodes=[{partition=0,port=12000,machineName=myMachine.myNetwork,taskId=Task-0}, {partition=1,port=12001,machineName=myMachine.myNetwork,taskId=Task-1}]} from { nbNodes=1,name=cluster1,mode=unicast,type=,nodes=[{partition=0,port=12000,machineName=myMachine.myNetwork,taskId=Task-0}]} [ZkClient-EventThread-15-localhost:2181] INFO o.a.s4.comm.topology.ClustersFromZK - New session:87684175268872205 [ZkClient-EventThread-15-localhost:2181] INFO o.a.s4.comm.topology.ClustersFromZK - Detected new stream [names] [ZkClient-EventThread-15-localhost:2181] INFO o.a.s4.comm.topology.ClustersFromZK - New session:87684175268872206 [ZkClient-EventThread-15-localhost:2181] INFO o.a.s4.comm.topology.ClusterFromZK - Changing cluster topology to { nbNodes=2,name=cluster1,mode=unicast,type=,nodes=[{partition=0,port=12000,machineName=myMachine.myNetwork,taskId=Task-0}, {partition=1,port=12001,machineName=myMachine.myNetwork,taskId=Task-1}]} from null [ZkClient-EventThread-15-localhost:2181] INFO org.apache.s4.core.Server - Loaded application from file /tmp/deploy-test/cluster1/myApp.s4r [ZkClient-EventThread-15-localhost:2181] INFO o.a.s.d.DistributedDeploymentManager - Successfully installed application myApp [ZkClient-EventThread-15-localhost:2181] DEBUG o.a.s.c.g.OverloadDispatcherGenerator - Dumping generated overload dispatcher class for PE of class [class hello.HelloPE] [ZkClient-EventThread-15-localhost:2181] DEBUG o.a.s4.comm.topology.ClustersFromZK - Adding input stream [names] for app [-1] in cluster [cluster1] [ZkClient-EventThread-15-localhost:2181] INFO org.apache.s4.core.App - Init prototype [hello.HelloPE].
...
You may also customize the communication and the core layers of S4 by tweaking configuration files and modules.
Last, the javadoc will help you when writing applications.
We hope this will help you start rapidly, and remember: In conclusion, edges are still a bit rough, more aspects need to be documented, and this is not a final version, but that should let you start, and we're happy to help!