Apache Solr Documentation

6.4 Ref Guide (PDF Download)
Solr Tutorial
Solr Community Wiki

Older Versions of this Guide (PDF)

6.5 Draft Ref Guide Topics

Meta-Documentation

This Unreleased Guide Will Cover Apache Solr 6.5

Skip to end of metadata
Go to start of metadata

Although Solr comes bundled with Apache ZooKeeper, you should consider yourself discouraged from using this internal ZooKeeper in production, because shutting down a redundant Solr instance will also shut down its ZooKeeper server, which might not be quite so redundant. Because a ZooKeeper ensemble must have a quorum of more than half its servers running at any given time, this can be a problem.

The solution to this problem is to set up an external ZooKeeper ensemble. Fortunately, while this process can seem intimidating due to the number of powerful options, setting up a simple ensemble is actually quite straightforward, as described below.

How Many ZooKeepers?

"For a ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines . Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures. Note that a deployment of six machines can only handle two failures since three machines is not a majority.

For this reason, ZooKeeper deployments are usually made up of an odd number of machines."

-- ZooKeeper Administrator's Guide.

 

When planning how many ZooKeeper nodes to configure, keep in mind that the main principle for a ZooKeeper ensemble is maintaining a majority of servers to serve requests. This majority is also called a quorum. It is generally recommended to have an odd number of ZooKeeper servers in your ensemble, so a majority is maintained. For example, if you only have two ZooKeeper nodes and one goes down, 50% of available servers is not a majority, so ZooKeeper will no longer serve requests. However, if you have three ZooKeeper nodes and one goes down, you have 66% of available servers available, and ZooKeeper will continue normally while you repair the one down node. If you have 5 nodes, you could continue operating with two down nodes if necessary. More information on ZooKeeper clusters is available from the ZooKeeper documentation at http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_zkMulitServerSetup.

Download Apache ZooKeeper

The first step in setting up Apache ZooKeeper is, of course, to download the software. It's available from http://zookeeper.apache.org/releases.html.

When using stand-alone ZooKeeper, you need to take care to keep your version of ZooKeeper updated with the latest version distributed with Solr. Since you are using it as a stand-alone application, it does not get upgraded when you upgrade Solr.

Solr currently uses Apache ZooKeeper v3.4.6.

Setting Up a Single ZooKeeper

Create the instance

Creating the instance is a simple matter of extracting the files into a specific target directory. The actual directory itself doesn't matter, as long as you know where it is, and where you'd like to have ZooKeeper store its internal data.

Configure the instance

The next step is to configure your ZooKeeper instance. To do that, create the following file: <ZOOKEEPER_HOME>/conf/zoo.cfg. To this file, add the following information:

The parameters are as follows:

tickTime: Part of what ZooKeeper does is to determine which servers are up and running at any given time, and the minimum session time out is defined as two "ticks". The tickTime parameter specifies, in miliseconds, how long each tick should be.

dataDir: This is the directory in which ZooKeeper will store data about the cluster. This directory should start out empty.

clientPort: This is the port on which Solr will access ZooKeeper.

Once this file is in place, you're ready to start the ZooKeeper instance.

Run the instance

To run the instance, you can simply use the ZOOKEEPER_HOME/bin/zkServer.sh script provided, as with this command: zkServer.sh start

Again, ZooKeeper provides a great deal of power through additional configurations, but delving into them is beyond the scope of this tutorial. For more information, see the ZooKeeper Getting Started page. For this example, however, the defaults are fine.

Point Solr at the instance

Pointing Solr at the ZooKeeper instance you've created is a simple matter of using the -z parameter when using the bin/solr script. For example, in order to point the Solr instance to the ZooKeeper you've started on port 2181, this is what you'd need to do:

Starting cloud example with Zookeeper already running at port 2181 (with all other defaults):

Add a node pointing to an existing ZooKeeper at port 2181:

NOTE: When you are not using an example to start solr, make sure you upload the configuration set to zookeeper before creating the collection.

Shut down ZooKeeper

To shut down ZooKeeper, use the zkServer script with the "stop" command: zkServer.sh stop.

Setting up a ZooKeeper Ensemble

With an external ZooKeeper ensemble, you need to set things up just a little more carefully as compared to the Getting Started example.

The difference is that rather than simply starting up the servers, you need to configure them to know about and talk to each other first. So your original zoo.cfg file might look like this:

Here you see three new parameters:

initLimit: Amount of time, in ticks, to allow followers to connect and sync to a leader. In this case, you have 5 ticks, each of which is 2000 milliseconds long, so the server will wait as long as 10 seconds to connect and sync with the leader.

syncLimit: Amount of time, in ticks, to allow followers to sync with ZooKeeper. If followers fall too far behind a leader, they will be dropped.

server.X: These are the IDs and locations of all servers in the ensemble, the ports on which they communicate with each other. The server ID must additionally stored in the <dataDir>/myid file and be located in the dataDir of each ZooKeeper instance. The ID identifies each server, so in the case of this first instance, you would create the file /var/lib/zookeeperdata/1/myid with the content "1".

Now, whereas with Solr you need to create entirely new directories to run multiple instances, all you need for a new ZooKeeper instance, even if it's on the same machine for testing purposes, is a new configuration file. To complete the example you'll create two more configuration files.

The <ZOOKEEPER_HOME>/conf/zoo2.cfg file should have the content:

You'll also need to create <ZOOKEEPER_HOME>/conf/zoo3.cfg:

Finally, create your myid files in each of the dataDir directories so that each server knows which instance it is. The id in the myid file on each machine must match the "server.X" definition. So, the ZooKeeper instance (or machine) named "server.1" in the above example, must have a myid file containing the value "1". The myid file can be any integer between 1 and 255, and must match the server IDs assigned in the zoo.cfg file.

To start the servers, you can simply explicitly reference the configuration files:

Once these servers are running, you can reference them from Solr just as you did before:

Securing the ZooKeeper connection

You may also want to secure the communication between ZooKeeper and Solr.

To setup ACL protection of znodes, see ZooKeeper Access Control.

For more information on getting the most power from your ZooKeeper installation, check out the ZooKeeper Administrator's Guide.

 

  • No labels

25 Comments

  1. I suggest to specify better the part related to the "myid" files creation, because is not really clear ...
    I suggest(from a Zookeeper guide) :

    "create a file named myid, one for each server, which resides in that server's data directory, as specified by the configuration file parameter dataDir.
    The myid file consists of a single line containing only the text of that machine's id. So myid of server 1 would contain the text "1" and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255."

    1. Thanks Alessandro.

      I didn't copy the text exactly from the ZK guide, but I made some edits that I hope make it more clear what the file is and when it is required.

      I also found a number of other problematic formatting and verbiage issues, so thanks so much for pointing out the problem.

  2. Does Solr 4.8 and higher must use Apache ZooKeeper v3.4.6? What about v3.4.3?

  3. The initLimit and syncLimit explanations are different from the ZK docs - we should just copy it over from there : http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html

    1. Changed that to be what the ZooKeeper 3.4.6 docs say.

  4. I'm looking for standard / recommended way to setup SolrCloud in production environment with external zookeepers using the bin/solr script.

    I have consulted to references (and bin/solr script,) but the bootstrapping step is a bit unclear to me yet ...  any help will be appreciated.

    My understanding is below. I have built branch_5x (r1650706) for trial.

    With "-e cloud" option, solr.solr.home is forced to set to example/cloud/nodeX and -s option is ignored. If I would not want to use example/cloud/nodeX directories in production, I will have to remove the initial nodes by Collections API after uploading config sets and creating collections.

    Or I must not to use bin/solr script in such cases for bootstrapping (maybe I will upload config sets via zkcli.sh to the zookeeper cluster before setup solr instances.)

     Is my understanding correct, or I've missed the important point?

  5. I started with embedded zookeeper as per Getting started with SOlrCloud. Can I run the same cluster without loosing anything to use external zookeeper?

    I tried to run the same cluster nodes by passing external zookeeper address as guided above but it's not working. Following exceptions appears on zookeeper

    2015-03-16 11:44:23,779 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /127.0.0.1:58513
    2015-03-16 11:44:23,787 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client attempting to establish new session at /127.0.0.1:58513
    2015-03-16 11:44:23,790 [myid:] - INFO  [SyncThread:0:FileTxnLog@199] - Creating new log file: log.1
    2015-03-16 11:44:23,818 [myid:] - INFO  [SyncThread:0:ZooKeeperServer@617] - Established session 0x14c237868450000 with negotiated timeout 15000 for client /127.0.0.1:58513
    2015-03-16 11:44:23,898 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x7 zxid:0x4 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:23,931 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0xd zxid:0x6 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:23,963 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x13 zxid:0x8 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:23,996 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x19 zxid:0xa txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,053 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:delete cxid:0x25 zxid:0xd txntype:-1 reqpath:n/a Error Path:/live_nodes/10.15.0.160:7574_solr Error:KeeperErrorCode = NoNode for /live_nodes/10.15.0.160:7574_solr
    2015-03-16 11:44:24,155 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:delete cxid:0x39 zxid:0x13 txntype:-1 reqpath:n/a Error Path:/overseer_elect/leader Error:KeeperErrorCode = NoNode for /overseer_elect/leader
    2015-03-16 11:44:24,186 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x3e zxid:0x15 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,201 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x3f zxid:0x16 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,218 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x41 zxid:0x17 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,251 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x47 zxid:0x19 txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,267 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x49 zxid:0x1a txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,284 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x4b zxid:0x1b txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,312 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x4e zxid:0x1c txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,329 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x50 zxid:0x1d txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,344 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x52 zxid:0x1e txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer
    2015-03-16 11:44:24,362 [myid:] - INFO  [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException when processing sessionid:0x14c237868450000 type:create cxid:0x54 zxid:0x1f txntype:-1 reqpath:n/a Error Path:/overseer Error:KeeperErrorCode = NodeExists for /overseer

    while on Solr admin you see the errors as

    • ettingstarted_shard2_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection gettingstarted found:null
    • gettingstarted_shard1_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection gettingstarted found:null
    1. Please use the solr-user mailing list for help requests with Solr.  These comments are meant for discussion the documentation, your comment has more to do with the usage of Solr itself.

      http://lucene.apache.org/solr/resources.html#community

      You could also try the IRC channel, if you are familiar with that as a communications method.

      http://wiki.apache.org/solr/IRCChannels

       

  6. I'd suggest to include a paragraph talking about how many Zookeeper needed in a SolrCloud cluster environment, especially about the failure tolerence., e.g. we need 2k+1 Zookeeper nodes to tolerate k Zookeeper nodes to fail.

    1. I added a paragraph to explain this. I thought it already existed somewhere, but apparently it didn't.

  7. How to run it on group of machine. Please explain the configuration one by one and command also which i need to run on different machine?

     

    I set up zookeeper on 3 machine. Installed solr there. Zookeeper is up on all machine with different port. What i need to do next?

    1. an entire section on this page is called "Setting up a ZooKeeper Ensemble" and specifically addresses the basics of running zookeeper on a 3 node ensemble - with links to the Apache ZooKeeper documentation for more details.

      • If you have concerns about the wording of this documentation, and have suggestions to make related to improving it, please post the specifics in a comment here.
      • If you are having difficulties with your setup, please email the solr-user list with specific details regarding your setup and the problems/errors you are having (but don't post them as a comment here – these comments are for discussion the documentation itself)
  8. It is generally recommended to have an odd number of ZooKeeper servers in your ensemble, so a majority is maintained. For example, if you only have two ZooKeeper nodes and one goes down, 50% of available servers is not a majority, so ZooKeeper will no longer serve requests. However, if you have three ZooKeeper nodes and one goes down, you have 66% of available servers available, and ZooKeeper will continue normally while you repair the one down node. If you have 5 nodes, you could continue operating with two down nodes if necessary.

    We have the above, which is essentially what the ZK docs have. However, the example doesn't clearly show why having "odd number of ZK servers" is a good idea. I suggest we have something like the following:

    It is generally recommended to have an odd number of ZooKeeper servers in your ensemble, so a majority is maintained. For example, if you only have two ZooKeeper nodes and one goes down, 50% of available servers is not a majority, so ZooKeeper will no longer serve requests. However, if you have three ZooKeeper nodes and one goes down, you have 66% of available servers available, and ZooKeeper will continue normally while you repair the one down node. If you have 5 nodes, you could continue operating with two down nodes if necessary. This means, while setting up the ensemble, adding an additional node to an odd numbered ensemble will not increase the failure tolerance of the ensemble (e.g. a three node emsemble and four node ensemble, both will be unavailable one node has failed). Also, one downside of an even numbered ensemble is that during half split brain partitions (network failure situation where half of the nodes are unable to reach the other half of the nodes), neither partition has a majority and the entire ZooKeeper ensemble is unavailable.

    1. ishan: i understand your point, but i feel like that paragraph was already way too long, and we should try to keep the info as succinct as possible to maximize the impact of the advice.

      fortunately there was already a great quote from the ZK admin guide to really drive home the 2xF+1 needed to survive failure of F. so i just used that, cited it as a quote, and put it in an info box to really draw attention to it.

      it still doesn't mention split brain, but honestly i'm not sure that trying to explain split brain in this context really adds anything the the key take away, it just seems like a distraction from the main point of the page: "IF YOU ARE SETTING UP ZK, DO THIS STUFF"

      Maybe instead there should be a sub-section in Read and Write Side Fault Tolerance that talks about network partitions and includes what happens in zk ensemble partitions?

      1. Great find, Hoss. +1 to keeping it as succinct as possible. Yes, I agree that split brain partition can be discussed in that page. What we have no makes good sense and is easy to understand; and much better than what was there originally.

  9. Why there're 2 ports in server.X line?

    1. This question would be better answered on the zookeeper mailing list, but I found some information.

      http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_configuration

      Quoting from that page: "There are two port numbers nnnnn. The first followers use to connect to the leader, and the second is for leader election. The leader election port is only necessary if electionAlg is 1, 2, or 3 (default). If electionAlg is 0, then the second port is not necessary. If you want to test multiple servers on a single machine, then different ports can be used for each server."

      This information have nothing at all to do with Solr – zookeeper does elections for its own internal operation.

  10. I use the example zoo.cfg here but when I start 1st node with zoo1.cfg, it returns "Cannot open channel 2 to election addres .... blah blah blah". What's wrong?

    1. The zoo.cfg needs to be identical between all zookeeper servers.

      If you only start one server, then it will not be able to contact any other servers for the leader election, and zookeeper will not be able to establish quorum or start properly.

      The election ports mentioned on server.N lines in zoo.cfg must be accessible from all zookeeper servers – not firewalled.

  11. Hi

    We can setup up Solr cloud on multiple server?

    Thanks

  12. Hi,

    The single zookeeper(zoo.cfg) configuration with works like a charm.

    tickTime=2000

    dataDir=/var/lib/zookeeper

    clientPort=2181

     

     

    But when I try to configure 3 files e.g zoo.cfg, zoo2.cfg, zoo3.cfg with

    zoo.cfg

    dataDir=/var/lib/zookeeperdata/1

    clientPort=2181

    initLimit=5

    syncLimit=2

    server.1=localhost:2888:3888

    server.2=localhost:2889:3889

    server.3=localhost:2890:3890

    zoo2.cfg

    tickTime=2000

    dataDir=c:/sw/zookeeperdata/2

    clientPort=2182

    initLimit=5

    syncLimit=2

    server.1=localhost:2888:3888

    server.2=localhost:2889:3889

    server.3=localhost:2890:3890


    zoo3.cfg

    tickTime=2000

    dataDir=c:/sw/zookeeperdata/3

    clientPort=2183

    initLimit=5

    syncLimit=2

    server.1=localhost:2888:3888

    server.2=localhost:2889:3889

    server.3=localhost:2890:3890

     

     

    Getting error

    sudo bin/zkServer.sh start zoo.cfg

    /usr/bin/java

    ZooKeeper JMX enabled by default

    Using config: /usr/local/bin/zookeeper-3.5.2-alpha/bin/../conf/zoo.cfg

    Starting zookeeper ... FAILED TO START

     

     

     

     

    1. I'd suggest taking the question to the Solr-user mailing list, we can't answer support questions in the comments section of the documentation. Information on how to join the mailing list is available from: http://lucene.apache.org/solr/resources.html#community.

    1. Thanks Furkan, I've fixed.