Apache Solr Documentation

6.5 Ref Guide (PDF Download)
Solr Tutorial
Solr Community Wiki

Older Versions of this Guide (PDF)

Ref Guide Topics

Meta-Documentation

*** As of June 2017, the latest Solr Ref Guide is located at https://lucene.apache.org/solr/guide ***

Please note comments on these pages have now been disabled for all users.

Skip to end of metadata
Go to start of metadata
This Page Has Been Removed From Guide

This page was deemed more harmful then helpful, and it was decided to archive it and remove it from the ref guide proper, rather then try to fix the content – most of the details covered here are more accurately explained in other existing pages.

Please see How SolrCloud Works instead.

 

Nodes and Cores

In SolrCloud, a node is Java Virtual Machine instance running Solr, commonly called a server. Each Solr core can also be considered a node. Any node can contain both an instance of Solr and various kinds of data.

A Solr core is basically an index of the text and fields found in documents. A single Solr instance can contain multiple "cores", which are separate from each other based on local criteria. It might be that they are going to provide different search interfaces to users (customers in the US and customers in Canada, for example), or they have security concerns (some users cannot have access to some documents), or the documents are really different and just won't mix well in the same index (a shoe database and a dvd database).

When you start a new core in SolrCloud mode, it registers itself with ZooKeeper. This involves creating an Ephemeral node that will go away if the Solr instance goes down, as well as registering information about the core and how to contact it (such as the base Solr URL, core name, etc). Smart clients and nodes in the cluster can use this information to determine who they need to talk to in order to fulfill a request.

New Solr cores may also be created and associated with a collection via CoreAdmin. Additional cloud-related parameters are discussed in the Parameter Reference page. Terms used for the CREATE action are:

  • collection: the name of the collection to which this core belongs. Default is the name of the core.
  • shard: the shard id this core represents. (Optional: normally you want to be auto assigned a shard id.)
  • collection.<param>=<value>: causes a property of <param>=<value> to be set if a new collection is being created. For example, use collection.configName=<configname> to point to the config for a new collection.

For example:

#666666solidcurl 'http://localhost:8983/solr/admin/cores? action=CREATE&name=mycore&collection=my_collection&shard=shard2'

Clusters

A cluster is set of Solr nodes managed by ZooKeeper as a single unit. When you have a cluster, you can always make requests to the cluster and if the request is acknowledged, you can be sure that it will be managed as a unit and be durable, i.e., you won't lose data. Updates can be seen right after they are made and the cluster can be expanded or contracted.

Creating a Cluster

A cluster is created as soon as you have more than one Solr instance registered with ZooKeeper. The section Getting Started with SolrCloud reviews how to set up a simple cluster.

Resizing a Cluster

Clusters contain a settable number of shards. You set the number of shards for a new cluster by passing a system property, numShards, when you start up Solr. The numShards parameter must be passed on the first startup of any Solr node, and is used to auto-assign which shard each instance should be part of. Once you have started up more Solr nodes than numShards, the nodes will create replicas for each shard, distributing them evenly across the node, as long as they all belong to the same collection.

To add more cores to your collection, simply start the new core. You can do this at any time and the new core will sync its data with the current replicas in the shard before becoming active.

You can also avoid numShards and manually assign a core a shard ID if you choose.

The number of shards determines how the data in your index is broken up, so you cannot change the number of shards of the index after initially setting up the cluster.

However, you do have the option of breaking your index into multiple shards to start with, even if you are only using a single machine. You can then expand to multiple machines later. To do that, follow these steps:

  1. Set up your collection by hosting multiple cores on a single physical machine (or group of machines). Each of these shards will be a leader for that shard.
  2. When you're ready, you can migrate shards onto new machines by starting up a new replica for a given shard on each new machine.
  3. Remove the shard from the original machine. ZooKeeper will promote the replica to the leader for that shard.

Leaders and Replicas

The concept of a leader is similar to that of master when thinking of traditional Solr replication. The leader is responsible for making sure the replicas are up to date with the same information stored in the leader.

However, with SolrCloud, you don't simply have one master and one or more "slaves", instead you likely have distributed your search and index traffic to multiple machines. If you have bootstrapped Solr with numShards=2, for example, your indexes are split across both shards. In this case, both shards are considered leaders. If you start more Solr nodes after the initial two, these will be automatically assigned as replicas for the leaders.

Replicas are assigned to shards in the order they are started the first time they join the cluster. This is done in a round-robin manner, unless the new node is manually assigned to a shard with the shardId parameter during startup. This parameter is used as a system property, as in -DshardId=1, the value of which is the ID number of the shard the new node should be attached to.

On subsequent restarts, each node joins the same shard that it was assigned to the first time the node was started (whether that assignment happened manually or automatically). A node that was previously a replica, however, may become the leader if the previously assigned leader is not available.

Consider this example:

  • Node A is started with the bootstrap parameters, pointing to a stand-alone ZooKeeper, with the numShards parameter set to 2.
  • Node B is started and pointed to the stand-alone ZooKeeper.

Nodes A and B are both shards, and have fulfilled the 2 shard slots we defined when we started Node A. If we look in the Solr Admin UI, we'll see that both nodes are considered leaders (indicated with a solid blank circle).

  • Node C is started and pointed to the stand-alone ZooKeeper.

Node C will automatically become a replica of Node A because we didn't specify any other shard for it to belong to, and it cannot become a new shard because we only defined two shards and those have both been taken.

  • Node D is started and pointed to the stand-alone ZooKeeper.

Node D will automatically become a replica of Node B, for the same reasons why Node C is a replica of Node A.

Upon restart, suppose that Node C starts before Node A. What happens? Node C will become the leader, while Node A becomes a replica of Node C.{scrollbar}

  • No labels

29 Comments

  1. I have a question about "core",confused.
    I start 2 tomcat(with solr,under zookeeper cluster) in solrcloud, one:in core.properties,I define core's name with "test_core,coreNodeName=core_node1",the other one:"test_core,coreNodeName=core_node2". what sort of relationship it is between them,noting or they actually belong part of core "test_one" ?

    1. When you're using SolrCloud, you should not try to administer your cloud with CoreAdmin or manually create cores with core.properties unless you fully understand how everything ties together.

      I have no desire to be rude, so I hope you can take this next part as I intended it, which is constructive criticism: Your question about core.properties indicates that you do not fully understand these things.  You should only be creating or modifying your collections using the Collections API until you do have that understanding.

      I actually have no idea what the CoreNodeName feature does or how to use it.  The minimum you should have in a core.properties file when running SolrCloud would be something like this:

      name=test_core
      collection=test_one
      shard=shard1

      A collection created by the Collections API on newer versions may have additional information in core.properties ... but I haven't used a new enough version of Solr in cloud mode myself – on the 4.2.1 version that I'm using, cores are defined in solr.xml and core.properties files don't exist.  I do have newer versions installed that use core.properties, but they are not in cloud mode.

      1. thanks reply.very greateful .
        but seems like u didn't answer my question,may be I describtion not clear enough,sorry about that.
        in version 3.6.x,I define solr.xml for solr core,every core is physics on disk,now,4.x or 5.x, it seems not only in physical,but a logical concept,so I wanna know, could we believe cores in different web container with (same name and different coreNodeName) are actually JUST SAME ONE CORE

  2. This page is very confusing, and in places, possibly wrong.

    The last section talks about nodes automatically becoming replicas.  Does this actually happen in situations where the user did not explicitly ask for it to happen?

    This page also talks about starting Solr with -DnumShards, and creating cores with CoreAdmin.  Both of these are practices that I would never recommend for a new SolrCloud user.  I believe that beginning users should use the Collections API and zkCli exclusively, and never start Solr with cloud options beyond zkHost.  That's usually good advice even for seasoned users.

    1. My be I have some  opinion about it,at first ,create a collection with collectionAPI is not convenience,core API is recommend,because whatever the collection looks like ,it must depend on solr core ; with core api,your can name them in common

      1. The automatic core names that you get when you create a collection named "foo" with the Collections API are foo_shardN_replicaM (with N and M being numbers) ... which I find to be an extremely convenient and logical way to name them.  You can create a collection that has a few hundred cores with one HTTP request ... doing that with the Core Admin would be tedious.

        If everyone uses the Collections API, then it is easier to obtain help for your install, because your cores will have predictable names.  In truth the core names don't matter all that much, but the default does make it a lot easier when you're trying to describe your setup to someone else.

  3. In SolrCloud world, I think "shard" is a logical name for the stuff (index and fields, etc.) on some node and "core" is the physical name for that stuff. Furthermore, "replica" is just another copy of the core of specific shard. Am I right? If yes, shouldn't there be a less confusable name for the very 1st core of specific shard, rather than just "shard", since the replica will also become a shard in some time later? There's once a name "slice" for that very 1st core to be distinguished from the later "shard", right? Does anyone know what I am talking about here (I almost bite my own tongue, ha!)?

    1. Here's the terminology from the user perspective, boiled down to three very short sentences:  Collections are made up of shards.  Shards are made up of replicas.  Each replica is a core.

      "Slice" is a term used in some places in the code for SolrCloud.  As I understand it, it is interchangeable with shard.  We are trying very hard to eliminate "slice" from all user-facing documentation, to prevent confusion, and I think there may even be some efforts to remove it from the code as well.

      1. You said "Shard are made up of replicas" but the very 1st one isn't a "replica". Doesn't replica mean "copied from something"? But the very 1st one is some inital part of the large index that the a collection repesents, it's not a replica itself at all, isn't it? Or I misunderstand (or dig too deep about) the meaning of replica?

        1. If your replicationFactor is 2 when you create your collection, you will have two copies.  They are both called replicas.  Solr will temporarily elect one of them to be the leader for the shard, but this does not change the terminology – they are still replicas.

          If your replicationFactor is 1, then you have one replica.  This is somewhat confusing if you are focused on the fact that replica and replication share a word root, but the terminology is correct in the context of SolrCloud.

  4. Due to considration of "single-point failure", we might install at least 2 Zookeepers (one primary and one backup). How can I switch an already-run-up cluster's zookeeper to the backup one? Do I have to stop-'n-restart each of the nodes in the cluster? Furthermore, is there a standy architecture for Zookeeper, whether hot or cold?

    1. A minimum of three zookeeper nodes are required for redundancy.  A two-node ensemble is actually less stable than even a single node, because the failure of EITHER node means the loss of quorum.  Each machine in a two-node ensemble is a single point of failure.

      The need for three machines is in the zookeeper documentation, but it is a little buried.  See this URL:

      http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_designing

      In the "Cross Machine Requirements" section it says "Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures."

      1. I read about the meaning of quorum and know Zookeeper itself has replicated mode. So if I establish a SolrCloud production with odd-number Zookeepers nodes, I don't have to worry about failure of some Zookeeper node, it will elect another well Zookeeper node to lead the work. Am I right?

        Note: I presume the suggestion of odd-number Zookeepers has similar reason as the that of Raid 5 aritechture.

        1. RAID5 isn't exactly the same as zookeeper quorum.  No matter how many disks there are in a RAID5 array, you can only lose one and still keep your data.  A four-drive array is larger than a three-drive array, so it is worthwhile (and commonplace) to have an even number of disks.

          In Zookeeper, a four-node ensemble has exactly the same ability to handle failure as a three-node ensemble – if one machine fails, you're good, but if two machines fail, quorum is gone.  It costs more for the same redundancy, and there's an extra piece that can fail in that fourth server.  The situation is similar when you compare a six-node ensemble to a five-node ensemble.  Both of them can handle exactly two failures, but the six-node version costs more.

          1. Clear enough! Thanks for your explanation!

  5. I'm seeing a strange behavior. I have configured 2 shards in my SolrCloud. Each shard is associated with 3 nodes.

    Shard 1 - node 1, node 2, node 3 out of this node 1 is leader and other are replica. Same is with other shards.

    This mapping shows properly when nodes starts. But after few days I can see that node 2 became leader and node 1 & node 3 became replicas.

    Does this change happens automatically. I also checked. There is no issue in the nodes. All nodes are active and running.

    Will this cause any issue in getting the result from indices? Currently I'm getting different results each time.

    Please suggest.

    1. What probably happened to cause the leader election is that some event took longer than zkClientTimeout, which forced a new leader election, and node 2 won.  This timeout defaults to 15 seconds internally in SolrCloud, and is explicitly set to 30 seconds by recent example configurations.

      The most common reason for events exceeding zkClientTimeout is long garbage collection pauses, but sometimes, especially on Solr 5.x which has good GC tuning in its startup scripts, it is general performance problems.

      https://wiki.apache.org/solr/SolrPerformanceProblems

      The 15 and 30 second timeouts are quite long, so whatever happened is definitely serious.

  6. Dear all,

    How can I know which configurations is used from my SolrCloud collection?

    I know how to upload and link but I also need to know which one is currently used by a collection.

    Many thanks

    1. You can see the current configuration for each core with the Files link under the core in the admin UI.  If you put something in each solrconfig.xml to identify it, you can tell with a glance which configuration is there.  If you haven't reloaded the collection or restarted your Solr instances since uploading it, it's possible that there's an old configuration that's active, but what you see there will definitely be the active configuration on reload.

      1. Hi Shawn,

        thank you for your reply.

        What I mean is that I can check the config and do comparison but what I really need to know is the config name stored in ZK that currently my collection is linked to.

        Thanks

         

        1. In the admin UI, go to Cloud->Tree, open collections, and then click on the collection.  The configName linked on that collection will be at the bottom of the right pane.

  7. If I send a query to a replica (leader or not), will the query be redirected to a random replica or only the leader node will perform the query?

    1. I've added some intro material on this to Distributed Requests

  8. In generally, this page is probably more harmful then helpful – most of the content seems to assuming only using bootstrap options to convert older setups, and not using ZK as truth, and conflates a lot of basic concepts like nodes vs cores.

    I think the best thing to do is just straight up remove this page, and try t oconsolidate some of the basic concepts explained here that are useful into the parent page ... i'm going to take a stab at that and then move this page out of the ref guide proper.

  9. For sentence "Set up your collection by hosting multiple cores on a single physical machine" in "Resizing Cluster" paragraph, shouldn't it be "multiple shards" rather than "multiple cores"?

    1. This page is no longer in the reference guide.  It was moved to an internal "dead pages" section over a month ago.  See the comment from Hoss preceding your comment, and the top of the page.

      1. Yes! I just saw it. However, "Resizing a cluster" is really a very practical and necessary topic. Will there be a section for this topic in future reference guide?