Thanks again Dave, helpful information!  Yes, we have one slave as primary
- serving queries, and the other for failover...   We are in a slow process
of moving to SolrCloud Solr 8/9, but still a lot of work is given to
maintaining our Solr 6 deployment.  So learning more about this type of
replication is valuable.

Matt

On Tue, Oct 11, 2022 at 12:32 PM Dave <hastings.recurs...@gmail.com> wrote:

> I’ve seen this happen where the slaves behave differently from each other
> or get the version of the index out of whack, usually it happened if the
> latency of one slave vs another to the master isn’t the same. But again,
> that’s why you should have at least double the size on your slaves for the
> index, I also wonder if both or however many slaves have the exact disk
> space and memory, and maybe if one slave is outside of the network increase
> the replication timeout to not battle with the immediate server.
>
> The entire thing is a dance, for example of the one server was giving
> issues and re replicating it could be because the index changed part way
> through so it had to make another temp index folder and repeat the same
> process. So many entertaining things to watch
>
> Take care and good luck. Also another tip from experience, use only one
> slave for queen and have the others as backup, as a single server can cache
> the fields way faster than round robin or whatever other metric uses to
> determine who serves.
> -Dave
>
>
> > On Oct 11, 2022, at 1:32 PM, mtn search <search...@gmail.com> wrote:
> >
> > Thanks Dave!  Yes, we ran into this issue yesterday and do need to
> review
> > the disk space we have available (as well as the large size of our
> cores).
> > Also, there was some interesting context for this event.  We have 2
> slaves
> > on separate servers replicating from the master.  One slave replicated
> fine
> > over the weekend, with only a fraction of the files needing to be
> updated.
> > However, on the other slave, Solr believed it needed to do a full
> > replication.  Over and over it filled up disk, failed, appeared to clean
> > failed attempt and tried again.  Yesterday, after a couple Solr restarts
> > and then a full Solr start/stop, it appears that Solr recognized it did
> not
> > need to perform a full replication and the completed successfully by only
> > copying over the subset of index files needed (like the other slave did).
> >
> > I am not sure how to explain it other than for a time Solr was in a state
> > that required a full replication and the stop/start forced it to reassess
> > what was actually needed in the replication.  Replication is healthy on
> > both today.
> >
> > Matt
> >
> >> On Mon, Oct 10, 2022 at 1:18 PM Dave <hastings.recurs...@gmail.com>
> wrote:
> >>
> >> Only an optimize or a large fragment merge would cause a large file
> >> deposits there. That’s why “slaves” should always have double the index
> >> size available as solr will decide on its own when to merge or optimize
> on
> >> the master so the slaves need to be ready for double the size, and the
> >> master needs to be ready for triple the size.  If you don’t have the
> disk
> >> space ready to handle this you’re going to eventually run into some
> serious
> >> issues, or just not be able to replicate
> >>
> >> -dave
> >>
> >>>> On Oct 10, 2022, at 2:56 PM, mtn search <search...@gmail.com> wrote:
> >>>
> >>> As I go back through
> >>> https://solr.apache.org/guide/6_6/index-replication.html, the picture
> is
> >>> filling in a little more.  My guess the tmp dir referenced, is the
> >>> index.<timestamp> dir.
> >>>
> >>> Very interested in cases that might generate a full replication.  To my
> >>> knowledge no optimize commands has been issued against the core in
> >> question.
> >>>
> >>>> On Mon, Oct 10, 2022 at 12:38 PM mtn search <search...@gmail.com>
> >> wrote:
> >>>>
> >>>> Hello,  I am learning more about replication as I maintain a large
> Solr
> >> 6
> >>>> set of Solr servers configured for Master/Slave.
> >>>>
> >>>> I noticed during some replication activities in addition to the
> original
> >>>> index dir under the core name on the file system is a dir named
> "index"
> >>>> with a timestamp.  index.<timestamp>.  Files are written to this dir
> >> with
> >>>> the timestamp during replication.  I am interested in how this works:
> >>>>
> >>>> For every core replicating to it's master is this timestamped dir
> >>>> created?
> >>>>
> >>>> Or is this timestamped dir created/used for only special
> circumstances?
> >>>> If so, what?
> >>>>
> >>>>     - Are there cases that cause a full replication within Solr 6?
> >>>>
> >>>> Is the original index dir removed and the time stamped dir renamed to
> >>>> "index" after replication?
> >>>>
> >>>> I initially figured all replication activities happened within the
> index
> >>>> dir, but that does not appear to be the case.
> >>>>
> >>>> Any tips, or documentation references would be appreciated.
> >>>>
> >>>> Thanks,
> >>>> Matt
> >>>>
> >>
>

Reply via email to