I remember ZK coordinates (hosts, ports and root) are set as system properties in Solr nodes (please open the admin console and see their names). So, it would be just a matter of
System.getProperty(ZK ensemble coordinates|root) Prior to go in that direction: I don't know/remember if there's some ZK Solr specific class where they can be asked. If that class exists, it would be a better way, otherwise you can go with the system property approach. Andrea On Thu, 29 Aug 2019, 21:32 Arnold Bronley, <[email protected]> wrote: > @Andrea: I agree with you. Do you know if there is a way to initialize > SolrCloudClient directly from some information that I get > from SolrQueryRequest or from AddUpdateCommand object? > > @Erick: Thank you for the information about > StatelessScriptUpdateProcessorFactory. > > "In your situation, add this _before_ the update is distributed and instead > of > coreB, ask for collectionB." > > Right, but how do I ask for for collectionB? > > "Next, you want to get the value from “coreB”. Don’t do that, get it from > _collection_ B." > > Right, but how do I get value _collection_B? > > > > On Thu, Aug 29, 2019 at 2:17 PM Erick Erickson <[email protected]> > wrote: > > > Have you looked at using one of the update processors? > > > > Consider StatelessScriptUpdateProcessorFactory for instance. You can do > > anything > > you’d like to do in a script (Groovy, Postscript. Python I think, and > > others). See: > > ./example/files/conf/update-script.js for one example. > > > > You put it in your solrconfig file in the update handler, then put the > > script in your > > conf directory and push it to ZK and the rest is automagical. > > > > There are a bunch of other update processors that you can use that are > also > > pretty much by configuration, but the one I referenced is the one that is > > the > > most general-purpose. > > > > In your situation, add this _before_ the update is distributed and > instead > > of > > coreB, ask for collectionB. > > > > Distributed updates go like this: > > 1. the doc gets routed to a leader for a shard > > 2. the doc gets forwarded to each replica. > > > > Now, depending on where you put the update processor (and you’ll have to > > dig a bit. Much of this distribution logic is implicit, but you can > > explicitly > > define it in solrconfig.xml), this either happens _before_ the docs are > > sent > > to the rest of the replicas or _after_ the docs arrive at each replica. > > From what > > you’ve described, you want to do this before distribution so all copies > > have > > the new field. You don’t care what replica is the leader. You don’t care > > how many > > other replicas exist or where they are. You don’t even care if there’s > any > > replica hosting this particular collection on the node that does this, it > > happens > > before distribution. > > > > Next, you want to get the value from “coreB”. Don’t do that, get it from > > _collection_ B. Since you have the doc ID (presumably the <uniqueKey>), > > using get-by-id instead of a standard query will be very efficient. I can > > imagine > > under very heavy load this might introduce too much overhead, but it’s > > where I’d start. > > > > Best, > > Erick > > > > > On Aug 29, 2019, at 1:45 PM, Arnold Bronley <[email protected]> > > wrote: > > > > > > I can't use CloudSolrClient because I need to intercept the incoming > > > indexing request and then add one more field to it. All this happens on > > > Solr side and not client side. > > > > > > On Thu, Aug 29, 2019 at 1:05 PM Andrea Gazzarini <[email protected] > > > > > wrote: > > > > > >> Hi Arnold, > > >> why don't you use solrj (in this case a CloudSolrClient) instead of > > dealing > > >> with such low-level details? The actual location of the document you > are > > >> looking for would be completely abstracted. > > >> > > >> Best, > > >> Andrea > > >> > > >> On Thu, 29 Aug 2019, 18:50 Arnold Bronley, <[email protected]> > > >> wrote: > > >> > > >>> So, here is the problem that I am trying to solve. I am moving from > > Solr > > >>> master-slave architecture to SolrCloud architecture. I have one > custom > > >> Solr > > >>> plugin that does following: > > >>> > > >>> 1. When a document (say document with unique id doc1)is getting > indexed > > >> to > > >>> a core say core A then this plugin adds one more field to the > indexing > > >>> request. It fetches this new field from core B. Core B in our case > > >>> maintains popularity score field for each document which gets > > calculated > > >> in > > >>> a different project. It fetches the popularity score from score B for > > >> doc1 > > >>> and adds it to indexing request. > > >>> 2. In following code, dataInfo.dataSource is the name of the core B. > > >>> > > >>> I can use the name of the core B like collection_shard1_replica_n21 > and > > >> it > > >>> works. But it is not a good solution. What if I had a multiple shards > > for > > >>> core B? In that case the the doc1 that I am trying to find might not > be > > >>> present in collection_shard1_replica_n21. > > >>> > > >>> So is there something like, > > >>> > > >>> SolrCollecton dataCollection = getCollection(dataInfo.dataSource); > > >>> > > >>> @Override > > >>> public void processAdd(AddUpdateCommand cmd) throws IOException { > > >>> SolrInputDocument doc = cmd.getSolrInputDocument(); > > >>> String uniqueId = getUniqueId(doc); > > >>> > > >>> SolrCore dataCore = > > >>> req.getCore().getCoreContainer().getCore(dataInfo.dataSource); > > >>> > > >>> if (dataCore == null){ > > >>> LOG.error("Solr core '{}' to use as data source could not be > > >>> found! " > > >>> + "Please check if it is loaded.", > dataInfo.dataSource); > > >>> } else{ > > >>> > > >>> Document sourceDoc = getSourceDocument(dataCore, uniqueId); > > >>> > > >>> if (sourceDoc != null){ > > >>> > > >>> populateDocToBeAddedFromSourceDoc(doc,sourceDoc); > > >>> } > > >>> } > > >>> > > >>> // pass it up the chain > > >>> super.processAdd(cmd); > > >>> } > > >>> > > >>> > > >>> On Wed, Aug 28, 2019 at 6:15 PM Erick Erickson < > > [email protected]> > > >>> wrote: > > >>> > > >>>> No, you cannot just use the collection name. Replicas are just > cores. > > >>>> You can host many replicas of a single collection on a single Solr > > node > > >>>> in a single CoreContainer (there’s only one per Solr JVM). If you > just > > >>>> specified a collection name how would the code have any clue which > > >>>> of the possibilities to return? > > >>>> > > >>>> The name is in the form collection_shard1_replica_n21 > > >>>> > > >>>> How do you know where the doc you’re working on? Put the ID through > > >>>> the hashing mechanism. > > >>>> > > >>>> This isn’t the same at all if you’re running stand-alone, then > there’s > > >>> only > > >>>> one name. > > >>>> > > >>>> But as I indicated above, your ask for just using the collection > name > > >>> isn’t > > >>>> going to work by definition. > > >>>> > > >>>> So perhaps this is an XY problem. You’re asking about getCore, which > > is > > >>>> a very specific, low-level concept. What are you trying to do at a > > >> higher > > >>>> level? Why do you think you need to get a core? What do you want to > > >> _do_ > > >>>> with the doc that you need the core it resides in? > > >>>> > > >>>> Best, > > >>>> Erick > > >>>> > > >>>>> On Aug 28, 2019, at 5:28 PM, Arnold Bronley < > [email protected] > > >>> > > >>>> wrote: > > >>>>> > > >>>>> Wait, would I need to use core name like > > >> collection1_shard1_replica_n4 > > >>>>> etc/? Can't I use collection name? What if I have multiple shards, > > >> how > > >>>>> would I know where does the document that I am working with lives > in > > >>>>> currently. > > >>>>> I would rather prefer to use collection name and expect the core > > >>>>> information to be abstracted out that way. > > >>>>> > > >>>>> On Wed, Aug 28, 2019 at 5:13 PM Erick Erickson < > > >>> [email protected]> > > >>>>> wrote: > > >>>>> > > >>>>>> Hmmm, should work. What is your core_name? There’s strings like > > >>>>>> collection1_shard1_replica_n4 and core_node6. Are you sure you’re > > >>> using > > >>>> the > > >>>>>> right one? > > >>>>>> > > >>>>>>> On Aug 28, 2019, at 3:56 PM, Arnold Bronley < > > >> [email protected] > > >>>> > > >>>>>> wrote: > > >>>>>>> > > >>>>>>> Hi, > > >>>>>>> > > >>>>>>> In a custom Solr plugin code, > > >>>>>>> req.getCore().getCoreContainer().getCore(core_name) is returning > > >> null > > >>>>>> even > > >>>>>>> if core by name core_name is loaded and up in Solr. req is object > > >>>>>>> of SolrQueryRequest class. I am using Solr 8.2.0 in SolrCloud > mode. > > >>>>>>> > > >>>>>>> Any ideas on why this might be the case? > > >>>>>> > > >>>>>> > > >>>> > > >>>> > > >>> > > >> > > > > >
