Content search and applying ACL

2021-03-23 Thread k-jingyang
Hello everyone, I have a use case for my users which I'm having issues implementing. Hoping to find some insights here. We are trying to let our users search for almost any content data that they have, while respecting access control policies. My users are grouped into teams, and policies are app

Control tlog size in solr-6.6 cloud

2021-03-23 Thread Ritvik Sharma
Hi I am facing an issue where the tlog size in each shard's replica is increasing ~150GB where as actual index is ~40GB. I have enabled HardCommit also and passed* commit=true* while indexing the data. Still no luck. Can you help in this regard?

Solr Cloud "Octet " issue

2021-03-23 Thread Ritvik Sharma
It seems that there are some intermittent error is coming, *org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://x.x.x.x:8983/solr/solrcollection_shard2_replica2 : Expected mime type application/o

RE: Distributed IDF for Solr using ExactStatsCache issue

2021-03-23 Thread Cameron M VandenBerg
Hi Michael, I have 8 shards (on 8 different nods) and no replicas with about 500 million documents. Additionally, I have a collection with just 2 shards and no replicas (and significantly fewer documents) where I see the same behavior. I do observe this behavior even when I route the query th

Re: Control tlog size in solr-6.6 cloud

2021-03-23 Thread Bernd Fehling
Hi, without any more info from your system and configs it is impossible to guess what the problem could be. But generally I can say that solrcloud 6.6 has no problems with it, as I have a cloud with 5 shards and 2 replicas each on 5 nodes and a total of 260 mio. records. Somewhere between 1 to 3

Re: Control tlog size in solr-6.6 cloud

2021-03-23 Thread Ritvik Sharma
Hi Brend, Thanks for the reply. The errors I am getting, INFO - 2021-03-23 19:45:32.433; [c:solrcollection s:shard2 r:core_node4 x:solrcollection_shard2_replica2] org.apache.solr.update.processor.DistributedUpdateProcessor; Ignoring commit while not ACTIVE - state: APPLYING_BUFFERED replay: fa

Re: Control tlog size in solr-6.6 cloud

2021-03-23 Thread Brian Lininger
Hi Ritvik, It looks like you're indexing and the replica(s) are in recovery and you're indexing faster than the replica can replay tlogs so it's not able to catch up. I've had this occur in our production Solr clusters (which are also on 6.6) many times, when it happens we have to throttle our ind

Re: Solr Cloud "Octet " issue

2021-03-23 Thread Walter Underwood
This usually means the client is receiving an HTML error message. Look on the server for non-200 responses. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 23, 2021, at 4:33 AM, Ritvik Sharma wrote: > > It seems that there are some intermittent

StrField not matching content "equals" way

2021-03-23 Thread Subhajit Das
Hi, I have a field with type string (StrField). The data in it is like “abc xyz”. But when I search for “xyz”, I am getting one document containing “abc xyz”. Using q and fq is same for this. When I filter like field=”xyz” it works, but field=xyz dosen’t. Thanks and Regards, Subhajit

Re: Solr Cloud "Octet " issue

2021-03-23 Thread Ritvik Sharma
Hi Walter Thanks for the reply ! Actually this logs are written on server also. I am using solr cloud . When i am doing indexing this error is coming frequently. On Tue, 23 Mar 2021 at 10:51 PM, Walter Underwood wrote: > This usually means the client is receiving an HTML error message. > Look

Re: StrField not matching content "equals" way

2021-03-23 Thread Alessandro Benedetti
Hi Subhajit, that's weird, if you use StrField no text analysis happens. This means in the inverted index a single token is built for the text "abc xyz". So neither the query field:xyz neither the phrase query field:"xyz" is supposed to return you such document. I would recommend running the query

Difference between * and [* TO *]

2021-03-23 Thread Rahul Goswami
Hello, Can someone please tell me what is the difference between q=myfield:* vs q=myfield:[* TO *] for an indexed field "myfield". The available answers online seem more like a best guess than a definitive answer, so I wanted to get my understanding clarified. Also, my understanding is that myfiel

Re: Content search and applying ACL

2021-03-23 Thread Alessandro Benedetti
Hi, it's definitely an interesting question, it happened I had to work personally on acl designs in the past. It has been a while I don't look at the Lucene/Solr internals of that bit, but first of all, I suspect you are going to get a performance boost if you store documents and acls in the same

Re: Replication and Score Issue

2021-03-23 Thread Alessandro Benedetti
When calculating the DF (document frequency) component of a BM25 score, Apache Lucene BM25 similarity uses: org.apache.lucene.search.similarities.BM25Similarity#idfExplain(org.apache.lucene.search.CollectionStatistics, org.apache.lucene.search.TermStatistics) *Note that CollectionStatistics.docCou

Re: Difference between * and [* TO *]

2021-03-23 Thread Shawn Heisey
On 3/23/2021 12:51 PM, Rahul Goswami wrote: Hello, Can someone please tell me what is the difference between q=myfield:* vs q=myfield:[* TO *] for an indexed field "myfield". The available answers online seem more like a best guess than a definitive answer, so I wanted to get my understanding cl

Re: Difference between * and [* TO *]

2021-03-23 Thread Houston Putman
So for a lot of fields, starting in 8.5 (SOLR-11746 ), the two are equivalent and use the exact same, very fast (when possible), query mechanism. If you are using Solr 8.4 or below, refer to Shawn's advice. The difference between the two is documen

RE: StrField not matching content "equals" way

2021-03-23 Thread Subhajit Das
Hi Alessandro, Thanks for the suggestion. The issue is that, without the double quotes, query like “ field: ola hello hi ” parses into “ field: ola (text:hello text:hi) ”. Which means that it is breaking in space and considering only first one. And all parts are or to each other. Wrapping in d

Re: StrField not matching content "equals" way

2021-03-23 Thread Alexandre Rafalovitch
That's correct behavior for the query parser you are with (default/lucene it seems): https://solr.apache.org/guide/8_8/the-standard-query-parser.html. You could explore other query parsers, e.g. field: https://solr.apache.org/guide/8_8/other-parsers.html#field-query-parser , possibly combined with

RE: StrField not matching content "equals" way

2021-03-23 Thread Subhajit Das
Thanks Alexandre for clarification. From: Alexandre Rafalovitch Sent: 24 March 2021 01:45 AM To: users@solr.apache.org Subject: Re: StrField not matching content "equals" way That's correct behavior for the query parser you are with (defaul

Error creating TLOG collection

2021-03-23 Thread Webster Homer
We use Solr 7.7.2 in a solrcloud consisting of 4 nodes. Each collection has 2 shards and we try to place a replica on each node and don't want to have more than one replica for a collection on the same node. One of our collections has very specific requirements that force us to use TLOG replic

Controlling dynamic field creation on guessed field

2021-03-23 Thread Steven White
Hi Everyone, I have the following block of code in my solrconfig.xml java.lang.String text_en true This is creating a new field like so: I need it to include additional field settings, so that I would have the following: I need to have be able to s

Re: Controlling dynamic field creation on guessed field

2021-03-23 Thread Alexandre Rafalovitch
Can you just define a new field type with all those parameters you want "text_en_mine" and map to that? Regards, Alex On Tue., Mar. 23, 2021, 7:32 p.m. Steven White, wrote: > Hi Everyone, > > I have the following block of code in my solrconfig.xml > >name="add-schema-fields"> > >

Re: Controlling dynamic field creation on guessed field

2021-03-23 Thread Steven White
Hi Alex, I already have defined my field type which in this case is called "text_en". Here is how it looks like: .. .. Are you saying I can add to the fieldType properties such as "multiValued" and "stored"? If so, I never knew this and I don

Re: Controlling dynamic field creation on guessed field

2021-03-23 Thread Alexandre Rafalovitch
Hi Steven, Yes, you can define most of the defaults on field types and then - if needed - override them per-field. Your example looks correct. You can find the relevant Ref Guide section at: https://solr.apache.org/guide/8_8/field-type-definitions-and-properties.html#field-default-properties Rega