Apache Solr Query Issue with huge data

2024-04-05 Thread prasad bezavada
Dear Team, I'm currently using Solr version 8.11.3, configured with RAM resources (125 GB physical memory, 64 GB heap memory). The collection comprises 4 shards within the same node. Through our Java application ( SolrJ), indexed approximately 8 million records from an RDBMS table into Solr. Pres

Re: Apache Solr Query Issue with huge data

2024-04-05 Thread uyil...@vivaldi.net.INVALID
Hi, Solr usually fills the heap with various caches so I wouldn't worry much about it consuming %90 of the heap, unless I get OutOfMemory errors. Pagination using rows parameter is intended for when row count is very low and page number is also small (eg. rows=10 page=2 etc.). It's problematic

Re: Symlink indexing

2024-04-05 Thread Hendrik Jilderda | ASERVO Software
The current implementation i have uses a dockerfile. The dockerfile starts the solr container by using a custom script which is ran within the container. the script does the following: * checks at the base of the archive which directories are there and which need to be indexed. * starts solr

Re: Apache Solr Query Issue with huge data

2024-04-05 Thread Thomas Corthals
Hi Prasad, This is expected with "deep paging": https://solr.apache.org/guide/solr/latest/query-guide/pagination-of-results.html#performance-problems-with-deep-paging Have a look at cursors instead, that should solve your problem: https://solr.apache.org/guide/solr/latest/query-guide/pagination-o

Re: Atomic updates with CBOR?

2024-04-05 Thread Thomas Corthals
Hi Paul, I think it should. If the ref guide only says "It’s much faster and efficient compared to JSON." it's not unreasonable for users to expect feature parity between CBOR and JSON. Thomas Op do 4 apr 2024 om 22:12 schreef Noble Paul : > No. It's not yet supported. However it can be added >

Re: Apache Solr Query Issue with huge data

2024-04-05 Thread prasad bezavada
Hello Thomas Corthals, Thank you very much for your valuable reply. I am trying to use cursors, but for the first query also its taking so much time to get the results, and next query I am getting heap memory error in my java application. On Fri, Apr 5, 2024 at 2:41 PM Thomas Corthals wrote: >

Re: Apache Solr Query Issue with huge data

2024-04-05 Thread Thomas Corthals
Hi Prasad, Have you tried with a smaller page size? Just how many documents you can fetch in one page with the given memory depends on the size of the documents. You'll have to try out what works for you. Regardless of page size a cursor will still be the way to go to page through a large set of

Re: Atomic updates with CBOR?

2024-04-05 Thread Ishan Chattopadhyaya
Thanks Thomas. Patches/PRs welcome. On Fri, 5 Apr, 2024, 3:29 pm Thomas Corthals, wrote: > Hi Paul, > > I think it should. If the ref guide only says "It’s much faster and > efficient compared to JSON." it's not unreasonable for users to expect > feature parity between CBOR and JSON. > > Thomas

Date syntax : Filter by month independent of the year

2024-04-05 Thread rajani m
Hi Solr Users, Any creative date range syntax that allows filters by month independent of the year? Such as date_added:[*-01-29T00:00:00Z TO *-04-09T00:00:00Z] The date format in the index is tries/kd-trees so thinking this could be possible. Appreciate any thoughts. Thank you for your time,

Re: Date syntax : Filter by month independent of the year

2024-04-05 Thread Alexandre Rafalovitch
If you know you are going to search by it, clone the field without storage and preprocess to just leave the months behind. That's like 12 possible values - super efficient for filtering. Or set all years to year 1 in the copy if you are still doing day as well. Regards, Alex. On Fri, 5 Apr 202

Re: Date syntax : Filter by month independent of the year

2024-04-05 Thread Walter Underwood
That is what I was going to suggest. Make a month field. —wunder > On Apr 5, 2024, at 8:22 AM, Alexandre Rafalovitch wrote: > > If you know you are going to search by it, clone the field without storage > and preprocess to just leave the months behind. That's like 12 possible > values - super e

Re: Date syntax : Filter by month independent of the year

2024-04-05 Thread rajani m
yeah, makes sense, thank you. On Fri, Apr 5, 2024 at 1:24 PM Walter Underwood wrote: > That is what I was going to suggest. Make a month field. —wunder > > > On Apr 5, 2024, at 8:22 AM, Alexandre Rafalovitch > wrote: > > > > If you know you are going to search by it, clone the field without >

Re: Date syntax : Filter by month independent of the year

2024-04-05 Thread Walter Underwood
This is a great example of a general technique to make Solr fast. Do the parsing and selection at index time to make the query as simple as possible. —wunder > On Apr 5, 2024, at 10:55 AM, rajani m wrote: > > yeah, makes sense, thank you. > > On Fri, Apr 5, 2024 at 1:24 PM Walter Underwood >

Re: Date syntax : Filter by month independent of the year

2024-04-05 Thread uyil...@vivaldi.net.INVALID
also if your index is a bit large, invest in an architecture which makes reindexing very easy, as you will probably need to change the schema and reindex multiple times --ufuk yilmaz From: Walter Underwood Sent: Friday, April 5, 2024 8:59 PM To: users@solr.apach