primitive integer field

2023-06-27 Thread Szűcs Roland
Hi Solr developers,

I would like to have a price field in Solr with integer type. I need to
store it. In addition to show it in the search result, the only role of
this field is to use it as a range filter.

My question is what fieldType should I use as a best practice. I have read
that:
"For general numeric needs, consider using one of the IntPointField,
LongPointField, FloatPointField, or DoublePointField classes, depending on
the specific values you expect. These "Dimensional Point" based numeric
classes use specially encoded data structures to support efficient range
queries regardless of the size of the ranges used. Enable DocValues
 on
these fields as needed for sorting and/or faceting."
Based on this, am I correct that I should use InpointField with
indexed="false" stored="true" docValues="true" for my use case?

Thanks in advance,
Roland

P.S.: It is not clear at all what does "Dimensional Point" means for a
scalar value


Re: primitive integer field

2023-06-27 Thread Ishan Chattopadhyaya
You can use an IntPointField with indexed=false, stored=false,
docValues=true.

On Tue, 27 Jun, 2023, 3:23 pm Szűcs Roland, 
wrote:

> Hi Solr developers,
>
> I would like to have a price field in Solr with integer type. I need to
> store it. In addition to show it in the search result, the only role of
> this field is to use it as a range filter.
>
> My question is what fieldType should I use as a best practice. I have read
> that:
> "For general numeric needs, consider using one of the IntPointField,
> LongPointField, FloatPointField, or DoublePointField classes, depending on
> the specific values you expect. These "Dimensional Point" based numeric
> classes use specially encoded data structures to support efficient range
> queries regardless of the size of the ranges used. Enable DocValues
> 
> on
> these fields as needed for sorting and/or faceting."
> Based on this, am I correct that I should use InpointField with
> indexed="false" stored="true" docValues="true" for my use case?
>
> Thanks in advance,
> Roland
>
> P.S.: It is not clear at all what does "Dimensional Point" means for a
> scalar value
>


Re: primitive integer field

2023-06-27 Thread Ishan Chattopadhyaya
You can also enable indexed=true to avail range queries using the BKD
Trees.

On Tue, 27 Jun, 2023, 5:54 pm Ishan Chattopadhyaya, <
ichattopadhy...@gmail.com> wrote:

> You can use an IntPointField with indexed=false, stored=false,
> docValues=true.
>
> On Tue, 27 Jun, 2023, 3:23 pm Szűcs Roland, 
> wrote:
>
>> Hi Solr developers,
>>
>> I would like to have a price field in Solr with integer type. I need to
>> store it. In addition to show it in the search result, the only role of
>> this field is to use it as a range filter.
>>
>> My question is what fieldType should I use as a best practice. I have read
>> that:
>> "For general numeric needs, consider using one of the IntPointField,
>> LongPointField, FloatPointField, or DoublePointField classes, depending on
>> the specific values you expect. These "Dimensional Point" based numeric
>> classes use specially encoded data structures to support efficient range
>> queries regardless of the size of the ranges used. Enable DocValues
>> 
>> on
>> these fields as needed for sorting and/or faceting."
>> Based on this, am I correct that I should use InpointField with
>> indexed="false" stored="true" docValues="true" for my use case?
>>
>> Thanks in advance,
>> Roland
>>
>> P.S.: It is not clear at all what does "Dimensional Point" means for a
>> scalar value
>>
>


Solr Cloud Backup Strategy and Data Corruption Prevention

2023-06-27 Thread Saksham Gupta
Hi Solr Developers,
Reaching out to inquire about the best practices for implementing a backup
strategy in Solr Cloud. We recently migrated from Solr standalone (solr6.5)
to Solr 8.10, where we have a collection with data divided among 8 shards
using implicit routing. Until now, we have maintained the standalone solr
as a backup in case something goes wrong on solr cloud (due to data
corruption/ deletion, etc.).
However, we now wish to discard the standalone Solr and fully transition to
Solr Cloud. My concern is what would happen if the data in Solr Cloud were
to become corrupted/ deleted, necessitating the replacement or reindexing
of the entire dataset, which can be a time-consuming process. We aim to
minimize downtime as much as possible.
I would greatly appreciate any insights or recommendations you could
provide to address this concern.

Thank you in advance.

Best regards,
Saksham


Re: primitive integer field

2023-06-27 Thread Jan Høydahl
You need indexed="true" to enable the dimensional index structure supporting 
range filters. If you do not ever need sorting on the field I suppose you could 
disable docValues.

Jan

> 27. jun. 2023 kl. 11:51 skrev Szűcs Roland :
> 
> Hi Solr developers,
> 
> I would like to have a price field in Solr with integer type. I need to
> store it. In addition to show it in the search result, the only role of
> this field is to use it as a range filter.
> 
> My question is what fieldType should I use as a best practice. I have read
> that:
> "For general numeric needs, consider using one of the IntPointField,
> LongPointField, FloatPointField, or DoublePointField classes, depending on
> the specific values you expect. These "Dimensional Point" based numeric
> classes use specially encoded data structures to support efficient range
> queries regardless of the size of the ranges used. Enable DocValues
>  on
> these fields as needed for sorting and/or faceting."
> Based on this, am I correct that I should use InpointField with
> indexed="false" stored="true" docValues="true" for my use case?
> 
> Thanks in advance,
> Roland
> 
> P.S.: It is not clear at all what does "Dimensional Point" means for a
> scalar value



Re: primitive integer field

2023-06-27 Thread Ishan Chattopadhyaya
If you disable docValues, then you would need stored=true to return the
values along with the search results.

On Tue, 27 Jun, 2023, 6:06 pm Jan Høydahl,  wrote:

> You need indexed="true" to enable the dimensional index structure
> supporting range filters. If you do not ever need sorting on the field I
> suppose you could disable docValues.
>
> Jan
>
> > 27. jun. 2023 kl. 11:51 skrev Szűcs Roland  >:
> >
> > Hi Solr developers,
> >
> > I would like to have a price field in Solr with integer type. I need to
> > store it. In addition to show it in the search result, the only role of
> > this field is to use it as a range filter.
> >
> > My question is what fieldType should I use as a best practice. I have
> read
> > that:
> > "For general numeric needs, consider using one of the IntPointField,
> > LongPointField, FloatPointField, or DoublePointField classes, depending
> on
> > the specific values you expect. These "Dimensional Point" based numeric
> > classes use specially encoded data structures to support efficient range
> > queries regardless of the size of the ranges used. Enable DocValues
> > 
> on
> > these fields as needed for sorting and/or faceting."
> > Based on this, am I correct that I should use InpointField with
> > indexed="false" stored="true" docValues="true" for my use case?
> >
> > Thanks in advance,
> > Roland
> >
> > P.S.: It is not clear at all what does "Dimensional Point" means for a
> > scalar value
>
>


Re: primitive integer field

2023-06-27 Thread Szűcs Roland
I planned to use only docValues="true" for an intPointField. is It not
enough for efficient faceting, range queries and sorting? Do I need
indexed="true" in addition to the docsValues?

Roland

Ishan Chattopadhyaya  ezt írta (időpont: 2023.
jún. 27., K, 14:43):

> If you disable docValues, then you would need stored=true to return the
> values along with the search results.
>
> On Tue, 27 Jun, 2023, 6:06 pm Jan Høydahl,  wrote:
>
> > You need indexed="true" to enable the dimensional index structure
> > supporting range filters. If you do not ever need sorting on the field I
> > suppose you could disable docValues.
> >
> > Jan
> >
> > > 27. jun. 2023 kl. 11:51 skrev Szűcs Roland <
> szucs.rol...@bookandwalk.hu
> > >:
> > >
> > > Hi Solr developers,
> > >
> > > I would like to have a price field in Solr with integer type. I need to
> > > store it. In addition to show it in the search result, the only role of
> > > this field is to use it as a range filter.
> > >
> > > My question is what fieldType should I use as a best practice. I have
> > read
> > > that:
> > > "For general numeric needs, consider using one of the IntPointField,
> > > LongPointField, FloatPointField, or DoublePointField classes, depending
> > on
> > > the specific values you expect. These "Dimensional Point" based numeric
> > > classes use specially encoded data structures to support efficient
> range
> > > queries regardless of the size of the ranges used. Enable DocValues
> > > <
> https://solr.apache.org/guide/solr/latest/indexing-guide/docvalues.html>
> > on
> > > these fields as needed for sorting and/or faceting."
> > > Based on this, am I correct that I should use InpointField with
> > > indexed="false" stored="true" docValues="true" for my use case?
> > >
> > > Thanks in advance,
> > > Roland
> > >
> > > P.S.: It is not clear at all what does "Dimensional Point" means for a
> > > scalar value
> >
> >
>


Re: primitive integer field

2023-06-27 Thread Houston Putman
Sorting and faceting use docValues for the fastest implementation, range
queries use the index for the fastest implementation.

In general I would advise to have everything (indexed, docValues, stored)
turned on unless you have an explicit reason not to.

- Houston

On Tue, Jun 27, 2023 at 9:57 AM Szűcs Roland 
wrote:

> I planned to use only docValues="true" for an intPointField. is It not
> enough for efficient faceting, range queries and sorting? Do I need
> indexed="true" in addition to the docsValues?
>
> Roland
>
> Ishan Chattopadhyaya  ezt írta (időpont: 2023.
> jún. 27., K, 14:43):
>
> > If you disable docValues, then you would need stored=true to return the
> > values along with the search results.
> >
> > On Tue, 27 Jun, 2023, 6:06 pm Jan Høydahl, 
> wrote:
> >
> > > You need indexed="true" to enable the dimensional index structure
> > > supporting range filters. If you do not ever need sorting on the field
> I
> > > suppose you could disable docValues.
> > >
> > > Jan
> > >
> > > > 27. jun. 2023 kl. 11:51 skrev Szűcs Roland <
> > szucs.rol...@bookandwalk.hu
> > > >:
> > > >
> > > > Hi Solr developers,
> > > >
> > > > I would like to have a price field in Solr with integer type. I need
> to
> > > > store it. In addition to show it in the search result, the only role
> of
> > > > this field is to use it as a range filter.
> > > >
> > > > My question is what fieldType should I use as a best practice. I have
> > > read
> > > > that:
> > > > "For general numeric needs, consider using one of the IntPointField,
> > > > LongPointField, FloatPointField, or DoublePointField classes,
> depending
> > > on
> > > > the specific values you expect. These "Dimensional Point" based
> numeric
> > > > classes use specially encoded data structures to support efficient
> > range
> > > > queries regardless of the size of the ranges used. Enable DocValues
> > > > <
> > https://solr.apache.org/guide/solr/latest/indexing-guide/docvalues.html>
> > > on
> > > > these fields as needed for sorting and/or faceting."
> > > > Based on this, am I correct that I should use InpointField with
> > > > indexed="false" stored="true" docValues="true" for my use case?
> > > >
> > > > Thanks in advance,
> > > > Roland
> > > >
> > > > P.S.: It is not clear at all what does "Dimensional Point" means for
> a
> > > > scalar value
> > >
> > >
> >
>


Re: Limiting Backup IO

2023-06-27 Thread David Smiley
Here's a POC: https://github.com/apache/solr/pull/1729

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jun 26, 2023 at 1:53 PM Jason Gerlowski 
wrote:

> Sounds like something that would be very useful for folks.
>
> I'm sure it'd be very dependent on your data and the type of backup,
> but I'm curious - if you can share Pierre - is there a number of
> cores-per-node being backed up where you start to see problems?
>
> Jason
>
> On Wed, Jun 21, 2023 at 8:34 AM Pierre Salagnac
>  wrote:
> >
> > Thanks for starting this thread David.
> >
> > I've been internally working on this, since we have issues (query
> failures)
> > during backups of big collections because of IO saturation.
> >
> > I see two different approaches to solve this:
> > 1. Throttle at the IO level, like David mentioned.
> > 2. Limit the number of cores we backup concurrently.
> > (These two options are *not* mutually exclusive.)
> >
> > I've been focused on the second option, to limit the number of concurrent
> > backups per node. Currently, the overseer sends shard requests to all
> > shards in a simple 'for' loop. If the collection has one thousand shards,
> > we'll start 1 thousand concurrent backups. The idea is to only send shard
> > level requests up to a certain limit per node, and then each time a shard
> > is complete, we send the next one for this node.
> > If you're interested, I integrated my experiment (for non incremental
> > backups) here:
> >
> https://github.com/psalagnac/solr/commit/c77c94e9a3c20aee3e45ec1198f00ab9cf0f76c5
> >
> > I don't think backup is the only operation that should be considered. At
> > least restore is, not sure whether we have other IO intensive operations
> > that are at the collection level. Ideally, we should have something
> generic
> > and not consider each type of operation individually.
> >
> > Thanks
> >
> >
> > Le mar. 20 juin 2023 à 09:58, Ishan Chattopadhyaya <
> > ichattopadhy...@gmail.com> a écrit :
> >
> > > Might be a good question for users@ list, I guess. I'm sure other
> users
> > > must've thought about this.
> > > Cross posting there, as I'm curious myself too.
> > >
> > > On Tue, 20 Jun 2023 at 01:07, David Smiley  wrote:
> > >
> > > > Has anyone mitigated the potentially large IO impact of doing a
> backup
> > > of a
> > > > large collection or just in general?  If the collection is large
> enough,
> > > > there very well could be many shards on one host and it could
> saturate
> > > the
> > > > IO.  I wonder if there should be a rate limit mechanism or some other
> > > > mechanism.
> > > >
> > > > Not the same but I know that at a segment level, the merges are rate
> > > > limited -- ConcurrentMergeScheduler doesn't quite let you set it but
> > > > adjusts itself automatically ("ioThrottle" boolean).
> > > >
> > > > ~ David Smiley
> > > > Apache Lucene/Solr Search Developer
> > > > http://www.linkedin.com/in/davidwsmiley
> > > >
> > >
>


Re: Limiting Backup IO

2023-06-27 Thread David Smiley
Here's a POC: https://github.com/apache/solr/pull/1729

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jun 19, 2023 at 3:36 PM David Smiley  wrote:

> Has anyone mitigated the potentially large IO impact of doing a backup of
> a large collection or just in general?  If the collection is large enough,
> there very well could be many shards on one host and it could saturate the
> IO.  I wonder if there should be a rate limit mechanism or some other
> mechanism.
>
> Not the same but I know that at a segment level, the merges are rate
> limited -- ConcurrentMergeScheduler doesn't quite let you set it but
> adjusts itself automatically ("ioThrottle" boolean).
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>


Plugin Initializing failure for [schema.xml] fieldType

2023-06-27 Thread Szűcs Roland
Hey Solr comunity,

I have a core named books. It is not using the managed schema but manually
edited schema.xml.

When I tried to access the admin on localhost, I got the following error:
*books:*
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core books: Can't load schema
/var/solr/data/books/conf/schema.xml: Plugin Initializing failure for
[schema.xml] fieldType

When I checked the log on Admin UI I got no more details wich fieldType is
the cause of the error.

Here are my custom field types:


  




  
  



  


I do not have any idea what is going wrong here.

Roland





  

  





  

  





  







  







  

  





  




Re: Solr Cloud Backup Strategy and Data Corruption Prevention

2023-06-27 Thread Saksham Gupta
Hi All,
Any help regarding this problem. What is the standard practice to create
backup on solr cloud?

On Tue, Jun 27, 2023 at 5:57 PM Saksham Gupta 
wrote:

> Hi Solr Developers,
> Reaching out to inquire about the best practices for implementing a backup
> strategy in Solr Cloud. We recently migrated from Solr standalone (solr6.5)
> to Solr 8.10, where we have a collection with data divided among 8 shards
> using implicit routing. Until now, we have maintained the standalone solr
> as a backup in case something goes wrong on solr cloud (due to data
> corruption/ deletion, etc.).
> However, we now wish to discard the standalone Solr and fully transition
> to Solr Cloud. My concern is what would happen if the data in Solr Cloud
> were to become corrupted/ deleted, necessitating the replacement or
> reindexing of the entire dataset, which can be a time-consuming process. We
> aim to minimize downtime as much as possible.
> I would greatly appreciate any insights or recommendations you could
> provide to address this concern.
>
> Thank you in advance.
>
> Best regards,
> Saksham
>