Get only year from a date field for a copy field ?

2024-07-01 Thread Bruno Mannina
Hi Solr Team,



Is it possible to create a copy field where the source is a date-time field
to extract only the year ?



I have 2024-07-01T253:59:59Z for the date field and store/index a copy field
with only 2024.



The goal is to have faceting only on Year information.



Note: I haven't the possibility to re-create my source files to index due to
the huge volume (more than 15To of data).

Note2: my index is not yet build.



Cordialement, Best Regards

Bruno Mannina

  www.matheo-software.com

  www.patent-pulse.com

Mob. +33 0 634 421 817





--
Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
www.avast.com

Re: Searching for synonyms

2024-07-01 Thread Carsten Klement
I changed my configuration as suggest, but it didn't help.
I can't search for tera* instead of terra*, I think there is also another 
problem.


  
  
  


kind regards
Carsten

Am 27.06.24, 17:50 schrieb "Marcus Bergner" mailto:marcus.berg...@vizrt.com.inva>LID>:


I think you've confused tokenizers and filters. Try something like this:











/ Marcus





Re: Get only year from a date field for a copy field ?

2024-07-01 Thread ufuk yılmaz
Hi Bruno,

Did you consider using a range facet with gap being 1 year? If that works for 
you, you can avoid reindexing altogether. 

https://solr.apache.org/guide/8_1/json-facet-api.html

Try specifying gap as "1YEARS" or "+1YEARS" (I forgot if plus sign was required 
or not)

Sincerely 
Ufuk Yilmaz

—

> On Jul 1, 2024, at 17:51, Bruno Mannina  wrote:
> 
> Hi Solr Team,
> 
> 
> 
> Is it possible to create a copy field where the source is a date-time field
> to extract only the year ?
> 
> 
> 
> I have 2024-07-01T253:59:59Z for the date field and store/index a copy field
> with only 2024.
> 
> 
> 
> The goal is to have faceting only on Year information.
> 
> 
> 
> Note: I haven't the possibility to re-create my source files to index due to
> the huge volume (more than 15To of data).
> 
> Note2: my index is not yet build. 
> 
> 
> 
> Cordialement, Best Regards
> 
> Bruno Mannina
> 
>  www.matheo-software.com
> 
>  www.patent-pulse.com
> 
> Mob. +33 0 634 421 817
> 
> 
> 
> 
> 
> -- 
> Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
> www.avast.com


Re: Problem with atomic updates after upgrading from 8.11 to 9.6

2024-07-01 Thread Christos Malliaridis
Hi Jeremy,

Based on the information you provided I would say that your price_list_url
is recognized as an object instead of a field update. Depending on the way
you update your document(s), this may succeed and do what you want, succeed
and create flattened documents or fail. A flattened object would look like
this in your case:
{
  "id":"contracts 36F79718D0274 | 65 II I",
  " price_list_url.set":["https://prices.anywhere.com";],
  "_version_":1803386312791687168
}

How exactly are you updating your documents? What endpoint are you using
and which request handler is processing your request? One potential root
cause I can think of is mixing the endpoints /update/json/docs and /update.

Sincerely,
Christos

On Sun, Jun 30, 2024 at 8:50 PM Jeremy Buckley - IQS-C
 wrote:

> After updating to 9.6.1, the following update is failing:
>
> [{
>   "id":"contracts 36F79718D0274 | 65 II C",
>   "price_list_url" : { "set" : "https://prices.anywhere.com"; }
> }]
>
> Solr responds with:
>
> {
> "responseHeader": {
> "rf": 1,
> "status": 400,
> "QTime": 4
> },
> "error": {
> "metadata": [
> "error-class",
> "org.apache.solr.common.SolrException",
> "root-error-class",
> "org.apache.solr.common.SolrException"
> ],
> "msg": "Unable to index docs with children: the schema must include
> definitions for both a uniqueKey field and the '_root_' field, using the
> exact same fieldType",
> "code": 400
> }
> }
>
> We do not have nested child documents, at least not intentionally. Schema
> has:
>
>  omitNorms="true" omitPositions="true" omitTermFreqAndPositions="true"
> stored="true" termVectors="false"/>
> ...
>  multiValued="false" />
> ...
> id
>
> There is no _root_ field defined in the schema, and it is
> using ClassicIndexSchemaFactory.
> We are running Solr Cloud, this collection has one shard and two replicas.
>
> Any ideas what could be causing this error or how to fix it?
>
> Thanks in advance!
>


Re: Problem with atomic updates after upgrading from 8.11 to 9.6

2024-07-01 Thread Jeremy Buckley - IQS-C
Hi Christos, thanks for the reply. I am using the /update endpoint.  If I
change to /update/json/docs, it does what you suggest and creates a
flattened document.  But that isn't what I want.

Somewhat strangely, I only have one collection that is acting this way -
atomic updates on other collections are working fine.  Also, everything
worked as expected under Solr 8.11.2 and before.

On Mon, Jul 1, 2024 at 10:29 AM Christos Malliaridis <
c.malliari...@gmail.com> wrote:

> Hi Jeremy,
>
> Based on the information you provided I would say that your price_list_url
> is recognized as an object instead of a field update. Depending on the way
> you update your document(s), this may succeed and do what you want, succeed
> and create flattened documents or fail. A flattened object would look like
> this in your case:
> {
>   "id":"contracts 36F79718D0274 | 65 II I",
>   " price_list_url.set":["https://prices.anywhere.com";],
>   "_version_":1803386312791687168
> }
>
> How exactly are you updating your documents? What endpoint are you using
> and which request handler is processing your request? One potential root
> cause I can think of is mixing the endpoints /update/json/docs and /update.
>
> Sincerely,
> Christos
>
> On Sun, Jun 30, 2024 at 8:50 PM Jeremy Buckley - IQS-C
>  wrote:
>
> > After updating to 9.6.1, the following update is failing:
> >
> > [{
> >   "id":"contracts 36F79718D0274 | 65 II C",
> >   "price_list_url" : { "set" : "https://prices.anywhere.com"; }
> > }]
> >
> > Solr responds with:
> >
> > {
> > "responseHeader": {
> > "rf": 1,
> > "status": 400,
> > "QTime": 4
> > },
> > "error": {
> > "metadata": [
> > "error-class",
> > "org.apache.solr.common.SolrException",
> > "root-error-class",
> > "org.apache.solr.common.SolrException"
> > ],
> > "msg": "Unable to index docs with children: the schema must
> include
> > definitions for both a uniqueKey field and the '_root_' field, using the
> > exact same fieldType",
> > "code": 400
> > }
> > }
> >
> > We do not have nested child documents, at least not intentionally. Schema
> > has:
> >
> >  > omitNorms="true" omitPositions="true" omitTermFreqAndPositions="true"
> > stored="true" termVectors="false"/>
> > ...
> >  > multiValued="false" />
> > ...
> > id
> >
> > There is no _root_ field defined in the schema, and it is
> > using ClassicIndexSchemaFactory.
> > We are running Solr Cloud, this collection has one shard and two
> replicas.
> >
> > Any ideas what could be causing this error or how to fix it?
> >
> > Thanks in advance!
> >


RE: Get only year from a date field for a copy field ?

2024-07-01 Thread Bruno Mannina
Hi Ufuk,

Thanks for the tips !

It works fine with this :

http://localhost:8983/solr/db001/select?
facet.contains.ignoreCase=true
&fl=ap,in
&indent=true
&q.op=OR
&q=ti%3Atreat*
&rows=1
&facet=true
&facet.range.gap=%2B1YEAR (the GAP)
&facet.range=pd (field to count)
&facet.range.start=1800-01-01T23:59:59Z (date start)
&facet.range.end=2024-07-01T23:59:59Z (date end, I don't know yet how 
can I tell "take all")
&facet.mincount=1 (To delete frequency = 0)

Thanks a lot !

PS: I continue to search how to indicate "take all"


Cordialement, Best Regards
Bruno Mannina
www.matheo-software.com
www.patent-pulse.com
Mob. +33 0 634 421 817


-Message d'origine-
De : ufuk yılmaz [mailto:uyil...@vivaldi.net.INVALID]
Envoyé : lundi 1 juillet 2024 15:38
À : users@solr.apache.org
Objet : Re: Get only year from a date field for a copy field ?

Hi Bruno,

Did you consider using a range facet with gap being 1 year? If that works for 
you, you can avoid reindexing altogether.

https://solr.apache.org/guide/8_1/json-facet-api.html

Try specifying gap as "1YEARS" or "+1YEARS" (I forgot if plus sign was required 
or not)

Sincerely
Ufuk Yilmaz

—

> On Jul 1, 2024, at 17:51, Bruno Mannina  wrote:
>
> Hi Solr Team,
>
>
>
> Is it possible to create a copy field where the source is a date-time
> field to extract only the year ?
>
>
>
> I have 2024-07-01T253:59:59Z for the date field and store/index a copy
> field with only 2024.
>
>
>
> The goal is to have faceting only on Year information.
>
>
>
> Note: I haven't the possibility to re-create my source files to index
> due to the huge volume (more than 15To of data).
>
> Note2: my index is not yet build.
>
>
>
> Cordialement, Best Regards
>
> Bruno Mannina
>
>  www.matheo-software.com
>
>  www.patent-pulse.com
>
> Mob. +33 0 634 421 817
>
>
>
>
>
> --
> Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
> www.avast.com


--
Cet e-mail a été vérifié par le logiciel antivirus d'Avast.
www.avast.com


Re: Problem with atomic updates after upgrading from 8.11 to 9.6

2024-07-01 Thread Christos Malliaridis
With a correctly configured configset and a collection with data in it, I
can only reproduce the error if there are documents that are wrongly
indexed. In that situation, fixing the documents in the collection (so that
they are no flattened documents) and reloading / reindexing the affected
collections may solve the issue. I don't think it's an issue directly
related to the version of Solr. It sounds like an issue with the documents
that may have been reindexed during the upgrade process and identified an
inconsistency that previously was ignored or skipped.

What I would do in this case is:
- Make sure there are no invalid documents (with wrong fields or values) in
your collection
- Try fully reindex your documents (see here

)
- Make sure all your clients run on a compatible (preferably the same)
version with your Solr deployment

The issue may also not be caused by the price_list_url, but instead by some
other field (or even document) that is identified as a nested object.

If you can provide a simplified reproducer, it will be easier to suggest
working solutions.

On Mon, Jul 1, 2024 at 5:58 PM Jeremy Buckley - IQS-C
 wrote:

> Hi Christos, thanks for the reply. I am using the /update endpoint.  If I
> change to /update/json/docs, it does what you suggest and creates a
> flattened document.  But that isn't what I want.
>
> Somewhat strangely, I only have one collection that is acting this way -
> atomic updates on other collections are working fine.  Also, everything
> worked as expected under Solr 8.11.2 and before.
>
> On Mon, Jul 1, 2024 at 10:29 AM Christos Malliaridis <
> c.malliari...@gmail.com> wrote:
>
> > Hi Jeremy,
> >
> > Based on the information you provided I would say that your
> price_list_url
> > is recognized as an object instead of a field update. Depending on the
> way
> > you update your document(s), this may succeed and do what you want,
> succeed
> > and create flattened documents or fail. A flattened object would look
> like
> > this in your case:
> > {
> >   "id":"contracts 36F79718D0274 | 65 II I",
> >   " price_list_url.set":["https://prices.anywhere.com";],
> >   "_version_":1803386312791687168
> > }
> >
> > How exactly are you updating your documents? What endpoint are you using
> > and which request handler is processing your request? One potential root
> > cause I can think of is mixing the endpoints /update/json/docs and
> /update.
> >
> > Sincerely,
> > Christos
> >
> > On Sun, Jun 30, 2024 at 8:50 PM Jeremy Buckley - IQS-C
> >  wrote:
> >
> > > After updating to 9.6.1, the following update is failing:
> > >
> > > [{
> > >   "id":"contracts 36F79718D0274 | 65 II C",
> > >   "price_list_url" : { "set" : "https://prices.anywhere.com"; }
> > > }]
> > >
> > > Solr responds with:
> > >
> > > {
> > > "responseHeader": {
> > > "rf": 1,
> > > "status": 400,
> > > "QTime": 4
> > > },
> > > "error": {
> > > "metadata": [
> > > "error-class",
> > > "org.apache.solr.common.SolrException",
> > > "root-error-class",
> > > "org.apache.solr.common.SolrException"
> > > ],
> > > "msg": "Unable to index docs with children: the schema must
> > include
> > > definitions for both a uniqueKey field and the '_root_' field, using
> the
> > > exact same fieldType",
> > > "code": 400
> > > }
> > > }
> > >
> > > We do not have nested child documents, at least not intentionally.
> Schema
> > > has:
> > >
> > >  > > omitNorms="true" omitPositions="true" omitTermFreqAndPositions="true"
> > > stored="true" termVectors="false"/>
> > > ...
> > >  > > multiValued="false" />
> > > ...
> > > id
> > >
> > > There is no _root_ field defined in the schema, and it is
> > > using ClassicIndexSchemaFactory.
> > > We are running Solr Cloud, this collection has one shard and two
> > replicas.
> > >
> > > Any ideas what could be causing this error or how to fix it?
> > >
> > > Thanks in advance!
> > >
>


Re: Problem with atomic updates after upgrading from 8.11 to 9.6

2024-07-01 Thread Jeremy Buckley - IQS-C
I can reproduce the error on a fresh collection with only a single document
added, so it may be something related to my schema.

I think at this point I'm about ready to punt and just do non-atomic full
updates for this scenario, which actually won't be that difficult.

Thanks for your suggestions!

On Mon, Jul 1, 2024 at 2:58 PM Christos Malliaridis 
wrote:

> With a correctly configured configset and a collection with data in it, I
> can only reproduce the error if there are documents that are wrongly
> indexed. In that situation, fixing the documents in the collection (so that
> they are no flattened documents) and reloading / reindexing the affected
> collections may solve the issue. I don't think it's an issue directly
> related to the version of Solr. It sounds like an issue with the documents
> that may have been reindexed during the upgrade process and identified an
> inconsistency that previously was ignored or skipped.
>
> What I would do in this case is:
> - Make sure there are no invalid documents (with wrong fields or values) in
> your collection
> - Try fully reindex your documents (see here
> <
> https://solr.apache.org/guide/solr/latest/upgrade-notes/major-changes-in-solr-9.html#reindexing-after-upgrade
> >
> )
> - Make sure all your clients run on a compatible (preferably the same)
> version with your Solr deployment
>
> The issue may also not be caused by the price_list_url, but instead by some
> other field (or even document) that is identified as a nested object.
>
> If you can provide a simplified reproducer, it will be easier to suggest
> working solutions.
>
> On Mon, Jul 1, 2024 at 5:58 PM Jeremy Buckley - IQS-C
>  wrote:
>
> > Hi Christos, thanks for the reply. I am using the /update endpoint.  If I
> > change to /update/json/docs, it does what you suggest and creates a
> > flattened document.  But that isn't what I want.
> >
> > Somewhat strangely, I only have one collection that is acting this way -
> > atomic updates on other collections are working fine.  Also, everything
> > worked as expected under Solr 8.11.2 and before.
> >
> > On Mon, Jul 1, 2024 at 10:29 AM Christos Malliaridis <
> > c.malliari...@gmail.com> wrote:
> >
> > > Hi Jeremy,
> > >
> > > Based on the information you provided I would say that your
> > price_list_url
> > > is recognized as an object instead of a field update. Depending on the
> > way
> > > you update your document(s), this may succeed and do what you want,
> > succeed
> > > and create flattened documents or fail. A flattened object would look
> > like
> > > this in your case:
> > > {
> > >   "id":"contracts 36F79718D0274 | 65 II I",
> > >   " price_list_url.set":["https://prices.anywhere.com";],
> > >   "_version_":1803386312791687168
> > > }
> > >
> > > How exactly are you updating your documents? What endpoint are you
> using
> > > and which request handler is processing your request? One potential
> root
> > > cause I can think of is mixing the endpoints /update/json/docs and
> > /update.
> > >
> > > Sincerely,
> > > Christos
> > >
> > > On Sun, Jun 30, 2024 at 8:50 PM Jeremy Buckley - IQS-C
> > >  wrote:
> > >
> > > > After updating to 9.6.1, the following update is failing:
> > > >
> > > > [{
> > > >   "id":"contracts 36F79718D0274 | 65 II C",
> > > >   "price_list_url" : { "set" : "https://prices.anywhere.com"; }
> > > > }]
> > > >
> > > > Solr responds with:
> > > >
> > > > {
> > > > "responseHeader": {
> > > > "rf": 1,
> > > > "status": 400,
> > > > "QTime": 4
> > > > },
> > > > "error": {
> > > > "metadata": [
> > > > "error-class",
> > > > "org.apache.solr.common.SolrException",
> > > > "root-error-class",
> > > > "org.apache.solr.common.SolrException"
> > > > ],
> > > > "msg": "Unable to index docs with children: the schema must
> > > include
> > > > definitions for both a uniqueKey field and the '_root_' field, using
> > the
> > > > exact same fieldType",
> > > > "code": 400
> > > > }
> > > > }
> > > >
> > > > We do not have nested child documents, at least not intentionally.
> > Schema
> > > > has:
> > > >
> > > >  > > > omitNorms="true" omitPositions="true" omitTermFreqAndPositions="true"
> > > > stored="true" termVectors="false"/>
> > > > ...
> > > >  stored="true"
> > > > multiValued="false" />
> > > > ...
> > > > id
> > > >
> > > > There is no _root_ field defined in the schema, and it is
> > > > using ClassicIndexSchemaFactory.
> > > > We are running Solr Cloud, this collection has one shard and two
> > > replicas.
> > > >
> > > > Any ideas what could be causing this error or how to fix it?
> > > >
> > > > Thanks in advance!
> > > >
> >
>


-- 
Jeremy Buckley
Principal Software Applications Engineer

Halley’s COMET Contract | Octo, an IBM Company
Mobile: 703-626-6107
Email: jeremy.buck...@gsa.gov