Re: Fastest way to index data to solr

2022-09-30 Thread Thomas Corthals
Hi Gus,

I have a followup question. Is JSON parsed faster than XML by Solr if they
represent the exact same documents?

Thomas

Op vr 30 sep. 2022 om 06:58 schreef Gus Heck :

> If you are using a non-java language you can use JSON.
>


Re: Fastest way to index data to solr

2022-09-30 Thread Dave
I don’t have any tests but I know anything is faster than xml. You may as well 
stick to text files. Xml is garbage that’s why they made yaml which is the 
parent of json

> On Sep 30, 2022, at 3:47 AM, Thomas Corthals  wrote:
> 
> Hi Gus,
> 
> I have a followup question. Is JSON parsed faster than XML by Solr if they
> represent the exact same documents?
> 
> Thomas
> 
> Op vr 30 sep. 2022 om 06:58 schreef Gus Heck :
> 
>> If you are using a non-java language you can use JSON.
>> 


Re: Fastest way to index data to solr

2022-09-30 Thread Andy Lester
I can’t imagine a case where the speed in parsing the input data won’t be 
dwarfed by the time spent on everything else. You’re talking about an in-memory 
operation that does a ton of I/O. 

It’s not going to make a noticeable difference once way or the other. 

> I have a followup question. Is JSON parsed faster than XML by Solr



Is this possible? Transform JSON to conform to existing UpdateRequestHandler

2022-09-30 Thread Matthew Castrigno
Hello,



I am new to SOLR and I am trying to configure it to accept existing API 
definition.



I need to create a requestHandler that will perform index operations based on 
the value in a specific field as opposed to using a parameter in the request.



Ie. A payload that may look like this:

{"id":"unique id",

"command":"add",

"content":"mycontent"}



Based on the value of command, either an add or delete operation would occur. I 
would like to know if I can accomplish this strictly with solrconfig.xml 
utilizing the solr.UpdateRequestHandler like this:


  

...


Or do I have to write a custom class?



I realize this odd but I am working with requests that are formed in an 
existing way that I cannot change, I can only change the domain.


All comments appreciated.

Thank you.

--
"This message is intended for the use of the person or entity to which it is 
addressed and may contain information that is confidential or privileged, the 
disclosure of which is governed by applicable law. If the reader of this 
message is not the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this information is strictly 
prohibited. If you have received this message by error, please notify us 
immediately and destroy the related message."


Re: Is this possible? Transform JSON to conform to existing UpdateRequestHandler

2022-09-30 Thread Matthew Castrigno
Maybe this is what I need?
https://solr.apache.org/guide/8_11/json-request-api.html
JSON Request API | Apache Solr Reference Guide 
8.11
JSON Parameter Merging. If multiple json parameters are provided in a single 
request, Solr attempts to merge the parameter values together before processing 
the request.. The JSON Request API has several properties (filter, fields, etc) 
which accept multiple values.During the merging process, all values for these 
"multivalued" properties are retained.
solr.apache.org


From: Matthew Castrigno 
Sent: Friday, September 30, 2022 12:16 PM
To: users@solr.apache.org 
Subject: Is this possible? Transform JSON to conform to existing 
UpdateRequestHandler

Hello, I am new to SOLR and I am trying to configure it to accept existing API 
definition. I need to create a requestHandler that will perform index 
operations based on the value in a specific field as opposed to using a 
parameter in the request. 
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside the St. Luke's email system.

ZjQcmQRYFpfptBannerEnd

Hello,



I am new to SOLR and I am trying to configure it to accept existing API 
definition.



I need to create a requestHandler that will perform index operations based on 
the value in a specific field as opposed to using a parameter in the request.



Ie. A payload that may look like this:

{"id":"unique id",

"command":"add",

"content":"mycontent"}



Based on the value of command, either an add or delete operation would occur. I 
would like to know if I can accomplish this strictly with solrconfig.xml 
utilizing the solr.UpdateRequestHandler like this:


  

...


Or do I have to write a custom class?



I realize this odd but I am working with requests that are formed in an 
existing way that I cannot change, I can only change the domain.


All comments appreciated.

Thank you.

--
"This message is intended for the use of the person or entity to which it is 
addressed and may contain information that is confidential or privileged, the 
disclosure of which is governed by applicable law. If the reader of this 
message is not the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this information is strictly 
prohibited. If you have received this message by error, please notify us 
immediately and destroy the related message."


--
"This message is intended for the use of the person or entity to which it is 
addressed and may contain information that is confidential or privileged, the 
disclosure of which is governed by applicable law. If the reader of this 
message is not the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this information is strictly 
prohibited. If you have received this message by error, please notify us 
immediately and destroy the related message."


Re: Fastest way to index data to solr

2022-09-30 Thread Joel Bernstein
Unless something has changed recently, you will have a memory leak if you
don't atleast soft commit during the load. This is due to the in-memory
tlog data used for real-time get. This in-memory tlog data is released when
a new searcher is opened.

So, if you're having memory issues while bulk loading data without a soft
commit, then set autoSoftCommit to an interval that balances load
performance with memory retention.



Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Sep 30, 2022 at 12:37 PM Andy Lester  wrote:

> I can’t imagine a case where the speed in parsing the input data won’t be
> dwarfed by the time spent on everything else. You’re talking about an
> in-memory operation that does a ton of I/O.
>
> It’s not going to make a noticeable difference once way or the other.
>
> > I have a followup question. Is JSON parsed faster than XML by Solr
>
>


Is this possible?

2022-09-30 Thread Matthew Castrigno


Hello,

I am new to SOLR and I am trying to configure it to accept existing API 
definition.

I need to create a requestHandler that will perform index operations based on 
the value in a specific field as opposed to using a parameter in the request.

Ie. A payload that may look like this:

{"id":"unique id",

"command":"add",

"content":"mycontent"}


Based on the value of command, either an add or delete operation would occur. I 
would like to know if I can accomplish this strictly with solrconfig.xml 
utilizing the solr.UpdateRequestHandler like this:


  

...


Or do I have to write a custom class?



I realize this odd but I am working with requests that are formed in an 
existing way that I cannot change, I can only change the domain.


All comments appreciated.

Thank you.



--
"This message is intended for the use of the person or entity to which it is 
addressed and may contain information that is confidential or privileged, the 
disclosure of which is governed by applicable law. If the reader of this 
message is not the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this information is strictly 
prohibited. If you have received this message by error, please notify us 
immediately and destroy the related message."