Retain Data Import Handler In Solr9.0

2022-07-22 Thread GANESHAN RAMAN
Dear Team,
Hope you are doing well,
So far we were using Solr 8.9.0 for one of our application which was using
DataImportHandler,
As we need to move to the latest version Solr9.0.0 the below configuration
is SolrCOnfig.xml is throwing an exception
"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'"


dbconfig.xml


*We understand that DataImportHandler which was deprecated in the earlier
version is now completely removed from Solr9.0*, We need to retain this
tool (DIH) along with latest solr9.0,
So can you please advise us how can this issue be resolved as this is
becoming a blocker for continuing with the latest version.
Thanks in Advance.

Thanks,
Ganeshan


Docvalues in Unique key field

2022-07-22 Thread Syam Krishnan R
Hi all,

We are introducing docvalues in our unique key field(id field) and we are
using Solr 8.4.1. The unique key field will be set to both stored=true and
docvalues =true . Please let me know if this is supported in the version we
are using and also if we need to add any additional configuration.

Thanks,
Syam


Re: Retain Data Import Handler In Solr9.0

2022-07-22 Thread Charlie Hull

You have two options basically:

1. consider using the externally maintained DIH from 
https://github.com/rohitbemax/dataimporthandler (there appears to be a 
9.x branch)


2. move away from DIH and write new ingestor code to pull data from your 
sources


Cheers

Charlie

On 22/07/2022 04:38, GANESHAN RAMAN wrote:

Dear Team,
Hope you are doing well,
So far we were using Solr 8.9.0 for one of our application which was using
DataImportHandler,
As we need to move to the latest version Solr9.0.0 the below configuration
is SolrCOnfig.xml is throwing an exception
"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'"


dbconfig.xml


*We understand that DataImportHandler which was deprecated in the earlier
version is now completely removed from Solr9.0*, We need to retain this
tool (DIH) along with latest solr9.0,
So can you please advise us how can this issue be resolved as this is
becoming a blocker for continuing with the latest version.
Thanks in Advance.

Thanks,
Ganeshan


--
Charlie Hull - Managing Consultant at OpenSource Connections Limited
Founding member of The Search Network  
and co-author of Searching the Enterprise 


tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828

OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
Amtsgericht Charlottenburg | HRB 230712 B
Geschäftsführer: John M. Woodell | David E. Pugh
Finanzamt: Berlin Finanzamt für Körperschaften II

--
This email has been checked for viruses by AVG.
https://www.avg.com


Re: Retain Data Import Handler In Solr9.0

2022-07-22 Thread David Smiley
The DIH does not yet support Solr 9 but I don't think it'll be long before
it does.
https://github.com/rohitbemax/dataimporthandler/issues/32

Note the dubious choice of the word "deprecated" was used because it's no
longer a part of Solr.  Practically speaking, it *moved* and isn't gone.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jul 22, 2022 at 5:23 AM Charlie Hull <
ch...@opensourceconnections.com> wrote:

> You have two options basically:
>
> 1. consider using the externally maintained DIH from
> https://github.com/rohitbemax/dataimporthandler (there appears to be a
> 9.x branch)
>
> 2. move away from DIH and write new ingestor code to pull data from your
> sources
>
> Cheers
>
> Charlie
>
> On 22/07/2022 04:38, GANESHAN RAMAN wrote:
> > Dear Team,
> > Hope you are doing well,
> > So far we were using Solr 8.9.0 for one of our application which was
> using
> > DataImportHandler,
> > As we need to move to the latest version Solr9.0.0 the below
> configuration
> > is SolrCOnfig.xml is throwing an exception
> >
> "org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Error loading class
> 'org.apache.solr.handler.dataimport.DataImportHandler'"
> >  > class="org.apache.solr.handler.dataimport.DataImportHandler">
> > 
> > dbconfig.xml
> > 
> > 
> > *We understand that DataImportHandler which was deprecated in the earlier
> > version is now completely removed from Solr9.0*, We need to retain this
> > tool (DIH) along with latest solr9.0,
> > So can you please advise us how can this issue be resolved as this is
> > becoming a blocker for continuing with the latest version.
> > Thanks in Advance.
> >
> > Thanks,
> > Ganeshan
> >
> --
> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> Founding member of The Search Network 
> and co-author of Searching the Enterprise
> <
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
>
> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> Amtsgericht Charlottenburg | HRB 230712 B
> Geschäftsführer: John M. Woodell | David E. Pugh
> Finanzamt: Berlin Finanzamt für Körperschaften II
>
> --
> This email has been checked for viruses by AVG.
> https://www.avg.com
>


BufferUnderFlowException

2022-07-22 Thread Hasmik Sarkezians
Does anyone know what would be the reason for BufferUnderFlowException
while solr is reading?
We have a profiler setup and at times we are seeing a lot of exceptions
related to buffer underflow exception:

[image: image.png]

Any insights would be appreciated.

thanks,
hasmik
-- 

Hasmik Sarkezians

Senior Director, Applications

M:

O:

E: hasmik.sarkezi...@zoominfo.com

275 Wyman St.
Waltham, MA 02451

www.zoominfo.com




[image: Add our best pipeline plays to your playbook!]



Re: BufferUnderFlowException

2022-07-22 Thread Shawn Heisey

On 7/22/22 08:16, Hasmik Sarkezians wrote:
Does anyone know what would be the reason for BufferUnderFlowException 
while solr is reading?
We have a profiler setup and at times we are seeing a lot of 
exceptions related to buffer underflow exception:


The image did not make it through.  The mailing list ate it.

You will either have to paste the exception text into an email or place 
a file on a sharing site and provide a URL to access it.


Thanks,
Shawn



Re: [External Email] Re: BufferUnderFlowException

2022-07-22 Thread Hasmik Sarkezians
Basically the call stack is this:

SegmentTermsEnum.seekExact
SegmentTermsEnumFrame.loadBlock()
ByteBufferIndexInput.readBytes
ByteBufferGuide.getBytes
DirectByteBuffer.get
ByteBuffer.get
Buffer.nextGetIndex
BufferUnderflowException.init

At times we see a lot of such exceptions and cannot find, what could be
possible explanation for this.

thanks,
hasmik

On Fri, Jul 22, 2022 at 11:14 AM Shawn Heisey  wrote:

> On 7/22/22 08:16, Hasmik Sarkezians wrote:
> > Does anyone know what would be the reason for BufferUnderFlowException
> > while solr is reading?
> > We have a profiler setup and at times we are seeing a lot of
> > exceptions related to buffer underflow exception:
>
> The image did not make it through.  The mailing list ate it.
>
> You will either have to paste the exception text into an email or place
> a file on a sharing site and provide a URL to access it.
>
> Thanks,
> Shawn
>
>

-- 

Hasmik Sarkezians

Senior Director, Applications

M:

O:

E: hasmik.sarkezi...@zoominfo.com

275 Wyman St.
Waltham, MA 02451

www.zoominfo.com




[image: Add our best pipeline plays to your playbook!]



Re: Retain Data Import Handler In Solr9.0

2022-07-22 Thread dmitri maziuk

On 2022-07-22 8:46 AM, David Smiley wrote:

The DIH does not yet support Solr 9 but I don't think it'll be long before
it does.


FWIW I've been gradually migrating our DIH imports to little python 
scripts; with all the extra things you can do in those, and less bloat 
in the main JVM, you gotta wonder how much interest there's gonna be in 
keeping that alive long-term.


$.02
Dima



Re: Retain Data Import Handler In Solr9.0

2022-07-22 Thread Andy Lester


> On Jul 22, 2022, at 1:19 PM, dmitri maziuk  wrote:
> 
>> The DIH does not yet support Solr 9 but I don't think it'll be long before
>> it does.
> 
> FWIW I've been gradually migrating our DIH imports to little python scripts; 
> with all the extra things you can do in those, and less bloat in the main 
> JVM, you gotta wonder how much interest there's gonna be in keeping that 
> alive long-term.


And I’m sure the DIH is slower, too.

We used to have the DIH pull from our Oracle database.  It took about 10 hours 
to do all 45M records.

I migrated to simple Perl program that pulled from Oracle, created JSON and 
sent it to the update handlers. We can easily run 10 in parallel and finish it 
off in about 45 minutes.

Andy

Re: Retain Data Import Handler In Solr9.0

2022-07-22 Thread Dave
Not to mention using dynamic fields on the fly in the indexer, applying code 
logic to the documents and just having full control over it has a lot of 
benefits to the point that a DIH was a cute idea when it came out but it 
reality it was just hand holding

> On Jul 22, 2022, at 2:19 PM, dmitri maziuk  wrote:
> 
> On 2022-07-22 8:46 AM, David Smiley wrote:
>> The DIH does not yet support Solr 9 but I don't think it'll be long before
>> it does.
> 
> FWIW I've been gradually migrating our DIH imports to little python scripts; 
> with all the extra things you can do in those, and less bloat in the main 
> JVM, you gotta wonder how much interest there's gonna be in keeping that 
> alive long-term.
> 
> $.02
> Dima
> 


Re: Retain Data Import Handler In Solr9.0

2022-07-22 Thread Dave
Oh look into perls fork manager module, 

https://metacpan.org/pod/Parallel::ForkManager

. Only trick is each time it spawns a process you have to redeclare the dbh and 
any stored procedures but it’s a small price to pay for being able to simply 
adjust the number of parallel jobs it will do in one script, want 25? Sure run 
25!  Only trick is if you do incremental commits based on doc count you should 
set it in solr itself as once a process spawns any outside variables, like a 
doc counter, can’t get modified across each one and persist. 

> On Jul 22, 2022, at 2:31 PM, Andy Lester  wrote:
> 
> 
> 
>>> On Jul 22, 2022, at 1:19 PM, dmitri maziuk  wrote:
>>> 
>>> The DIH does not yet support Solr 9 but I don't think it'll be long before
>>> it does.
>> 
>> FWIW I've been gradually migrating our DIH imports to little python scripts; 
>> with all the extra things you can do in those, and less bloat in the main 
>> JVM, you gotta wonder how much interest there's gonna be in keeping that 
>> alive long-term.
> 
> 
> And I’m sure the DIH is slower, too.
> 
> We used to have the DIH pull from our Oracle database.  It took about 10 
> hours to do all 45M records.
> 
> I migrated to simple Perl program that pulled from Oracle, created JSON and 
> sent it to the update handlers. We can easily run 10 in parallel and finish 
> it off in about 45 minutes.
> 
> Andy


Re: Retain Data Import Handler In Solr9.0

2022-07-22 Thread Andy Lester


> On Jul 22, 2022, at 1:39 PM, Dave  wrote:
> 
> Oh look into perls fork manager module, 
> 
> https://metacpan.org/pod/Parallel::ForkManager 
> 

I’m aware of the numerous tools like that (I’ve been doing Perl since the 90s 
https://metacpan.org/author/PETDANCE), but for as often as we have to do the 
full import (maybe every couple of months on a schema change) it was easier to 
just assign 1/10th of the records to each of ten updaters that run 
concurrently.  For normal day-to-day incremental, our updater runs every five 
or ten minutes and sends them to Solr.

The other huge win was getting core swapping working.  Build the new core with 
the new schema, index it for an hour, and swap old with new.  So nice.  No 
downtime for schema changes.

Andy