I dug some more into a workaround and found, the SortableTextField, field type:
https://solr.apache.org/guide/7_4/field-types-included-with-solr.html

My max length is 3945.

Any concerns about changing my solr.TextField type to a SortableTextField type 
in order to enable docValues?  
I would then configure the maxCharsForDocValues to 4096.

Is this a bad idea, or am I on the right track?
Is there another way to enable docValues for a pipe delimited string of tokens?

-----Original Message-----
From: Jon Morisi <jon.mor...@hsc.utah.edu> 
Sent: Thursday, July 22, 2021 8:45 AM
To: users@solr.apache.org
Subject: RE: Solr nodes crashing

I looked into this (https://solr.apache.org/guide/7_4/docvalues.html), and it 
looks like I can't use docvalues because my field type is solr.textfield.  
Specifically:

<fieldType name="PipeToken" class="solr.TextField" positionIncrementGap="100" 
multiValued="false">
  <analyzer>
    <tokenizer class="solr.SimplePatternSplitTokenizerFactory" pattern="|"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  </fieldType>

I'm passing in a string of tokens separated by '|'. 

Some (made up) example data would be: 
41654165|This is a phrase|6579813|phrases are all one 
41654165|token|65798761|There can be multiple phrases or tokens per doc

 Is there a workaround?

My search would look something like:
.../select?q=ptokens: 41654165%20AND% ptokens: 65798761


-----Original Message-----
From: Mike Drob <md...@mdrob.com>
Sent: Wednesday, July 21, 2021 12:36 PM
To: users@solr.apache.org
Subject: Re: Solr nodes crashing

You may want to look into enabling docvalues for your fields in your scheme, if 
not already enabled. That often helps with memory usage during query, but 
requires a reindex of your data.

There are also first searches and new searches queries you can configure in 
your Solr config, those would be able to warm your caches for you if that is 
the case.

Mike

On Wed, Jul 21, 2021 at 11:06 AM Jon Morisi <jon.mor...@hsc.utah.edu> wrote:

> Thanks for the help Shawn and Walter.  After increasing the open files 
> setting to 128000 and increasing the JVM-Memory to 16 GB, I was able 
> to load my documents.
>
> I now have a collection with 2.3 T rows / ~480 GB running on a 4-node 
> cluster.  I have found that complicated queries (searching for two 
> search terms in a field with "AND" for example), often timeout.  If I 
> try multiple times the query does eventually complete.  I'm assuming 
> this is a caching / warm-up issue.
>
> Is there a configuration option I can use to cache the indexes for one 
> of the columns or increase the timeout?  Any other advice to get this 
> performing quicker is appreciated.
>
> Thanks again,
> Jon
>
> -----Original Message-----
> From: Shawn Heisey <apa...@elyograg.org>
> Sent: Thursday, July 1, 2021 6:48 PM
> To: users@solr.apache.org
> Subject: Re: Solr nodes crashing
>
> On 7/1/2021 4:23 PM, Jon Morisi wrote:
> > I've had an indexing job running for 24+ hours.  I'm importing 100m+
> documents.  After about 8 hours both of the replica nodes crashed but 
> the primary nodes have continued to run and index.
>
> There's a common misconception.  Java programs, including Solr, almost 
> never crash.
>
> If you've started a recent Solr version on a platform other than 
> Windows, then Solr is started with a Java option that runs a script 
> whenever an OutOfMemoryError exception is thrown by the program.  What 
> that script does is simple -- it logs a line to a logfile and then 
> kills Solr with the -9
> (kill) signal.  Note that there are a number of resource depletion 
> scenarios, other than memory, which can result in an OutOfMemoryError.
> That's why you were asked about open file and process limits.
>
> Most operating systems also have what has been named the "oom killer".
> When system memory becomes extremely tight, the OS will find programs 
> using a lot of memory and kill one of them.
>
> These two things will LOOK like a crash, but they're not really crashes.
>
> > JVM-Memory 50.7%
> > 981.38 MB
> > 981.38 MB
> > 497
>
> This indicates that your max heap setting for Solr is in the ballpark 
> of 1GB.  This is extremely small, and so you're probably throwing 
> OutOfMemoryError because of heap space.  Which, on a non-Windows 
> system, will basically cause Solr to commit suicide.  It does this 
> because when OOME is thrown, program operation becomes completely 
> unpredictable, and index corruption is a very real possibility.
>
> There are precisely two ways to deal with OOME.  One is to increase 
> the size of the resource that is being depleted.  The other is to 
> change the program or the program configuration so that it doesn't 
> require as much of that resource.  Often, especially with Solr, the 
> second option is simply not possible.
>
> Most likely you're going to need to increase Solr's heap far beyond 1GB.
>   There's no way for us to come up with a recommendation for you 
> without asking you a lot of very detailed questions about your setup 
> ... and even with that, it's possible that we would give you an 
> incorrect recommendation.  I'll give you a number, and warn you that 
> it could be wrong, either way too small or way too large.  Try an 8GB 
> heap.  You have lots of memory in this system, 8GB is barely a drop in the 
> bucket.
>
> Thanks,
> Shawn
>

Reply via email to