ps- wrt requesting a "literal, complete search url" to aid troubleshooting:
facets, `sort`, `offset`, and `rows` params would all be of particular
interest.

On Thu, Jul 22, 2021 at 12:25 PM Michael Gibney <mich...@michaelgibney.net>
wrote:

> SortableTextField uses docValues in a very specific way, and is not a
> general-purpose workaround for enabling docValues on TextFields. Possibly
> of interest: https://issues.apache.org/jira/browse/SOLR-8362
>
> That said, DocValues are relevant mainly (only?) wrt full-domain per-doc
> value-access (e.g., for faceting, sorting, functions, export ...). Enabling
> docValues for any field against which you're only running _searches_ is
> unlikely to help.
>
> If search latency is the main issue for you now, sharing more detail about
> the queries you're running would be helpful (e.g., are you only running
> searches? are you also running facets? how are you sorting? etc.). Pasting
> a literal, complete search url (and any configured param defaults, if
> applicable) could be helpful (fwiw, the example search you provided
> earlier, ".../select?q=ptokens: 41654165%20AND% ptokens: 65798761" looks a
> bit odd in several respects, and may not be being interpreted the way you
> think it should be; e.g., field spec should be immediately adjacent to
> field value, with no intervening whitespace, etc...).
>
> I note that you have a small amount of swap space being used; "small
> amount used" or not, I would _strongly_ recommend disabling swap entirely
> (`swapoff -a`). There are risks associated with disabling in general; but
> with an index that large, you should be running with enough memory headroom
> for the OS page cache that you shouldn't get anywhere near a situation
> where application memory actually _needs_ swap. Also, a shot in the dark:
> is there any chance you're running this index on a network filesystem?
>
> On Thu, Jul 22, 2021 at 11:51 AM Jon Morisi <jon.mor...@hsc.utah.edu>
> wrote:
>
>> I dug some more into a workaround and found, the SortableTextField, field
>> type:
>> https://solr.apache.org/guide/7_4/field-types-included-with-solr.html
>>
>> My max length is 3945.
>>
>> Any concerns about changing my solr.TextField type to a SortableTextField
>> type in order to enable docValues?
>> I would then configure the maxCharsForDocValues to 4096.
>>
>> Is this a bad idea, or am I on the right track?
>> Is there another way to enable docValues for a pipe delimited string of
>> tokens?
>>
>> -----Original Message-----
>> From: Jon Morisi <jon.mor...@hsc.utah.edu>
>> Sent: Thursday, July 22, 2021 8:45 AM
>> To: users@solr.apache.org
>> Subject: RE: Solr nodes crashing
>>
>> I looked into this (https://solr.apache.org/guide/7_4/docvalues.html),
>> and it looks like I can't use docvalues because my field type is
>> solr.textfield.  Specifically:
>>
>> <fieldType name="PipeToken" class="solr.TextField"
>> positionIncrementGap="100" multiValued="false">
>>   <analyzer>
>>     <tokenizer class="solr.SimplePatternSplitTokenizerFactory"
>> pattern="|"/>
>>     <filter class="solr.LowerCaseFilterFactory"/>
>>   </analyzer>
>>   </fieldType>
>>
>> I'm passing in a string of tokens separated by '|'.
>>
>> Some (made up) example data would be:
>> 41654165|This is a phrase|6579813|phrases are all one
>> 41654165|token|65798761|There can be multiple phrases or tokens per doc
>>
>>  Is there a workaround?
>>
>> My search would look something like:
>> .../select?q=ptokens: 41654165%20AND% ptokens: 65798761
>>
>>
>> -----Original Message-----
>> From: Mike Drob <md...@mdrob.com>
>> Sent: Wednesday, July 21, 2021 12:36 PM
>> To: users@solr.apache.org
>> Subject: Re: Solr nodes crashing
>>
>> You may want to look into enabling docvalues for your fields in your
>> scheme, if not already enabled. That often helps with memory usage during
>> query, but requires a reindex of your data.
>>
>> There are also first searches and new searches queries you can configure
>> in your Solr config, those would be able to warm your caches for you if
>> that is the case.
>>
>> Mike
>>
>> On Wed, Jul 21, 2021 at 11:06 AM Jon Morisi <jon.mor...@hsc.utah.edu>
>> wrote:
>>
>> > Thanks for the help Shawn and Walter.  After increasing the open files
>> > setting to 128000 and increasing the JVM-Memory to 16 GB, I was able
>> > to load my documents.
>> >
>> > I now have a collection with 2.3 T rows / ~480 GB running on a 4-node
>> > cluster.  I have found that complicated queries (searching for two
>> > search terms in a field with "AND" for example), often timeout.  If I
>> > try multiple times the query does eventually complete.  I'm assuming
>> > this is a caching / warm-up issue.
>> >
>> > Is there a configuration option I can use to cache the indexes for one
>> > of the columns or increase the timeout?  Any other advice to get this
>> > performing quicker is appreciated.
>> >
>> > Thanks again,
>> > Jon
>> >
>> > -----Original Message-----
>> > From: Shawn Heisey <apa...@elyograg.org>
>> > Sent: Thursday, July 1, 2021 6:48 PM
>> > To: users@solr.apache.org
>> > Subject: Re: Solr nodes crashing
>> >
>> > On 7/1/2021 4:23 PM, Jon Morisi wrote:
>> > > I've had an indexing job running for 24+ hours.  I'm importing 100m+
>> > documents.  After about 8 hours both of the replica nodes crashed but
>> > the primary nodes have continued to run and index.
>> >
>> > There's a common misconception.  Java programs, including Solr, almost
>> > never crash.
>> >
>> > If you've started a recent Solr version on a platform other than
>> > Windows, then Solr is started with a Java option that runs a script
>> > whenever an OutOfMemoryError exception is thrown by the program.  What
>> > that script does is simple -- it logs a line to a logfile and then
>> > kills Solr with the -9
>> > (kill) signal.  Note that there are a number of resource depletion
>> > scenarios, other than memory, which can result in an OutOfMemoryError.
>> > That's why you were asked about open file and process limits.
>> >
>> > Most operating systems also have what has been named the "oom killer".
>> > When system memory becomes extremely tight, the OS will find programs
>> > using a lot of memory and kill one of them.
>> >
>> > These two things will LOOK like a crash, but they're not really crashes.
>> >
>> > > JVM-Memory 50.7%
>> > > 981.38 MB
>> > > 981.38 MB
>> > > 497
>> >
>> > This indicates that your max heap setting for Solr is in the ballpark
>> > of 1GB.  This is extremely small, and so you're probably throwing
>> > OutOfMemoryError because of heap space.  Which, on a non-Windows
>> > system, will basically cause Solr to commit suicide.  It does this
>> > because when OOME is thrown, program operation becomes completely
>> > unpredictable, and index corruption is a very real possibility.
>> >
>> > There are precisely two ways to deal with OOME.  One is to increase
>> > the size of the resource that is being depleted.  The other is to
>> > change the program or the program configuration so that it doesn't
>> > require as much of that resource.  Often, especially with Solr, the
>> > second option is simply not possible.
>> >
>> > Most likely you're going to need to increase Solr's heap far beyond 1GB.
>> >   There's no way for us to come up with a recommendation for you
>> > without asking you a lot of very detailed questions about your setup
>> > ... and even with that, it's possible that we would give you an
>> > incorrect recommendation.  I'll give you a number, and warn you that
>> > it could be wrong, either way too small or way too large.  Try an 8GB
>> > heap.  You have lots of memory in this system, 8GB is barely a drop in
>> the bucket.
>> >
>> > Thanks,
>> > Shawn
>> >
>>
>

Reply via email to