ps- wrt requesting a "literal, complete search url" to aid troubleshooting: facets, `sort`, `offset`, and `rows` params would all be of particular interest.
On Thu, Jul 22, 2021 at 12:25 PM Michael Gibney <mich...@michaelgibney.net> wrote: > SortableTextField uses docValues in a very specific way, and is not a > general-purpose workaround for enabling docValues on TextFields. Possibly > of interest: https://issues.apache.org/jira/browse/SOLR-8362 > > That said, DocValues are relevant mainly (only?) wrt full-domain per-doc > value-access (e.g., for faceting, sorting, functions, export ...). Enabling > docValues for any field against which you're only running _searches_ is > unlikely to help. > > If search latency is the main issue for you now, sharing more detail about > the queries you're running would be helpful (e.g., are you only running > searches? are you also running facets? how are you sorting? etc.). Pasting > a literal, complete search url (and any configured param defaults, if > applicable) could be helpful (fwiw, the example search you provided > earlier, ".../select?q=ptokens: 41654165%20AND% ptokens: 65798761" looks a > bit odd in several respects, and may not be being interpreted the way you > think it should be; e.g., field spec should be immediately adjacent to > field value, with no intervening whitespace, etc...). > > I note that you have a small amount of swap space being used; "small > amount used" or not, I would _strongly_ recommend disabling swap entirely > (`swapoff -a`). There are risks associated with disabling in general; but > with an index that large, you should be running with enough memory headroom > for the OS page cache that you shouldn't get anywhere near a situation > where application memory actually _needs_ swap. Also, a shot in the dark: > is there any chance you're running this index on a network filesystem? > > On Thu, Jul 22, 2021 at 11:51 AM Jon Morisi <jon.mor...@hsc.utah.edu> > wrote: > >> I dug some more into a workaround and found, the SortableTextField, field >> type: >> https://solr.apache.org/guide/7_4/field-types-included-with-solr.html >> >> My max length is 3945. >> >> Any concerns about changing my solr.TextField type to a SortableTextField >> type in order to enable docValues? >> I would then configure the maxCharsForDocValues to 4096. >> >> Is this a bad idea, or am I on the right track? >> Is there another way to enable docValues for a pipe delimited string of >> tokens? >> >> -----Original Message----- >> From: Jon Morisi <jon.mor...@hsc.utah.edu> >> Sent: Thursday, July 22, 2021 8:45 AM >> To: users@solr.apache.org >> Subject: RE: Solr nodes crashing >> >> I looked into this (https://solr.apache.org/guide/7_4/docvalues.html), >> and it looks like I can't use docvalues because my field type is >> solr.textfield. Specifically: >> >> <fieldType name="PipeToken" class="solr.TextField" >> positionIncrementGap="100" multiValued="false"> >> <analyzer> >> <tokenizer class="solr.SimplePatternSplitTokenizerFactory" >> pattern="|"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> </analyzer> >> </fieldType> >> >> I'm passing in a string of tokens separated by '|'. >> >> Some (made up) example data would be: >> 41654165|This is a phrase|6579813|phrases are all one >> 41654165|token|65798761|There can be multiple phrases or tokens per doc >> >> Is there a workaround? >> >> My search would look something like: >> .../select?q=ptokens: 41654165%20AND% ptokens: 65798761 >> >> >> -----Original Message----- >> From: Mike Drob <md...@mdrob.com> >> Sent: Wednesday, July 21, 2021 12:36 PM >> To: users@solr.apache.org >> Subject: Re: Solr nodes crashing >> >> You may want to look into enabling docvalues for your fields in your >> scheme, if not already enabled. That often helps with memory usage during >> query, but requires a reindex of your data. >> >> There are also first searches and new searches queries you can configure >> in your Solr config, those would be able to warm your caches for you if >> that is the case. >> >> Mike >> >> On Wed, Jul 21, 2021 at 11:06 AM Jon Morisi <jon.mor...@hsc.utah.edu> >> wrote: >> >> > Thanks for the help Shawn and Walter. After increasing the open files >> > setting to 128000 and increasing the JVM-Memory to 16 GB, I was able >> > to load my documents. >> > >> > I now have a collection with 2.3 T rows / ~480 GB running on a 4-node >> > cluster. I have found that complicated queries (searching for two >> > search terms in a field with "AND" for example), often timeout. If I >> > try multiple times the query does eventually complete. I'm assuming >> > this is a caching / warm-up issue. >> > >> > Is there a configuration option I can use to cache the indexes for one >> > of the columns or increase the timeout? Any other advice to get this >> > performing quicker is appreciated. >> > >> > Thanks again, >> > Jon >> > >> > -----Original Message----- >> > From: Shawn Heisey <apa...@elyograg.org> >> > Sent: Thursday, July 1, 2021 6:48 PM >> > To: users@solr.apache.org >> > Subject: Re: Solr nodes crashing >> > >> > On 7/1/2021 4:23 PM, Jon Morisi wrote: >> > > I've had an indexing job running for 24+ hours. I'm importing 100m+ >> > documents. After about 8 hours both of the replica nodes crashed but >> > the primary nodes have continued to run and index. >> > >> > There's a common misconception. Java programs, including Solr, almost >> > never crash. >> > >> > If you've started a recent Solr version on a platform other than >> > Windows, then Solr is started with a Java option that runs a script >> > whenever an OutOfMemoryError exception is thrown by the program. What >> > that script does is simple -- it logs a line to a logfile and then >> > kills Solr with the -9 >> > (kill) signal. Note that there are a number of resource depletion >> > scenarios, other than memory, which can result in an OutOfMemoryError. >> > That's why you were asked about open file and process limits. >> > >> > Most operating systems also have what has been named the "oom killer". >> > When system memory becomes extremely tight, the OS will find programs >> > using a lot of memory and kill one of them. >> > >> > These two things will LOOK like a crash, but they're not really crashes. >> > >> > > JVM-Memory 50.7% >> > > 981.38 MB >> > > 981.38 MB >> > > 497 >> > >> > This indicates that your max heap setting for Solr is in the ballpark >> > of 1GB. This is extremely small, and so you're probably throwing >> > OutOfMemoryError because of heap space. Which, on a non-Windows >> > system, will basically cause Solr to commit suicide. It does this >> > because when OOME is thrown, program operation becomes completely >> > unpredictable, and index corruption is a very real possibility. >> > >> > There are precisely two ways to deal with OOME. One is to increase >> > the size of the resource that is being depleted. The other is to >> > change the program or the program configuration so that it doesn't >> > require as much of that resource. Often, especially with Solr, the >> > second option is simply not possible. >> > >> > Most likely you're going to need to increase Solr's heap far beyond 1GB. >> > There's no way for us to come up with a recommendation for you >> > without asking you a lot of very detailed questions about your setup >> > ... and even with that, it's possible that we would give you an >> > incorrect recommendation. I'll give you a number, and warn you that >> > it could be wrong, either way too small or way too large. Try an 8GB >> > heap. You have lots of memory in this system, 8GB is barely a drop in >> the bucket. >> > >> > Thanks, >> > Shawn >> > >> >