Re: Is this list alive? I need help

Gus Heck Sun, 25 Feb 2024 06:15:43 -0800

Hi Jim,

Welcome to the Solr user list, not sure why your are asking about list
liveliness? I don't see prior messages from you?
https://lists.apache.org/list?users@solr.apache.org:lte=1M:jim

Probably the most important thing you haven't told us is the current size
of your indexes. You said 20k/day input, but at the start do you have
0days, 1 day, 10 days, 100 days, 1000 days, or 10000 days (27y) on disk
already?

If you are starting from zero, then there is likely a 20x or more growth in
the size of the index between the first and second measurement.. indexes do
get slower with size though you would need fantastically large documents or
some sort of disk problem to explain it that way.

However, maybe you do have huge documents or disk issues since your query
time at time1 is already abysmal? Either you are creating a fantastically
expensive query, or your system is badly overloaded. New systems, properly
sized with moderate sized documents ought to be serving simple queries in
tens of milliseconds.

As others have said it is *critical you show us the entire query request*.
If you are doing something like attempting to return the entire index with
rows=999999, that would almost certainly explain your issues...

How large are your average documents (in terms of bytes)?

Also what version of Solr?

r5.xlarge only has 4 cpu and 32 GB of memory. That's not very large
(despite the name). However since it's unclear what your total index size
looks like, it might be OK.

What are your IOPS constraints with EFS? Are you running out of a quota
there? (bursting mode?)

Note that EFS is encrypted file system, and stunnel is encrypted transport,
so for each disk read you likely causing:

   - read raw encrypted data from disk to memory (at AWS)
   - decrypt the disk data in memory (at AWS)
   - encrypt the memory data for stunnel transport (at AWS)
   - send the data over the wire
   - decrypt the data for use by solr. (Hardware you specify)

That's guaranteed to be slow, and worse yet, you have no control at all
over the size or loading of the hardware performing anything but the last
step. You are completely at the mercy of AWS's cost/speed tradeoffs which
are unlikely to be targeting the level of performance usually desired for
search disk IO.

I'll also echo others and say that it's a bad idea to allow solr instances
to compete for disk IO in any way. I've seen people succeed with setups
that use invisibly provisioned disks, but one typically has to run more
hardware to compensate. Having a shared disk creates competition, and it
also creates a single point of failure partially invalidating the notion of
running 3 servers in cloud mode for high availability. If you can't have
more than one disk, then you might as well run a single node, especially at
small data sizes like 20k/day.  A single node on well chosen hardware can
usually serve tens of millions of normal sized documents, which would be
several years of data for you. (assuming low query rates, handling high
rates of course starts to require hardware)

Finally, you will want to get away from using single queries as a
measurement of latency. If you care about response time I HIGHLY suggest
you watch this YouTube video on how NOT to measure latency:
https://www.youtube.com/watch?v=lJ8ydIuPFeU

On Fri, Feb 23, 2024 at 6:44 PM Jan Høydahl <jan....@cominvent.com> wrote:

> I think EFS is a terribly slow file system to use for Solr, who
> recommended it? :)
> Better use one EBS per node.
> Not sure if the gradually slower performance is due to EFS though. We need
> to know more about your setup to get a clue. What role does stunnel play
> here? How are you indexing the content etc.
>
> Jan
>
> > 23. feb. 2024 kl. 19:58 skrev Walter Underwood <wun...@wunderwood.org>:
> >
> > First, a shared disk is not a good idea. Each node should have its own
> local disk. Solr makes heavy use of the disk.
> >
> > If the indexes are shared, I’m surprised it works at all. Solr is not
> designed to share indexes.
> >
> > Please share the full query string.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Feb 23, 2024, at 10:01 AM, Beale, Jim (US-KOP)
> <jim.be...@hibu.com.INVALID> wrote:
> >>
> >> I have a Solrcloud installation of three servers on three r5.xlarge EC2
> with a shared disk drive using EFS and stunnel.
> >>
> >> I have documents coming in about 20000 per day and I am trying to
> perform indexing along with some regular queries and some special queries
> for some new functionality.
> >>
> >> When I just restart Solr, these queries run very fast but over time
> become slower and slower.
> >>
> >> This is typical for the numbers. At time1, the request only took 2.16
> sec but over night the response took 18.137 sec. That is just typical.
> >>
> >> businessId, all count, reduced count, time1, time2
> >> 7016274253,8433,4769,2.162,18.137
> >>
> >> The same query is so far different. Overnight the Solr servers slow
> down and give terrible response. I don’t even know if this list is alive.
> >>
> >>
> >> Jim Beale
> >> Lead Software Engineer
> >> hibu.com
> >> 2201 Renaissance Boulevard, King of Prussia, PA, 19406
> >> Office: 610-879-3864
> >> Mobile: 610-220-3067
> >>
> >>
> >>
> >> The information contained in this email message, including any
> attachments, is intended solely for use by the individual or entity named
> above and may be confidential. If the reader of this message is not the
> intended recipient, you are hereby notified that you must not read, use,
> disclose, distribute or copy any part of this communication. If you have
> received this communication in error, please immediately notify me by email
> and destroy the original message, including any attachments. Thank you.
> **Hibu IT Code:1414593000000**
> >
>
>

-- 
http://www.needhamsoftware.com (work)
https://a.co/d/b2sZLD9 (my fantasy fiction book)

Re: Is this list alive? I need help

Reply via email to