failure.
Back with Solr 1.3, before DIH, I wrote a Java program to fetch from the
database, then load. That did some transformation, mostly making queue adds
comparable with views (this was at Netflix).
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
Multi-threaded indexing can speed things up. Use two threads per CPU
to get maximum throughput. I wrote a simple Python program to do that.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 6, 2025, at 5:11 PM, Robi Petersen wrote:
>
>
change from IUPUI? I went to North Central High School.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 18, 2025, at 9:19 AM, mw...@iu.edu wrote:
>
> So *how* does copyField work? Do I wind up with two identical copies
> of the data s
change from IUPUI? I went to North Central High School.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 18, 2025, at 9:19 AM, mw...@iu.edu wrote:
>
> So *how* does copyField work? Do I wind up with two identical copies
> of the data s
s field. The only attributes documented
> there are source, dest, and maxChars.
copyField is not a field. It is an instruction to duplicate the text heading to
one field and also send it to another field.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
s field. The only attributes documented
> there are source, dest, and maxChars.
copyField is not a field. It is an instruction to duplicate the text heading to
one field and also send it to another field.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
. Something like:
score desc, id desc
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 15, 2025, at 4:30 AM, Binal Panchal wrote:
>
> Hello Team,
>
> I have two Solr indexes having the same documents. If I do sorting, the
&g
. Your book data will not change that frequently.
I ran search for Netflix, which is not that different from searching books. I
also ran search for Chegg, searching textbooks.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 4, 2025, at 1:46 PM, Nik
. Your book data will not change that frequently.
I ran search for Netflix, which is not that different from searching books. I
also ran search for Chegg, searching textbooks.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 4, 2025, at 1:46 PM, Nik
authors
title^8 authors^2
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 23, 2024, at 10:07 PM, Nikola Smolenski wrote:
>
> Thank you for the suggestion, but that wouldn't work because there could be
> multiple authors with t
26, 2024, at 8:06 AM, Walter Underwood wrote:
>
> Use multiple threads to send batches. I use two moderate sized batches and
> two threads per CPU. You can tune it until you see near 100% CPU utilization.
>
> Why two client threads per CPU? Roughly, one batch being processed by
processed.
Indexing is CPU-intensive, so once it approaches 100% utilization, it is maxed
out.
Add more CPUs to go faster.
I doubt that messing with commits will make a meaningful difference. Use auto
commit so the indexing threads aren’t waiting.
wunder
Walter Underwood
wun...@wunderwood.org
a new user
registered).
I actually can’t remember any index corruption in Solr and I’ve run versions
from 1.3 to 9.1 with both high query load (Netflix) and massive content
(LexisNexis).
I would look at system-level causes, not Solr.
wunder
Walter Underwood
wun...@wunderwood.org
http
.
I dealt with PDF documents in search for over twenty years. You are lucky to
get searchable text out of them.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 4, 2024, at 8:28 AM, Uwe Amberger wrote:
>
> Hallo!
>
> Problem descri
Honestly, there is a missing feature here. Solr should have a free text query
parser. Run the query through standard tokenizer, ignore all the syntax, and
make a bunch of word/phrase queries.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May
word. A more conservative
approach is to remove “*” and “?”, so you prevent script kiddie queries like
“a* b* c* d* e* f* …”
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 29, 2024, at 7:11 AM, Dmitri Maziuk wrote:
>
> Hi all,
>
&
, like a replica going down.
Pretty easy to test, shut down all the Zookeeper nodes in the middle of a load
test.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 21, 2024, at 8:30 AM, Matt Kuiper wrote:
>
> Thanks for the responses!
&
You can send the timeAllowed parameter. It is only checked at certain points in
request processing, but it will stop requests that run too long.
https://solr.apache.org/guide/solr/latest/query-guide/common-query-parameters.html#timeallowed-parameter
wunder
Walter Underwood
wun
/search/IndexSearcher.java at
3024e66e4aba942b039fcad7daf958aa4c90b8bf · apache/lucene
github.com
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 17, 2024, at 3:29 PM, Walter Underwood wrote:
>
> I know about both of those user-specifi
I know about both of those user-specified limits. They are documented, as is
the change in counting clauses in 9.0.
I’ll ask again, is there a hard upper limit on the value of maxBooleanClauses?
wunder
> On Apr 17, 2024, at 2:33 PM, Chris Hostetter wrote:
>
>
>
> : Is there a hard upper lim
Is there a hard upper limit for maxBooleanClauses? We have someone hitting a
limit at 64k clauses after upgrading to 9.x.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
are stored in the database. Having
Solr generate the IDs makes it impossible to update the documents. I’ve used
Solr in production for over 15 years and I’ve never had Solr generate the IDs.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 8, 2
This is a great example of a general technique to make Solr fast. Do the
parsing and selection at index time to make the query as simple as possible.
—wunder
> On Apr 5, 2024, at 10:55 AM, rajani m wrote:
>
> yeah, makes sense, thank you.
>
> On Fri, Apr 5, 2024 at 1:24 PM W
That is what I was going to suggest. Make a month field. —wunder
> On Apr 5, 2024, at 8:22 AM, Alexandre Rafalovitch wrote:
>
> If you know you are going to search by it, clone the field without storage
> and preprocess to just leave the months behind. That's like 12 possible
> values - super e
term (empty), then there is
nothing to have a position.
It might be possible to do that with a string field, but this is TextField.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 4, 2024, at 6:38 AM, Carsten Klement
> wrote:
>
://repost.aws/questions/QUqyZD98d0TbiluqPBW_zALw/how-to-get-comparable-performance-to-gp2-gp3-on-efs
How to get comparable performance to gp2/gp3 on EFS?
repost.aws
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 28, 2024, at 11:18 PM, Gus H
node should have its
own EBS volume, preferably GP3.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 28, 2024, at 7:51 PM, Beale, Jim (US-KOP)
> wrote:
>
> I did send the query. Here it is:
>
> http://samisolrcld.aws01.hibu.i
First, a shared disk is not a good idea. Each node should have its own local
disk. Solr makes heavy use of the disk.
If the indexes are shared, I’m surprised it works at all. Solr is not designed
to share indexes.
Please share the full query string.
wunder
Walter Underwood
wun
was virtual, all
bare metal.
If you are in a mass market hit-oriented business, document cache might pay
off. Where I work now, every client has a different need (legal support), so
our cache hit rates are very small.
It all comes back to the users.
wunder
Walter Underwood
wun
You seem to be jumping to conclusions about causes. Might want to step back and
do some measurements.
Try eliminating parts of the query one at a time, including returning fields.
You might need to do this with a query set of a few thousand queries to avoid
cache effects.
wunder
Walter
include an aggregate popularity, for example.
Maybe add overall recency.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 4, 2024, at 10:28 AM, rajani m wrote:
>
> Hi Wunder,
>
> The base ranker takes care of matching and rankin
reRankDocs is set to 1000. I would try with a lower number, like 100. If the
best match is not in the top 100 documents, something is wrong with the base
relevance algorithm.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 4, 2024, at 9:28
documentation, such as it is, is in solr/modules/*/README.md.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 22, 2023, at 2:05 AM, Jan Høydahl wrote:
>
> Really, if you have a pointing to ../../contrib// that would
> translate in
The use for this is migrating from 8.x to 9.x and replacing with modules.
Folks need to know which modules replace the directives they are removing.
wunder
> On Dec 20, 2023, at 8:30 AM, Walter Underwood wrote:
>
> Is there a list of modules and what they include? It seems scatter
Is there a list of modules and what they include? It seems scattered around the
docs.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
product} instead of {product}.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 14, 2023, at 12:07 PM, Shawn Heisey
> wrote:
>
> On 12/14/23 13:04, Shawn Heisey wrote:
>> On 12/14/23 12:58, Shawn Heisey wrote:
>>> On
Thanks for the recommendation. Are you running this on Intel or ARM64? We’ve
mostly moved to ARM64. —wunder
> On Dec 12, 2023, at 9:55 AM, Shawn Heisey wrote:
>
> Java 11 is a good solid choice. Java 17 seems to perform a little better
> than 11 on Solr 9.x, but I haven't actually measured it
I think the Velocity support is moved to contrib.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 21, 2023, at 4:42 AM, Vince McMahon
> wrote:
>
> Oh, browse endpoint is depreciated... Thanks!
>
> On Tue, Nov 21, 2023 at 5
Thanks for confirming. Yes, we’ll use the CloneFieldUpdateProcessor Factory.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 23, 2023, at 11:36 PM, Mikhail Khludnev wrote:
>
> Hello Walter.
> I'm afraid the copyField directive
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
smime.p7s
Description: S/MIME cryptographic signature
.
This page has some size comparisons for one data set.
https://www.adaltas.com/en/2021/03/22/performance-comparison-of-file-formats/
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 17, 2023, at 11:09 AM, Christine Poerschke (BLOOMBERG/ LONDON
was because they got allocated on a different EC2 instance type. Oops.
We do see some persistent cohorts with different performance, but nothing like
30 ms vs 1000 ms.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 29, 2023, at 9:24 PM, Sh
a real-time get before the update to check whether the document is
really there? That should be pretty fast.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 25, 2023, at 9:53 AM, Dmitri Maziuk wrote:
>
> On 9/25/23 08:24, Shawn Hei
=nmjyrl9z0n92lgidfei45vq4q&dl=0
The collection currently has about 2.5 billion documents. When I worked at
Infoseek, our index of the entire web was 12 million documents.
This is at LexisNexis.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 13, 2
a or fail. I
>> don't
>>> think join query (unless crossCollection) bothers about shard preference.
>>>
>>> On Wed, Sep 13, 2023 at 7:21 PM Walter Underwood
>>> wrote:
>>>
>>>> We have a sharded collection that joins with a non-s
ference.
>
> On Wed, Sep 13, 2023 at 7:21 PM Walter Underwood
> wrote:
>
>> We have a sharded collection that joins with a non-sharded collection. The
>> non-sharded collection has a replica on every node. Does the join
>> automatically choose the local replica or
We have a sharded collection that joins with a non-sharded collection. The
non-sharded collection has a replica on every node. Does the join automatically
choose the local replica or do we need to pass in a shard preference param?
wunder
Walter Underwood
wun...@wunderwood.org
http
sharding.
Double the shards, halve the response time.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 13, 2023, at 4:48 AM, Jan Høydahl wrote:
>
> Hi,
>
> There are no hard rules wrt sharding, it often comes down to measuring and
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 8, 2023, at 7:43 AM, Lyn Evans
> wrote:
>
> We will need to see the status of the Circuit Breaker ( enabled | disabled )
> after this endpoint is invoked.
>
> The ~/config API will
ack when Solr was new (version 1.3). The
synonyms covered “superman”, “babysitter”, “manhunt”, “fullmetal”, etc. The
last was for “Full Metal Jacket” and “Fullmetal Alchemist”. There were about
300 synonyms.
You might also need to consider hyphenated versions, like “Spider-man”.
wunder
Wal
I’ve seen this kind of thing happen when the overseer is stuck for some reason.
Look for a long queue of work for the overseer in zookeeper. I’ve fixed that by
restarting the node which is the overseer. The new one wakes up and clears the
queue. I’ve only seen that twice.
Wunder
> On Jun 5, 20
I wouldn’t call it semantic sugar, more like a different compact format. The
compact format also avoids duplicate keys, which are legal in JSON but hard to
create in some systems.
The XML command format is working fine.
wunder
Walter Underwood
wun...@wunderwood.org
http
.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14234208
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 25, 2023, at 12:19 AM, Thomas Corthals wrote:
>
> Hi Walter
>
> Deleting multiple IDs at once with JSON is mentioned here
back into a consistent state while we wait for the next
full reindex.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 24, 2023, at 1:44 PM, Shawn Heisey wrote:
>
> On 5/24/23 10:48, Walter Underwood wrote:
>> I think I know how w
-006H-40F0-0-00
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 24, 2023, at 1:13 PM, Ishan Chattopadhyaya
> wrote:
>
> Ah, now I remember this comment:
> https://issues.apache.org/jira/browse/SOLR-5890?focusedCommentI
Nice catch. This issue looks exactly like what I’m seeing, it returns success
but does not delete the document.
SOLR-5890
Delete silently fails if not sent to shard where document was added
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May
to avoid changing
the number of shards without a reindex. One of the other clusters has 320
shards.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 24, 2023, at 10:12 AM, Gus Heck wrote:
>
> Understood, of course I've seen your na
volumes were mounted for the matching shards. New shards got
empty volumes. Then the content was reloaded without a delete-all.
Would it work to send the deletes directly to the leader for the shard? That
might bypass the hash-based routing.
wunder
Walter Underwood
wun...@wunderwood.org
http
not an everyday
occurrence. I’m trying to clean up the minor problem of 675k documents with
dupes.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 24, 2023, at 8:06 AM, Jan Høydahl wrote:
>
> I thought deletes were "broadcast&quo
.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
No, Solr Cloud automatically routes it to the correct shard.
wunder
> On May 11, 2023, at 6:41 PM, Anjali Maurya
> wrote:
>
> But it needs a route parameter to find the right shard from where we need
> to delete the document.
>
> On Tue, May 9, 2023 at 11:24 PM Walt
Leave off the routing and send multiple IDs. Solr Cloud will route then to the
correct shards for you. This is just as fast as Solr Cloud reading the route
parameter and sending it to the right shard. The whole point of Solr Cloud is
that it manages shards and replicas for you.
wunder
Walter
We are looking at changing a field property to be large=true. Can we do that
without reindexing?
Also, I’d appreciate pointers to discussions about the performance implications.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
a leader to
replicate from.
Any ideas on how to unwedge this?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
://solr.apache.org/guide/8_11/solrcloud-query-routing-and-read-tolerance.html#shards-tolerant-parameter
I looked at the original Jira for that, but it is for 4.0 and things have
changed just a little bit (https://issues.apache.org/jira/browse/SOLR-3134).
wunder
Walter Underwood
wun
the sawtooth, then add some headroom, maybe
a gigabyte. Test with that value.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 14, 2023, at 7:01 AM, HariBabu kuruva
> wrote:
>
> Hi ,
>
> Till now it was running with 45GB heap m
How frequent are your commits?
wunder
Walter Underwood
wun...@wunderwood.org
https://observer.wunderwood.org/ (my blog)
> On Mar 10, 2023, at 12:27 AM, Hakan Özler wrote:
>
> Regarding the problem, we're able to mitigate it by increasing the time
> between
> commits to the
Use a heap analysis tool. You’ll see a sawtooth pattern in the heap size. The
bottom of that sawtooth is the actual amount of memory that Solr is using. Pick
the highest point of the bottom of the sawtooth, then add some headroom, maybe
a gigabyte. Test with that value.
wunder
Walter Underwood
Is it supposed to be:
{“delete”: {“id”: "1E089335-892C-41F6-B767-632EB5361775”}}
wunder
Walter Underwood
wun...@wunderwood.org
https://observer.wunderwood.org/ (my blog)
> On Mar 7, 2023, at 1:20 PM, Thomas Corthals wrote:
>
> Got blindsided by the quotes and didn't noti
up an extra
downstream machine to play with until you get it right.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Mar 2, 2023, at 10:42 AM, gnandre wrote:
>
> Thanks! I am using non-cloud mode at the moment. So, there is no way to
> j
You need to send a build request to each node. I used to have some code to dig
out the nodes from a cluster status, then send a build to each one, but I think
that is marooned at my previous company. It isn’t super hard, just dig it out
of the JSON.
wunder
Walter Underwood
wun

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 17, 2023, at 11:52 AM, Mark Hieber wrote:
>
> We have a cluster of hosts running Solr 8.4 Each host has an application
> which listens to an external source for updated documents. Wh
Just reboot it. Solr will shut down all connections, interrupting any
in-progress replication. The replication will be retried after it starts back
up.
Failure of the master during replication has been safe for many years.
wunder
Walter Underwood
wun...@wunderwood.org
http
consistent ordering.
Exact score ties are common with one word queries and short documents, like
book or movie titles.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 11, 2023, at 4:10 AM, Peter Lancaster
> wrote:
>
> Hi Mikhail,
>
&
single cloud. Any suggestions?
It was challenging to manage with 8 shards and a replication factor of 8. At
that point, we scaled vertically to bigger AWS instances. It scaled smoothly up
to 72 CPU instances.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
single cloud. Any suggestions?
It was challenging to manage with 8 shards and a replication factor of 8. At
that point, we scaled vertically to bigger AWS instances. It scaled smoothly up
to 72 CPU instances.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
update, but those shouldn’t be frequent.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
invented
by Infoseek. That patent expired several years ago, so we should implement it.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 28, 2022, at 5:35 AM, Eric Pugh
> wrote:
>
> For a very long time, that was what folks always
neighboring zip codes would be to find what DMA (direct
marketing area) the address is in, then find the zip codes that are in that DMA.
This thread has some relevant discussion:
https://www.reddit.com/r/adops/comments/oxdthy/zip_code_to_dma_converter/
wunder
Walter Underwood
wun...@wunderwood.org
/data=!4m5!3m4!1s0x80ba30d165da8f09:0xaf8f27eb9fd93664!8m2!3d38.3675335!4d-115.9467997
What does the current API do, exactly?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 14, 2022, at 9:37 AM, dmitri maziuk wrote:
>
> On 2022-12-14
than the caching in a
single Solr server.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 13, 2022, at 11:27 AM, David Hastings
> wrote:
>
> Ah, that makes sense. If you can do sticky sessions and such with your
> balancers, plus
could send the same query back to the same
host, but AWS load balancers aren’t very smart.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 13, 2022, at 3:50 AM, Dave wrote:
>
> Ha I meant qtimes not atone. Also in general you should
If you want apple OR pear, use:
myField:apple myField:pear
If you want apple AND pear, use:
+myField:apple +myField:pear
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 9, 2022, at 9:22 AM, Matthew Castrigno wrote:
>
> I am havin
format.
https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-update-handlers.html#json-formatted-index-updates
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 28, 2022, at 2:59 PM, Matthew Castrigno wrote:
>
> Thank you
,\"Date\":\"2022-10-03T12:30:17.3388537\",\"ContentType\":\"Blog\",\"Body\":{\"Fields\":[{\"Name\":\"Heading
Background Image\",\"Type\":\"Image\",\"Value\":\"\”},...
would add fields like
That is invalid JSON. The client needs to fix it. I’m surprised it indexes at
all. This should not be your problem.
Past that string into this: https://jsonlint.com
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 28, 2022, at 12:57 PM, Matt
. That is more predictable.
wunder
Walter Underwood
wun...@wunderwood.org <mailto:wun...@wunderwood.org>
http://observer.wunderwood.org/ (my blog)
> On Nov 16, 2022, at 3:55 AM, Jan Høydahl <mailto:jan@cominvent.com>> wrote:
>
> Also see the Ref Guide about Request
.
wunder
Walter Underwood
wun...@wunderwood.org <mailto:wun...@wunderwood.org>
http://observer.wunderwood.org/ (my blog)
> On Nov 15, 2022, at 3:49 AM, DAVID MARTIN NIETO <mailto:dmart...@viewnext.com>> wrote:
>
> hello solr users
>
> We have a production cluster
ng of name/value pairs.”
https://www.ecma-international.org/publications-and-standards/standards/ecma-404/
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 31, 2022, at 1:42 PM, Adam Constabaris
> wrote:
>
> I don't know if there&
Run a GC analyzer on that JVM. I cannot imagine that they need 80 GB of heap.
I’ve never run with more than 16 GB, even for a collection with 70 million
documents.
Look at the amount of heap used after full collections. Add a safety factor to
that, then use that heap size.
wunder
Walter
speed and capacity of the disk system.
If the index does fit in RAM, then you should be fine.
You may want to spend some effort on reducing index size if it is near the
limit.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Oct 6, 2022, at 8:18
was a movie titled “+/-“, but that is a different problem.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Sep 27, 2022, at 12:49 PM, Miguel Joy
> wrote:
>
> Hi Walter,
>
> Thanks very much for your honest feedback. As I ment
Honestly, this analysis chain is a mess.
* StandardTokenizer has parsing support for email addresses, so that is a
better choice.
* Never mix phonetic transformation and stemming, use different chains.
Phonetic tokens aren’t stemmable.
* Don’t stem email addresses.
* Don’t do phonetic transforms
I’ve always used the HTTP (access) log. In that, queries to shards are POST
requests, so if the external requests are all GET, they are easy to sort out.
wunder
Walter Underwood
https://observer.wunderwood.org/
> On Sep 22, 2022, at 7:02 PM, Shawn Heisey wrote:
>
> On 9/22/22 09:1
In the real world, many queries are repeated, so it is best to replay logged
queries keeping all the dupes.
wunder
Walter Underwood
https://observer.wunderwood.org/
> On Sep 21, 2022, at 4:31 PM, Derek C wrote:
>
> Thanks Deepak,
>
> I'm going to do more testing and c
I made this work with 6.x but don’t remember the details, sorry. I think it
wanted application/something, maybe the POST format.
wunder
> On Sep 9, 2022, at 1:38 PM, Mikhail Khludnev wrote:
>
> Hold on. JSON query DSL lets you pass quite long content via body. It
> should support {!mlt}. At
just aren’t designed for persistent data.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jul 17, 2022, at 7:58 AM, Dave wrote:
>
> Three nodes with nginx in front will handle well over 50k searches a day on a
> half terabyte index,
change? Not very often, I bet.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 29, 2022, at 12:50 AM, Noah Torp-Smith wrote:
>
> Interestingly, I found that
>
> [child childFilter=$pidfilter limit=-1]&pidfilter=+instance.agen
200k documents, but
response times were well under 100 ms, as I remember.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 1, 2022, at 3:00 PM, Christopher Schultz
> wrote:
>
> All,
>
> Since Solr / Lucene can't def
We had one 4.x cluster that was difficult to migrate, so until recently we were
using SolrJ 4.x with our new Solr 8.7 clusters and with a Solr 4.10.4 cluster.
We were not doing anything fancy like faceting.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog
1 - 100 of 181 matches
Mail list logo