Hi Greg, Matt,
Thank you for the responses, it's very helpful and great to hear that
Taxonomy is successfully used for large scale products!
Our biggest concern with it right now is future complications related to
index split and merge, which we are most likely going to use to implement
sharding a
During this change I had to change the way I store indexes. This change
results in too many .cfs and .fdt files generated against earlier.
Previously there were 5-7 files in index folder, now it has grown to 40+.
Does it affect having change in the way how indexes are stored internally
with this ch
//Method to create document
private static Document createDocumentTextField(HashMap
fields) {
Document document = new Document();
for (String key : fields.keySet()) {
String val = fields.get(key);
Field f = new TextField(key, val, Field.Store.YES);
Yes, it would be great if you could share code snippets. Maybe it will
help others or maybe someone will have a suggestion to improve or an
alternative.
All the best
Michael
Am 29.04.21 um 14:35 schrieb amitesh116:
Thank you Michael!
I solved this requirement by setting the tokenStream at t
Hi Alex-
Amazon's product search engine is built on top of Lucene, which is a
fairly large-scale application (w.r.t. both index size, traffic and
use-case complexity). We have found taxonomy-based faceting to work
well for us generally, and haven't needed to do much to optimize
beyond what's alrea
Thank you Michael!
I solved this requirement by setting the tokenStream at the field level and
not leaving it to the analyzer. This gives control over altering the full
text before tokenization using custom methods.
This has memory overhead which is handled by writing the documents one at a
time
Alex,
We did consider trying to optimize Taxonomy indexing performance but we
never really got around to it. The sidecar index is annoying to deal with
and we have had occasional issues with it. Zulia has sharding implemented.
The main issue here is not the taxonomy but rather just getting exact
Hi all,
I've been contacted by someone from the hardware arena who wants to
understand which areas of Lucene could benefit from performance
acceleration, removal of bottlenecks etc. I've asked a few of the usual
suspects to no avail, so I'm putting out a wider call for anyone
(probably a comm