Hey Matt,
You basically don't need to use DDQ in that case. You can construct a
BooleanQuery with a MUST_NOT clause for filter out the facet path. Here's a
short code snippet:
String indexedField = config.getDimConfig("Author").indexFieldName; // Find
the field of the "Author" facet
Query q = new
This feature is not available in Lucene currently, but it shouldn't be hard
to add it. See Mike's comment here:
http://blog.mikemccandless.com/2013/05/dynamic-faceting-with-lucene.html?showComment=1412777154420#c363162440067733144
One more tricky (yet nicer) feature would be to have it all in one
We've removed the userguide a long time ago. We have a set of example files
under lucene-demo, e.g. here
https://lucene.apache.org/core/6_3_0/demo/src-html/org/apache/lucene/demo/facet/
.
Also, you can read some blog posts, start here:
http://shaierera.blogspot.com/2012/11/lucene-facets-part-1.htm
Hi
The reason IMO is historic - ES and Solr had faceting solutions before
Lucene had it. There were discussions in the past about using the Lucene
faceting module in Solr (can't tell for ES) but, sadly, I can't say I see
it happening at this point.
Regarding your other question, IMO the Lucene fa
*> However, that should not lead to NSFE. At worst it should lead to>
"ordinal is not known" (maybe as an AIOOBE) from the taxonomy reader.*
That is correct, this interleaving indexing case can potentially result in
an AIOOBE like exception during faceted search, when the facets that are in
the "
Hmm ... the commit part of the two indexes is always tricky. The javadocs
are correct because the order of indexing is as follows: when you index a
document with facets, the facets are first added to the taxonomy index and
only then the document is indexed in IW.
Therefore if you concurrently inde
Hey,
Here's a blog I wrote a couple years ago about using facet associations:
http://shaierera.blogspot.com/2013/01/facet-associations.html. Note that
the examples in the blog were written against a very old Lucene version
(4.7 maybe). We have a couple of demo files that are maintained with the
co
True, but Erick's questions are still valid :-). We need more info to
answer these questions. So Simona, the more info you can give us the better
we'll be able to answer.
On Fri, Feb 26, 2016, 10:54 Uwe Schindler wrote:
> Hi Erick,
>
> this was a question about Lucene so "&debug=true" won't help
You should use Lucene's replicator module, which helps you take backups
from live snapshots of your index, even while indexing happens. You can
read about how to use it here:
http://shaierera.blogspot.co.il/2013/05/the-replicator.html
Shai
On Wed, Jan 13, 2016, 19:14 Erick Erickson wrote:
> Jus
I think you can just write a TokenFilter which sets the
PositionIncrementAttribute of every other token to 0. Then you can use
StandardTokenizer and wrap it with that filter.
Shai
On Aug 8, 2015 6:33 AM, "Văn Châu" wrote:
> Hi,
>
> I'm looking a solution for the following format in solr/lucene 5
It deal with possible out of memory issue?
> >
> > I am thinking of using the same Database to store the merged indices. But
> > the problem is the original sharded indices can be updated, when new
> > entries come in. So the merged final indices also needs to be updated
> &
In some cases, MMapDirectory offers even better performance, since the JVM
doesn't need to manage that RAM when it's doing GC.
Also, using only RAMDirectory is not safe in that if the JVM crashes, your
index is lost.
On Thu, Apr 2, 2015 at 12:54 PM, Christoph Kaser
wrote:
> Hi Gimantha,
>
> why
I don't see that you use acceptDocs in your MyNDVFilter. I think it would
return false for all userB docs, but you should confirm that.
Anyway, because you use an NDV field, you can't automatically skip
unrelated documents, but rather your code would look something like:
for (int i = 0; i < reade
prepare the Ranges
> manually and pass them to LongRangeFacetsCounts.
>
> On Tue, Mar 10, 2015 at 4:54 PM, Shai Erera wrote:
>
> > I am not sure that splitting the ranges into smaller ranges is the same
> as
> > sampling.
> >
> > Take a look RandomSamplingFa
I am not sure that splitting the ranges into smaller ranges is the same as
sampling.
Take a look RandomSamplingFacetsCollector - it implements sampling by
sampling the document space, not the facet values space.
So if for instance you use a LongRangeFacetCounts in conjunction with a
RandomSamplin
)
>
> Can Lucene internally index like above, as 'India' value already exist as
> path of some other document ?
> Or some other ways that can be explored within Lucene.
>
>
>
> On Thu, Jan 8, 2015 at 5:26 PM, Shai Erera wrote:
>
> > Lucene does not underst
Lucene does not understand the word "India", therefore the facets that are
actually indexed are:
Doc1: Asia + Asia/India
Doc2: India + India/Gujarat
When you ask for top children, you will get Asia + India, both with a count
of 1.
Shai
On Thu, Jan 8, 2015 at 1:48 PM, Jigar Shah wrote:
> Very
Hi Mrugesh,
This is strange indeed, as the facets are ordered by count, and we use a
facet ordinal (integer code) as a tie breaker. What do you mean by
"refreshed"? Do you have a sample test that shows this behavior?
Shai
On Fri, Dec 12, 2014 at 8:37 AM, patel mrugesh
wrote:
>
>
> Hi All,
> I a
post, we use Lucene 4.2.1.
>
> On Thu, Dec 4, 2014 at 9:29 AM, Shai Erera wrote:
>
> > Do you use Lucene or Solr? Lucene also has a replication module, which
> will
> > allow you to replicate index changes.
> >
> > On Thu, Dec 4, 2014 at 4:19 PM, Vijay B wrote:
Do you use Lucene or Solr? Lucene also has a replication module, which will
allow you to replicate index changes.
On Thu, Dec 4, 2014 at 4:19 PM, Vijay B wrote:
> Hello,
>
> We index docs coming from database nightly. Current index is sitting on
> NFS. Due to obvious performance reasons, we are
Yes, hierarchical faceting in Lucene is only supported by the taxonomy
index, at least currently.
Shai
On Tue, Nov 25, 2014 at 3:46 PM, Vincent Sevel
wrote:
> hi,
> I saw that SortedSetDocValuesFacetCounts does not support hierarchical
> facets.
> Is that to say that hierarchical facets are onl
e not matched.
>
> And I have set hitpage =10 .
>
>
> Thanks
> Priyanka
>
>
> On Mon, Oct 27, 2014 at 6:14 AM, Shai Erera wrote:
>
> > Hi
> >
> > Your question is a bit fuzzy -- what do you mean by not showing "low
> > scores"? Are you
Hi
Your question is a bit fuzzy -- what do you mean by not showing "low
scores"? Are you sure that these 2 documents are matched by the query? Can
you boil it down to a short test case that demonstrates the problem?
In general though, when you search through IndexSearch.search(Query, int),
you wo
lyAllDeletes=false)
>
> Will "IndexSearcher" and "TaxonomyReader" be in sync, in both
> SearcherTaxonomyManager ?
>
> On Fri, Oct 10, 2014 at 12:08 AM, Shai Erera wrote:
>
> > This usually means that your IndexReader and TaxonomyReader are out of
> &g
This usually means that your IndexReader and TaxonomyReader are out of
sync. That is, the IndexReader sees category ordinals that the
TaxonomyReader does not yet see.
Do you use SearcherTaxonomyManager in your application? It ensures that the
two are always in sync, i.e. reopened together and that
The facets translation should be done at the application level. So if you
index the dimension A w/ two facets A/A1 and A/A2, where A1 should also be
translated to B1 and A2 translated to B2, there are several options:
Index the dimensions A and B with their respective facets, and count the
relevan
Hi
You cannot remove facets from the taxonomy index, but you can reindex a
single document and update its facets. This will add new facets to the
taxonomy index (if they do not already exist). You do that just like you
reindex any document, by calling IndexWriter.updateDocument(). Just make
sure t
Hi
The FacetsConfig object is the one that you use to index facets, and at
search time it is consulted about the facets attributes (multi-valued,
hierarchical etc.). You can make changes to the FacetsConfig, as long as
they don't contradict the indexed data in a problematic manner.
Usually the fa
Thanks Yonghui,
I will commit a fix - need to initialize the example class before each
example is run !
Shai
On Tue, Sep 30, 2014 at 1:26 PM, Yonghui Zhao wrote:
>
> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_8/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFac
Hi
The taxonomy faceting approach maintains a sidecar index where it keeps the
taxonomy and assigns an integer (ordinal) to each category. Those integers
are encoded in a BinaryDocValues field for each document. It supports
hierarchical faceting as well as assigning additional metadata to each
fac
You can read some discussion here:
http://search-lucene.com/m/Z2GP220szmS&subj=RE+What+is+equivalent+to+Document+setBoost+from+Lucene+3+6+inLucene+4+1+
.
I wrote a post on how to achieve that with the new API:
http://shaierera.blogspot.com/2013/09/boosting-documents-in-lucene.html.
Shai
On Sun,
gt; forceMerge().
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Shai Erera [mailto:ser...@gmail.com]
> > Sent: Thursday, August 07, 2014
similar to how we store the payload :) We use an
> integer as payload for each token, and store more complicated information
> in another Lucene index with the integer payload as the key for each
> document.
>
> Sheng
>
> On Wednesday, August 13, 2014, Shai Erera wrote:
>
Sheng,
I assume that you're using the Lucene faceting module, so I answer
following that:
(1) A document can be associated with many facet labels, e.g. Tags/lucene
and Author/Shai. The way to extract all facet labels for a particular
document is this:
OrdinalsReader ordinals = new DocValuesOrd
looks like the MergePolicy is set
> through IndexWriterConfig but I don't see a way to update an IWC on an
> IW.
>
> Thanks,
>
> Jon
>
>
> On Thu, Aug 7, 2014 at 7:37 AM, Shai Erera wrote:
> > Using NoMergePolicy for online indexes is usually not recommende
Using NoMergePolicy for online indexes is usually not recommended. You want
to use NoMP in case where you build an index in a batch job, then in the
end before the index is "published" you run a forceMerge or maybeMerge
(with a real MergePolicy).
For online indexes, i.e. indexes that are being sea
Hi
Currently we do not provide the means to use a single SortedSetDVField for
both faceting and sorting. You can add a SortedSetDVFacetField to a
Document, then use FacetsConfig.build(), but that encodes all your
dimensions under a single SSDV field. It's done for efficiency, since at
search time,
------
> Thanks n Regards,
> Sandeep Ramesh Khanzode
>
>
> On Tuesday, July 1, 2014 9:53 PM, Shai Erera wrote:
>
>
>
> Except that Lucene now offers efficient numeric and binary DocValues
> updates. See IndexWriter.updateNumeric/Binary...
>
> On
Except that Lucene now offers efficient numeric and binary DocValues
updates. See IndexWriter.updateNumeric/Binary...
On Jul 1, 2014 5:51 PM, "Erick Erickson" wrote:
> This JIRA is "complicated", don't really expect it in 4.9 as it's
> been hanging around for quite a while. Everyone would like th
ere any advantage of indexing some facets as not providing any
> indexFieldName ?
>
> Thanks
>
>
>
>
> On Mon, Jun 23, 2014 at 12:55 PM, Shai Erera wrote:
>
> > There is no sample code for doing that but it's quite straightforward -
> if
> > you know y
There is no sample code for doing that but it's quite straightforward - if
you know you indexed some dimensions under different indexFieldNames,
initialize a FacetCounts per such field name, e.g.:
FastTaxoFacetCounts defaultCounts = new FastTaxoFacetCounts(...); // for
your regular facets
FastTaxo
Reply wasn't sent to the list.
On Jun 22, 2014 8:15 PM, "Shai Erera" wrote:
> Can you post an example which demonstrates the problem? It's also
> interesting how you count the facets, eg do you use a TaxonomyFacets object
> or something else?
>
> Have yo
on 'CITY'.
>
> FastTaxonomyFacetCounts(String indexFieldName, TaxonomyReader taxoReader,
> FacetsConfig config, FacetsCollector fc) throws IOException {
> super(indexFieldName, taxoReader, config);
> ...
> }
>
> Thanks
> Jigar Shah.
>
>
>
> On Sat, Ju
What do you mean by does not index anything? Do you get an exception when
you add a String[] with more than one element?
You should probably call conf.setHierarchical(dimension), but if you don't
do that you should receive an IllegalArgumentException telling you to do
that...
Shai
On Sun, Jun 2
If you can, while in debug mode try to note the instance ID of the
FacetsConfig, and assert it is indeed the same (i.e. indexConfig ==
searchConfig).
Shai
On Sat, Jun 21, 2014 at 8:26 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> Are you sure it's the same FacetsConfig at search
How do you add facets to your documents? Did you play with the
FacetsConfig, such as alter the field under which the CITY dimension is
indexed?
If you can reproduce this failure in a simple program, I guess it will be
easy to spot the error. Looks like a configuration error to me...
Shai
On Fri
shows this count.
>
> I will check on a Linux box to make sure. Thanks,
>
> ---
> Thanks n Regards,
> Sandeep Ramesh Khanzode
>
>
> On Tuesday, June 17, 2014 11:28 PM, Shai Erera wrote:
>
>
>
> Nothing suspicious ... code looks fine. The c
(1000, "F5"));
> results.add(facets.getTopChildren(1000, "F6"));
> results.add(facets.getTopChildren(1000, "F7"));
> System.out.println("3. End Date: " + new Date());
> // Above part takes approx less than 1 second
> ===
that way ... I look at e.g how
doc-values are merged .. not sure it will improve performance. But if you
want to cons up a patch, that'd be awesome!
Shai
On Tue, Jun 17, 2014 at 8:01 PM, Shai Erera wrote:
> OK I think I now understand what you're asking :). It's unrelated thoug
7;t need any memory
>
> I was trying to get a heads-up on these 2 approaches. Please do let me know
> if I have understood correctly
>
> --
> Ravi
>
>
>
>
> On Tue, Jun 17, 2014 at 5:42 PM, Shai Erera wrote:
>
> > >
> > > I am afraid the DocMap sti
Execution: 11 seconds
>Facet counts execution: < 1 second
>
>With 4.9M hits (1 different value for the 1 term): (Without
> Flushing
> Windows File Cache on Next run)
> Query Execution: 2 seconds
>Facet counts execu
>
> - we are extending FacetResultsHandler to change the order of the facet
> results (i.e. date facets ordered by date instead of count). How can I
> achieve this now?
>
Now everything is a Facets. In your case, since you use the taxonomy, it's
TaxonomyFacets. You can check the class-hierarchy, w
I think lucene itself has a MergeIterator in o.a.l.util package.
>
> A MergePolicy can wrap a simple MergeIterator for iterating docs across
> different AtomicReaders in correct sort-order for a given field/term
>
> That should be fine right?
>
> --
> Ravi
>
> --
> Ravi
&
sorted.
>
> I find this "loadSortTerm(compositeReader)" to be a bit heavy where it
> tries to all load the doc-to-term mappings eagerly...
>
> Are there some alternatives for this?
>
> --
> Ravi
>
>
> On Tue, Jun 17, 2014 at 10:58 AM, Shai Erera wrote:
>
I'm not sure that I follow ... where do you see DocMap being loaded up
front? Specifically, Sorter.sort may return null of the readers are already
sorted ... I think we already optimized for the case where the readers are
sorted.
Shai
On Tue, Jun 17, 2014 at 4:04 AM, Ravikumar Govindarajan <
rav
#x27;ll help as much as I can with that
too.
Shai
On Mon, Jun 16, 2014 at 7:15 PM, Nicola Buso wrote:
> Hi Shai,
>
> I'm going to update from 4.6.1 to 4.8.1 :-(
>
> On Wed, 2014-06-11 at 14:05 +0300, Shai Erera wrote:
> > Hi
> >
> > We remove
rstand it, the
> state is persisted to the disk. But this time, there are additional file
> extensions like doc/pos/tim/tip/dvd/dvm, etc. I am not sure about this
> difference and its cause.
>
> 5.] Does the RAMBufferSizeMB() control the commit intervals, so that when
> the limit i
Err ... are you sure there's an index in the directory that you point Luke
at? I see that the exception points to "." which suggests the local
directory from where Luke was run.
There's nothing special about the taxonomy index, as far as Luke should
concern. However, note that I do not recommend t
use case?
>
> Please let me know. And, thanks!
>
> ---
> Thanks n Regards,
> Sandeep Ramesh Khanzode
>
>
> On Friday, June 13, 2014 9:51 PM, Shai Erera wrote:
>
>
>
> Hi
>
> You can check the demo code here:
>
> https://svn.apache.org
Hi
You can check the demo code here:
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_8/lucene/demo/src/java/org/apache/lucene/demo/facet/.
This code is updated with each release, so you always get a working code
examples, even when the API changes.
If you don't mind managing th
Hi
We removed the userguide long time ago, and replaced it with better
documentation on the classes and package.html, as well as demo code that
you can find here:
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_8/lucene/demo/src/java/org/apache/lucene/demo/facet/
You can also l
You don't need to commit from each thread, you can definitely commit when
all threads are done. In general, you should commit only when you want to
ensure the data is "safe" on disk.
Shai
On Wed, May 21, 2014 at 2:58 PM, andi rexha wrote:
> Hi!
> I have a question about multi-thread indexing.
Well, first make sure that you set ramBufferSizeMB to well below the max
Java heap size, otherwise you could run into OOMs.
While a larger RAM buffer may speed up indexing (since it flushes less
often to disk), it's not the only factor that affects indexing speed.
For instance, if a big portion o
You don't need to do that in parallel to all indexes, unless
it's more convenient for you.
Shai
On Fri, May 2, 2014 at 9:28 AM, Olivier Binda wrote:
> On 05/02/2014 06:05 AM, Shai Erera wrote:
>
>> If you're always rebuilding, let alone forceMerge, you shouldn'
n May 2, 2014 1:57 AM, "Olivier Binda" wrote:
> On 05/01/2014 10:28 AM, Shai Erera wrote:
>
>> I'm glad it helped you. Good luck with the implementation.
>>
>
> Thanks. First I started looking at the lucene internal code. To understand
> when/where and why
index.
Or, if rebuilding all indexes won't take long, you can always rebuild all
of them.
Shai
On Thu, May 1, 2014 at 12:00 AM, Olivier Binda wrote:
> On 04/30/2014 10:48 AM, Shai Erera wrote:
>
>> I hope I got all the details right, if I didn't then please clarify. A
I hope I got all the details right, if I didn't then please clarify. Also,
I haven't read the entire thread, so if someone already suggested this ...
well, it probably means it's the right solution :)
It sounds like you could use Lucene's ParallelCompositeReader, which
already handles multiple Ind
NoMP means no merges, and indeed it seems silly that NoMP distinguishes
between compound/non-compound settings. Perhaps it's rooted somewhere in
the past, I don't remember.
I checked and IndexWriter.addIndexes consults
MP.useCompoundFile(segmentInfo) when it adds the segments. But maybe
NoMP.useCo
The problem is that compound files settings are split between MergePolicy
and IndexWriterConfig. As documented on IWC.setUseCompoundFile, this
setting controls how new segments are flushed, while the MP setting
controls how merged segments are written.
If we only offer NoMP.INSTANCE, what would it
s. I think the best way to solve this is to encode
> the number of values as first entry in the BDV. This is not that hard so I
> will take this road.
>
> -Rob
>
>
> > Op 27 apr. 2014 om 21:27 heeft Shai Erera het
> volgende geschreven:
> >
> > Hi Rob,
> &g
2014 at 1:20 PM, Shai Erera wrote:
> I don't think that you should use the facet module. If all you want is to
> encode a bunch of numbers under a 'foo' field, you can encode them into a
> byte[] and index them as a BDV. Then at search time you get the BDV and
> deco
etSum*Associations
> would need to do this for all fields that I need facet counts/sums for.
>
> What do you think?
>
> -Rob
>
>
> On Wed, Apr 23, 2014 at 5:13 PM, Shai Erera wrote:
>
> > A NumericDocValues field can only hold one value. Have you thought about
>
ache.LongParser. These parsers only seem te parse one field.
>
> Is there an efficient way to get -all- of the (numeric) values for a field
> in a document?
>
>
> On Wed, Apr 23, 2014 at 4:38 PM, Shai Erera wrote:
>
> > You can do that by writing a Filter which returns matchin
You can do that by writing a Filter which returns matching documents based
on a sum of the field's value. However I suspect that is going to be slow,
unless you know that you will need several such filters and can cache them.
Another approach would be to write a Collector which serves as a Filter,
>> (LUCENE-5438), InfosRefCounts (weird name), whose purpose is to do
>> what IndexFileDeleter does for IndexWriter, ie keep track of which
>> files are still referenced, delete them when they are done, etc. This
>> could used on the client side to hold a lease for another client.
>>
Hi
I am not sure how more than one client_no field ends up w/ a document, and
I'm not sure it's related to the taxonomy at all.
However, looking at the code example you pasted above, and since you
mention that you index+commit in one thread, while another thread does the
reopen, I wonder if that'
IndexRevision uses the IndexWriter for deleting unused files when the
revision is released, as well as to obtain the SnapshotDeletionPolicy.
I think that you will need to implement two things on the "client" side:
* Revision, which doesn't use IndexWriter.
* Replicator which keeps track of how ma
, then close() should
not create a new commit point. Do you see that it does?
Shai
On Wed, Mar 19, 2014 at 11:09 PM, Roberto Franchini wrote:
> On Sat, Mar 15, 2014 at 12:56 PM, Roberto Franchini
> wrote:
> > On Sat, Mar 15, 2014 at 12:47 PM, Shai Erera wrote:
> >> If you
If you use LocalReplicator on both sides, you have to use the same instance
on both sides. Otherwise the replicas will never see the published
revisions the which are done in a separate instance. Can you try that?
Shai
On Mar 15, 2014 1:10 PM, "Roberto Franchini" wrote:
> On Sat, Mar 15, 2014 at
Double fields can be implemented today over NumericDVField and therefore
already support updates.
String can be implemented on Sorted/SortedSetDVField, but not updates for
them yet. I hope that once I'm done w/ LUCENE-5513, adding update support
for Sorted/SortedSet will be even easier.
Shai
On
Hi
1. Is it possible to provide updateNumericDocValue(Term term,
> Map), incase I wish to update multiple-fields and it's
> doc-values?
>
For now you can call updateNDV multiple times, each time w/ a new field.
Under the covers, we currently process each update separately anyway.
I think in order
I often prefer to manage such weights outside the index. Usually managing
them inside the index leads to problems in the future when e.g the weights
change. If they are encoded in the index, it means re-indexing. Also, if
the weight changes then in some segments the weight will be different than
ot
"adjacency" by "size", whereas it would be better
> if "timestamp" is used in my case
>
> Sure, I need to wrap this in an SMP to make sure that the newly-created
> segment is also in sorted-order
>
> --
> Ravi
>
>
>
> On Wed, Feb 12,
Why not use LogByteSizeMP in conjunction w/ SortingMP? LogMP picks adjacent
segments and SortingMP ensures the merged segment is also sorted.
Shai
On Wed, Feb 12, 2014 at 3:16 PM, Ravikumar Govindarajan <
ravikumar.govindara...@gmail.com> wrote:
> Yes exactly as you have described.
>
> Ex: Cons
not about lack of creativity, I might have not explained you in the
> proper way :)
>
> Thank you for all the support :)
>
>
> On Tue, Feb 11, 2014 at 12:23 AM, Shai Erera wrote:
>
> > What you want sounds like grouping more like faceting?
> >
> > So
documents result first and then category wise,
> suppose 2 documents by the same Author etc
>
> As per my requirement, I am doing DrillDown Search by asking the user to
> provide such as title of the docment, author of the document, etc... as
> advanced search option.
>
> --
e same category
> from the FacetResult Object also.
>
> I hope you will understand my question :)
>
> Thank you :)
>
> --
> Jebarlin
>
>
>
> On Mon, Feb 10, 2014 at 9:09 PM, Shai Erera wrote:
>
> > Hi
> >
> > You will need to build a BooleanQue
indly Guide me :)
>
> Thank you for All your Support.
>
> Regards,
> Jebarlin.R
>
>
> On Mon, Feb 10, 2014 at 1:28 PM, Shai Erera wrote:
>
> > Hi
> >
> > If you want to drill-down on first name only, then you have several
> > options:
> >
>
Hi
If you want to drill-down on first name only, then you have several options:
1) Index Author/First, Author/Last, Author/First_Last as facets on the
document. This is the faster approach, but bloats the index. Also, if you
index the author Author/Jebarlin, Author/Robertson and
Author/Jebarlin_R
r.java:2034)
> > 02-07 12:38:11.006: W/System.err(5411): at
> >
> com.example.lucene.threads.AsyncIndexWriter.addDocumentSynchronous(AsyncIndexWriter.java:343)
> > 02-07 12:38:11.006: W/System.err(5411): at
> >
> com.example.lucene.threads.AsyncIndexWriter.addDocume
It looks like something's wrong with the index indeed. Are you sure you
committed both the IndexWriter and TaxoWriter?
Do you have some sort of testcase / short program which demonstrates the
problem?
I know there were few issues running Lucene on Android, so I cannot
guarantee it works fully .. w
Note that Lucene doesn't support general in-place document updates, and
updating a document means first deleting it and adding it back.
Therefore if you only intend to add/change few categories of an existing
document, you have to fully re-index the document. This is not specific to
categories but
ave
> reproduces it very quickly, Only have to index ~330K docs.
>
>
> On Fri, Jan 17, 2014 at 3:27 PM, Shai Erera wrote:
>
> > Do you have a test which reproduces the error? Are you adding categories
> > with very deep hierarchies?
> >
> > Shai
> >
> &
Do you have a test which reproduces the error? Are you adding categories
with very deep hierarchies?
Shai
On Fri, Jan 17, 2014 at 11:59 PM, Matthew Petersen wrote:
> I've confirmed that using the LruTaxonomyWriterCache solves the issue for
> me. It would appear there is in fact a bug in the Cl
Opened https://issues.apache.org/jira/browse/LUCENE-5320.
Shai
On Fri, Nov 1, 2013 at 4:59 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> On Fri, Nov 1, 2013 at 3:12 AM, Shai Erera wrote:
>
> > Maybe we should offer such a ReferenceManager (ma
SearcherTaxonomyManager can be used only for NRT, as it only takes an
IndexWriter and DirectoryTaxonomyWriter. And I don't think you want to keep
those writers open on the slaves side.
I think that a ReferenceManager, which returns a SearcherAndTaxonomy, is
the right thing to do. The reason why we
is that SortingMergePolicy performs sorting after
> wrapping the 2 segments, correct?
>
> As I mentioned in my original email I would like to avoid the re-sorting
> and exploit the fact that the input segments are already sorted.
>
>
>
> On Wed, Oct 23, 2013 at 11:02
Hi
You can use SortingMergePolicy and SortingAtomicReader to achieve that. You
can read more about index sorting here:
http://shaierera.blogspot.com/2013/04/index-sorting-with-lucene.html
Shai
On Wed, Oct 23, 2013 at 8:13 PM, Arvind Kalyan wrote:
> Hi there, I'm looking for pointers, suggesti
>
> The codec intercepts merges in order to clean up files that are no longer
> referenced
>
What happens if a document is deleted while there's a reader open on the
index, and the segments are merged? Maybe I misunderstand what you meant by
this statement, but if the external file is deleted, sin
Oops you're right, it was committed in LUCENE-4985 which will be released
in Lucene 4.5.
Shai
On Wed, Aug 28, 2013 at 6:16 PM, Krishnamurthy, Kannan <
kannan.krishnamur...@contractor.cengage.com> wrote:
> Thanks for the response. I double checked that
> SortedSetDocValuesAccumulator doesn't tak
1 - 100 of 350 matches
Mail list logo