ields share the same FS
blocks, then the hot 2 fields values will be to scattered acrossed the FS
the OS cache useless. degradating performance back to I/O bounded.
Which is the case with Lucene 3.6?
Thanks.
Gili Nachum.
-
To unsubs
ields share the same FS
blocks, then the hot 2 fields values will be to scattered across the FS the
OS cache useless. degradating performance back to I/O bounded.
Which is the case with Lucene 3.6?
Thanks.
Gili Nachum.
and rows are documents).
But for stored fields, or term vectors, which are "row stride", you won't
see efficient use of the OS's IO cache.
Mike McCandless
http://blog.mikemccandless.com
On Wed, Jan 23, 2013 at 7:59 AM, Gili Nachum wrote:
Hi,
I have a search workload
I am out of the office until 20/02/2013.
For Search/CCM - Noga Tor
For AS-Search/Social People Typeahead - Sharon Krisher
Or my manager Eitan Shapiro.
Note: This is an automated response to your message "What is equivalent to
Document.setBoost() from Lucene 3.6 inLucene 4.1 ?" sent on 18/02/20
Answering myself for next generations' sake.
Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS does the job.
Example:
import junit.framework.Assert;
import org.junit.Test;
public class DetectCJK {
@Test
public void test1() {
Assert.assertEquals(Character.UnicodeBlock.BASIC_LATIN,
Ch
tCJK(Character character, String message) {
UnicodeBlock unicodeBlock = Character.UnicodeBlock.of(character);
Assert.assertTrue(message, cjkUnicodeBlocks.contains(unicodeBlock));
}
}
On Mon, Mar 11, 2013 at 12:10 AM, Trejkaz wrote:
> On Sun, Mar 10, 2013 at 8:19 PM, Gili Nachu
y still be applicable.
>
> 512Mb for a 70Gb index sounds very conservative.
>
>
>
> --
> Ian.
>
>
> On Mon, Mar 11, 2013 at 9:08 AM, Gili Nachum wrote:
> > Hello.
> >
> > I'm getting an OOME with a heap size of 512MB while trying to open an
> &g
Hi. *I would like for hits that contain the search terms in proximity to
each other to be ranked higher than hits in which the terms are scattered
across the doc.
Wondering if there's a best practice to achieve that?*
I also want that all hits will contain all of the search terms (implicit
AND):
*
s on the corpora your index
> represents, your queries and your needs.
> >
> > Given your question it looks like you're using the query parser. Try
> something like "your proximity query"~20, but consider the cost of a great
> slop.
> >
> >
>
ndering if anyone has tested # of segments against search time
performance?*
I should add I have ~10 indexes, at a total size of 50GB, and I use
mutli-index searcher to search over them (Lucene 3.0.3 - yeah it's old I
know). The index is updated every 15min.
Gili Nachum.
Hello,
I got an index corruption in production, and was wondering if it might be a
known bug (still with Lucene 3.1), or is my code doing something wrong.
It's a local disk index. No known machine power lose. No suppose to even
happen, right?
This index that got corrupted is updated every 30sec; a
AM, Gili Nachum wrote:
> Hello,
> I got an index corruption in production, and was wondering if it might be
> a known bug (still with Lucene 3.1), or is my code doing something wrong.
> It's a local disk index. No known machine power lose. No suppose to even
> happen, right?
Thanks Mike and Uwe.
I already reindexed in production, my goal is to get to the root cause to
make sure it doesn't happen again.
Will remove the flush(). No idea why it's there.
Attaching checkIndex.Main() output (why did I bother writing my own output
:#)
*Output:*
Opening index @ C:\\customers\
Hello! I've implemented a type ahead search by indexing all possible terms'
prefixes as fields on the docs.
The resulting index is about 1gb in size and fits in the filesystem cache.
Will implementing this differently, over FSTs instead of prefixes, would
bare any performance/size/features advantag
Hi, What FS block size to use?
I have an RAID-5 of SSD drives currently configured with a 128KB block
size.
Can I expect better indexing/query time performance with a smaller block
size (say 8K)?
Considering my documents are almost always smaller than 8K.
I assume all stored fields would fit into
15 matches
Mail list logo