Re: Lucene vs SQL.

2012-07-27 Thread Denis Bazhenov
;t work in lucene or perform better in MySQL/postgres. >>> >>> - >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org &g

Re: Facet Support

2012-07-27 Thread Denis Bazhenov
print, retain, copy, disseminate, distribute, or use this message or > any part thereof. If you receive this message > in error, please notify the sender immediately and delete all copies of this > message. > --- Denis Bazhenov

Re: RAM or SSD...

2012-07-27 Thread Denis Bazhenov
mall low-latency bucket. SSDs speeds up almost everything, saves >>>>>> RAM and spares a lot of work hours optimizing I/O-speed. >>>>>> >>>>>> Regards, >>>>>> Toke Eskildsen >>>>>> >>>>>> >&

Re: getting the token position

2013-01-10 Thread Denis Bazhenov
- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --- Denis Bazhenov - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Denis Bazhenov
- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --- Denis Bazhenov - To unsubscrib

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Denis Bazhenov
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --- Denis Bazhenov - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Reg Lucene Naive Bayesian classifier.

2013-01-14 Thread Denis Bazhenov
ignesh Srinivasan > 9739135640 > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --- Denis Bazhenov

Re: Is LogByteSizeMergePolicy deterministic?

2013-01-21 Thread Denis Bazhenov
s? >> >> >> >> -- >> >> >> Sincerely yours, >> >> Apostolis Xekoukoulotakis > > ----- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.ap

Re: FacetedSearch and MultiReader

2013-01-21 Thread Denis Bazhenov
he faceted framework to >>obtain >>>> FacetResults starting from a MultiReader? all the example >>I see are using a >>>> "single" IndexReader. >>>> >>>> >>>> >>>> Nicola. >>>&g

Re: IndexWriter.optimize() is removed in 4.0?

2013-01-23 Thread Denis Bazhenov
;> >> - >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > > ----

Re: posting list traversal code

2013-06-12 Thread Denis Bazhenov
cs(term); while (termDocs.next()) { int docId = termDocs.doc(); // work with the document... } On Jun 13, 2013, at 1:56 PM, Sriram Sankar wrote: > Can someone point me to the code that traverses the posting lists? I > trying to understand how it works. > &g

Re: posting list traversal code

2013-06-12 Thread Denis Bazhenov
omewhere in the code where the magic happens. > > Thanks, > > Sriram. > > > > > On Wed, Jun 12, 2013 at 9:33 PM, Denis Bazhenov wrote: > >> I'm not quite sure, what you really need. But as far as I understand, you >> want to get al

Query serialization/deserialization

2013-07-27 Thread Denis Bazhenov
x27;m trying to avoid it, is that serialization is used to communicate in distributed system, and we can't guarantee the equality of Lucene version at all nodes at any particular point in time. Is there some way to perform query serialization in "lucene version independent

Re: Query serialization/deserialization

2013-07-28 Thread Denis Bazhenov
in > the "parsedquery" section of debugQuery output for a Solr query response. > > -- Jack Krupansky > > -Original Message- From: Denis Bazhenov > Sent: Sunday, July 28, 2013 1:59 AM > To: java-user@lucene.apache.org > Subject: Query serialization/deseri

WeakIdentityMap high memory usage

2013-08-07 Thread Denis Bazhenov
even retained size). Here is screenshot of the JProfiler output: https://dl.dropboxusercontent.com/u/16254496/Screen%20Shot%202013-08-07%20at%205.35.22%20PM.png. The keys of the map are MMapIndexInput. What this map is for and how can I reduce it memory usage? --- Denis Bazhenov FarPost

Re: WeakIdentityMap high memory usage

2013-08-07 Thread Denis Bazhenov
somehow it's causing "real life" problems, please report back! But a >> simple >> workaround is to call MMapDirectory.setUseUnmap(false) to turn off this >> tracking; this means you rely on GC to (eventually) unmap. >> >> Mike McCandless >> &g

Re: WeakIdentityMap high memory usage

2013-08-09 Thread Denis Bazhenov
o Lucene, and instead let the OS take up the slack of any spare RAM > for IO caching. --- Denis Bazhenov

WordBoundTokenFilter

2011-06-13 Thread Denis Bazhenov
is there any interest in this solution for the community, and does it make sense to contribute it back? --- Denis Bazhenov - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: WordBoundTokenFilter

2011-06-13 Thread Denis Bazhenov
itive (which is not a good thing in a >> general search), but works well in a specialized indexes. >> >> So a developed such a token filter. My question is there any interest in >> this solution for the community, and does it make sense to contribute it >> back? >

Re: WordBoundTokenFilter

2011-06-13 Thread Denis Bazhenov
was no documentation in the > API >> - at last when I searched for it the last time. >> >> Regards, >> Em >> >> Am 13.06.2011 12:56, schrieb Denis Bazhenov: >>> It seems so. Interestingly I can't find any mentions of >> WordDelimiterTokenF

Re: Index size and performance degradation

2011-06-16 Thread Denis Bazhenov
ke the power law: Heavy performance >>> degradation in the beginning, less later. It makes sense when we look at >>> caching and it means that if you do not require stellar performance, you >>> can have very large indexes on few machines (cue Hathi Trust). >>> >>

Re: how to do something like sql in () clause

2011-06-20 Thread Denis Bazhenov
s immediately by e-mail and delete the message and any > attachments from your system. > --- Denis Bazhenov - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: how to do something like sql in () clause

2011-06-20 Thread Denis Bazhenov
Also, how do you write that in the query syntax? > > Thanks, > Dean > > -Original Message- > From: Denis Bazhenov [mailto:dot...@gmail.com] > Sent: Monday, June 20, 2011 5:50 PM > To: java-user@lucene.apache.org > Subject: Re: how to do something like sql in () clau

Does {Filter}ing is faster than {Query}ing in Lucene?

2011-06-23 Thread Denis Bazhenov
es exactly the same as NRQ but without document scoring. Does this means that if I do not need scoring or sort documents by document field value I should preferFiltering over Querying from performance point of view? --- Denis Bazhenov

Lucene sort performance roots?

2011-06-23 Thread Denis Bazhenov
sorting faster than relational database. Let me put it another way -- I have no explanation why SQL databases should not do it as fast as Lucene. Is there any explanation for that? --- Denis Bazhenov - To unsubscrib

Re: Lucene sort performance roots?

2011-06-23 Thread Denis Bazhenov
Weiss wrote: > Can you describe the kind of sorting you're doing? Maybe the data is > already sorted (and in RAM) and you're only getting it out? > > Dawid > > On Fri, Jun 24, 2011 at 3:32 AM, Denis Bazhenov wrote: >> Well, maybe it's a bit controversial que

IndexReader#reopen() on externally changed index

2011-10-16 Thread Denis Bazhenov
ement efficient index reopen (only new segments should be read) when index is changed externally? --- Denis Bazhenov - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-u

Re: IndexReader#reopen() on externally changed index

2011-10-30 Thread Denis Bazhenov
en possible (ie, open > the IndexReader from the IndexWriter), but it sounds like in your case > this is not possible since writer and reader on different > JVMs/machines across a network. > > Mike McCandless > > http://blog.mikemccandless.com > > On Sun, Oct 16, 2011 at

Re: Never close IndexWriter/Reader?

2014-05-04 Thread Denis Bazhenov
nly close it, > when the server is shutting down. Is this a good idea? Any suggestions or > comments? Would be nice J > > > > Greetings > > Sascha > --- Denis Bazhenov - To unsubscribe, e-ma

Re: Top 10 words

2015-02-15 Thread Denis Bazhenov
using Lucene Ramdirectory or memory indexing and find >>> the most occurring top 10 words. >>> 2. Is this the right approach , indexing and writing to the disk would be >>> almost over kill and a user can search any number of times. >>> >>> Thanks in advance. >>> >> --- Denis Bazhenov - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Using Lucene to model ownership of documents

2016-06-16 Thread Denis Bazhenov
engthened the time needed for indexing & index size. > We are also thinking about using a custom filter but we are concerned about > the memory requirements. > > Any ideas/suggestions would be really appreciated. --- Denis Bazhenov ---

Re: CPU usage 100% during search

2017-01-08 Thread Denis Bazhenov
is spent. It might have been GC. > On Jan 4, 2017, at 18:05, Adrien Grand wrote: > > Well, you could but that would not make sense, 100% CPU usage is really the > best you can get. Why would you like to make things worse artificially? --- Denis Bazhenov

Document serializable representation

2017-03-30 Thread Denis Bazhenov
Hi. We have in-house distributed Lucene setup. 40 dual-socket servers with approximatley 700 cores divided in 7 partitions. Those machines are doing index search only. Indexes are prepared on several isolated machines (so called, Index Masters) and distributed over the cluster with plain rsync.

Re: Document serializable representation

2017-03-30 Thread Denis Bazhenov
cated and transfer the plain > fieldname:value pairs over the network. Each node then creates Lucene > IndexableDocuments out of it and passes to their own IndexWriter. --- Denis Bazhenov

Re: Document serializable representation

2017-03-30 Thread Denis Bazhenov
hen merge (and not > literally, but just by addIndexes() or so ) them to smaller number for > search. Transferring indices is more efficient (scp -C) than separate > tokens and their attributes over the wire. > > On Thu, Mar 30, 2017 at 12:02 PM, Denis Bazhenov wrote: > >>

Re: Document serializable representation

2017-03-30 Thread Denis Bazhenov
Yeah, I definitely will look into PreAnalyzedField as you and Michail suggest. Thank you. > On Mar 30, 2017, at 19:15, Uwe Schindler wrote: > > But that's hard to implement. I'd go for Solr instead of doing that on your > own! --- Denis Bazhenov

Re: Document serializable representation

2017-03-30 Thread Denis Bazhenov
properly. "Properly" here means > that the hash ranges of the merged shards exactly span the > ranges of the merged segments. > > And if you're merging them all down to one segment the ranges > don't matter. > > Best, > Erick > > On Thu, Mar 30, 2

Re: Decision on Number of shards and collection

2018-04-11 Thread Denis Bazhenov
; Neo > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --- Denis