Re: DV limited to 32766 ?

2013-08-12 Thread Nicolas Guyot
the whole thread was interesting to read, glad to know a patch is on its way :) thanks! On Fri, Aug 9, 2013 at 4:00 PM, Jack Krupansky wrote: > Check out the discussion on: > > https://issues.apache.org/**jira/browse/LUCENE-4583 > "StraightBy

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Shai Erera
ok that makes sense. Shai On Mon, Aug 12, 2013 at 9:18 PM, Robert Muir wrote: > On Mon, Aug 12, 2013 at 11:06 AM, Shai Erera wrote: > > > > Or, you'd like to keep FieldCache API for sort of back-compat with > existing > > features, and let the app control the "caching" by using an explicit >

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Robert Muir
On Mon, Aug 12, 2013 at 11:06 AM, Shai Erera wrote: > > Or, you'd like to keep FieldCache API for sort of back-compat with existing > features, and let the app control the "caching" by using an explicit > RamDVFormat? > Yes. In the future ideally fieldcache goes away and is a UninvertingFilterRea

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Shai Erera
Rob, when DiskDV becomes the default DVFormat, would it not make sense to load the values into the cache if someone uses FieldCache API? Vs. if someone calls DV API directly, he uses whatever is the default Codec, or the one that he plugs. That's what I would expect from a 'cache'. So it's ok that

Re: How to get hits coordinates in Lucene 4.4.0

2013-08-12 Thread Lingviston
I think that's OK for me. I just need to know the right way to get them. Notice that queries must support boolean operators, *, ? and qoutes. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-hits-coordinates-in-Lucene-4-4-0-tp4083913p4084046.html Sent from the Luce

Re: Creating Indexes when data inside the file is being written.

2013-08-12 Thread Michael McCandless
You'll have to periodically re-index that document, if it's content is constantly changing. Alternatively, it's possible to index sub-documents so that each new "chunk" of content added because a new document, and then you join or group the results back into a single document ... Mike McCandless

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Ross Woolf
Yes, I will open an issue. On Mon, Aug 12, 2013 at 10:02 AM, Robert Muir wrote: > On Mon, Aug 12, 2013 at 8:48 AM, Ross Woolf wrote: > > Okay, just for clarity sake, what you are saying is that if I make the > > FieldCache call it won't actually create and impose the loading time of > the > >

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Robert Muir
On Mon, Aug 12, 2013 at 8:48 AM, Ross Woolf wrote: > Okay, just for clarity sake, what you are saying is that if I make the > FieldCache call it won't actually create and impose the loading time of the > FieldCache, but rather just use the NumericDocValuesField instead. Is this > correct? Yes, e

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Ross Woolf
Okay, just for clarity sake, what you are saying is that if I make the FieldCache call it won't actually create and impose the loading time of the FieldCache, but rather just use the NumericDocValuesField instead. Is this correct? Also, my similarity was extending SimilarityBase, and I can't see

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Robert Muir
Hello: This call just "passes thru" to docvalues: FieldCache.DEFAULT.getFloats(context.reader(), boostField, false) if you want to call context.reader().getNumericDocValues... you could do that too, but thats all its doing in this case. On Mon, Aug 12, 2013 at 11:09 AM, Ross Woolf wrote: >

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Ross Woolf
That example shows using fieldcache, I am not wanting to use the fieldcache. I want to use the newer NumericDocValuesField. Any direction or examples of how to retrieve a value from the created NumericDocValuesField in most efficient way would be appreciated. On Mon, Aug 12, 2013 at 8:54 AM, Ro

Re: How to get hits coordinates in Lucene 4.4.0

2013-08-12 Thread Michael McCandless
OK. But, the offsets refer to the plain text after you filtered the PDF document, not e.g. to offset in the original PDF content. Mike McCandless http://blog.mikemccandless.com On Mon, Aug 12, 2013 at 9:58 AM, Lingviston wrote: > Like I said I will work with pdf files. So I will draw highlig

Re: Merging 2 indexes into a single existing file.

2013-08-12 Thread Michael McCandless
Use IndexWriter.addIndexes? Mike McCandless http://blog.mikemccandless.com On Mon, Aug 12, 2013 at 10:51 AM, Jugal Kolariya wrote: > Hello All, > I have created Indexes for 100 files on my local server. These > indexes are now being used for Search operation. > > Now, from the back

Re: How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Robert Muir
There is a unit test demonstrating this at a very basic level here: http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/search/TestDocValuesScoring.java On Mon, Aug 12, 2013 at 10:43 AM, Ross Woolf wrote: > The JavaDocs for NumericDocValuesField i

Merging 2 indexes into a single existing file.

2013-08-12 Thread Jugal Kolariya
Hello All, I have created Indexes for 100 files on my local server. These indexes are now being used for Search operation. Now, from the backend, new files are created on FTP server. I have to now create indexes for newly created files on FTP server. This can also be done without

How to retrieve value of NumericDocValuesField in similarity

2013-08-12 Thread Ross Woolf
The JavaDocs for NumericDocValuesField indicates that this field value can be used for scoring. The example shows how to store the field, but I am unclear as to how to retrieve the value of the field while in a similarity to use it when scoring a document? Can someone point me to an example or gi

Re: How to get hits coordinates in Lucene 4.4.0

2013-08-12 Thread Lingviston
Like I said I will work with pdf files. So I will draw highlights by myself over the rendered pdf file (as far as I know lucene can't work with pdf by default). Yes, offsets is what I'm looking for. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-hits-coordinate

Creating Indexes when data inside the file is being written.

2013-08-12 Thread Jugal Kolariya
Hello, I have a potential usecase for which I am not sure whether using lucene will help me or not. In my code case, I am creating a new file and writing data to that file. Now, when the file writing is in progress, I would like to create Lucene Indexes. Once indexes are created, I

Re: How to get hits coordinates in Lucene 4.4.0

2013-08-12 Thread Michael McCandless
I think you're asking for what Lucene calls "offsets", i.e. the character indices into the original indexed text, telling you where each hit occurred. All highlighters use offsets to find the matches in the original indexed text. One option, which both Highlighter and FastVectorHighlighter use, i

How to get hits coordinates in Lucene 4.4.0

2013-08-12 Thread Lingviston
Hi, I'm trying to use Lucene in my Android project. To start with I've created a small demo app. It works with .txt files but I need to work with .pdf. So analyzing my code I understand that it will have some issues with .pdfs due to memory management. However the question I want to ask here is not