Hits Max # of documents?

2008-12-01 Thread Ian Vink
(I'm using Lucene.NET but the APIs are close enough) I'd like the search to always return all documents always. I notice that it 'seems' to return a percentage of them. Hits myHits = searcher.search(query); Is what I use. Is there a way to force the searcher to give me everything? Ian

Newbie: MatchAllDocsQuery sample?

2008-12-01 Thread Ian Vink
Is there a simple example on how to query for "contents:Hello" in all documents using MatchAllDocsQuery ? I want 100% of the docs with "Hello" Ian

Re: Newbie: MatchAllDocsQuery sample?

2008-12-01 Thread Ian Vink
But when I search (50,000 documents) I don't get all documents with "Hello" in them. I get a lot, but not all. Ian On Mon, Dec 1, 2008 at 9:33 AM, Erik Hatcher <[EMAIL PROTECTED]>wrote: > > On Dec 1, 2008, at 8:30 AM, Ian Vink wrote: > >> Is there

ID field - hundreds?

2008-12-01 Thread Ian Vink
Each document has a field "DocID" with a unique int in the index. I want to search the documents with DocID of 1 or 2 or 5 or 8 etc (hundreds long list) When I specify a query like: [ contents:Hello DocID:1 DocID:2 ] etc it is slow. Is there a more efficient way to limit my search to books in

Re: Newbie: MatchAllDocsQuery sample?

2008-12-01 Thread Ian Vink
enough hits? > > e.g. if you are using method TopDocs search(Query query, int n) are > you setting n high enough? > > -- > Ian. > > On Mon, Dec 1, 2008 at 1:48 PM, Ian Vink <[EMAIL PROTECTED]> wrote: > > But when I search (50,000 documents) I don't get all do

Design guidance - search strategy

2008-12-04 Thread Ian Vink
I have documents with this simple schema in Lucene which I can not change. docid: (int) contents: (text) The user is given a list of 10,000 documents in a tree which they select to search, usually they select 5000 or so. I only want to search those 5000 documents. I have the 'id' fields. That is

Re: Design guidance - search strategy

2008-12-04 Thread Ian Vink
see > several of the Searcher.search variants. > > Second suggestion, use one of the collector classes rather than > Hits, e.g. TopDoc*, TopFieldDoc*, whichever suits. > > > Best > Erick > > On Thu, Dec 4, 2008 at 7:59 AM, Ian Vink <[EMAIL PROTECTED]> wrote:

Re: Design guidance - search strategy

2008-12-04 Thread Ian Vink
escape my aging memory. > > Hope that helps > Erick > > On Thu, Dec 4, 2008 at 4:20 PM, Ian Vink <[EMAIL PROTECTED]> wrote: > > > So, let me get this straight. :) > > > > A Query tells Lucene what to search for. Then a Filter tells lucene what? > > &g

Re: Design guidance - search strategy

2008-12-04 Thread Ian Vink
It works. For those using Lucene.NET here is an example of a Filter that takes a list of IDs for books: public class BookFilter: Filter { private readonly List bookIDs; public BookFilter(List bookIDsToSearch) { bookIDs = bookIDsToSearch; }

TopDocs

2008-12-04 Thread Ian Vink
I have this search which returns TopDocs TopDocs topDocs = searcher.Search(query, bookFilter, maxDocsToFind); How do I get the document object for the ScoreDoc? foreach (ScoreDoc scoreDoc in topDocs.scoreDocs) { ??Document myDoc = GetTheDocument(scoreDoc.doc); ?? }

TopDocs - Get all docs?

2008-12-05 Thread Ian Vink
Is there an easy way to get all the documents in the index? Kinda like this: TopDocs everything = ???.GetAllDocuments();

Fragment Highlighter Phrase?

2008-12-07 Thread Ian Vink
Is there a way to get phrases counted in the list of fragments that come back from Highlighter.GetBestFragments() in general. It seems to only take words into account. Ian

How to add an Arabic and Farsi language analyzer to Lucene

2008-12-12 Thread Ian Vink
Anyone heard of one for Lucene.NET ? Ian

.NET list?

2008-12-12 Thread Ian Vink
I am using java-user@lucene.apache.org for help, but sometimes I'd like Lucene.net specific help. Is there a mailing list for Lucene.NET on apache? Ian

All Terms Unique

2008-12-14 Thread Ian Vink
I have an index with these terms defined for each document: language author religion Is there a simple way to get from the index a unique list of all the authors ? How about all the authors that also have "english" and "baha'i" set? I'm creating the UI and need 'pickers' for these items. Thanks

Persian (Farsi) Language Analyzer

2008-12-17 Thread Ian Vink
I have ported the Java version of the Arabic analyzer recently committed to Lucene.Net Is there any work been done on a Farsi Analyzer (Persian Language) Thanks, Ian

Showcase - What I made with Lucene

2008-12-19 Thread Ian Vink
A Freeware, OpenSource Windows PC and Web based application: http://BahaiResearch.com It allows people from 14 languages to investigate the religious texts of other religions. The goal is to foster better understanding between peoples of many religions and many languages. A many-to-many relations

Get DocID after Document insert?

2008-12-24 Thread Ian Vink
I am building up an index with documents that are hierarchical in their relationship to each other. After I insert a Document into the index, how do I know its document ID? I need that to pass to the next document as the "ParentID" Ian

Fast string access - Best Practise?

2008-12-25 Thread Ian Vink
Which of these is the better practice: myTitle = luceneDocment.GetField("title").StringValue(); or myTitle = luceneDocment.Get("title"); Thanks in advance. Ian

Re: Get DocID after Document insert?

2008-12-25 Thread Ian Vink
ue > IDs into a map along with their current Lucene ID if you really care > that much about speed. This could just be a simple counter and > you could find the maximum one currently in your index when you > needed to add more documents to the index using the methods > I mentioned. >

Re: Fragment Highlighter Phrase?

2009-02-14 Thread Ian Vink
s that are part of the phrase in the Query? Ian On Mon, Dec 8, 2008 at 8:28 AM, Mark Miller wrote: > Ian Vink wrote: > >> Is there a way to get phrases counted in the list of fragments that come >> back from Highlighter.GetBestFragments() in general. >> It seems to only

Re: Fragment Highlighter Phrase?

2009-02-15 Thread Ian Vink
s on that still, though > it may be dated now. First check the highlighter contrib > and see if its there though. > > - Mark > > > Ian Vink wrote: > >> I use the Lucene.NET implementation. (2.3) >> There is a Lucene.Net.Search.Spans.SpanScorer class, but it'

"Near" force in query server side?

2009-02-19 Thread Ian Vink
know about positions of the terms when it indexes? Thanks, Ian Vink http://BahaiResearch.com

Distinct terms values? (like in Luke)

2009-05-10 Thread Ian Vink
I have tagged each of my documents with a term "religion" and values like "Baha'i, Christian, Jewish, Islam" etc. In Luke it shows me that I have a term count of 8 for the term "religion" How do I get a list of the 8 distinct values for the term religion from an index? Ian

IndexReader.Terms - internals

2009-05-11 Thread Ian Vink
IndexReader rdr = IndexReader.Open(myFolder); TermEnum terms = rdr.Terms((new Term(myTermName, ""))); (from .NET land, but it's all the same) This code works great, I can loop thru the terms nicely, but after it returns all the myTermName terms, it goes into all other term

Re: IndexReader.Terms - internals

2009-05-11 Thread Ian Vink
an application level object it is designed to match complex > > word. So we loop on the TermEnum until we consider we reached the end of > > interesting information. > > To summarize: you stop the loop when > > 1. there is no more data in TermEnum >

Lucene for the Mac

2009-06-08 Thread Ian Vink
Is there a Mac port of the Lucene engine?

Re: Lucene for the Mac

2009-06-08 Thread Ian Vink
Yes, if there an Objective-C version. Ian On Mon, Jun 8, 2009 at 6:57 PM, Paul Libbrecht wrote: > > Le 08-juin-09 à 23:55, Ian Vink a écrit : > >> Is there a Mac port of the Lucene engine? >> > > I don't get it, are you asking whether Lucene java works on Ma

Re: Lucene for the Mac

2009-06-08 Thread Ian Vink
esuggests > there is, although I don't know the state of it. > > > On Jun 8, 2009, at 6:05 PM, Ian Vink wrote: > > Yes, if there an Objective-C version. >> Ian >> >> >> On Mon, Jun 8, 2009 at 6:57 PM, Paul Libbrecht >> wrote: >> >>

Newbie: Luke and fields

2009-09-04 Thread Ian Vink
I have created an index and each document has a contents field and a language field. contents has the flags: Indexed Tokenized Stored Vector language has the flags: Indexed Stored In luke I can search contents fine, but when I try to search the field language, I never ever get results. Every doc

Re: Newbie: Luke and fields

2009-09-04 Thread Ian Vink
igure it out. > I speak from experience here, I've given myself lumps on my forehead > when someone pointed out *perfectly obvious* problems that I couldn't > find for hours. > > Best > Erick > > On Fri, Sep 4, 2009 at 7:55 PM, Ian Vink wrote: > > > I hav

field with single quote being split

2009-09-12 Thread Ian Vink
My index has a field with the source of the document. In luke I can see that religion has baha'i or islam or Tao etc The problem is that when I construct a query in luke with "religion:baha'i" luke thinks it's 2 terms "baha" and "i" Is there a way to construct a query to make it search with

Re: field with single quote being split

2009-09-12 Thread Ian Vink
I'm using Snowball as I have a dozen languages. ian On Sat, Sep 12, 2009 at 4:56 PM, AHMET ARSLAN wrote: > > The problem is that when I construct a query in luke with > > "religion:baha'i" > > luke thinks it's 2 terms "baha" and "i" > > Which analyzer is used in query parsing? LetterTokenizer