Re: field with single quote being split

2009-09-12 Thread Ian Vink
I'm using Snowball as I have a dozen languages. ian On Sat, Sep 12, 2009 at 4:56 PM, AHMET ARSLAN wrote: > > The problem is that when I construct a query in luke with > > "religion:baha'i" > > luke thinks it's 2 terms "baha" and "i" > > Which analyzer is used in query parsing? LetterTokenizer

field with single quote being split

2009-09-12 Thread Ian Vink
My index has a field with the source of the document. In luke I can see that religion has baha'i or islam or Tao etc The problem is that when I construct a query in luke with "religion:baha'i" luke thinks it's 2 terms "baha" and "i" Is there a way to construct a query to make it search with

Re: Newbie: Luke and fields

2009-09-04 Thread Ian Vink
igure it out. > I speak from experience here, I've given myself lumps on my forehead > when someone pointed out *perfectly obvious* problems that I couldn't > find for hours. > > Best > Erick > > On Fri, Sep 4, 2009 at 7:55 PM, Ian Vink wrote: > > > I hav

Newbie: Luke and fields

2009-09-04 Thread Ian Vink
I have created an index and each document has a contents field and a language field. contents has the flags: Indexed Tokenized Stored Vector language has the flags: Indexed Stored In luke I can search contents fine, but when I try to search the field language, I never ever get results. Every doc

Re: Lucene for the Mac

2009-06-08 Thread Ian Vink
esuggests > there is, although I don't know the state of it. > > > On Jun 8, 2009, at 6:05 PM, Ian Vink wrote: > > Yes, if there an Objective-C version. >> Ian >> >> >> On Mon, Jun 8, 2009 at 6:57 PM, Paul Libbrecht >> wrote: >> >>

Re: Lucene for the Mac

2009-06-08 Thread Ian Vink
Yes, if there an Objective-C version. Ian On Mon, Jun 8, 2009 at 6:57 PM, Paul Libbrecht wrote: > > Le 08-juin-09 à 23:55, Ian Vink a écrit : > >> Is there a Mac port of the Lucene engine? >> > > I don't get it, are you asking whether Lucene java works on Ma

Lucene for the Mac

2009-06-08 Thread Ian Vink
Is there a Mac port of the Lucene engine?

Re: IndexReader.Terms - internals

2009-05-11 Thread Ian Vink
an application level object it is designed to match complex > > word. So we loop on the TermEnum until we consider we reached the end of > > interesting information. > > To summarize: you stop the loop when > > 1. there is no more data in TermEnum >

IndexReader.Terms - internals

2009-05-11 Thread Ian Vink
IndexReader rdr = IndexReader.Open(myFolder); TermEnum terms = rdr.Terms((new Term(myTermName, ""))); (from .NET land, but it's all the same) This code works great, I can loop thru the terms nicely, but after it returns all the myTermName terms, it goes into all other term

Distinct terms values? (like in Luke)

2009-05-10 Thread Ian Vink
I have tagged each of my documents with a term "religion" and values like "Baha'i, Christian, Jewish, Islam" etc. In Luke it shows me that I have a term count of 8 for the term "religion" How do I get a list of the 8 distinct values for the term religion from an index? Ian

"Near" force in query server side?

2009-02-19 Thread Ian Vink
know about positions of the terms when it indexes? Thanks, Ian Vink http://BahaiResearch.com

Re: Fragment Highlighter Phrase?

2009-02-15 Thread Ian Vink
s on that still, though > it may be dated now. First check the highlighter contrib > and see if its there though. > > - Mark > > > Ian Vink wrote: > >> I use the Lucene.NET implementation. (2.3) >> There is a Lucene.Net.Search.Spans.SpanScorer class, but it'

Re: Fragment Highlighter Phrase?

2009-02-14 Thread Ian Vink
s that are part of the phrase in the Query? Ian On Mon, Dec 8, 2008 at 8:28 AM, Mark Miller wrote: > Ian Vink wrote: > >> Is there a way to get phrases counted in the list of fragments that come >> back from Highlighter.GetBestFragments() in general. >> It seems to only

Re: Get DocID after Document insert?

2008-12-25 Thread Ian Vink
ue > IDs into a map along with their current Lucene ID if you really care > that much about speed. This could just be a simple counter and > you could find the maximum one currently in your index when you > needed to add more documents to the index using the methods > I mentioned. >

Fast string access - Best Practise?

2008-12-25 Thread Ian Vink
Which of these is the better practice: myTitle = luceneDocment.GetField("title").StringValue(); or myTitle = luceneDocment.Get("title"); Thanks in advance. Ian

Get DocID after Document insert?

2008-12-24 Thread Ian Vink
I am building up an index with documents that are hierarchical in their relationship to each other. After I insert a Document into the index, how do I know its document ID? I need that to pass to the next document as the "ParentID" Ian

Showcase - What I made with Lucene

2008-12-19 Thread Ian Vink
A Freeware, OpenSource Windows PC and Web based application: http://BahaiResearch.com It allows people from 14 languages to investigate the religious texts of other religions. The goal is to foster better understanding between peoples of many religions and many languages. A many-to-many relations

Persian (Farsi) Language Analyzer

2008-12-17 Thread Ian Vink
I have ported the Java version of the Arabic analyzer recently committed to Lucene.Net Is there any work been done on a Farsi Analyzer (Persian Language) Thanks, Ian

All Terms Unique

2008-12-14 Thread Ian Vink
I have an index with these terms defined for each document: language author religion Is there a simple way to get from the index a unique list of all the authors ? How about all the authors that also have "english" and "baha'i" set? I'm creating the UI and need 'pickers' for these items. Thanks

.NET list?

2008-12-12 Thread Ian Vink
I am using java-user@lucene.apache.org for help, but sometimes I'd like Lucene.net specific help. Is there a mailing list for Lucene.NET on apache? Ian

How to add an Arabic and Farsi language analyzer to Lucene

2008-12-12 Thread Ian Vink
Anyone heard of one for Lucene.NET ? Ian

Fragment Highlighter Phrase?

2008-12-07 Thread Ian Vink
Is there a way to get phrases counted in the list of fragments that come back from Highlighter.GetBestFragments() in general. It seems to only take words into account. Ian

TopDocs - Get all docs?

2008-12-05 Thread Ian Vink
Is there an easy way to get all the documents in the index? Kinda like this: TopDocs everything = ???.GetAllDocuments();

TopDocs

2008-12-04 Thread Ian Vink
I have this search which returns TopDocs TopDocs topDocs = searcher.Search(query, bookFilter, maxDocsToFind); How do I get the document object for the ScoreDoc? foreach (ScoreDoc scoreDoc in topDocs.scoreDocs) { ??Document myDoc = GetTheDocument(scoreDoc.doc); ?? }

Re: Design guidance - search strategy

2008-12-04 Thread Ian Vink
It works. For those using Lucene.NET here is an example of a Filter that takes a list of IDs for books: public class BookFilter: Filter { private readonly List bookIDs; public BookFilter(List bookIDsToSearch) { bookIDs = bookIDsToSearch; }

Re: Design guidance - search strategy

2008-12-04 Thread Ian Vink
escape my aging memory. > > Hope that helps > Erick > > On Thu, Dec 4, 2008 at 4:20 PM, Ian Vink <[EMAIL PROTECTED]> wrote: > > > So, let me get this straight. :) > > > > A Query tells Lucene what to search for. Then a Filter tells lucene what? > > &g

Re: Design guidance - search strategy

2008-12-04 Thread Ian Vink
see > several of the Searcher.search variants. > > Second suggestion, use one of the collector classes rather than > Hits, e.g. TopDoc*, TopFieldDoc*, whichever suits. > > > Best > Erick > > On Thu, Dec 4, 2008 at 7:59 AM, Ian Vink <[EMAIL PROTECTED]> wrote:

Design guidance - search strategy

2008-12-04 Thread Ian Vink
I have documents with this simple schema in Lucene which I can not change. docid: (int) contents: (text) The user is given a list of 10,000 documents in a tree which they select to search, usually they select 5000 or so. I only want to search those 5000 documents. I have the 'id' fields. That is

Re: Newbie: MatchAllDocsQuery sample?

2008-12-01 Thread Ian Vink
enough hits? > > e.g. if you are using method TopDocs search(Query query, int n) are > you setting n high enough? > > -- > Ian. > > On Mon, Dec 1, 2008 at 1:48 PM, Ian Vink <[EMAIL PROTECTED]> wrote: > > But when I search (50,000 documents) I don't get all do

ID field - hundreds?

2008-12-01 Thread Ian Vink
Each document has a field "DocID" with a unique int in the index. I want to search the documents with DocID of 1 or 2 or 5 or 8 etc (hundreds long list) When I specify a query like: [ contents:Hello DocID:1 DocID:2 ] etc it is slow. Is there a more efficient way to limit my search to books in

Re: Newbie: MatchAllDocsQuery sample?

2008-12-01 Thread Ian Vink
But when I search (50,000 documents) I don't get all documents with "Hello" in them. I get a lot, but not all. Ian On Mon, Dec 1, 2008 at 9:33 AM, Erik Hatcher <[EMAIL PROTECTED]>wrote: > > On Dec 1, 2008, at 8:30 AM, Ian Vink wrote: > >> Is there

Newbie: MatchAllDocsQuery sample?

2008-12-01 Thread Ian Vink
Is there a simple example on how to query for "contents:Hello" in all documents using MatchAllDocsQuery ? I want 100% of the docs with "Hello" Ian

Hits Max # of documents?

2008-12-01 Thread Ian Vink
(I'm using Lucene.NET but the APIs are close enough) I'd like the search to always return all documents always. I notice that it 'seems' to return a percentage of them. Hits myHits = searcher.search(query); Is what I use. Is there a way to force the searcher to give me everything? Ian