date:20050511

Re: I need 100 most frequently used words in different languages.

2005-05-11 Thread Ahmet Aksoy

Hi David, Thanks for your suggestion. I'll give a try. David Spencer wrote: You could try downloading a copy of the wikipedia and processing the entries yourself. I don't know how well represented other languages are but there's lot of English. Ahmet Aksoy wrote: Hi, I have a project which will

Re: I need 100 most frequently used words in different languages.

2005-05-11 Thread David Spencer

You could try downloading a copy of the wikipedia and processing the entries yourself. I don't know how well represented other languages are but there's lot of English. Ahmet Aksoy wrote: Hi, I have a project which will be used in order to supply automatic dictionary helps in different language

RE: problem while merging two indexes

2005-05-11 Thread Omar Didi

Thanks otis, I copied the index and I am playing around with the copy. I first had to change the code to force the unlock of the directory. and from what you just said all the new segments that are in my directory the index doesn't know about them so deleting them shouldn't hurt. -Origina

Re: problem while merging two indexes

2005-05-11 Thread Otis Gospodnetic

You should be able to re-try the merge (from the beginning - there is no way to restart it at any point other than the beginning). The merge and the new index is "finalized" at the very end of the merge, so if it failed before that, your Lucene index (the segments file) still doesn't know about th

Re: end of line in queries

2005-05-11 Thread Chris Hostetter

it is as long as you use an Analyzer (when indexing, and when parsing your query strings) that doesn't strip/convert whatever characters you consider an "end of line" (newline? linefeed?) durring tokenization. : Date: Wed, 11 May 2005 12:41:52 -0400 : From: "Govoni, Darren" <[EMAIL PROTECTED]> :

problem while merging two indexes

2005-05-11 Thread Omar Didi

hey guys, my application died while I was merging two indexes. acoording to my undestanding, if I just delete the new files that have been created while I started merging, the index won't be affected. is this true?. what will happen if i just restart the merging from where the application died?

I need 100 most frequently used words in different languages.

2005-05-11 Thread Ahmet Aksoy

Hi, I have a project which will be used in order to supply automatic dictionary helps in different languages. I'm using Lucene for indexing, and searching the words in it. It is an open source project in java at address http://belletmen.dev.java.net Now, I will prepare a function to find the natu

Zilverline Search Engine version 1.3.0 released

2005-05-11 Thread Zilverline info

All, I've just released Zilverline version 1.3.0. This version has a webservice for indexing, and is localized for the chinese language. This version is fully webbased, all settings, collections, preferences can be set via the web interface. You don't need to edit any config files anymore. Also I'm

RE: Strange results using QueryParser (?)

2005-05-11 Thread Chris Hostetter

in your query parser, you'll need to use an Analyzer that knows that "documenttype" should not be tokenized, and the raw user string entered by the user should be treated as the query Term value. you can make you own analyzer that subclass StandardAnalyzer and only does the special behavior for t

Re: categorized search

2005-05-11 Thread Chris Hostetter

well ... once you have the list of all "category" names that are in docs which match your orriginal query, you can either redo the orriginal query with "and category:" to get the counts, or you can pre-compute (and save) a BitSet for each category in your index (esay to build using a HitCollec

Re: MultiFieldQueryParser Problems about how to give the fields weight

2005-05-11 Thread Otis Gospodnetic

If you think content field is more important, you could boost it at indexing time. If you want to boost at search time, and you are using QueryParser, you could just use the term^float syntax. I think what you have down there is ok, too, but I suppose you'd need an if/else so you boost only the c

Re: AW: only getting Hits with score >= threshold

2005-05-11 Thread Otis Gospodnetic

In that case just look at the first N hits and don't even mention the rest. Otis --- Kai Gülzau <[EMAIL PROTECTED]> wrote: > >Note that it may not make sense filtering by an arbitrary score > >(normalized or not). > > I don't like the gooogle effect > with an endless amo

RE: Real time indexing with RAMDirectory

2005-05-11 Thread Otis Gospodnetic

What happens if you swap these 2 lines? System.out.println("Docs number : " + ir.numDocs()); ir.close(); If I were you, I'd try using minMergeDocs instead of RAMDirectory. It makes things much simpler. You shouldn't need to optimize the index. Otis --- Rifflar

Re: a few basic questions

2005-05-11 Thread Otis Gospodnetic

Hello, It sounds like you missed the Index Format page: http://lucene.apache.org/java/docs/fileformats.html That's the best index format documentation currently available. Otis --- Sujatha Das <[EMAIL PROTECTED]> wrote: > > Hi, > I couldn't find documentation on these issues, > so a url as

RE: Getting subpart of Lucene Query

2005-05-11 Thread Yagnesh Shah

Hi! Seema, Change your document.java so that content field is added for example: doc.add(Field.Text("contents", "some dummy text")); -Original Message- From: Seema Jain [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 11, 2005 6:20 AM To: java-user@lucene.apache.org Subject: Getting subpar

RE: indexing relational table(s)

2005-05-11 Thread Govoni, Darren

You can also leverage the 'fields' capability in lucene and perhaps match them against columns to do field-based searching. -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Wed 5/11/2005 12:50 PM To: java-user@lucene.apache.org Subject: Re: indexing relational ta

Re: indexing relational table(s)

2005-05-11 Thread Andrzej Bialecki

Dick Hollenbeck wrote: As sources of indexable text we always see HTML, XML, PDF, etc. but I have not seen much mention of relational tables as a source. Anybody know why? I think no specific reason - Lucene is able to index just pure text, anything else must go through format converters first

end of line in queries

2005-05-11 Thread Govoni, Darren

Hi, I'm trying to perform a query and ened to specify a string pattern occurring at the end of a line. Is this possible? Thanks. Darren

indexing relational table(s)

2005-05-11 Thread Dick Hollenbeck

As sources of indexable text we always see HTML, XML, PDF, etc. but I have not seen much mention of relational tables as a source. Anybody know why? We have a database with 60,000 records in 6 tables and aproximately 15 *text* fields per table. Can we use lucene to index this with JDBC being

MultiFieldQueryParser Problems about how to give the fields weight

2005-05-11 Thread luqun lou

Now Suppose,There are two fields,"content","summary",but i think the query in content field may have highter weight than the summary field. how can i do it? I overload the parse function,and add weights which store every fields weights. public static Query parse(String query,String[] fie

Re: sanity check - large, long running index updates and concurrent read-only service

2005-05-11 Thread Yonik Seeley

When created, an IndexReader opens all the segment files and hangs onto them. Any updates to the index through an IndexWriter (including commit and optimize) will not affect already open IndexReaders. -Yonik On 5/11/05, Naomi Dushay <[EMAIL PROTECTED]> wrote: > It's my impression that with optimi

RE: sanity check - large, long running index updates and concurrent read-only service

2005-05-11 Thread Naomi Dushay

It's my impression that with optimize running so long, there will be a significant period of time (many minutes) when the old IndexReader will not be able to find the segment/documents it needs. Am I wrong about that? - Naomi > Could you explain why you need to copy the index? It doesn't seem

RE: Strange results using QueryParser (?)

2005-05-11 Thread Lilja, Bjorn

Hi, Daniel's suggestions was quite correct. Is the "/" suposed to be turned into a whitespace? In that case, how do I stop it? I do wish to search for the entire exact word "Blankett/Mall". Regards, Björn _ Björn Lilja | Technology S

Getting subpart of Lucene Query

2005-05-11 Thread Seema Jain

Hi , I am using Lucene API for Text indexing , searching and highlighting .I am using Lucene SANDBOX API for highlighting of keywords . My requirement is to get the subpart of a lucene query . Lucene query , which is made up of Field-value pair. How can i get the value of a particular field ?

Re: proximity search in lucene (fwd)

2005-05-11 Thread Sujatha Das

Consider a situation in which i have indexed the terms under two different fields (say FIELD_TEXT and FIELD_SYNONYM). What if I wanted to support queries like "jaguar NEAR london", when i have indexed a document with "panthers in zoos around London". So given that Lucene doesn't support cross-fie

Re: proximity search in lucene (fwd)

2005-05-11 Thread Sujatha Das

-- Forwarded message -- Date: Fri, 1 Apr 2005 15:34:10 -0500 From: Erik Hatcher <[EMAIL PROTECTED]> Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: proximity search in lucene On Apr 1, 2005, at 2:29 PM, Sujatha Das wrote: Hi, Does Lucene support "

a few basic questions

2005-05-11 Thread Sujatha Das

Hi, I couldn't find documentation on these issues, so a url as response should be just fine. The inverted index must look like FIELD-1 term -> (doc,offset)pairs Is this correct? Say I am trying to index the documents in a corpus under two different fields. For instance, I want to store with every w

AW: only getting Hits with score >= threshold

2005-05-11 Thread Kai GÃ¼lzau

>Note that it may not make sense filtering by an arbitrary score >(normalized or not). I don't like the gooogle effect with an endless amount of paging links. ;) The user should get only the top percentage of docs/products he can handle reasonable. Regards, Kai Gü

Re: I need 100 most frequently used words in different languages.

Re: I need 100 most frequently used words in different languages.

RE: problem while merging two indexes

Re: problem while merging two indexes

Re: end of line in queries

problem while merging two indexes

I need 100 most frequently used words in different languages.

Zilverline Search Engine version 1.3.0 released

RE: Strange results using QueryParser (?)

Re: categorized search

Re: MultiFieldQueryParser Problems about how to give the fields weight

Re: AW: only getting Hits with score >= threshold

RE: Real time indexing with RAMDirectory

Re: a few basic questions

RE: Getting subpart of Lucene Query

RE: indexing relational table(s)

Re: indexing relational table(s)

end of line in queries

indexing relational table(s)

MultiFieldQueryParser Problems about how to give the fields weight

Re: sanity check - large, long running index updates and concurrent read-only service

RE: sanity check - large, long running index updates and concurrent read-only service

RE: Strange results using QueryParser (?)

Getting subpart of Lucene Query

Re: proximity search in lucene (fwd)

Re: proximity search in lucene (fwd)

a few basic questions

AW: only getting Hits with score >= threshold

28 matches

Site Navigation

Mail list logo

Footer information