Re: Hungarian notation analyzer and phrase queries

2005-04-13 Thread Paul Smith
Thanks for your help guys! If you put the term query at position 2 then you need slop to find "Use PowerQuery for advanced searches", which is the exact text in the document. I think I'd rather have that phrase query work without any slop, and require some slop for "use power query for advanced

SearchBlox J2EE Search Expands Language Support in Version 2.2

2005-04-13 Thread Robert Selvaraj
SearchBlox is a J2EE Search Component that delivers out-of-the-box search functionality for fast and easy implementation in your websites, applications, intranets and portals. SearchBlox uses the Lucene Search API and incorporates integrated HTTP/HTTPS and File System crawlers, support for various

Re: Strange sort error

2005-04-13 Thread Daniel Naber
On Tuesday 12 April 2005 20:04, Bill Tschumy wrote: > Here is a small program that will manifest the error. ÂHopefully  > someone can explain the problem. ÂIt happens with Lucene 1.4.2 and  > 1.4.3. This is the code that throws the exception (from FieldCacheImpl.java): TermEnum termEnum = re

Re: lucene - nutch to rss?

2005-04-13 Thread Michael Wechner
Erik Hatcher wrote: That's really more appropriate to the Nutch list, not the Lucene user list. right. I realised at the very last moment (when it was too late ;-) that the address was the lucene list instead the nutch list Re the XSLT one then could offer various XSLTs in order to cover the

Re: lucene - nutch to rss?

2005-04-13 Thread Erik Hatcher
On Apr 13, 2005, at 3:06 PM, Michael Wechner wrote: Michael Giles wrote: Zak, Doing such a thing is pretty simple. If you take the Nutch sample, there is a JSP page which handles queries and renders the search results. I think it would make sense to enhance the current JSP by introducing a "fo

ArrayIndexOutOfBounds exception

2005-04-13 Thread Bill Tschumy
I am using the MoreLikeThis class that is available in the "similarity" package in the contributed software area. It has bee working fine, but I just received the following exception java.lang.ArrayIndexOutOfBoundsException: -1 at org.apache.lucene.index.MultiReader.getTermFreqVector(Multi

Re: lucene - nutch to rss?

2005-04-13 Thread Michael Wechner
Michael Giles wrote: Zak, Doing such a thing is pretty simple. If you take the Nutch sample, there is a JSP page which handles queries and renders the search results. I think it would make sense to enhance the current JSP by introducing a "format" parameter, e.g. nutch?format=xml whereas defau

Re: Hungarian notation analyzer and phrase queries

2005-04-13 Thread Chris Hostetter
: Another approach would be to index this as: : : token: use power query for advanced searches : powerquery : position:0 1 2 3 45 : : Then use phrase queries with slop=1, to permit a one-token gap when : someone searches for "use powe

Re: Hungarian notation analyzer and phrase queries

2005-04-13 Thread Doug Cutting
Paul Smith wrote: I have written a custom analyzer to tokenize PowerQuery into 'power', 'query, and 'powerquery' and change the position increment to 0, but I don't quite get the desired behavior. The phrase query "use power query for advanced searches" does not match, but "use query for advanced

Re: lucene - nutch to rss?

2005-04-13 Thread Erik Hatcher
On Apr 13, 2005, at 11:54 AM, jazdrv wrote: hello, this is my first posting to this thread, and i haven't played with the libraries as of yet. i'm curious whether people have been using lucene/nutch to convert results to rss and what would be architectural considerationis/time period in doing somet

RE: Searching an NTFS File Server

2005-04-13 Thread Maher Martin
Does the following concept sound reasonable? * The Lucene index generator would run under a windows account that has full read access to all files stored on the NTFS file server. * For each file, the following information would have to be extracted: - the contents of the file and, - the ac

Re: lucene - nutch to rss?

2005-04-13 Thread Michael Giles
I think you are over-complicating it. There is no need to turn the HTML into RSS. Once you're in HTML, you've gone too far. You just need to change the code in the JSP page that currently renders HTML so that it renders XML/RSS instead. Then pass requests to your new JSP page when folks wan

RE: lucene - nutch to rss?

2005-04-13 Thread jazdrv
ok, that's comforting to hear. i've been trying to figure out the particular piece that does that html to rss conversion. what have other people been using? Something like informa or am I over-complicating myself? ~z -Original Message- From: Michael Giles [mailto:[EMAIL PROTECTED] Sent:

Re: lucene - nutch to rss?

2005-04-13 Thread Michael Giles
Zak, Doing such a thing is pretty simple. If you take the Nutch sample, there is a JSP page which handles queries and renders the search results. You can easily copy and alter that page to render the results in RSS format instead of HTML. -Mike jazdrv wrote: hello, this is my first posting to

lucene - nutch to rss?

2005-04-13 Thread jazdrv
hello, this is my first posting to this thread, and i haven't played with the libraries as of yet. i'm curious whether people have been using lucene/nutch to convert results to rss and what would be architectural considerationis/time period in doing something like that. zak

Re: Dynamic index building is expensive

2005-04-13 Thread Erik Hatcher
On Apr 13, 2005, at 6:34 AM, Ranjan K. Baisak wrote: Hello, My application used swing and a data base application. For searching mechanism, I'm using Lucene. I used to build the index during application startup and any change to the DB also makes change the index. So I have a thread which looks if

RE: Searching an NTFS File Server

2005-04-13 Thread Peter Veentjer - Anchor Men
-Oorspronkelijk bericht- Van: Maher Martin [mailto:[EMAIL PROTECTED] Verzonden: woensdag 13 april 2005 12:39 Aan: java-user@lucene.apache.org Onderwerp: Searching an NTFS File Server -How does one generate the list of results, so that the list contains only entries that the use

Re: TR : information

2005-04-13 Thread Chris Lamprecht
Hi Arnaud, > First, I have to index different things; documents files and bdd tables > to make a search on all theses elements. Is it possible? How can I index > both elements? For the search, I have to use a MultiSearcher ? Yes, you can store different types of documents (i.e., they have differe

Re: Searching an NTFS File Server

2005-04-13 Thread mark harwood
I have used JCIFS before (http://jcifs.samba.org/) to handle single-sign-on of Windows clients to my web apps and it works very well. There is a whole bunch of file-access stuff in this package too which could possibly help with identifying who can see what. Send instant messages to your online f

Searching an NTFS File Server

2005-04-13 Thread Maher Martin
Hello all, We're currently evaluating search tools to cover the following requirement: We have an NTFS file server with 2 TB of files (word, excel, pdf, txt, etc). We would like to index all these files and integrate a simple web application into our intranet which will allow users to login

RE: Dynamic index building is expensive

2005-04-13 Thread Peter Veentjer - Anchor Men
You could create a change-table in the database where all changes are registered (you could add triggers to the database that registers changes). Now your searchengine only has to check the changetable and retrieve the changed data (and index it). -Oorspronkelijk bericht- Van: Ranjan K. B

TR : information

2005-04-13 Thread arnaudbuffet
Hello, Today I need few explanations about the possibilities of lucene in order to implement development on an application. First, I have to index different things; documents files and bdd tables to make a search on all theses elements. Is it possible? How can I index both elements? For the sea

Dynamic index building is expensive

2005-04-13 Thread Ranjan K. Baisak
Hello, My application used swing and a data base application. For searching mechanism, I'm using Lucene. I used to build the index during application startup and any change to the DB also makes change the index. So I have a thread which looks if any change has occurred in DB and if so then updates

Re: Hungarian notation analyzer and phrase queries

2005-04-13 Thread Peter Hotm. N�rregaard
Chris wrote ...As Erik points out in that thread, when dealing with a dictionary of "singleword" => ["multi" "word"], and ["multi" "word"] => "singleword" synonyms a very good/simple approach is to use an analyzer that allways normalizes down to the single word version (as a single token) This allo

RE: Lucene on Linux problem...

2005-04-13 Thread Kristian Ottosen
> I remember strange problems (with mkdir in my case) when using java > servlets within a tomcat that was started by apache daemon. > The problem occured on some linux installations only. As a fact this is a servlet running under Tomcat - so you may have a good point there. But I don't think it is

RE: Lucene on Linux problem...

2005-04-13 Thread Karthik N S
Hi Guys Apologies. We had something similar problems with Tomcat 5.0.3 on Linux Gentoo and Lucene1.4.3... every first Search was not able to work properly, So finally we swithched JVM's from 1.4.2 to 1.4.3 sdk's and solved the problem for the day We also updated the Linux Machines

Re: zero boost / zero score

2005-04-13 Thread Paul Elschot
On Tuesday 12 April 2005 23:41, Yonik Seeley wrote: > It seems like different search methods treat zero scoring docs a > little differently. Is this OK? > > Some search methods on IndexSearcher check for score > 0.0f : > scorer.score(new HitCollector() { > public final void collect(in