Lucene Queries Over User-Editable Dynamic Categories of Documents

2007-10-23 Thread lucene user
Folks! We are building a web-based multi-user system. Users of our system are able to categorize items that they have found into groups of related documents. We would like users to be able to search these document groups and rapidly find matches. Each user might have ten of these categories and mi

Re: Making Highlighter.mergeContiguousFragments() public

2007-10-23 Thread Mark Miller
Uh...ignore that lsat email...hit reply on the wrong one obviously...sorry. Dave Golombek wrote: I was wondering if people thought that making Highlighter.mergeContiguousFragments() public (and non-final) would be acceptable. In my application, I want to strip all fragments with score == 0 befor

Re: Meta- search descriptions

2007-10-23 Thread Chris Lu
Since you only try to index your client's pages, I think it should be doable to use regular expressions or similar to find out the meta info. Or you can ask your clients to expose some XML or RSS that you can process more easily. But still, accessing database directly will save you tons of time to

Re: Making Highlighter.mergeContiguousFragments() public

2007-10-23 Thread Mark Miller
Ahhhthe reason the second Shapes is not highlighted is that the Highlighter highlights based on what caused the hit in Lucene...and Lucene does not look for every shape within 4 paragraphs of distribution...after it finds one such occurrence it says "sweet, a match" and moves on...it does n

Re: Meta- search descriptions

2007-10-23 Thread Cool Coder
>Why not index their database directly? I should have provided about this in my first mail. Anyway, clients are ready to allow for indexing their DB, but they have some confidential data as well as information about their clients and all data are so much tightly coupled, it is difficult for them

Re: Meta- search descriptions

2007-10-23 Thread Chris Lu
Why not index their database directly? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Databa

Meta- search descriptions

2007-10-23 Thread Cool Coder
I was just looking into couple of search engines like indeed.com or bixee.com and I really got surprised the accuracy of information they have built in their indexes and also they provide for search result. I have same sort of requirement to build indexes for all my cleints site and provide

Re: Is there bug in CJKAnalyzer?

2007-10-23 Thread Steven Rowe
Hi Ivan, Ivan Vasilev wrote: > But how to understand the meaning of this: “To overcome this, you > have to index chinese characters as single tokens (this will increase > recall, but decrease precision).” > > I understand it so: To increase the results I have to use instead of > the Chinese anot

Making Highlighter.mergeContiguousFragments() public

2007-10-23 Thread Dave Golombek
I was wondering if people thought that making Highlighter.mergeContiguousFragments() public (and non-final) would be acceptable. In my application, I want to strip all fragments with score == 0 before merging the fragments (to get the minimal matching section, but still in order), and the easiest w

OSSummit Asia / ApacheCon Atlanta

2007-10-23 Thread Erik Hatcher
A bit of self-promotion, sorry but also just want to in general make Solr and Lucene users aware of upcoming training sessions at OSSummit Asia (and ApacheCon Atlanta). It's a struggle for the conference organizers to put on training sessions because of the upfront expense and risk in

Re: Sort by date with Lucene 2.2.0 ...

2007-10-23 Thread Daniel Naber
On Tuesday 23 October 2007 15:57, Dragon Fly wrote: > I tried specifying the field type using a SortField object but I got the > same result.  I'll be glad to write a stand-alone test case.  Should I > post the code to this thread when I'm done or should I submit some sort > of bug report? Thanks.

RE: Sort by date with Lucene 2.2.0 ...

2007-10-23 Thread Dragon Fly
I tried specifying the field type using a SortField object but I got the same result. I'll be glad to write a stand-alone test case. Should I post the code to this thread when I'm done or should I submit some sort of bug report? Thanks. > From: [EMAIL PROTECTED] > To: java-user@lucene.apache.

Re: Is there bug in CJKAnalyzer?

2007-10-23 Thread Ivan Vasilev
Thanks Samir :) You info was really helpful for us. I saw the index by Luke and there the Chinese signs were split in pairs as you said �C AB BC CD etc. Also when querying for ABC it is split in the query to AB BC. But how to understand the meaning of this: “To overcome this, you have to index chi