Re: Possible bug in QueryParser when using CJKAnalyzer (lucene 2.4.1)

2009-06-01 Thread Koji Sekiguchi
I'm not sure this is the same case, but there is a report and patch for CJKTokenizer in JARA: https://issues.apache.org/jira/browse/LUCENE-973 Koji Zhang, Lisheng wrote: Hi, When I use lucene 2.4.1 QueryParser with CJKAnalyzer, somehow it always generates an extra space, for example, if the

Possible bug in QueryParser when using CJKAnalyzer (lucene 2.4.1)

2009-06-01 Thread Zhang, Lisheng
Hi, When I use lucene 2.4.1 QueryParser with CJKAnalyzer, somehow it always generates an extra space, for example, if the input is "ABC", the query would be: myfield"AB BC " // should be myfield:"AB BC" If I create PhraseQuery directly it does work. From Luke I know indexing works OK. In lucene

Re: Searching documents that contain a field (text of field is irrelevant)

2009-06-01 Thread balasubramanian sudaakeran
Couple of approaches. (But not very sure if there are other better approaches) 1. Add a seperate field which is set to 1 or 0 depending upon if the self description is present or not. Then you can search by this new field. 2. Along with each self-descrition add a common idenfier word. Then you ca

Searching documents that contain a field (text of field is irrelevant)

2009-06-01 Thread mattspitz
Hey! Consider a bunch of documents that represent, say, students. These students have the following attributes: 1) Student IDs 2) Name 3) Self-description (optional) So, all documents have id: and name:, but only some of the documents have an added desc: Assuming all of the fields are indexed,

RE: Distributed Lucene Questions

2009-06-01 Thread Angel, Eric
Has anyone used Katta in production? It looks very interesting and feature-rich, but I'm wondering how stable it is and whether or not it can support fine-grained queries - for example, constant score queries, MultiSearcher, etc. -Original Message- From: Ken Krugler [mailto:kkrugler_li...

Question on Efficient field updates in the Lucene index in Nutch

2009-06-01 Thread Vijay
Hi all, I have a question regarding field updates to the lucene index in nutch. Suppose I am indexing webpages along with tags as an extra field. I want to add an extra tag to a webpage. Is there a clean way for me to do this without having to re-index the page with the updated tags

Re: Sorting fields while searching!

2009-06-01 Thread Erick Erickson
It's really unclear to me what PhysicianFieldInfo.FIRST_NAME_EXACT.toString() returns. I assume the intent is to return a field name, but how that relates to FIRST_NAME_EXACT(Field.Store.YES, Field.Index.UN_TOKENIZED) doesn't mean anything to me. Could you provide some details? Note that if you s

Re: Distributed Lucene Questions

2009-06-01 Thread Ken Krugler
Hi All, I am trying to build a distributed system to build and serve lucene indexes. I came across the Distributed Lucene project- http://wiki.apache.org/hadoop/DistributedLucene https://issues.apache.org/jira/browse/HADOOP-3394 and have a couple of questions. It will be really helpful if someon

Sorting fields while searching!

2009-06-01 Thread vanshi
I have two fields initialized following way for Index writing: FIRST_NAME_EXACT(Field.Store.YES, Field.Index.UN_TOKENIZED), LAST_NAME_EXACT(Field.Store.YES, Field.Index.UN_TOKENIZED), I have a prefix query to look for any name starting with the name entered by the user. Lets say user enters 'kar

Re: No hits while searching!

2009-06-01 Thread Matthew Hall
Just build your own. Here's exactly what you are looking for: (Mind you I just whipped this out, and didn't compile it... so there could be minor syntax errors here.) You will also obviously have to make your own package declaration, and your own imports. So anyhow, the really neat thing a

Re: No hits while searching!

2009-06-01 Thread vanshi
Thanks Matt & sithu. Yes, It was due to stop word analyzer...now i'm using a simple analyzer temporarily, as I know even simple analyzer cannot handle quotes in names. However, can somebody plz direct me towards how to handle quotes with the name in query using lowercase analyzer? thanks, Vanshi

Lucene on NFS/iSCSI

2009-06-01 Thread Jordon Saardchit
So I've read a lot about nightmares with lucene over shared indices using NFS, and was curious if anyone had any experience running Lucene over iSCSI? Specifically if the same sort of lock failure issues occur as does with NFS. I'm specifically looking into multple machines mounted to a SAN v

Distributed Lucene Questions

2009-06-01 Thread Tarandeep Singh
Hi All, I am trying to build a distributed system to build and serve lucene indexes. I came across the Distributed Lucene project- http://wiki.apache.org/hadoop/DistributedLucene https://issues.apache.org/jira/browse/HADOOP-3394 and have a couple of questions. It will be really helpful if someone

Re: No hits while searching!

2009-06-01 Thread Matthew Hall
Yeah, he's gotta be. You might be better of using something like a lowercase analyzer here, since punctuation in a name is likely important. Matt Sudarsan, Sithu D. wrote: Do you use stopword filtering? Sincerely, Sithu D Sudarsan -Original Message- From: vanshi [mailto:nilu.tha

Re: Searching index problems with tomcat

2009-06-01 Thread Marco Lazzara
In order to let you know about my problem I decided to use swt for the standalone app and to use Google Web Toolkit for web app :):) bye ML 2009/5/27 N Hira > > Cool! > > 1. So you are creating a parser with { name, synonyms, propIn }, correct? > > 2. Sorry -- I meant the output of "que

RE: No hits while searching!

2009-06-01 Thread Sudarsan, Sithu D.
Do you use stopword filtering? Sincerely, Sithu D Sudarsan -Original Message- From: vanshi [mailto:nilu.tha...@gmail.com] Sent: Monday, June 01, 2009 11:39 AM To: java-user@lucene.apache.org Subject: Re: No hits while searching! Thanks Erick, I was able to get this work...as you sai

Re: No hits while searching!

2009-06-01 Thread vanshi
Thanks Erick, I was able to get this work...as you said ..Luke is a great tool to look in to what gets stored as indexes though in my case I was searching before the indexes were created so i was getting zero hits. On side note, I'm running a strange output with prefix query...it only works when

How to post date encoded in NCR(decimal) to lucene indexer?

2009-06-01 Thread KK
Hi All, I'm trying to index data to lucene index in unicode utf-8 format. All my search queries are of the form \u and its working fine. But the problem is in some cases, when the document[actually a webpage content] contains Numeric Character Reference[decimal], these are getting indexed as su