Re: new to lucene- some questions regarding internals

2015-08-11 Thread Erick Erickson
1-3 are really answered by the same explanation: When you open a searcher, lucene "knows" what all the closed segments are (i.e., the last commit point). And you can't commit when only part of a document has been written to the current segment. You can think of commits as atomic at the document le

new to lucene- some questions regarding internals

2015-08-11 Thread Yechiel Feffer
Hi 1. as I understand Lucene is preparing the documents of the search result in a lazy fashion- using the docId in the ScoreDoc. What happens if the document "pointed" by the ScoreDoc is deleted meanwhile i.e. the DocId is not relevant (maybe assigned to a different document) ? 2. when a docu

Re: new to Lucene

2015-08-07 Thread Erick Erickson
search too. > "Information Technology in Education" as in your question can be searched > as phrase query. > > Regards, > Modassar > > > On Fri, Aug 7, 2015 at 1:07 PM, Nantha Kumar Subramaniam < > nanthaku...@oum.edu.my> wrote: > >> Good day >

Re: new to Lucene

2015-08-07 Thread Modassar Ather
f single term search and phrase search too. "Information Technology in Education" as in your question can be searched as phrase query. Regards, Modassar On Fri, Aug 7, 2015 at 1:07 PM, Nantha Kumar Subramaniam < nanthaku...@oum.edu.my> wrote: > Good day > I am new to Lucene

new to Lucene

2015-08-07 Thread Nantha Kumar Subramaniam
Good day I am new to Lucene and have started to explore Lucene. I have questions: I have a book in which all the chapters are in pdf. I plan to index all these individual chapters in Lucene using Tika for the text extraction. 1. For the indexing of these chapters, how many fields that need to

Re: NEW TO LUCENE

2012-03-05 Thread Saurabh Gokhale
this message in context: > http://lucene.472066.n3.nabble.com/NEW-TO-LUCENE-tp3794428p3794428.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: java-user-unsubscr...

Re: NEW TO LUCENE

2012-03-05 Thread Shashi Kant
have worked on it. > So, if anyone could relate it and give any any start. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/NEW-TO-LUCENE-tp3794428p3794428.html > Sent from the Lucene - Java

NEW TO LUCENE

2012-03-05 Thread rahul reddy
Hi , I'm new to Lucene.Can anyone tell me how can i start learning about it with the code base. I have knowledge of endeca search engine and have worked on it. So, if anyone could relate it and give any any start. -- View this message in context: http://lucene.472066.n3.nabble.com/N

Re: new to lucene, non standard index

2011-05-06 Thread Michael Sokolov
I believe creating a large number of fields is not a good match w/the underlying architecture, and you'd be better off w/a large number of documents/small number of fields, where the same field occurs in every document. There is some discussion here: http://markmail.org/message/hcmt5syca7zdeac

Re: new to lucene, non standard index

2011-05-05 Thread Chris Schilling
ts that I >>>> indexed for the given query keyword. >>>> >>>>private static final QueryParser parser = new >>>> QueryParser(Version.LUCENE_30, "keywords", new >>>> StandardAnalyzer(Version.LUCENE_30)); >>>> &

Re: new to lucene, non standard index

2011-05-05 Thread Mike Sokolov
for(ScoreDoc scoreDoc : hits.scoreDocs) { Document doc = this.is.doc(scoreDoc.doc); String hash = doc.get("hash"); System.out.println(hash + " " + doc.get(q+"sortby") + " " + hash); } } I am p

Re: new to lucene, non standard index

2011-05-05 Thread Chris Schilling
; System.out.println("Found " + hits.totalHits + >> " document(s) (in " + (end - start) + >> " milliseconds) that matched query '" + >> q + "&

Re: new to lucene, non standard index

2011-05-05 Thread Chris Schilling
long end = System.currentTimeMillis(); >> System.out.println("Found " + hits.totalHits + >> " document(s) (in " + (end - start) + >> " milliseconds) that matched query '" + >>

Re: new to lucene, non standard index

2011-05-05 Thread Mike Sokolov
oc : hits.scoreDocs) { Document doc = this.is.doc(scoreDoc.doc); String hash = doc.get("hash"); System.out.println(hash + " " + doc.get(q+"sortby") + " " + hash); } } I am pretty new

new to lucene, non standard index

2011-05-05 Thread Chris Schilling
ry '" + q + "':"); for(ScoreDoc scoreDoc : hits.scoreDocs) { Document doc = this.is.doc(scoreDoc.doc); String hash = doc.get("hash"); System.out.println(hash

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Matthew, Ok, thanks for the clarifications. When I have some quiet time, I'll try to re-do the tests I did earlier and post back if any questions. Thanks again, Jim Matthew Hall wrote: > Oh.. no. > > If you specifically include a fieldname: blah in your clause, you don't > need a Mult

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Oh.. no. If you specifically include a fieldname: blah in your clause, you don't need a MultiFieldQueryParser. The purpose of the MFQP is to turn queries like this "blah" automatically into this "field1: blah" AND "field2: blah" AND "field3: blah" (Or OR if you set it up properly) When you

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Matthew, I'll keep your comments in mind, but I'm still confused about something. I currently haven't changed much in the demo, other than adding that doc.add for "summary". With JUST that doc.add, having done my reading, I kind of expected NOT to be able to search on the "summary" at all, but

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
You can choose to do either, Having items in multiple fields allows you to apply field specific boosts, thusly making matches to certain fields more important to others. But, if that's not something that you care about the second technique is useful in that it vastly simplifies your index str

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Hi Matthew and Ian, Thanks, I'll try that, but, in the meantime, I've been doing some reading (Lucene in Action), and on pg. 159, section 5.3, it discusses "Querying on multiple fields". I was just about to try to what's described in that section, i.e., using MultiFieldQueryParser.parse(), o

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Yeah, Ian has it nailed on the head here. Can't believe I missed it in the initial writeup. Matt Ian Lea wrote: Jim Glancing at SearchFiles.java I can see Analyzer analyzer = new StandardAnalyzer(); ... QueryParser parser = new QueryParser(field, analyzer); ... Query query = parser.parse(li

Re: New to Lucene - some questions about demo

2009-07-28 Thread Ian Lea
Jim Glancing at SearchFiles.java I can see Analyzer analyzer = new StandardAnalyzer(); ... QueryParser parser = new QueryParser(field, analyzer); ... Query query = parser.parse(line); so any query term you enter will be run through StandardAnalyzer which will, amongst other things, convert it t

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Ian and Matthew, I've tried "foofoo", "summary:foofoo", "FooFoo", and "summary:FooFoo". No results returned for any of those :(. Also, Matthew, I bounced Tomcat after running IndexFiles, so I don't think that's the problem either :(... I looked at the SearchFiles.java code, and it looks like

Re: New to Lucene - some questions about demo

2009-07-28 Thread Ian Lea
Hi Field.Index.NOT_ANALYZED means it will be stored as is i.e. "FooFoo" in your example, and if you search for "foofoo" it won't match. A search for "FooFoo" would, assuming that your search terms are not being lowercased. -- Ian. On Tue, Jul 28, 2009 at 1:56 PM, Ohaya wrote: > Hi, > > I'm

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Oh, also check to see which Analyzer the demo webapp/indexer is using. Its entirely possible the analyzer that has been chosen isn't lowercasing input, which could also cause you issues. I'd be willing to bet your issue lies in one of these two problems I've mentioned ^^ Matt Matthew Hall

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Restart tomcat. When the indexes are read in at initialization time they are a snapshot of what the indexes contained at that moment. Unless the demo specifically either closes its IndexReader and creates a new one, or calls IndexReader.reopen periodically (Which I don't remember it doing) y

New to Lucene - some questions about demo

2009-07-28 Thread Ohaya
Hi, I'm just starting to work with Lucene, and I guess that I learn best by working with code, so I've started with the demos in the Lucene distribution. I got the IndexFiles.java and IndexHTML.java working, and also the luceneweb.war is deployed to Tomcat. I used IndexFiles.java to index

RE: search a subdirectory (New to Lucene)

2006-02-23 Thread John Hamilton
EMAIL PROTECTED] Sent: Wednesday, February 22, 2006 3:18 PM To: java-user@lucene.apache.org Subject: Re: search a subdirectory (New to Lucene) I presume by saying "subdirectory" you're referring to filesystem directories and you're indexing a directory tree of files. If you

Re: search a subdirectory (New to Lucene)

2006-02-22 Thread Erik Hatcher
or each entire file. Slicing the granularity of a domain into Documents is a fascinating topic :) Erik On Feb 22, 2006, at 1:00 PM, John Hamilton wrote: I'm new to Lucene and was wondering what is the best way to perform a search on a subdirectory or subdirectories within t

search a subdirectory (New to Lucene)

2006-02-22 Thread John Hamilton
I'm new to Lucene and was wondering what is the best way to perform a search on a subdirectory or subdirectories within the index? My thought at this point is to build a query to first search for files in the required directory(ies) and then use that query to make a QueryFilter and use