Re:Re: Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread dr
Thank you so much, Steve. Your reply is very helpful. At 2016-06-16 23:01:18, "Steve Rowe" wrote: >Hi dr, > >Unicode’s character property model is described here: >. > >Wikipedia has a description of Unicode character properties: >

Re: Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread Steve Rowe
Hi dr, Unicode’s character property model is described here: . Wikipedia has a description of Unicode character properties: JFlex allows you to refer to the set of characters that have a given Unicode

Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread dr
Hi guys Currenly, I'm looking into the rules of StandardTokenizer, but met some probleam. As the docs says, StandardTokenizer implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29. Also it is generated by JFlex, a lexer/sc

Re: new to lucene- some questions regarding internals

2015-08-11 Thread Erick Erickson
1-3 are really answered by the same explanation: When you open a searcher, lucene "knows" what all the closed segments are (i.e., the last commit point). And you can't commit when only part of a document has been written to the current segment. You can think of commits as atomic at the document le

new to lucene- some questions regarding internals

2015-08-11 Thread Yechiel Feffer
Hi 1. as I understand Lucene is preparing the documents of the search result in a lazy fashion- using the docId in the ScoreDoc. What happens if the document "pointed" by the ScoreDoc is deleted meanwhile i.e. the DocId is not relevant (maybe assigned to a different document) ? 2. when a docu

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Matthew, Ok, thanks for the clarifications. When I have some quiet time, I'll try to re-do the tests I did earlier and post back if any questions. Thanks again, Jim Matthew Hall wrote: > Oh.. no. > > If you specifically include a fieldname: blah in your clause, you don't > need a Mult

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Oh.. no. If you specifically include a fieldname: blah in your clause, you don't need a MultiFieldQueryParser. The purpose of the MFQP is to turn queries like this "blah" automatically into this "field1: blah" AND "field2: blah" AND "field3: blah" (Or OR if you set it up properly) When you

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Matthew, I'll keep your comments in mind, but I'm still confused about something. I currently haven't changed much in the demo, other than adding that doc.add for "summary". With JUST that doc.add, having done my reading, I kind of expected NOT to be able to search on the "summary" at all, but

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
You can choose to do either, Having items in multiple fields allows you to apply field specific boosts, thusly making matches to certain fields more important to others. But, if that's not something that you care about the second technique is useful in that it vastly simplifies your index str

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Hi Matthew and Ian, Thanks, I'll try that, but, in the meantime, I've been doing some reading (Lucene in Action), and on pg. 159, section 5.3, it discusses "Querying on multiple fields". I was just about to try to what's described in that section, i.e., using MultiFieldQueryParser.parse(), o

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Yeah, Ian has it nailed on the head here. Can't believe I missed it in the initial writeup. Matt Ian Lea wrote: Jim Glancing at SearchFiles.java I can see Analyzer analyzer = new StandardAnalyzer(); ... QueryParser parser = new QueryParser(field, analyzer); ... Query query = parser.parse(li

Re: New to Lucene - some questions about demo

2009-07-28 Thread Ian Lea
Jim Glancing at SearchFiles.java I can see Analyzer analyzer = new StandardAnalyzer(); ... QueryParser parser = new QueryParser(field, analyzer); ... Query query = parser.parse(line); so any query term you enter will be run through StandardAnalyzer which will, amongst other things, convert it t

Re: New to Lucene - some questions about demo

2009-07-28 Thread ohaya
Ian and Matthew, I've tried "foofoo", "summary:foofoo", "FooFoo", and "summary:FooFoo". No results returned for any of those :(. Also, Matthew, I bounced Tomcat after running IndexFiles, so I don't think that's the problem either :(... I looked at the SearchFiles.java code, and it looks like

Re: New to Lucene - some questions about demo

2009-07-28 Thread Ian Lea
Hi Field.Index.NOT_ANALYZED means it will be stored as is i.e. "FooFoo" in your example, and if you search for "foofoo" it won't match. A search for "FooFoo" would, assuming that your search terms are not being lowercased. -- Ian. On Tue, Jul 28, 2009 at 1:56 PM, Ohaya wrote: > Hi, > > I'm

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Oh, also check to see which Analyzer the demo webapp/indexer is using. Its entirely possible the analyzer that has been chosen isn't lowercasing input, which could also cause you issues. I'd be willing to bet your issue lies in one of these two problems I've mentioned ^^ Matt Matthew Hall

Re: New to Lucene - some questions about demo

2009-07-28 Thread Matthew Hall
Restart tomcat. When the indexes are read in at initialization time they are a snapshot of what the indexes contained at that moment. Unless the demo specifically either closes its IndexReader and creates a new one, or calls IndexReader.reopen periodically (Which I don't remember it doing) y

New to Lucene - some questions about demo

2009-07-28 Thread Ohaya
Hi, I'm just starting to work with Lucene, and I guess that I learn best by working with code, so I've started with the demos in the Lucene distribution. I got the IndexFiles.java and IndexHTML.java working, and also the luceneweb.war is deployed to Tomcat. I used IndexFiles.java to index

Re: Some questions...

2007-10-01 Thread Karl Wettin
1 okt 2007 kl. 14.41 skrev sandeep chawla: 2- Is there a way I can get the term.docFrq() for a particular set of documents.. Using TermDocs or the TermFreqVector. -- karl - To unsubscribe, e-mail: [EMAIL PROTECTED] For a

Some questions...

2007-10-01 Thread sandeep chawla
Hi, I want to ask two question here- 1- Does lucene provide a tokenizer which can use string as a delimiter if not , someone please give me some gyan :) about how to do it. 2- Is there a way I can get the term.docFrq() for a particular set of documents.. i mean if i have a 100 documents and

Re: Some questions on transactions

2007-09-12 Thread Michael McCandless
"Simon Wistow" <[EMAIL PROTECTED]> wrote: > I'm looking at doing a system which is looks something like this - I > have an IndexSearcher open with a on-disk index but all writes go to a > RAM based IndexWriter. Periodically I do > > 1. Close IndexSearcher > 2. Open new IndexWriter i

Some questions on transactions

2007-09-12 Thread Simon Wistow
I'm looking at doing a system which is looks something like this - I have an IndexSearcher open with a on-disk index but all writes go to a RAM based IndexWriter. Periodically I do 1. Close IndexSearcher 2. Open new IndexWriter in same location 3. Use addIndexes with old