Re: Using Lucene to model ownership of documents

2016-06-16 Thread Denis Bazhenov
The speed for a and b, should be the same, at least from conceptual point of view. The number of terms generated for each scenario is equal. Therefore, index size and vocabulary size should be the same. I’m wondering why there is difference. It seems like there is some penalty for writing/readi

RE: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Mukul Ranjan
Hi Mike, Yes, we are getting indexReader instance from the active Directory. We are using MultiReader to obtain instance of indexSearcher. Thanks, Mukul From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Friday, June 17, 2016 12:56 AM To: Mukul Ranjan Cc: Lucene Users Subject:

Re: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Michael McCandless
But do you open any near-real-time readers from this writer? Mike McCandless http://blog.mikemccandless.com On Thu, Jun 16, 2016 at 1:01 PM, Mukul Ranjan wrote: > Hi Michael, > > > > Thanks for your reply. > > I’m running it on windows. I have checked my code, I’m closing IndexWriter > after a

RE: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Mukul Ranjan
Hi Michael, Thanks for your reply. I’m running it on windows. I have checked my code, I’m closing IndexWriter after adding document to it. We are not getting this issue always but it’s frequency is high in our application. Can you please provide your suggestion? Thanks, Mukul From: Michael McC

Re: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Michael McCandless
Are you running on Windows? This is not a LockFactory issue ... it's likely caused because you closed IndexWriter, and then opened a new one, before closing NRT readers you had opened from the first writer? Mike McCandless http://blog.mikemccandless.com On Thu, Jun 16, 2016 at 6:19 AM, Mukul Ra

Re:Re: Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread dr
Thank you so much, Steve. Your reply is very helpful. At 2016-06-16 23:01:18, "Steve Rowe" wrote: >Hi dr, > >Unicode’s character property model is described here: >. > >Wikipedia has a description of Unicode character properties: >

Re: Using Lucene to model ownership of documents

2016-06-16 Thread Geebee Coder
Thank you all. Michael, do you mean grouping customers by categories? (e.g. customer A has premium access and so does customer B so they will have access to same set of documents) if that's the case, unfortunately, we don't have such categories of customers, their access rights are over specific do

Re: Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread Steve Rowe
Hi dr, Unicode’s character property model is described here: . Wikipedia has a description of Unicode character properties: JFlex allows you to refer to the set of characters that have a given Unicode

Re: Using Lucene to model ownership of documents

2016-06-16 Thread Michael Wilkowski
Definitely b). I would also suggest groups and expanding user groups at user sign in time. MW On Thu, Jun 16, 2016 at 12:36 PM, Ian Lea wrote: > I'd definitely go for b). The index will of course be larger for every > extra bit of data you store but it doesn't sound like this would make much >

IndexWriterConfig.readerPooling option...

2016-06-16 Thread Ravikumar Govindarajan
Came across a JIRA filed for pooling IndexReaders https://issues.apache.org/jira/browse/LUCENE-2297 For every commit/delete/update cycle IndexWriter opens a bunch of SegmentReaders, does the job & closes it. Does the JIRA aim to re-use the SegmentReaders for all commit-cycles till they are fina

Some questions about StandardTokenizer and UNICODE Regular Expressions

2016-06-16 Thread dr
Hi guys Currenly, I'm looking into the rules of StandardTokenizer, but met some probleam. As the docs says, StandardTokenizer implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29. Also it is generated by JFlex, a lexer/sc

Re: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Ian Lea
Sounds to me like it's related to the index not having been closed properly or still being updated or something. I'd worry about that. -- Ian. On Thu, Jun 16, 2016 at 11:19 AM, Mukul Ranjan wrote: > Hi, > > I'm observing below exception while getting instance of indexWriter- > > java.lang.Ill

Re: Using Lucene to model ownership of documents

2016-06-16 Thread Ian Lea
I'd definitely go for b). The index will of course be larger for every extra bit of data you store but it doesn't sound like this would make much difference. Likewise for speed of indexing. -- Ian. On Wed, Jun 15, 2016 at 2:25 PM, Geebee Coder wrote: > Hi there, > I would like to use Lucene

LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Mukul Ranjan
Hi, I'm observing below exception while getting instance of indexWriter- java.lang.IllegalArgumentException: Directory MMapDirectory@"directoryName" lockFactory=org.apache.lucene.store.NativeFSLockFactory@1ec79746 still has pending deleted files; cannot initialize IndexWriter Is it related to