Sanne,

That error looks suspiciously familiar to an old Lucene error they had. Could they have regressed?

John Griffin

On Sep 27, 2009 2:00pm, Łukasz Moreń <lukasz.mo...@gmail.com> wrote:
You can try to incease TURNS_NUM (I've tried with 1000) and THREADS_NUM (200) fields in InfinispanDirectoryTest to make it more propable. Same problem appears also in InfinispanDirectoryProviderTest

An example stacktrace is:


21:22:44,441 ERROR InfinispanDirectoryTest:142 - Error
java.io.IOException: File [ segments_nl ] for index [ indexName ] was not found at org.hibernate.search.store.infinispan.InfinispanIndexIO$InfinispanIndexInput.(InfinispanIndexIO.java:79)

at org.hibernate.search.store.infinispan.InfinispanDirectory.openInput(InfinispanDirectory.java:201)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:214)
at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:95)

at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653) at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)
at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)

at org.apache.lucene.index.IndexReader.open(IndexReader.java:227)
at org.apache.lucene.search.IndexSearcher.(IndexSearcher.java:55)
at org.hibernate.search.test.directoryProvider.infinispan.CacheTestSupport.doReadOperation(CacheTestSupport.java:106)

at org.hibernate.search.test.directoryProvider.infinispan.InfinispanDirectoryTest$InfinispanDirectoryThread.run(InfinispanDirectoryTest.java:130)

Cheers,
Lukasz

2009/9/27 Sanne Grinovero sanne.grinov...@gmail.com>

Hi Łukasz,

I'm unable to reproduce the problem, you said it happens randomly:

I've tried several times

and I'm not getting errors. Do you know something I could do to make it happen?

Could you share a stacktrace?



Anyway if you are confident it's about the segments getting lost when

they are still being read,

you could introduce a per-segment counter of usage; like it starts at

value 1 to mark the segment

as "most current", gets a +1 vote at each reader opening it, -1

closing, and -1 deleting.

Each decrement method should check for the value reaching 0 to really delete it,

and this counting method would be easy to add inside the Directory.

When opening a new indexReader, you

1) get the SegmentsInfo

2) increment all counters (eager-lock, verify>0 or retry : set changed

counters back and get a new SegmentsInfo-->1)

3) get the needed segments



Getting a counter should be much faster than getting a segment in case

the data is downloaded

from another node, so we can use a different key while still relating

to the segment.



Sanne



2009/9/23 Łukasz Moreń lukasz.mo...@gmail.com>:


> I agree that Infinispan case is not much different from RamDirectory. The

> major difference is that in RD (also FileDirectory) changes are not batched

> like in ID. If I do not wrap changes in InfinispanDirectory(simple remove

> tx.begin() from obtain() method and tx.commit() from release() in

> InfinispanLock), and immediately commit every change made by IW it works

> well. Hovewer it makes indexing really slower, because of frequent

> replication to other nodes.

> Sanne it's good remark that IW commit is kind of flush.

>

> I've attached patch with InfinispanDirectory, failing test is

> testDirectoryWithMultipleThreads in InfinispanDirectoryTest class. It fails

> randomly. I think problem is Infinispan commit on lockRelease() in

> org.apache.lucene.index.IndexWriter (line 1658) is after IW commit() (line

> 1654).

>

>> Is it because, the IndexWriter only clean files if no indexReaders are

>> reading them (how would that be detected)?

>

> It can happen if IndexWriter clean file, and IndexReader try to access that

> cleaned file.

>

> 2009/9/23 Sanne Grinovero sanne.grinov...@gmail.com>

>>

>> I agree It should work the same way; The IndexWriter cleans files

>> whenever it likes to, it doesn't try to detect readers, and this

>> shouldn't have any effect on the working of readers.

>> The IndexReader opens the "SegmentsInfo" first, and immediately

>> after** gets a reference to the segments listed in this SegmentsInfo.

>> No IndexWriter will ever change an existing segment, only add new

>> files or eventually delete old ones (segments merge,optimize).

>> The deletion of segments is the interesting subject: when using Files

>> it uses "delete at last close", which works because the IR needing it

>> have it opened already**; when using the RAMDirectory they have a

>> reference preventing garbage collection.

>>

>> ( the two "**" are assuming the same event occurred correctly,

>> otherwise an exception is thrown at opening)

>>

>> When using Infinispan it shouldn't be much different than the

>> RAMDirectory? so even if the needed segment is deleted, the IR holds a

>> reference to the Java object locally since it was opened.

>>

>> Łukcasz, do you have some failing test?

>>

>> Sanne

>>

>> 2009/9/23 Emmanuel Bernard emman...@hibernate.org>:

>> > Conceptually I don't understand why it does work in a pure file system

>> > directory (ie IndexReader can go and process queries with the

>> > IndexWriter

>> > goes about its business) and not when using Infinispan.

>> > Is it because, the IndexWriter only clean files if no indexReaders are

>> > reading them (how would that be detected)?

>> > On 22 sept. 09, at 20:46, Łukasz Moreń wrote:

>> >

>> > I need to provide this same lifecycle for IndexWriter as for Infinispan

>> > tx -

>> > IW is created: tx is started, IW is commited: tx is commited. It assures

>> > that IndexReader doesn't read old data from directory.

>> > Infinispan transaction can be started when IW acquires the lock, but its

>> > commit on IW lock release, as it is done so far, causes a problem:

>> >

>> > index writer close {

>> > index writer commit(); //changes are visible for IndexReaders

>> >

>> > //Index reader starts reading here, ie tries to access file "A"

>> >

>> > index writer lockRelease(); //changes in Infinispan directory are

>> > commited, file "A" was removed, IndexReader cannot find it and crashes

>> > }

>> >

>> > I think Infinispan tx have to be commited just before IW commit, and the

>> > problem is where to put in code.

>> >

>> > W dniu 22 września 2009 18:24 użytkownik Emmanuel Bernard

>> > emman...@hibernate.org> napisał:

>> >>

>> >> Can you explain in more details what is going on.

>> >> Aside from that Workspace has been Sanne's baby lately so he will be

>> >> the

>> >> best to see what design will work in HSearch. That being said, I don't

>> >> like

>> >> the idea of subclassing / overriding very much. In my experience, it

>> >> has

>> >> lead to more bad and unmaintainable code than anything else.

>> >> On 22 sept. 09, at 02:16, Łukasz Moreń wrote:

>> >>

>> >> Hi,

>> >>

>> >> Thanks for explanation.

>> >> Maybe better I will concentrate on the first release and postpone

>> >> distributed writing.

>> >>

>> >> There is already LockStrategy that uses Infinispan. With using it I was

>> >> wrapping changes made by IndexWriter in Infinispan transaction, because

>> >> of

>> >> performance reasons -

>> >> on lock obtaining transaction was started, on lock release transaction

>> >> was

>> >> commited. Hovewer Ispn transaction commit on lock release is not good

>> >> idea

>> >> since IndexWriter calls index commit before lock is released(and ispn

>> >> transaction is committed).

>> >> I was thinking to override Workspace class and getIndexWriter(start

>> >> infinispan tx), commitIndexWriter (commit tx) methods to wrap

>> >> IndexWrite

>> >> lifecycle, but this needs few other changes. Some other ideas?

>> >>

>> >> Cheers,

>> >> Lukasz

>> >>

>> >> 2009/9/21 Sanne Grinovero sanne.grinov...@gmail.com>

>> >>>

>> >>> Hi Łukasz,

>> >>> you've rightful concerns, because the way the IndexWriter tries to

>> >>> achieve the lock

>> >>> that will bring some trouble; As far as I remember we decided in this

>> >>> first release

>> >>> to avoid multiple writer nodes because of this reasons

>> >>> (that's written in your docs?)

>> >>>

>> >>> Actually it shouldn't be very hard to do, as the LockStrategy is

>> >>> pluggable (see changes from HSEARCH-345)

>> >>> and you could implement one delegating to an Infinispan eager lock on

>> >>> some key,

>> >>> like the default LockStrategy takes a file lock in the index

>> >>> directory.

>> >>>

>> >>> Maybe it's simpler to support this distributed writing instead of

>> >>> sending the queue to some single

>> >>> (elected) node? Would be cool, as the Document Analysis effort would

>> >>> be distributed,

>> >>> but I have no idea if this would be more or less efficient than a

>> >>> single node writing; it could

>> >>> bring some huge data transfers along the wire during segments merging

>> >>> (basically fetching

>> >>> the whole index data at each node performing a segment merge); maybe

>> >>> you'll need to

>> >>> play with IndexWriter settings (

>> >>>

>> >>>

>> >>> http://docs.jboss.org/hibernate/stable/search/reference/en/html_single/#lucene-indexing-performance


>> >>> )

>> >>> probably need to find the sweet spot for "merge_factor".

>> >>> I just saw now that MergePolicy is now re-implementable, but I hope

>> >>> that won't be needed.

>> >>>

>> >>> Sanne

>> >>>

>> >>> 2009/9/21 Łukasz Moreń lukasz.mo...@gmail.com>:

>> >>> > Hi,

>> >>> >

>> >>> > I'm wondering if it is reasonable to have multiple threads/nodes

>> >>> > that

>> >>> > modifies indexes in Lucene Directory based on Infinispan? Let's

>> >>> > assume

>> >>> > that

>> >>> > two nodes try to update index in this same time. First one creates

>> >>> > IndexWriter and obtains

>> >>> > write lock. There is high propability that second node throws

>> >>> > LockObtainFailedException (as one IndexWriter is allowed on single

>> >>> > index)

>> >>> > and index is not modified. How is that? Should be always only one

>> >>> > node

>> >>> > that

>> >>> > makes changes in

>> >>> > the index?

>> >>> >

>> >>> > Cheers,

>> >>> > Lukasz

>> >>> >

>> >>> > W dniu 15 września 2009 01:39 użytkownik Łukasz Moreń

>> >>> > lukasz.mo...@gmail.com> napisał:

>> >>> >>

>> >>> >> Hi,

>> >>> >>

>> >>> >> With using JMeter I wanted to check if Infinispan dir does not

>> >>> >> crash

>> >>> >> under

>> >>> >> heavy load in "real" use and check performance in comparison with

>> >>> >> none/other

>> >>> >> directories.

>> >>> >> However appeared problem when multiple IndexWriters tries to modify

>> >>> >> index

>> >>> >> (test InfinispanDirectoryTest) - random deadlocks, and Lucene

>> >>> >> exceptions.

>> >>> >> IndexWriter tries to access files in index that were removed

>> >>> >> before.

>> >>> >> I'm

>> >>> >> looking into it, but not having good idea.

>> >>> >>

>> >>> >> Concerning the last part, I think similar thing is done in

>> >>> >> InfinispanDirectoryProviderTest. Many threads are making changes

>> >>> >> and

>> >>> >> searching (not checking if db is in sync with index).

>> >>> >> If threads finish their work, with Lucene query I'm checking if

>> >>> >> index

>> >>> >> contains as many results as expected. Maybe you meant something

>> >>> >> else?

>> >>> >> Would be good to run each node in different VM.

>> >>> >>

>> >>> >>> Great ! Looking forward to it. What state are things in at the

>> >>> >>> moment

>> >>> >>> if I want to play around with it ?

>> >>> >>

>> >>> >> Should work with with one master(updates index) and one many slave

>> >>> >> nodes

>> >>> >> (sends changes to master). I tried with one master and one slave

>> >>> >> (both

>> >>> >> with

>> >>> >> jms and jgroups backend) and worked ok. Still fails if multiple

>> >>> >> nodes

>> >>> >> want

>> >>> >> to modify index.

>> >>> >>

>> >>> >> I've attached patch with current version.

>> >>> >>

>> >>> >> Cheers,

>> >>> >> Łukasz

>> >>> >>

>> >>> >> 2009/9/13 Michael Neale michael.ne...@gmail.com>

>> >>> >>>

>> >>> >>> Great ! Looking forward to it. What state are things in at the

>> >>> >>> moment

>> >>> >>> if I want to play around with it ?

>> >>> >>>

>> >>> >>> Sent from my phone.

>> >>> >>>

>> >>> >>> On 13/09/2009, at 7:26 PM, Sanne Grinovero

>> >>> >>> sanne.grinov...@gmail.com>

>> >>> >>> wrote:

>> >>> >>>

>> >>> >>> > 2009/9/12 Michael Neale michael.ne...@gmail.com>:

>> >>> >>> >> That does sounds pretty cool. Would be nice if the lucene

>> >>> >>> >> indexes

>> >>> >>> >> could scale along with how people will want to use infinispan.

>> >>> >>> >> Probably worth playing with.

>> >>> >>> >

>> >>> >>> > Sure, this is the goal of Łukasz's work; We know compass has

>> >>> >>> > some good Directories, but we're building our own as one based

>> >>> >>> > on Infinispan is not yet available.

>> >>> >>> >

>> >>> >>> >>

>> >>> >>> >> Sent from my phone.

>> >>> >>> >>

>> >>> >>> >> On 13/09/2009, at 8:37 AM, Jeff Ramsdale

>> >>> >>> >> jeff.ramsd...@gmail.com>

>> >>> >>> >> wrote:

>> >>> >>> >>

>> >>> >>> >>> I'm afraid I haven't followed the Infinispan-Lucene

>> >>> >>> >>> implementation

>> >>> >>> >>> closely, but have you looked at the Compass Project?

>> >>> >>> >>> (http://www.compass-project.org/overview.html) It provides a

>> >>> >>> >>> simplified interface to Lucene (optional) as well as Directory

>> >>> >>> >>> implementations built on Terracotta, Gigaspaces and Coherence.

>> >>> >>> >>> The

>> >>> >>> >>> latter, in particular, might be a useful guide for the

>> >>> >>> >>> Infinispan

>> >>> >>> >>> implementation. I believe it's mature enough to have solved

>> >>> >>> >>> many

>> >>> >>> >>> of

>> >>> >>> >>> the most difficult problems of implementing Directory on a

>> >>> >>> >>> distributed

>> >>> >>> >>> Map.

>> >>> >>> >>>

>> >>> >>> >>> If someone has any experience with Compass (particularly it's

>> >>> >>> >>> Directory implementations) I'd be interested in hearing about

>> >>> >>> >>> it...

>> >>> >>> >>> It's Apache 2.0 licensed, btw.

>> >>> >>> >>>

>> >>> >>> >>> -jeff

>> >>> >>> >>> _______________________________________________

>> >>> >>> >>> infinispan-dev mailing list

>> >>> >>> >>> infinispan-...@lists.jboss.org

>> >>> >>> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev

>> >>> >>> >> _______________________________________________

>> >>> >>> >> infinispan-dev mailing list

>> >>> >>> >> infinispan-...@lists.jboss.org

>> >>> >>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev

>> >>> >>> >>

>> >>> >>> >

>> >>> >>> > _______________________________________________

>> >>> >>> > infinispan-dev mailing list

>> >>> >>> > infinispan-...@lists.jboss.org

>> >>> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev

>> >>> >>>

>> >>> >>> _______________________________________________

>> >>> >>> infinispan-dev mailing list

>> >>> >>> infinispan-...@lists.jboss.org

>> >>> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev

>> >>> >

>> >>> >

>> >>

>> >>

>> >

>> >

>> >

>

>






_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Reply via email to