Yes, using SerialMergeScheduler helped. Thanks! Restriction on merge scheduler should be set in IndexWriterSetting enum, similarly to i.e. max_buffered_delete_terms or max_buffered_docs properties?
2009/8/14 Emmanuel Bernard <emman...@hibernate.org> > OK so let's try something.Lukasz, can you try and use > the SerialMergeScheduler policy on the index writer and see what is going > on. > It will make indexing slower for Infinispan but it seems we can't do much > better in the short time. > > If that works then we will put some restriction in place when reading the > config. The index writer for this given index will be forced to use the > serial strategy, > > On 14 août 09, at 06:23, Łukasz Moreń wrote: > > Expensive is replication to all Infinispan nodes. IndexWriter creates > segment files, merge it to one compound segment, delete already useless > descriptor files - many files that must be replicated. Some of them even > don't need to be replicated because they are inserted into directory at the > begin of index commit process and removed at the end. Batching helps with > performance here. Yes, I think IndexWriter works like you wrote. > > 2009/8/14 Manik Surtani <ma...@jboss.org> > >> >> On 14 Aug 2009, at 10:17, Łukasz Moreń wrote: >> >> Yes, but i.e. FSDirectory flushes changes if any file descriptor is >> created/updated - can be many in one IndexWriter life. >> In infinispan case implementation, I want to commit changes only when >> IndexWriter is closing - batch all modifications. >> If I switch to transaction per descriptor modification - similarly how >> it's done in FSDirectory it works well, however not efficient. >> >> >> So what's expensive here? Writing to Infinispan, or the indexing itself? >> Correct me if I am wrong, I assume that the IndexWriter creates multiple >> threads, and each thread does: { >> // some indexing work >> // write these indexes to Infinispan >> } >> >> Is that correct? >> >> >> 2009/8/14 Sanne Grinovero <sanne.grinov...@gmail.com> >> >>> I am not an expert on this part of Lucene, but it looks like to me >>> that the IndexWriter is the "driver/coordinator", and it's decisions >>> are affected by a pluggable MergeScheduler; they do stuff on the >>> internal buffers of the IndexWriter (dequeue the pending segments to >>> be written to the index), but it shouldn't matter what they exactly do >>> as the internal status of these classes are unaffected by our >>> transactions. >>> They take some decision about writing segments to the Directory and >>> committing changes ("sync()") : as you implement this Directory you >>> should only have to take care of this class, I don't think the >>> MergeScheduler(s) are relevant: it just happens that the thread going >>> to apply changes to the index might be a different one than the one >>> pushing changes to the IndexWriter. >>> >>> In the Directory implementation you should use transactions to push >>> state changes to the "underlying storage": as FSDirectory is playing >>> with file descriptors and flushes, you do the same with Infinispan >>> transactions. >>> >>> 2009/8/14 Łukasz Moreń <lukasz.mo...@gmail.com>: >>> > Yes, right, MergeSchedulers. >>> > >>> > 2009/8/14 Sanne Grinovero <sanne.grinov...@gmail.com> >>> >> >>> >> what are these "other" threads? Are you speaking about the >>> >> MergeSchedulers? >>> >> >>> >> 2009/8/13 Łukasz Moreń <lukasz.mo...@gmail.com>: >>> >> > IndexWriter processes index update and delegates some job to other >>> >> > threads and waits when they finish. These "other" threads works on >>> >> > data modified >>> >> > in IndexWriter transaction. So I think if I use transaction per >>> >> > thread, "others" would not see data modified by IndexWriter until >>> >> > commit. >>> >> > >>> >> > 2009/8/13, Emmanuel Bernard <emman...@hibernate.org>: >>> >> >> Ah I thought it was using multiple threads because of your mass >>> >> >> indexing. I did not know some threads were span specifically for >>> the >>> >> >> Infinispan directory. >>> >> >> >>> >> >> On 13 août 09, at 17:34, Sanne Grinovero wrote: >>> >> >> >>> >> >>> Hi Łukasz, >>> >> >>> what is your usage of these threads? did you consider using one >>> >> >>> transaction per thread? >>> >> >>> >>> >> >>> Sanne >>> >> >>> >>> >> >>> 2009/8/13 Łukasz Moreń <lukasz.mo...@gmail.com>: >>> >> >>>> Newly created threads were not associated with any transaction, >>> so I >>> >> >>>> suppose it was a problem. Sharing transaction between threads >>> seems >>> >> >>>> to >>> >> >>>> be a good solution. >>> >> >>>> Thanks for help! >>> >> >>>> >>> >> >>>> 2009/8/13, Jason T. Greene <jason.gre...@redhat.com>: >>> >> >>>>> Correct. Also there could be read races as well, so if you are >>> >> >>>>> going to >>> >> >>>>> share a tx between threads, i would use some shared lock to >>> >> >>>>> gaurantee >>> >> >>>>> that only one thread can use it at a time. BTW this means you >>> have >>> >> >>>>> to >>> >> >>>>> properly suspend/resume the TX via the TM API as well. >>> >> >>>>> >>> >> >>>>> Emmanuel Bernard wrote: >>> >> >>>>>> Modifying a transaction means applying muations (like SQL >>> INSERT / >>> >> >>>>>> UPDATE / DELETE) to the transactional resource? >>> >> >>>>>> >>> >> >>>>>> On 13 août 09, at 15:07, Jason T. Greene wrote: >>> >> >>>>>> >>> >> >>>>>>> When using transactions, the context is bound to the >>> >> >>>>>>> transaction, and >>> >> >>>>>>> you can move a transaction between threads. However, you >>> should >>> >> >>>>>>> only >>> >> >>>>>>> be modifying a transaction with one thread at a time. >>> >> >>>>>>> >>> >> >>>>>>> Emmanuel Bernard wrote: >>> >> >>>>>>>> Could it be that you are not using the same transaction >>> between >>> >> >>>>>>>> different threads (ie you physically start different ones or >>> >> >>>>>>>> different "Infinispan contexts")? >>> >> >>>>>>>> Infini guys, do you support transactional operation spanning >>> >> >>>>>>>> several >>> >> >>>>>>>> concurrent threads? >>> >> >>>>>>>> On 13 août 09, at 14:04, Łukasz Moreń wrote: >>> >> >>>>>>>>> I've tried with JBoss AS transaction manager and >>> >> >>>>>>>>> JBossStandaloneTM. >>> >> >>>>>>>>> The result is this same in all cases - error during merge. >>> >> >>>>>>>>> >>> >> >>>>>>>>> 2009/8/12, Emmanuel Bernard <emman...@hibernate.org>: >>> >> >>>>>>>>>> Ok I understand better now. >>> >> >>>>>>>>>> Do your tests in JBoss AS with it's decent transaction >>> manager >>> >> >>>>>>>>>> (infinispan should have a config for it) >>> >> >>>>>>>>>> For unit testing, force the indexing process in hibernate >>> to >>> >> >>>>>>>>>> use a >>> >> >>>>>>>>>> single thread (I ghnk it's possible ask Sanne of you don't >>> >> >>>>>>>>>> know how). >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> Exposing some configuration to infinispan makes sense. can >>> you >>> >> >>>>>>>>>> start a >>> >> >>>>>>>>>> thread explainig what is configurable and which one you >>> think >>> >> >>>>>>>>>> we >>> >> >>>>>>>>>> should expose to hsearch users. Ideally I would like to >>> offer >>> >> >>>>>>>>>> one or >>> >> >>>>>>>>>> two defaut config scenarios and allow to fallback to a >>> custom >>> >> >>>>>>>>>> config. >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> Emmanuel >>> >> >>>>>>>>>> >>> >> >>>>>>>>>> On 12 août 2009, at 11:58, Łukasz Moreń >>> >> >>>>>>>>>> <lukasz.mo...@gmail.com> >>> >> >>>>>>>>>> wrote: >>> >> >>>>>>>>>> >>> >> >>>>>>>>>>> Sorry, but my wifi does not work well today. I will try to >>> >> >>>>>>>>>>> explain >>> >> >>>>>>>>>>> it more clear. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> I'm using DummyTransactionManager available for >>> Infinispan. >>> >> >>>>>>>>>>> It associates transaction with the calling thread. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> Steps to update index: >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> 1. index writer acquires lock - begin of transaction >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> 2. if it is necessary, index writer delegates new threads >>> to >>> >> >>>>>>>>>>> do >>> >> >>>>>>>>>>> merge work. >>> >> >>>>>>>>>>> Those merge threads do not see changes made so far from >>> >> >>>>>>>>>>> begin of >>> >> >>>>>>>>>>> transaction, >>> >> >>>>>>>>>>> and are looking for segments which are not yet in index. >>> >> >>>>>>>>>>> Changes will be visible when AD.3 is completed. >>> >> >>>>>>>>>>> For tests i tried to commit transaction when merge starts >>> >> >>>>>>>>>>> and then >>> >> >>>>>>>>>>> everything worked well. But then i need to start it again. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> 3. index writer releases lock - transaction is commited, >>> all >>> >> >>>>>>>>>>> changes >>> >> >>>>>>>>>>> made in this transaction are visible for other threads. >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> Maybe using some other transaction manager could help? >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> What about Infinispan cache configuration? Some >>> configuration >>> >> >>>>>>>>>>> mechanism should be exposed to the user, >>> >> >>>>>>>>>>> or we can hardcoded one in InfinispanDirectoryProvider is >>> >> >>>>>>>>>>> enough? >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> 2009/8/12 Emmanuel Bernard <emman...@hibernate.org> >>> >> >>>>>>>>>>> why? >>> >> >>>>>>>>>>> Emmanuel Bernard >>> >> >>>>>>>>>>> Pending >>> >> >>>>>>>>>>> you there? >>> >> >>>>>>>>>>> Emmanuel Bernard >>> >> >>>>>>>>>>> Pending >>> >> >>>>>>>>>>> Ok please describe in details what is going on. From what >>> >> >>>>>>>>>>> you are >>> >> >>>>>>>>>>> describing the tx cannot see all segments which looks like >>> an >>> >> >>>>>>>>>>> infinispan bug to me. >>> >> >>>>>>>>>>> Pending >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> As a back up you can try wo transaction and see if that >>> works >>> >> >>>>>>>>>>> Emmanuel Bernard >>> >> >>>>>>>>>>> Pending >>> >> >>>>>>>>>>> technically the lucene index should cope with that >>> >> >>>>>>>>>>> Emmanuel Bernard >>> >> >>>>>>>>>>> 11:16 >>> >> >>>>>>>>>>> but I like this approach less >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> >>> >> >>>>>>>>>>> Let's try and chat by email IF I'm not online, I need to >>> run >>> >> >>>>>>>>>>> on some >>> >> >>>>>>>>>>> errands today. >>> >> >>>>>>>>>>> >>> >> >>>>>>>> _______________________________________________ >>> >> >>>>>>>> infinispan-dev mailing list >>> >> >>>>>>>> infinispan-...@lists.jboss.org >>> >> >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >>>>>>> >>> >> >>>>>>> >>> >> >>>>>>> -- >>> >> >>>>>>> Jason T. Greene >>> >> >>>>>>> JBoss, a division of Red Hat >>> >> >>>>>> >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> -- >>> >> >>>>> Jason T. Greene >>> >> >>>>> JBoss, a division of Red Hat >>> >> >>>>> >>> >> >>>> >>> >> >>>> _______________________________________________ >>> >> >>>> infinispan-dev mailing list >>> >> >>>> infinispan-...@lists.jboss.org >>> >> >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >>> >>> >> >>> _______________________________________________ >>> >> >>> hibernate-dev mailing list >>> >> >>> hibernate-dev@lists.jboss.org >>> >> >>> https://lists.jboss.org/mailman/listinfo/hibernate-dev >>> >> >> >>> >> >> >>> >> > >>> > >>> > >>> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-...@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Manik Surtani >> ma...@jboss.org >> Lead, Infinispan >> Lead, JBoss Cache >> http://www.infinispan.org >> http://www.jbosscache.org >> >> >> >> >> > _______________________________________________ > infinispan-dev mailing list > infinispan-...@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > >
_______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev