Stan, Thanks for staring "Starting with missing PDS pieces" that is promising to embed usability changes into the source code. In the meantime, could you propose a TODO list for recovering from index corruption and similar scenarios? I know that you're experienced in that and it will be great to document the procedures until the code is modified.
- Denis On Wed, Jan 30, 2019 at 1:02 PM Denis Magda <dma...@apache.org> wrote: > Dmitry, > > Thanks, the FAQ section might make sense but, as the practice shows, it's > hard to get recommendations even for questions like this one :) > > Ignite experts, please chime in, the project fails with data corruption > periodically and we have to explain how to come around until an issue is > resolved. > > - > Denis > > > On Wed, Jan 30, 2019 at 11:55 AM Dmitriy Pavlov <dpav...@apache.org> > wrote: > >> Denis, >> >> BTW one case of corruption is fixed here, >> https://issues.apache.org/jira/browse/IGNITE-11030 >> >> I still need a review from Ignite Native Persistence Experts. I feel it is >> really important to apply such fixes. >> >> Sincerely, >> Dmitriy Pavlov >> >> чт, 24 янв. 2019 г. в 16:29, Dmitriy Pavlov <dpav...@apache.org>: >> >> > Denis, Whan do you think about a more general idea of creating FAQs for >> > Ignite users? >> > >> > What if experts will once place their answer in a wiki page and then >> > develop answers for frequent problems. >> > >> > And before diving into researching each problem, experienced community >> > members will ask users to check the FAQ first? >> > >> > Sincerely, >> > Dmitriy Pavlov >> > >> > P.S. here is an article, Apache guides have reference to >> > http://www.catb.org/~esr/faqs/smart-questions.html - one from required >> > actions from users is to search for information. >> > >> > чт, 24 янв. 2019 г. в 01:55, Denis Magda <dma...@gridgain.com>: >> > >> >> Another data/index corruption issue: >> >> >> >> >> https://stackoverflow.com/questions/54295401/ignite-transaction-failure-not-recoverable-with-persistance >> >> >> >> It's suggested to clean index.bin to be able to recover the cluster. >> >> Folks, >> >> let's prepare a list of actions to do if a cluster becomes >> unrecoverable >> >> due to data or index corruption issue. What should we do depending on >> an >> >> exception: >> >> >> >> - Remove index.bin if X or Y or Z >> >> - etc >> >> >> >> >> >> -- >> >> Denis Magda >> >> >> >> >> >> On Sun, Dec 30, 2018 at 10:06 AM Denis Magda <dma...@gridgain.com> >> wrote: >> >> >> >> > Ignite SQL and memory experts, >> >> > >> >> > The following issue was reported on SO: >> >> > >> >> > >> >> >> https://stackoverflow.com/questions/53979106/ignite-corruptedtreeexception-leads-to-cluster-failure >> >> > >> >> > The stack trace starts with the message below, more details are in >> that >> >> > forum: >> >> > >> >> > [SEVERE][data-streamer-stripe-2-#15][GridDhtAtomicCache] <MyCache> >> >> > Unexpected exception during cache update >> >> > org.h2.message.DbException: General error: "class >> >> > >> >> >> org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: >> >> > Runtime failure on row: Row@75ab6623[ key: CacheKey >> [idHash=242632156, >> >> > hash=-841684964, parentId=-8607237606486310912, hour=9, >> >> > id=-8607237528489033728, date=2018-09-09 00:00:00.0], val: CacheValue >> >> > [idHash=843227122, hash=-801894604, .... >> >> > >> >> > Let's see if it's addressed in the latest release. Also, the user >> asked >> >> a >> >> > reasonable question - how to recover? Yes, it's possible to use >> >> snapshots >> >> > of GridGain if they are created before but I remember some >> discussions >> >> > around a recovery tool. >> >> > >> >> > -- >> >> > Denis >> >> > >> >> >> > >> >