Re: Ignite index corruption issue -> unrecoverable cluster

Denis Magda Wed, 06 Feb 2019 16:12:58 -0800

Stan,

Thanks for staring "Starting with missing PDS pieces" that is promising to
embed usability changes into the source code. In the meantime, could you
propose a TODO list for recovering from index corruption and similar
scenarios? I know that you're experienced in that and it will be great to
document the procedures until the code is modified.


-
Denis


On Wed, Jan 30, 2019 at 1:02 PM Denis Magda <dma...@apache.org> wrote:

> Dmitry,
>
> Thanks, the FAQ section might make sense but, as the practice shows, it's
> hard to get recommendations even for questions like this one :)
>
> Ignite experts, please chime in, the project fails with data corruption
> periodically and we have to explain how to come around until an issue is
> resolved.
>
> -
> Denis
>
>
> On Wed, Jan 30, 2019 at 11:55 AM Dmitriy Pavlov <dpav...@apache.org>
> wrote:
>
>> Denis,
>>
>> BTW one case of corruption is fixed here,
>> https://issues.apache.org/jira/browse/IGNITE-11030
>>
>> I still need a review from Ignite Native Persistence Experts. I feel it is
>> really important to apply such fixes.
>>
>> Sincerely,
>> Dmitriy Pavlov
>>
>> чт, 24 янв. 2019 г. в 16:29, Dmitriy Pavlov <dpav...@apache.org>:
>>
>> > Denis, Whan do you think about a more general idea of creating FAQs for
>> > Ignite users?
>> >
>> > What if experts will once place their answer in a wiki page and then
>> > develop answers for frequent problems.
>> >
>> > And before diving into researching each problem, experienced community
>> > members will ask users to check the FAQ first?
>> >
>> > Sincerely,
>> > Dmitriy Pavlov
>> >
>> > P.S. here is an article, Apache guides have reference to
>> > http://www.catb.org/~esr/faqs/smart-questions.html - one from required
>> > actions from users is to search for information.
>> >
>> > чт, 24 янв. 2019 г. в 01:55, Denis Magda <dma...@gridgain.com>:
>> >
>> >> Another data/index corruption issue:
>> >>
>> >>
>> https://stackoverflow.com/questions/54295401/ignite-transaction-failure-not-recoverable-with-persistance
>> >>
>> >> It's suggested to clean index.bin to be able to recover the cluster.
>> >> Folks,
>> >> let's prepare a list of actions to do if a cluster becomes
>> unrecoverable
>> >> due to data or index corruption issue. What should we do depending on
>> an
>> >> exception:
>> >>
>> >>    - Remove index.bin if X or Y or Z
>> >>    - etc
>> >>
>> >>
>> >> --
>> >> Denis Magda
>> >>
>> >>
>> >> On Sun, Dec 30, 2018 at 10:06 AM Denis Magda <dma...@gridgain.com>
>> wrote:
>> >>
>> >> > Ignite SQL and memory experts,
>> >> >
>> >> > The following issue was reported on SO:
>> >> >
>> >> >
>> >>
>> https://stackoverflow.com/questions/53979106/ignite-corruptedtreeexception-leads-to-cluster-failure
>> >> >
>> >> > The stack trace starts with the message below, more details are in
>> that
>> >> > forum:
>> >> >
>> >> > [SEVERE][data-streamer-stripe-2-#15][GridDhtAtomicCache] <MyCache>
>> >> > Unexpected exception during cache update
>> >> > org.h2.message.DbException: General error: "class
>> >> >
>> >>
>> org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>> >> > Runtime failure on row: Row@75ab6623[ key: CacheKey
>> [idHash=242632156,
>> >> > hash=-841684964, parentId=-8607237606486310912, hour=9,
>> >> > id=-8607237528489033728, date=2018-09-09 00:00:00.0], val: CacheValue
>> >> > [idHash=843227122, hash=-801894604, ....
>> >> >
>> >> > Let's see if it's addressed in the latest release. Also, the user
>> asked
>> >> a
>> >> > reasonable question - how to recover? Yes, it's possible to use
>> >> snapshots
>> >> > of GridGain if they are created before but I remember some
>> discussions
>> >> > around a recovery tool.
>> >> >
>> >> > --
>> >> > Denis
>> >> >
>> >>
>> >
>>
>

Re: Ignite index corruption issue -> unrecoverable cluster

Reply via email to