I've attached benchmark results to the JIRA ticket.
We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of WAL
compaction enabled flag. It's pretty significant drop: WAL compaction
itself gives only ~3% drop.
I see two options here:
1) Change LOG_ONLY behavior. That implies that we'll be ready to release
AI 2.5 with 7% drop.
2) Introduce LOG_ONLY_SAFE, make it default, add release note to AI 2.5
that we added power loss durability in default mode, but user may
fallback to previous LOG_ONLY in order to retain performance.
Thoughts?
Best Regards,
Ivan Rakov
On 20.03.2018 16:00, Ivan Rakov wrote:
Val,
If a storage is in
corrupted state, does it mean that it needs to be completely removed and
cluster needs to be restarted without data?
Yes, there's a chance that in LOG_ONLY all local data will be lost,
but only in *power loss**/ OS crash* case.
kill -9, JVM crash, death of critical system thread and all other
cases that usually take place are variations of *process crash*. All
WAL modes (except NONE, of course) ensure corruption-safety in case of
process crash.
If so, I'm not sure any mode
that allows corruption makes much sense to me.
It depends on performance impact of enforcing power-loss corruption
safety. Price of full protection from power loss is high - FSYNC is
way slower (2-10 times) than other WAL modes. The question is whether
ensuring weaker guarantees (corruption can't happen, but loss of last
updates can) will affect performance as badly as strong guarantees.
I'll share benchmark results soon.
Best Regards,
Ivan Rakov
On 20.03.2018 5:09, Valentin Kulichenko wrote:
Guys,
What do we understand under "data corruption" here? If a storage is in
corrupted state, does it mean that it needs to be completely removed and
cluster needs to be restarted without data? If so, I'm not sure any mode
that allows corruption makes much sense to me. How am I supposed to
use a
database, if virtually any failure can end with complete loss of data?
In any case, this definitely should not be a default behavior. If
user ever
switches to corruption-unsafe mode, there should be a clear warning
about
this.
-Val
On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <ivan.glu...@gmail.com>
wrote:
Ticket to track changes:
https://issues.apache.org/jira/browse/IGNITE-7754
Best Regards,
Ivan Rakov
On 16.03.2018 10:58, Dmitriy Setrakyan wrote:
On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <ivan.glu...@gmail.com>
wrote:
Vladimir,
Unlike BACKGROUND, LOG_ONLY provides strict write guarantees
unless power
loss has happened.
Seems like we need to measure performance difference to decide
whether do
we need separate WAL mode. If it will be invisible, we'll just fix
these
bugs without introducing new mode; if it will be perceptible, we'll
continue the discussion about introducing LOG_ONLY_SAFE.
Makes sense?
Yes, this sounds like the right approach.