Hey everyone,
I attended the Alpha 2 update yesterday and was quite pleased to see the
progress on things so far. So first, congratulations to everyone on the
work being put in and thank you to Val and Kseniya for running yesterday's
event.
I asked a few questions after the webinar which Val had some answers to but
suggested posting here as some of them are not things that have been
thought about yet or no plans exist around it at this point.
I'll put all of them here and if necessary we can break into different
threads after.
1. Schema change - does that include the ability to change the types of
fields/columns?
1. Val's answer was yes with some limitations but those are not well
defined yet. He did mention that something like some kind of transformer
could be provided for doing the conversion and I would second this, even
for common types like int to long being able to do a custom
conversion will
be immensely valuable.
2. Will the new guaranteed consistency between APIs also mean SQL will
gain transaction support?
1. I believe the answer here was yes but perhaps someone else may
want to weigh in to confirm
3. Has there been any decision about how much of Calcite will be exposed
to the client? When using thick clients, it'll be hugely beneficial to be
able to work with Calcite APIs directly to provide custom rules and
optimisations to better suit organisation needs
1. We currently use Calcite ourselves and have a lot of custom rules and
optimisations and have slowly pushed more of our queries to
Calcite that we
then push down to Ignite.
2. We Index into Solr and use the Solr indices and others to
fulfill over all queries with Ignite just being one of the
possible storage
targets Calcite pushes down to. If we could get to the calcite
API from an
Ignite thick client, it would enable us to remove a layer of abstraction
and complexity and make Ignite our primary that we then link
with Solr and
others to fulfill queries.
4. Will the unified storage model enable different versions of Ignite to
be in the cluster when persistence is enabled so that rolling restarts can
be done?
1. We have to do a strange dance to perform Ignite upgrades without
downtime because pods/nodes will fail to start on version mismatch and if
we get that dance wrong, we will corrupt a node's data. It will make
admin/upgrades far less brittle and error prone if this was possible.
5. Will it be possible to provide a custom cache store still and will
these changes enable custom cache stores to be queryable from SQL?
1. Our Ignite usage is wide and complex because we use KV, SQL and other
APIs. The inconsistency of what can and can't be used from one API to
another is a real challenge and has forced us over time to stick
to one API
and write alternative solutions outside of Ignite. It will drastically
simplify things if any CacheStore (or some new equivalent) could
be plugged
in and be made accessible to SQL (and in fact all other APIs) without
having to load all the data from the underlying CacheStore first
into memory
6. This question wasn't mine but I was going to ask it as well: What
will happen to the Indexing API since H2 is being removed?
1. As I mentioned above, we Index into Solr, in earlier versions of
our product we used the indexing SPI to index into Lucene on the Ignite
nodes but this presented so many challenges we ultimately
abandoned it and
replaced it with the current Solr solution.
2. Lucene indexing was ideal because it meant we didn't have to
re-invent Solr or Elasticsearch's sharding capabilities, that was almost
automatic with Ignite only giving you the data that was meant for the
current node.
3. The Lucene API enabled more flexibility and removed a network
round trip from our queries.
4. Given Calcite's ability to support custom SQL functions, I'd love
to have the ability to define custom functions that Lucene was answering
7. What impact does RAFT now have on conflict resolution, off the top of
my head there are two cases
1. On startup after a split brain Ignite currently takes an "exercise
for the reader" approach and dumps a log along the lines of
> 1. BaselineTopology of joining node is not compatible with
> BaselineTopology in the cluster.
> 1. Branching history of cluster BlT doesn't contain branching point
> hash of joining node BlT. Consider cleaning persistent storage of the
> node
> and adding it to the cluster again.
>
1. This leaves you with no choice except to take one half and manually
copy, write data back over to the other half then destroy the bad one.
2. The second case is conflicts on keys, I
beleive CacheVersionConflictResolver and manager are used
by GridCacheMapEntry which just says if use old value do this
otherwise use
newVal. Ideally this will be exposed in the new API so that one can
override this behaviour. The last writer wins approach isn't always ideal
and the semantics of the domain can mean that what is consider
"correct" in
a conflict is not so for a different domain.
8. This is last on the list but is actually the most important for us
right now as it is an impending and growing risk. We allow customers to
create their own tables on demand. We're already using the same cache group
etc for data structures to be re-used but now that we're getting to
thousands of tables/caches our startup times are sometimes unpredictably
long - at present it seems to depend on the state of the cache/table before
the restart but we're into the order of 5 - 7 mins and steadily increasing
with the growth of tables. Are there any provisions in Ignite 3 for
ensuring startup time isn't proportional to the number of tables/caches
available?
Those are the key things I can think of at the moment. Val and others I'd
love to open a conversation around these.
Regards,
Courtney Robinson
Founder and CEO, Hypi
Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io>
<https://hypi.io>
https://hypi.io