Hey everyone, I attended the Alpha 2 update yesterday and was quite pleased to see the progress on things so far. So first, congratulations to everyone on the work being put in and thank you to Val and Kseniya for running yesterday's event.
I asked a few questions after the webinar which Val had some answers to but suggested posting here as some of them are not things that have been thought about yet or no plans exist around it at this point. I'll put all of them here and if necessary we can break into different threads after. 1. Schema change - does that include the ability to change the types of fields/columns? 1. Val's answer was yes with some limitations but those are not well defined yet. He did mention that something like some kind of transformer could be provided for doing the conversion and I would second this, even for common types like int to long being able to do a custom conversion will be immensely valuable. 2. Will the new guaranteed consistency between APIs also mean SQL will gain transaction support? 1. I believe the answer here was yes but perhaps someone else may want to weigh in to confirm 3. Has there been any decision about how much of Calcite will be exposed to the client? When using thick clients, it'll be hugely beneficial to be able to work with Calcite APIs directly to provide custom rules and optimisations to better suit organisation needs 1. We currently use Calcite ourselves and have a lot of custom rules and optimisations and have slowly pushed more of our queries to Calcite that we then push down to Ignite. 2. We Index into Solr and use the Solr indices and others to fulfill over all queries with Ignite just being one of the possible storage targets Calcite pushes down to. If we could get to the calcite API from an Ignite thick client, it would enable us to remove a layer of abstraction and complexity and make Ignite our primary that we then link with Solr and others to fulfill queries. 4. Will the unified storage model enable different versions of Ignite to be in the cluster when persistence is enabled so that rolling restarts can be done? 1. We have to do a strange dance to perform Ignite upgrades without downtime because pods/nodes will fail to start on version mismatch and if we get that dance wrong, we will corrupt a node's data. It will make admin/upgrades far less brittle and error prone if this was possible. 5. Will it be possible to provide a custom cache store still and will these changes enable custom cache stores to be queryable from SQL? 1. Our Ignite usage is wide and complex because we use KV, SQL and other APIs. The inconsistency of what can and can't be used from one API to another is a real challenge and has forced us over time to stick to one API and write alternative solutions outside of Ignite. It will drastically simplify things if any CacheStore (or some new equivalent) could be plugged in and be made accessible to SQL (and in fact all other APIs) without having to load all the data from the underlying CacheStore first into memory 6. This question wasn't mine but I was going to ask it as well: What will happen to the Indexing API since H2 is being removed? 1. As I mentioned above, we Index into Solr, in earlier versions of our product we used the indexing SPI to index into Lucene on the Ignite nodes but this presented so many challenges we ultimately abandoned it and replaced it with the current Solr solution. 2. Lucene indexing was ideal because it meant we didn't have to re-invent Solr or Elasticsearch's sharding capabilities, that was almost automatic with Ignite only giving you the data that was meant for the current node. 3. The Lucene API enabled more flexibility and removed a network round trip from our queries. 4. Given Calcite's ability to support custom SQL functions, I'd love to have the ability to define custom functions that Lucene was answering 7. What impact does RAFT now have on conflict resolution, off the top of my head there are two cases 1. On startup after a split brain Ignite currently takes an "exercise for the reader" approach and dumps a log along the lines of > 1. BaselineTopology of joining node is not compatible with > BaselineTopology in the cluster. > 1. Branching history of cluster BlT doesn't contain branching point > hash of joining node BlT. Consider cleaning persistent storage of the > node > and adding it to the cluster again. > 1. This leaves you with no choice except to take one half and manually copy, write data back over to the other half then destroy the bad one. 2. The second case is conflicts on keys, I beleive CacheVersionConflictResolver and manager are used by GridCacheMapEntry which just says if use old value do this otherwise use newVal. Ideally this will be exposed in the new API so that one can override this behaviour. The last writer wins approach isn't always ideal and the semantics of the domain can mean that what is consider "correct" in a conflict is not so for a different domain. 8. This is last on the list but is actually the most important for us right now as it is an impending and growing risk. We allow customers to create their own tables on demand. We're already using the same cache group etc for data structures to be re-used but now that we're getting to thousands of tables/caches our startup times are sometimes unpredictably long - at present it seems to depend on the state of the cache/table before the restart but we're into the order of 5 - 7 mins and steadily increasing with the growth of tables. Are there any provisions in Ignite 3 for ensuring startup time isn't proportional to the number of tables/caches available? Those are the key things I can think of at the moment. Val and others I'd love to open a conversation around these. Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> <https://hypi.io> https://hypi.io