On September 2 Courtney will tell more about Hypi use case, join us at Virtual Meetup: https://www.meetup.com/Apache-Ignite-Virtual-Meetup/events/280030600/
Below I quote Courtney's email to the user list: The talk will look at the general architecture, playing nice with other > technologies and the future reactive-streams architecture Hypi's team > started work on with a view for being in production in 2022. We'll look at why we're transitioning to reactive streams and how Ignite is > being used to accelerate other technologies in the stack. Just like Ignite > 3, the next-gen of Hypi itself is using Apache Calcite as a data > virtualisation layer. The talk will explore a little of how this is being > done and finally will discuss some challenges and pitfalls to avoid. вт, 3 авг. 2021 г. в 12:38, Ivan Pavlukhin <vololo...@gmail.com>: > Hi Courtney, > > Statistics for query planning is a rather complex subject. > Unfortunately I am now aware of technical details how is it expected > to be implemented in Ignite 3. > > Folks driving SQL please step in. > > 2021-07-31 20:18 GMT+03:00, Courtney Robinson <courtney.robin...@hypi.io>: > > Hi Ivan, > > Atri's description of the query plan being cached is what I was thinking > of > > with my description. > > > > I lack the knowledge on how the statistics are maintained to really > comment > > constructively Atri but my first question about the problem you raise > with > > statistics would be: > > > > How/where are the stats maintained and if a query plan is cached based on > > some stats, is it not possible to invalidate the cached plan periodically > > or based on statistics changes? > > > > Regards, > > Courtney Robinson > > Founder and CEO, Hypi > > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > > > > <https://hypi.io> > > https://hypi.io > > > > > > On Sat, Jul 31, 2021 at 8:54 AM Atri Sharma <a...@apache.org> wrote: > > > >> Query caching works on three levels - caching results, caching blocks > and > >> caching query plans. > >> > >> Prepared queries work by caching a plan for a query and reusing that > plan > >> by changing the parameters for the incoming query. So the query remains > >> the > >> same, but input values keep changing. > >> > >> The problem with prepared queries is that query execution can go bad > very > >> fast if the underlying data distribution changes and the cached plan is > >> no > >> longer optimal for the given statistics. > >> > >> On Sat, 31 Jul 2021, 12:54 Ivan Pavlukhin, <vololo...@gmail.com> wrote: > >> > >> > Hi Courtney, > >> > > >> > Please clarify what do you mean by prepared queries and query caching? > >> > Do you mean caching query results? If so, in my mind material views > >> > are the best approach here (Ignite 2 does not support them). Do you > >> > have other good approaches in your mind? E.g. implemented in other > >> > databases. > >> > > >> > 2021-07-26 21:27 GMT+03:00, Valentin Kulichenko < > >> > valentin.kuliche...@gmail.com>: > >> > > Hi Courtney, > >> > > > >> > > Generally speaking, query caching certainly makes sense. As far as I > >> > know, > >> > > Ignite 2.x actually does that, but most likely there might be room > >> > > for > >> > > improvement as well. We will look into this. > >> > > > >> > > As for the SQL API - the answer is yes. The requirement for a dummy > >> cache > >> > > is an artifact of the current architecture. This is 100% wrong and > >> > > will > >> > be > >> > > changed in 3.0. > >> > > > >> > > -Val > >> > > > >> > > On Sun, Jul 25, 2021 at 2:51 PM Courtney Robinson > >> > > <courtney.robin...@hypi.io> > >> > > wrote: > >> > > > >> > >> Something else came to mind, are there plans to support prepared > >> > queries? > >> > >> > >> > >> I recall someone saying before that Ignite does internally cache > >> queries > >> > >> but it's not at all clear if or how it does do that. I assume a > >> > >> simple > >> > >> hash > >> > >> of the query isn't enough. > >> > >> > >> > >> We generate SQL queries based on user runtime settings and they can > >> get > >> > >> to > >> > >> hundreds of lines long, I imagine this means most of our queries > are > >> not > >> > >> being cached but there are patterns so we could generate and manage > >> > >> prepared queries ourselves. > >> > >> > >> > >> Also, will there be a dedicated API for doing SQL queries rather > >> > >> than > >> > >> having to pass a SqlFieldsQuery to a cache that has nothing to do > >> > >> with > >> > >> the > >> > >> cache being queried? When I first started with Ignite years ago, > >> > >> this > >> > was > >> > >> beyond confusing for me. I'm trying to run select x from B but I > >> > >> pass > >> > >> this > >> > >> to a cache called DUMMY or whatever arbitrary name... > >> > >> > >> > >> On Fri, Jul 23, 2021 at 4:05 PM Courtney Robinson < > >> > >> courtney.robin...@hypi.io> > >> > >> wrote: > >> > >> > >> > >> > Andrey, > >> > >> > Thanks for the response - see my comments inline. > >> > >> > > >> > >> > > >> > >> >> I've gone through the questions and have no the whole picture of > >> your > >> > >> use > >> > >> >> case. > >> > >> > > >> > >> > Would you please clarify how you exactly use the Ignite? what are > >> the > >> > >> >> integration points? > >> > >> >> > >> > >> > > >> > >> > I'll try to clarify - we have a low/no code platform. A user > >> designs a > >> > >> > model for their application and we map this model to Ignite > tables > >> and > >> > >> > other data sources. The model I'll describe is what we're > building > >> now > >> > >> and > >> > >> > expected to be in alpha some time in Q4 21. Our current > production > >> > >> > architecture is different and isn't as generic, it is heavily > tied > >> to > >> > >> > Ignite and we've redesigned to get some flexibility where Ignite > >> > >> > doesn't > >> > >> > provide what we want. Things like window functions and other > >> > >> > SQL-99 > >> > >> limits. > >> > >> > > >> > >> > In the next gen version we're working on you can create a model > >> > >> > for > >> a > >> > >> > Tweet(content, to) and we will create an Ignite table with > content > >> and > >> > >> > to > >> > >> > columns using the type the user selects. This is the simplest > >> > >> > case. > >> > >> > We are adding generic support for sources and sinks and using > >> Calcite > >> > >> > as > >> > >> a > >> > >> > data virtualisation layer. Ignite is one of the available > >> > source/sinks. > >> > >> > > >> > >> > When a user creates a model for Tweet, we also allow them to > >> > >> > specify > >> > >> > how > >> > >> > they want to index the data. We have a copy of the calcite > >> > >> > Elasticsearch > >> > >> > adapter modified for Solr. > >> > >> > > >> > >> > When a source is queried (Ignite or any other that we support), > we > >> > >> > generate SQL that Calcite executes. Calcite will push down the > >> > >> > generated > >> > >> > queries to Solr and Solr produces a list of IDs (in case of > >> > >> > Ignite) > >> > and > >> > >> we > >> > >> > do a multi-get from Ignite to produce the actual results. > >> > >> > > >> > >> > Obviously there's a lot more to this but that should give you a > >> > general > >> > >> > idea. > >> > >> > > >> > >> > and maybe share some experience with using Ignite SPIs? > >> > >> >> > >> > >> > Our evolution with Ignite started from the key value + compute > >> > >> > APIs. > >> > We > >> > >> > used the SPIs then but have since moved to using only the Ignite > >> > >> > SQL > >> > >> > API > >> > >> > (we gave up transactions for this). > >> > >> > > >> > >> > We originally used the indexing SPI to keep our own lucene index > >> > >> > of > >> > >> > data > >> > >> > in a cache. We did not use the Ignite FTS as it is very limited > >> > >> > compared > >> > >> to > >> > >> > what we allow customers to do. If I remember correctly, we were > >> using > >> > >> > an > >> > >> > affinity compute job to send queries to the right Ignite node and > >> > >> > then doing a multi-get to pull the data from caches. > >> > >> > I think we used one or two other SPIs and we found them very > >> > >> > useful > >> to > >> > >> > be > >> > >> > able to extend and customise Ignite without having to fork/change > >> > >> upstream > >> > >> > classes. We only stopped using them because we eventually > >> > >> > concluded > >> > >> > that > >> > >> > using the SQL only API was better for numerous reasons. > >> > >> > > >> > >> > > >> > >> >> We'll keep the information in mind while developing the Ignite, > >> > >> >> because this may help us to make a better product. > >> > >> >> > >> > >> >> By the way, I'll try to answer the questions. > >> > >> >> > >> > >> >> > 1. Schema change - does that include the ability to change > >> > >> >> > the > >> > >> >> > types > >> > >> >> of > >> > >> >> > fields/columns? > >> > >> >> Yes, we plan to support transparent conversion to a wider type > >> on-fly > >> > >> >> (e.g. > >> > >> >> 'int' to 'long'). > >> > >> >> This is a major point of our Live-schema concept. > >> > >> >> In fact, there is no need to convert data on all the nodes in a > >> > >> >> synchronous > >> > >> >> way as old SQL databases do (if one supports though), > >> > >> >> we are going to support multiple schema versions and convert > data > >> > >> >> on-demand > >> > >> >> on a per-row basis to the latest version, > >> > >> >> then write-back the row. > >> > >> >> > >> > >> > > >> > >> > I can understand. The auto conversion to wider type makes sense. > >> > >> > > >> > >> >> > >> > >> >> More complex things like 'String' -> 'int' are out of scope for > >> > >> >> now > >> > >> >> because > >> > >> >> it requires the execution of a user code on the critical path. > >> > >> >> > >> > >> > > >> > >> > I would argue though that executing user code on the critical > path > >> > >> > shouldn't be a blocker for custom conversions. I feel if a user > is > >> > >> > making > >> > >> > an advance enough integration to provide custom conversions they > >> would > >> > >> > be > >> > >> > aware that it impacts the system as a whole. > >> > >> > > >> > >> > The limitation here is column MUST NOT be indexed, because an > >> > >> > index > >> > >> > over > >> > >> >> the data of different kinds is impossible. > >> > >> >> > >> > >> > Understood - I'd make the case that indexing should be > pluggable. > >> > >> > I > >> > >> would > >> > >> > love for us to be able to take indexing away from Ignite in our > >> impl. > >> > - > >> > >> > I > >> > >> > think in Calcite, the Postgres adapter does this by having a > table > >> > >> > whose > >> > >> > type is "Index". The implementor would be left with the freedom > to > >> > >> > choose > >> > >> > how that table answers index lookups. From Ignite's perspective > it > >> > >> wouldn't > >> > >> > care so long as the interface's contract is met, I could use an > >> index > >> > >> that > >> > >> > does a lucene, ES, Solr or Redis lookup and the end result would > >> > >> > be > >> > the > >> > >> > same but as the implementor I'm choosing the tradeoff I want to > >> > >> > meet > >> > >> > the > >> > >> > organisation's goals. > >> > >> > > >> > >> > > >> > >> >> > >> > >> >> > 2. Will the new guaranteed consistency between APIs also > mean > >> SQL > >> > >> will > >> > >> >> > gain transaction support? > >> > >> >> Yes, we plan to have Transactional SQL. > >> > >> >> DDL will be non-transactional though, and I wonder if the one > >> > supports > >> > >> >> this. > >> > >> >> > >> > >> > I'm not sure I know of any thing that supports transactional DDL > >> > >> > so > >> > >> > don't > >> > >> > think this is an issue but I would say that a DDL statement in a > >> > >> > transaction shouldn't fail the transaction. I believe in Ignite 2 > >> > there > >> > >> is > >> > >> > a flag to turn this on or off, we should definitely keep this. > In > >> our > >> > >> > case, it's an issue with the nature of the platform we provide, > at > >> > >> > development time only about 10% of schema or other DB info is > >> > >> > known > >> - > >> > >> > we > >> > >> > generate the other 90% on the fly based on whatever customers > >> > >> > decide > >> > to > >> > >> > design from our UI. > >> > >> > > >> > >> >> > >> > >> >> Ignite 3 will operate with Rows underneath, but classic Table > API > >> and > >> > >> >> Key-value will be available to a user > >> > >> >> at the same time and with all consistency guarantees. > >> > >> > > >> > >> > Excellent! > >> > >> > > >> > >> >> > >> > >> >> > >> > >> > > >> > >> >> > 3. Has there been any decision about how much of Calcite will > >> > >> >> > be > >> > >> >> exposed > >> > >> >> > to the client? When using thick clients, it'll be hugely > >> > >> >> > beneficial > >> > >> to > >> > >> >> be > >> > >> >> > able to work with Calcite APIs directly to provide custom > >> > >> >> > rules > >> > >> >> > and > >> > >> >> > optimizations to better suit organization needs > >> > >> >> As of now, we have no plans to expose any Calcite API to a user. > >> > >> >> AFAIK, we have our custom Calcite convention, custom rules that > >> > >> >> are > >> > >> aware > >> > >> >> of distributed environment, > >> > >> >> and additional AST nodes. The rules MUST correctly propagate > >> internal > >> > >> >> information about data distribution, > >> > >> >> so I'm not sure want to give low-level access to them. > >> > >> >> > >> > >> > > >> > >> > Maybe we're an edge case but for us access to the Calcite APIs > >> > >> > would > >> > be > >> > >> > shift our current development somewhat. For us, we're treating > >> Ignite > >> > >> > as > >> > >> a > >> > >> > library that provides a good foundation and we extend and > >> > >> > customise > >> > it. > >> > >> > Again, we may be an edge case and maybe most people just want a > >> > >> > database > >> > >> to > >> > >> > put data into and get it back out without controlling some of how > >> > >> > it > >> > >> > does > >> > >> > those things. > >> > >> > > >> > >> > > >> > >> >> > We Index into Solr and use the Solr indices > >> > >> >> Ignite 1-2 has poor support for TEXT queries, which is totally > >> > >> >> unconfigurable. > >> > >> >> Also, Lucene indices underneath are NOT persistent that requires > >> too > >> > >> much > >> > >> >> effort to fix it. > >> > >> >> GeoSpatial index has the same issues, we decided to drop them > >> > >> >> along > >> > >> >> with > >> > >> >> Indexing SPI at all. > >> > >> >> > >> > >> >> However, you can find the activity on dev-list on the Index > Query > >> > >> >> topic. > >> > >> >> Guys are going to add IndexQuery (a scan query over the sorted > >> index > >> > >> which > >> > >> >> can use simple conditions) in Ignite 2. > >> > >> >> We also plan to have the same functionality, maybe it is > possible > >> to > >> > >> >> add > >> > >> >> full-text search support here. > >> > >> >> Will it work for you, what do you think? > >> > >> >> > >> > >> > Yes, we originally looked at text queries and almost immediately > >> said > >> > >> > no. > >> > >> > Nothing about it was useful for us other than the lucene > >> > >> > dependency > >> in > >> > >> > Java. In the end that also became an issue because we wanted a > >> > >> > newer > >> > >> lucene > >> > >> > version. > >> > >> > IndexQuery will be useful - we'll certainly use it but it's not > >> > enough. > >> > >> > I > >> > >> > think we customise and depend on Solr too much for IndexQuery to > >> > >> > compare > >> > >> > but it will help in some cases for simpler queries. > >> > >> > > >> > >> >> > >> > >> >> > >> > >> >> > 4. Will the unified storage model enable different versions > >> > >> >> > of > >> > >> Ignite > >> > >> >> to > >> > >> >> > be in the cluster when persistence is enabled so that > rolling > >> > >> restarts > >> > >> >> can > >> > >> >> > be done? > >> > >> >> I'm not sure a rolling upgrade (RU) will be available because > too > >> > much > >> > >> >> compatibility issues should be resolved > >> > >> >> to make RU possible under the load without downtime. > >> > >> >> > >> > >> >> Maybe it makes sense to provide some grid mode (maintenance > mode) > >> for > >> > >> >> RU > >> > >> >> purposes that will block all the user load > >> > >> >> but allow upgrade the grid. E.g. for the pure in-memory case. > >> > >> >> > >> > >> >> Persistence compatibility should be preserved as it works for > >> Ignite > >> > >> >> 2. > >> > >> >> > >> > >> > My ideal situation would be that we start a newer Ignite version, > >> > >> > it > >> > >> comes > >> > >> > online, joins the cluster and is treated as some kind of > >> > >> > maintenance > >> > >> > mode > >> > >> > as you suggested. In maintenance mode, the other nodes re-balance > >> > >> > or > >> > >> > some > >> > >> > other process to send all the data this new node will handle over > >> > >> > to > >> > >> > it. > >> > >> > The existing nodes continue serving this data until the new node > >> > >> > is > >> no > >> > >> > longer in maintenance mode and then it becomes the primary for > the > >> > data > >> > >> > that was rebalanced to it. > >> > >> > > >> > >> > The second case is if an existing node is restarted with a newer > >> > Ignite > >> > >> > version. No re-balance is needed, it joins in maintenance mode, > >> > >> > runs > >> > >> > any > >> > >> > upgrade/conversion or other task it needs to and then starts > >> accepting > >> > >> > reads and writes. Communication with lower version nodes can be > >> > >> > limited, > >> > >> > they are aware of it and sends it data and queries for which it > is > >> the > >> > >> > primary assuming they will also be upgraded. > >> > >> > > >> > >> > I guess I'm not aware of the compatibility issues this presents > >> > >> > and > >> so > >> > >> > my > >> > >> > view is narrow and perhaps naive here. > >> > >> > > >> > >> >> > >> > >> >> > >> > >> >> > 5. Will it be possible to provide a custom cache store > still > >> and > >> > >> will > >> > >> >> > these changes enable custom cache stores to be queryable > from > >> > SQL? > >> > >> >> I'm not sure I fully understand this. > >> > >> >> 1. Usually, SQL is about indices. Ignite can't perform a query > >> > >> >> over > >> > >> >> the > >> > >> >> unindexed data. > >> > >> >> > >> > >> > Yes understood > >> > >> > > >> > >> >> > >> > >> >> 2. Fullscan over the cache that contains only part of data + > scan > >> the > >> > >> >> CacheStore, then merging the results is a pain. > >> > >> >> Most likely, running a query over CacheStore directly will be a > >> > >> >> simpler > >> > >> >> way, and even more performant. > >> > >> >> Shared CacheStore (same for all nodes) will definitely kill the > >> > >> >> performance > >> > >> >> in that case. > >> > >> >> So, the preliminary loadCache() call looks like a good > >> > >> >> compromise. > >> > >> >> > >> > >> > I think the problem is largely that the CacheStore interface is > >> > >> > not > >> > >> > sufficient for being able to do this. If it had a richer > interface > >> > >> > which > >> > >> > allowed the cache store to answer index queries basically hooking > >> into > >> > >> > whatever Ignite's doing for its B+tree then this would be viable. > >> > >> > A > >> > >> > CacheStore that only implements KV API doesn't take part in SQL > >> > >> > queries. > >> > >> > > >> > >> >> > >> > >> >> 3. Splitting query into 2 parts to run on Ignite and to run on > >> > >> CacheStore > >> > >> >> looks possible with Calcite, > >> > >> >> but I think it impractical because in general, neither > CacheStore > >> nor > >> > >> >> database structure are aware of the data partitioning. > >> > >> >> > >> > >> > hmmm, maybe I missed the point but as the implementor of the > >> > CacheStore > >> > >> > you should have knowledge of the structure and partition info. or > >> have > >> > >> some > >> > >> > way of retrieving it. Again, I think the current CacheStore > >> interface > >> > >> > is > >> > >> > the problem and if it was extended to provide this information > >> > >> > then > >> > its > >> > >> up > >> > >> > to the implementation to do this whilst Ignite knows that any > >> > >> > implementation of these interfaces will meet the contract > >> > >> > necessary. > >> > >> > > >> > >> > > >> > >> >> > >> > >> >> 4. Transactions can't be supported in case of direct CacheStore > >> > >> >> access, > >> > >> >> because even if the underlying database supports 2-phase commit, > >> > which > >> > >> is > >> > >> >> a > >> > >> >> rare case, the recovery protocol looks hard. > >> > >> >> Just looks like this feature doesn't worth it. > >> > >> >> > >> > >> > I'd completely agree with this. It will be incredibly hard to get > >> this > >> > >> > done reliably > >> > >> > > >> > >> >> > >> > >> >> > >> > >> >> > 6. This question wasn't mine but I was going to ask it as > >> > >> >> > well: > >> > >> >> > What > >> > >> >> > will happen to the Indexing API since H2 is being removed? > >> > >> >> As I wrote above, Indexing SPI will be dropped, but IndexQuery > >> > >> >> will > >> > be > >> > >> >> added. > >> > >> >> > >> > >> >> > 1. As I mentioned above, we Index into Solr, in earlier > >> > >> >> > versions > >> > of > >> > >> >> > our product we used the indexing SPI to index into Lucene > >> > >> >> > on > >> > >> >> > the > >> > >> >> Ignite > >> > >> >> > nodes but this presented so many challenges we ultimately > >> > >> abandoned > >> > >> >> it and > >> > >> >> > replaced it with the current Solr solution. > >> > >> >> AFAIK, some guys developed and sell a plugin for Ignite-2 with > >> > >> persistent > >> > >> >> Lucene and Geo indices. > >> > >> >> I don't know about the capabilities and limitations of their > >> > solution, > >> > >> >> because of closed code. > >> > >> >> You can easily google it. > >> > >> >> > >> > >> >> I saw few encouraged guys who want to improve TEXT queries, > >> > >> >> but unfortunately, things weren't moved far enough. For now, > they > >> are > >> > >> >> in > >> > >> >> the middle of fixing the merging TEXT query results. > >> > >> >> So far so good. > >> > >> >> > >> > >> >> I think it is a good chance to master the skill developing of a > >> > >> >> distributed > >> > >> >> system for the one > >> > >> >> who will take a lead over the full-text search feature and add > >> native > >> > >> >> FullText index support into Ignite-3. > >> > >> >> > >> > >> > I've seen the other thread from Atri I believe about this. > >> > >> > > >> > >> >> > >> > >> >> > >> > >> >> > 7. What impact does RAFT now have on conflict resolution? > >> > >> >> RAFT is a state machine replication protocol. It guarantees all > >> > >> >> the > >> > >> nodes > >> > >> >> will see the updates in the same order. > >> > >> >> So, seems no conflicts are possible. Recovery from split-brain > is > >> > >> >> impossible in common-case. > >> > >> >> > >> > >> >> However, I think we have a conflict resolver analog in Ignite-3 > >> > >> >> as > >> it > >> > >> >> is > >> > >> >> very useful in some cases > >> > >> >> e.g datacenter replication, incremental data load from 3-rd > party > >> > >> source, > >> > >> >> recovery from 3-rd party source. > >> > >> >> > >> > >> >> > >> > >> >> > 8. CacheGroups. > >> > >> >> AFAIK, CacheGroup will be eliminated, actually, we'll keep this > >> > >> mechanic, > >> > >> >> but it will be configured in a different way, > >> > >> >> which makes Ignite configuring a bit simpler. > >> > >> >> Sorry, for now, I have no answer on your performance concerns, > >> > >> >> this > >> > >> >> part > >> > >> >> of > >> > >> >> Ignite-3 slipped from my radar. > >> > >> >> > >> > >> > No worries. I'll wait and see if anyone else suggests something. > >> > >> > Its > >> > >> > getting a lot worse, a node took 1hr to start yesterday after a > >> > >> deployment > >> > >> > and its in prod with very little visibility into what it is > doing, > >> it > >> > >> > was > >> > >> > just stopped, no logging or anything and then resumed. > >> > >> > > >> > >> > 2021-07-22 13:40:15.997 INFO [ArcOS,,,] 9 --- [orker-#40%hypi%] > >> > >> > o.a.i.i.p.cache.GridCacheProcessor [285] : Finished > recovery > >> for > >> > >> > cache [cache=hypi_01F8ZC3DGT66RNYCDZH3XNVY2E_Hue, grp=hypi, > >> > >> > startVer=AffinityTopologyVersion [topVer=79, minorTopVer=0]] > >> > >> > > >> > >> > One hour later it printed the next cache recovery message and > >> started > >> > >> > 30 > >> > >> > seconds after going through other tables. > >> > >> > > >> > >> > > >> > >> > > >> > >> >> > >> > >> >> Let's wait if someone will clarify what we could expect in > >> Ignite-3. > >> > >> >> Guys, can someone chime in and give more light on 3,4,7,8 > >> questions? > >> > >> >> > >> > >> >> > >> > >> >> On Thu, Jul 22, 2021 at 4:15 AM Courtney Robinson < > >> > >> >> courtney.robin...@hypi.io> > >> > >> >> wrote: > >> > >> >> > >> > >> >> > Hey everyone, > >> > >> >> > I attended the Alpha 2 update yesterday and was quite pleased > >> > >> >> > to > >> > see > >> > >> the > >> > >> >> > progress on things so far. So first, congratulations to > >> > >> >> > everyone > >> on > >> > >> the > >> > >> >> > work being put in and thank you to Val and Kseniya for running > >> > >> >> yesterday's > >> > >> >> > event. > >> > >> >> > > >> > >> >> > I asked a few questions after the webinar which Val had some > >> > answers > >> > >> to > >> > >> >> but > >> > >> >> > suggested posting here as some of them are not things that > have > >> > been > >> > >> >> > thought about yet or no plans exist around it at this point. > >> > >> >> > > >> > >> >> > I'll put all of them here and if necessary we can break into > >> > >> >> > different > >> > >> >> > threads after. > >> > >> >> > > >> > >> >> > 1. Schema change - does that include the ability to change > >> > >> >> > the > >> > >> types > >> > >> >> of > >> > >> >> > fields/columns? > >> > >> >> > 1. Val's answer was yes with some limitations but those > >> > >> >> > are > >> > >> >> > not > >> > >> >> well > >> > >> >> > defined yet. He did mention that something like some > kind > >> of > >> > >> >> > transformer > >> > >> >> > could be provided for doing the conversion and I would > >> second > >> > >> >> this, > >> > >> >> > even > >> > >> >> > for common types like int to long being able to do a > >> > >> >> > custom > >> > >> >> > conversion will > >> > >> >> > be immensely valuable. > >> > >> >> > 2. Will the new guaranteed consistency between APIs also > >> > >> >> > mean > >> > SQL > >> > >> >> will > >> > >> >> > gain transaction support? > >> > >> >> > 1. I believe the answer here was yes but perhaps someone > >> else > >> > >> may > >> > >> >> > want to weigh in to confirm > >> > >> >> > 3. Has there been any decision about how much of Calcite > >> > >> >> > will > >> be > >> > >> >> exposed > >> > >> >> > to the client? When using thick clients, it'll be hugely > >> > >> >> > beneficial > >> > >> >> to > >> > >> >> > be > >> > >> >> > able to work with Calcite APIs directly to provide custom > >> rules > >> > >> >> > and > >> > >> >> > optimisations to better suit organisation needs > >> > >> >> > 1. We currently use Calcite ourselves and have a lot of > >> > >> >> > custom > >> > >> rules > >> > >> >> and > >> > >> >> > optimisations and have slowly pushed more of our queries > >> > >> >> > to > >> > >> >> > Calcite that we > >> > >> >> > then push down to Ignite. > >> > >> >> > 2. We Index into Solr and use the Solr indices and > others > >> to > >> > >> >> > fulfill over all queries with Ignite just being one of > >> > >> >> > the > >> > >> >> > possible storage > >> > >> >> > targets Calcite pushes down to. If we could get to the > >> > calcite > >> > >> >> > API from an > >> > >> >> > Ignite thick client, it would enable us to remove a > layer > >> of > >> > >> >> > abstraction > >> > >> >> > and complexity and make Ignite our primary that we then > >> link > >> > >> >> > with Solr and > >> > >> >> > others to fulfill queries. > >> > >> >> > 4. Will the unified storage model enable different versions > >> > >> >> > of > >> > >> >> Ignite to > >> > >> >> > be in the cluster when persistence is enabled so that > >> > >> >> > rolling > >> > >> >> restarts > >> > >> >> > can > >> > >> >> > be done? > >> > >> >> > 1. We have to do a strange dance to perform Ignite upgrades > >> > >> >> > without > >> > >> >> > downtime because pods/nodes will fail to start on > version > >> > >> mismatch > >> > >> >> > and if > >> > >> >> > we get that dance wrong, we will corrupt a node's data. > >> > >> >> > It > >> > >> >> > will > >> > >> >> make > >> > >> >> > admin/upgrades far less brittle and error prone if this > >> > >> >> > was > >> > >> >> possible. > >> > >> >> > 5. Will it be possible to provide a custom cache store > still > >> and > >> > >> will > >> > >> >> > these changes enable custom cache stores to be queryable > >> > >> >> > from > >> > >> >> > SQL? > >> > >> >> > 1. Our Ignite usage is wide and complex because we use KV, > >> > >> >> > SQL > >> > >> >> > and > >> > >> >> other > >> > >> >> > APIs. The inconsistency of what can and can't be used > >> > >> >> > from > >> > one > >> > >> >> API to > >> > >> >> > another is a real challenge and has forced us over time > >> > >> >> > to > >> > >> >> > stick > >> > >> >> > to one API > >> > >> >> > and write alternative solutions outside of Ignite. It > >> > >> >> > will > >> > >> >> > drastically > >> > >> >> > simplify things if any CacheStore (or some new > >> > >> >> > equivalent) > >> > >> >> > could > >> > >> >> > be plugged > >> > >> >> > in and be made accessible to SQL (and in fact all other > >> APIs) > >> > >> >> without > >> > >> >> > having to load all the data from the underlying > >> > >> >> > CacheStore > >> > >> >> > first > >> > >> >> > into memory > >> > >> >> > 6. This question wasn't mine but I was going to ask it as > >> well: > >> > >> What > >> > >> >> > will happen to the Indexing API since H2 is being removed? > >> > >> >> > 1. As I mentioned above, we Index into Solr, in earlier > >> > >> >> > versions > >> > >> >> of > >> > >> >> > our product we used the indexing SPI to index into > Lucene > >> on > >> > >> >> > the > >> > >> >> > Ignite > >> > >> >> > nodes but this presented so many challenges we > ultimately > >> > >> >> > abandoned it and > >> > >> >> > replaced it with the current Solr solution. > >> > >> >> > 2. Lucene indexing was ideal because it meant we didn't > >> have > >> > >> >> > to > >> > >> >> > re-invent Solr or Elasticsearch's sharding capabilities, > >> that > >> > >> was > >> > >> >> > almost > >> > >> >> > automatic with Ignite only giving you the data that was > >> meant > >> > >> for > >> > >> >> the > >> > >> >> > current node. > >> > >> >> > 3. The Lucene API enabled more flexibility and removed a > >> > >> >> > network > >> > >> >> > round trip from our queries. > >> > >> >> > 4. Given Calcite's ability to support custom SQL > >> > >> >> > functions, > >> > >> >> > I'd > >> > >> >> love > >> > >> >> > to have the ability to define custom functions that > >> > >> >> > Lucene > >> > was > >> > >> >> > answering > >> > >> >> > 7. What impact does RAFT now have on conflict resolution, > >> > >> >> > off > >> > the > >> > >> >> top of > >> > >> >> > my head there are two cases > >> > >> >> > 1. On startup after a split brain Ignite currently takes > >> > >> >> > an > >> > >> >> "exercise > >> > >> >> > for the reader" approach and dumps a log along the lines > >> > >> >> > of > >> > >> >> > > >> > >> >> > > 1. BaselineTopology of joining node is not compatible > with > >> > >> >> > > BaselineTopology in the cluster. > >> > >> >> > > 1. Branching history of cluster BlT doesn't contain > >> branching > >> > >> point > >> > >> >> > > hash of joining node BlT. Consider cleaning persistent > >> > >> >> > > storage > >> > >> >> of > >> > >> >> > the node > >> > >> >> > > and adding it to the cluster again. > >> > >> >> > > > >> > >> >> > 1. This leaves you with no choice except to take one half > >> > >> >> > and > >> > >> >> manually > >> > >> >> > copy, write data back over to the other half then > destroy > >> the > >> > >> bad > >> > >> >> > one. > >> > >> >> > 2. The second case is conflicts on keys, I > >> > >> >> > beleive CacheVersionConflictResolver and manager are > used > >> > >> >> > by GridCacheMapEntry which just says if use old value do > >> this > >> > >> >> > otherwise use > >> > >> >> > newVal. Ideally this will be exposed in the new API so > >> > >> >> > that > >> > >> >> > one > >> > >> >> can > >> > >> >> > override this behaviour. The last writer wins approach > >> isn't > >> > >> >> always > >> > >> >> > ideal > >> > >> >> > and the semantics of the domain can mean that what is > >> > consider > >> > >> >> > "correct" in > >> > >> >> > a conflict is not so for a different domain. > >> > >> >> > 8. This is last on the list but is actually the most > >> > >> >> > important > >> > >> >> > for > >> > >> us > >> > >> >> > right now as it is an impending and growing risk. We allow > >> > >> customers > >> > >> >> to > >> > >> >> > create their own tables on demand. We're already using the > >> same > >> > >> cache > >> > >> >> > group > >> > >> >> > etc for data structures to be re-used but now that we're > >> getting > >> > >> >> > to > >> > >> >> > thousands of tables/caches our startup times are sometimes > >> > >> >> unpredictably > >> > >> >> > long - at present it seems to depend on the state of the > >> > >> cache/table > >> > >> >> > before > >> > >> >> > the restart but we're into the order of 5 - 7 mins and > >> steadily > >> > >> >> > increasing > >> > >> >> > with the growth of tables. Are there any provisions in > >> > >> >> > Ignite > >> 3 > >> > >> >> > for > >> > >> >> > ensuring startup time isn't proportional to the number of > >> > >> >> tables/caches > >> > >> >> > available? > >> > >> >> > > >> > >> >> > > >> > >> >> > Those are the key things I can think of at the moment. Val and > >> > >> >> > others > >> > >> >> I'd > >> > >> >> > love to open a conversation around these. > >> > >> >> > > >> > >> >> > Regards, > >> > >> >> > Courtney Robinson > >> > >> >> > Founder and CEO, Hypi > >> > >> >> > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > >> > >> >> > > >> > >> >> > <https://hypi.io> > >> > >> >> > https://hypi.io > >> > >> >> > > >> > >> >> > >> > >> >> > >> > >> >> -- > >> > >> >> Best regards, > >> > >> >> Andrey V. Mashenkov > >> > >> >> > >> > >> > > >> > >> > >> > > > >> > > >> > > >> > -- > >> > > >> > Best regards, > >> > Ivan Pavlukhin > >> > > >> > > > > > -- > > Best regards, > Ivan Pavlukhin >