Hi Courtney, Statistics for query planning is a rather complex subject. Unfortunately I am now aware of technical details how is it expected to be implemented in Ignite 3.
Folks driving SQL please step in. 2021-07-31 20:18 GMT+03:00, Courtney Robinson <courtney.robin...@hypi.io>: > Hi Ivan, > Atri's description of the query plan being cached is what I was thinking of > with my description. > > I lack the knowledge on how the statistics are maintained to really comment > constructively Atri but my first question about the problem you raise with > statistics would be: > > How/where are the stats maintained and if a query plan is cached based on > some stats, is it not possible to invalidate the cached plan periodically > or based on statistics changes? > > Regards, > Courtney Robinson > Founder and CEO, Hypi > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > > <https://hypi.io> > https://hypi.io > > > On Sat, Jul 31, 2021 at 8:54 AM Atri Sharma <a...@apache.org> wrote: > >> Query caching works on three levels - caching results, caching blocks and >> caching query plans. >> >> Prepared queries work by caching a plan for a query and reusing that plan >> by changing the parameters for the incoming query. So the query remains >> the >> same, but input values keep changing. >> >> The problem with prepared queries is that query execution can go bad very >> fast if the underlying data distribution changes and the cached plan is >> no >> longer optimal for the given statistics. >> >> On Sat, 31 Jul 2021, 12:54 Ivan Pavlukhin, <vololo...@gmail.com> wrote: >> >> > Hi Courtney, >> > >> > Please clarify what do you mean by prepared queries and query caching? >> > Do you mean caching query results? If so, in my mind material views >> > are the best approach here (Ignite 2 does not support them). Do you >> > have other good approaches in your mind? E.g. implemented in other >> > databases. >> > >> > 2021-07-26 21:27 GMT+03:00, Valentin Kulichenko < >> > valentin.kuliche...@gmail.com>: >> > > Hi Courtney, >> > > >> > > Generally speaking, query caching certainly makes sense. As far as I >> > know, >> > > Ignite 2.x actually does that, but most likely there might be room >> > > for >> > > improvement as well. We will look into this. >> > > >> > > As for the SQL API - the answer is yes. The requirement for a dummy >> cache >> > > is an artifact of the current architecture. This is 100% wrong and >> > > will >> > be >> > > changed in 3.0. >> > > >> > > -Val >> > > >> > > On Sun, Jul 25, 2021 at 2:51 PM Courtney Robinson >> > > <courtney.robin...@hypi.io> >> > > wrote: >> > > >> > >> Something else came to mind, are there plans to support prepared >> > queries? >> > >> >> > >> I recall someone saying before that Ignite does internally cache >> queries >> > >> but it's not at all clear if or how it does do that. I assume a >> > >> simple >> > >> hash >> > >> of the query isn't enough. >> > >> >> > >> We generate SQL queries based on user runtime settings and they can >> get >> > >> to >> > >> hundreds of lines long, I imagine this means most of our queries are >> not >> > >> being cached but there are patterns so we could generate and manage >> > >> prepared queries ourselves. >> > >> >> > >> Also, will there be a dedicated API for doing SQL queries rather >> > >> than >> > >> having to pass a SqlFieldsQuery to a cache that has nothing to do >> > >> with >> > >> the >> > >> cache being queried? When I first started with Ignite years ago, >> > >> this >> > was >> > >> beyond confusing for me. I'm trying to run select x from B but I >> > >> pass >> > >> this >> > >> to a cache called DUMMY or whatever arbitrary name... >> > >> >> > >> On Fri, Jul 23, 2021 at 4:05 PM Courtney Robinson < >> > >> courtney.robin...@hypi.io> >> > >> wrote: >> > >> >> > >> > Andrey, >> > >> > Thanks for the response - see my comments inline. >> > >> > >> > >> > >> > >> >> I've gone through the questions and have no the whole picture of >> your >> > >> use >> > >> >> case. >> > >> > >> > >> > Would you please clarify how you exactly use the Ignite? what are >> the >> > >> >> integration points? >> > >> >> >> > >> > >> > >> > I'll try to clarify - we have a low/no code platform. A user >> designs a >> > >> > model for their application and we map this model to Ignite tables >> and >> > >> > other data sources. The model I'll describe is what we're building >> now >> > >> and >> > >> > expected to be in alpha some time in Q4 21. Our current production >> > >> > architecture is different and isn't as generic, it is heavily tied >> to >> > >> > Ignite and we've redesigned to get some flexibility where Ignite >> > >> > doesn't >> > >> > provide what we want. Things like window functions and other >> > >> > SQL-99 >> > >> limits. >> > >> > >> > >> > In the next gen version we're working on you can create a model >> > >> > for >> a >> > >> > Tweet(content, to) and we will create an Ignite table with content >> and >> > >> > to >> > >> > columns using the type the user selects. This is the simplest >> > >> > case. >> > >> > We are adding generic support for sources and sinks and using >> Calcite >> > >> > as >> > >> a >> > >> > data virtualisation layer. Ignite is one of the available >> > source/sinks. >> > >> > >> > >> > When a user creates a model for Tweet, we also allow them to >> > >> > specify >> > >> > how >> > >> > they want to index the data. We have a copy of the calcite >> > >> > Elasticsearch >> > >> > adapter modified for Solr. >> > >> > >> > >> > When a source is queried (Ignite or any other that we support), we >> > >> > generate SQL that Calcite executes. Calcite will push down the >> > >> > generated >> > >> > queries to Solr and Solr produces a list of IDs (in case of >> > >> > Ignite) >> > and >> > >> we >> > >> > do a multi-get from Ignite to produce the actual results. >> > >> > >> > >> > Obviously there's a lot more to this but that should give you a >> > general >> > >> > idea. >> > >> > >> > >> > and maybe share some experience with using Ignite SPIs? >> > >> >> >> > >> > Our evolution with Ignite started from the key value + compute >> > >> > APIs. >> > We >> > >> > used the SPIs then but have since moved to using only the Ignite >> > >> > SQL >> > >> > API >> > >> > (we gave up transactions for this). >> > >> > >> > >> > We originally used the indexing SPI to keep our own lucene index >> > >> > of >> > >> > data >> > >> > in a cache. We did not use the Ignite FTS as it is very limited >> > >> > compared >> > >> to >> > >> > what we allow customers to do. If I remember correctly, we were >> using >> > >> > an >> > >> > affinity compute job to send queries to the right Ignite node and >> > >> > then doing a multi-get to pull the data from caches. >> > >> > I think we used one or two other SPIs and we found them very >> > >> > useful >> to >> > >> > be >> > >> > able to extend and customise Ignite without having to fork/change >> > >> upstream >> > >> > classes. We only stopped using them because we eventually >> > >> > concluded >> > >> > that >> > >> > using the SQL only API was better for numerous reasons. >> > >> > >> > >> > >> > >> >> We'll keep the information in mind while developing the Ignite, >> > >> >> because this may help us to make a better product. >> > >> >> >> > >> >> By the way, I'll try to answer the questions. >> > >> >> >> > >> >> > 1. Schema change - does that include the ability to change >> > >> >> > the >> > >> >> > types >> > >> >> of >> > >> >> > fields/columns? >> > >> >> Yes, we plan to support transparent conversion to a wider type >> on-fly >> > >> >> (e.g. >> > >> >> 'int' to 'long'). >> > >> >> This is a major point of our Live-schema concept. >> > >> >> In fact, there is no need to convert data on all the nodes in a >> > >> >> synchronous >> > >> >> way as old SQL databases do (if one supports though), >> > >> >> we are going to support multiple schema versions and convert data >> > >> >> on-demand >> > >> >> on a per-row basis to the latest version, >> > >> >> then write-back the row. >> > >> >> >> > >> > >> > >> > I can understand. The auto conversion to wider type makes sense. >> > >> > >> > >> >> >> > >> >> More complex things like 'String' -> 'int' are out of scope for >> > >> >> now >> > >> >> because >> > >> >> it requires the execution of a user code on the critical path. >> > >> >> >> > >> > >> > >> > I would argue though that executing user code on the critical path >> > >> > shouldn't be a blocker for custom conversions. I feel if a user is >> > >> > making >> > >> > an advance enough integration to provide custom conversions they >> would >> > >> > be >> > >> > aware that it impacts the system as a whole. >> > >> > >> > >> > The limitation here is column MUST NOT be indexed, because an >> > >> > index >> > >> > over >> > >> >> the data of different kinds is impossible. >> > >> >> >> > >> > Understood - I'd make the case that indexing should be pluggable. >> > >> > I >> > >> would >> > >> > love for us to be able to take indexing away from Ignite in our >> impl. >> > - >> > >> > I >> > >> > think in Calcite, the Postgres adapter does this by having a table >> > >> > whose >> > >> > type is "Index". The implementor would be left with the freedom to >> > >> > choose >> > >> > how that table answers index lookups. From Ignite's perspective it >> > >> wouldn't >> > >> > care so long as the interface's contract is met, I could use an >> index >> > >> that >> > >> > does a lucene, ES, Solr or Redis lookup and the end result would >> > >> > be >> > the >> > >> > same but as the implementor I'm choosing the tradeoff I want to >> > >> > meet >> > >> > the >> > >> > organisation's goals. >> > >> > >> > >> > >> > >> >> >> > >> >> > 2. Will the new guaranteed consistency between APIs also mean >> SQL >> > >> will >> > >> >> > gain transaction support? >> > >> >> Yes, we plan to have Transactional SQL. >> > >> >> DDL will be non-transactional though, and I wonder if the one >> > supports >> > >> >> this. >> > >> >> >> > >> > I'm not sure I know of any thing that supports transactional DDL >> > >> > so >> > >> > don't >> > >> > think this is an issue but I would say that a DDL statement in a >> > >> > transaction shouldn't fail the transaction. I believe in Ignite 2 >> > there >> > >> is >> > >> > a flag to turn this on or off, we should definitely keep this. In >> our >> > >> > case, it's an issue with the nature of the platform we provide, at >> > >> > development time only about 10% of schema or other DB info is >> > >> > known >> - >> > >> > we >> > >> > generate the other 90% on the fly based on whatever customers >> > >> > decide >> > to >> > >> > design from our UI. >> > >> > >> > >> >> >> > >> >> Ignite 3 will operate with Rows underneath, but classic Table API >> and >> > >> >> Key-value will be available to a user >> > >> >> at the same time and with all consistency guarantees. >> > >> > >> > >> > Excellent! >> > >> > >> > >> >> >> > >> >> >> > >> > >> > >> >> > 3. Has there been any decision about how much of Calcite will >> > >> >> > be >> > >> >> exposed >> > >> >> > to the client? When using thick clients, it'll be hugely >> > >> >> > beneficial >> > >> to >> > >> >> be >> > >> >> > able to work with Calcite APIs directly to provide custom >> > >> >> > rules >> > >> >> > and >> > >> >> > optimizations to better suit organization needs >> > >> >> As of now, we have no plans to expose any Calcite API to a user. >> > >> >> AFAIK, we have our custom Calcite convention, custom rules that >> > >> >> are >> > >> aware >> > >> >> of distributed environment, >> > >> >> and additional AST nodes. The rules MUST correctly propagate >> internal >> > >> >> information about data distribution, >> > >> >> so I'm not sure want to give low-level access to them. >> > >> >> >> > >> > >> > >> > Maybe we're an edge case but for us access to the Calcite APIs >> > >> > would >> > be >> > >> > shift our current development somewhat. For us, we're treating >> Ignite >> > >> > as >> > >> a >> > >> > library that provides a good foundation and we extend and >> > >> > customise >> > it. >> > >> > Again, we may be an edge case and maybe most people just want a >> > >> > database >> > >> to >> > >> > put data into and get it back out without controlling some of how >> > >> > it >> > >> > does >> > >> > those things. >> > >> > >> > >> > >> > >> >> > We Index into Solr and use the Solr indices >> > >> >> Ignite 1-2 has poor support for TEXT queries, which is totally >> > >> >> unconfigurable. >> > >> >> Also, Lucene indices underneath are NOT persistent that requires >> too >> > >> much >> > >> >> effort to fix it. >> > >> >> GeoSpatial index has the same issues, we decided to drop them >> > >> >> along >> > >> >> with >> > >> >> Indexing SPI at all. >> > >> >> >> > >> >> However, you can find the activity on dev-list on the Index Query >> > >> >> topic. >> > >> >> Guys are going to add IndexQuery (a scan query over the sorted >> index >> > >> which >> > >> >> can use simple conditions) in Ignite 2. >> > >> >> We also plan to have the same functionality, maybe it is possible >> to >> > >> >> add >> > >> >> full-text search support here. >> > >> >> Will it work for you, what do you think? >> > >> >> >> > >> > Yes, we originally looked at text queries and almost immediately >> said >> > >> > no. >> > >> > Nothing about it was useful for us other than the lucene >> > >> > dependency >> in >> > >> > Java. In the end that also became an issue because we wanted a >> > >> > newer >> > >> lucene >> > >> > version. >> > >> > IndexQuery will be useful - we'll certainly use it but it's not >> > enough. >> > >> > I >> > >> > think we customise and depend on Solr too much for IndexQuery to >> > >> > compare >> > >> > but it will help in some cases for simpler queries. >> > >> > >> > >> >> >> > >> >> >> > >> >> > 4. Will the unified storage model enable different versions >> > >> >> > of >> > >> Ignite >> > >> >> to >> > >> >> > be in the cluster when persistence is enabled so that rolling >> > >> restarts >> > >> >> can >> > >> >> > be done? >> > >> >> I'm not sure a rolling upgrade (RU) will be available because too >> > much >> > >> >> compatibility issues should be resolved >> > >> >> to make RU possible under the load without downtime. >> > >> >> >> > >> >> Maybe it makes sense to provide some grid mode (maintenance mode) >> for >> > >> >> RU >> > >> >> purposes that will block all the user load >> > >> >> but allow upgrade the grid. E.g. for the pure in-memory case. >> > >> >> >> > >> >> Persistence compatibility should be preserved as it works for >> Ignite >> > >> >> 2. >> > >> >> >> > >> > My ideal situation would be that we start a newer Ignite version, >> > >> > it >> > >> comes >> > >> > online, joins the cluster and is treated as some kind of >> > >> > maintenance >> > >> > mode >> > >> > as you suggested. In maintenance mode, the other nodes re-balance >> > >> > or >> > >> > some >> > >> > other process to send all the data this new node will handle over >> > >> > to >> > >> > it. >> > >> > The existing nodes continue serving this data until the new node >> > >> > is >> no >> > >> > longer in maintenance mode and then it becomes the primary for the >> > data >> > >> > that was rebalanced to it. >> > >> > >> > >> > The second case is if an existing node is restarted with a newer >> > Ignite >> > >> > version. No re-balance is needed, it joins in maintenance mode, >> > >> > runs >> > >> > any >> > >> > upgrade/conversion or other task it needs to and then starts >> accepting >> > >> > reads and writes. Communication with lower version nodes can be >> > >> > limited, >> > >> > they are aware of it and sends it data and queries for which it is >> the >> > >> > primary assuming they will also be upgraded. >> > >> > >> > >> > I guess I'm not aware of the compatibility issues this presents >> > >> > and >> so >> > >> > my >> > >> > view is narrow and perhaps naive here. >> > >> > >> > >> >> >> > >> >> >> > >> >> > 5. Will it be possible to provide a custom cache store still >> and >> > >> will >> > >> >> > these changes enable custom cache stores to be queryable from >> > SQL? >> > >> >> I'm not sure I fully understand this. >> > >> >> 1. Usually, SQL is about indices. Ignite can't perform a query >> > >> >> over >> > >> >> the >> > >> >> unindexed data. >> > >> >> >> > >> > Yes understood >> > >> > >> > >> >> >> > >> >> 2. Fullscan over the cache that contains only part of data + scan >> the >> > >> >> CacheStore, then merging the results is a pain. >> > >> >> Most likely, running a query over CacheStore directly will be a >> > >> >> simpler >> > >> >> way, and even more performant. >> > >> >> Shared CacheStore (same for all nodes) will definitely kill the >> > >> >> performance >> > >> >> in that case. >> > >> >> So, the preliminary loadCache() call looks like a good >> > >> >> compromise. >> > >> >> >> > >> > I think the problem is largely that the CacheStore interface is >> > >> > not >> > >> > sufficient for being able to do this. If it had a richer interface >> > >> > which >> > >> > allowed the cache store to answer index queries basically hooking >> into >> > >> > whatever Ignite's doing for its B+tree then this would be viable. >> > >> > A >> > >> > CacheStore that only implements KV API doesn't take part in SQL >> > >> > queries. >> > >> > >> > >> >> >> > >> >> 3. Splitting query into 2 parts to run on Ignite and to run on >> > >> CacheStore >> > >> >> looks possible with Calcite, >> > >> >> but I think it impractical because in general, neither CacheStore >> nor >> > >> >> database structure are aware of the data partitioning. >> > >> >> >> > >> > hmmm, maybe I missed the point but as the implementor of the >> > CacheStore >> > >> > you should have knowledge of the structure and partition info. or >> have >> > >> some >> > >> > way of retrieving it. Again, I think the current CacheStore >> interface >> > >> > is >> > >> > the problem and if it was extended to provide this information >> > >> > then >> > its >> > >> up >> > >> > to the implementation to do this whilst Ignite knows that any >> > >> > implementation of these interfaces will meet the contract >> > >> > necessary. >> > >> > >> > >> > >> > >> >> >> > >> >> 4. Transactions can't be supported in case of direct CacheStore >> > >> >> access, >> > >> >> because even if the underlying database supports 2-phase commit, >> > which >> > >> is >> > >> >> a >> > >> >> rare case, the recovery protocol looks hard. >> > >> >> Just looks like this feature doesn't worth it. >> > >> >> >> > >> > I'd completely agree with this. It will be incredibly hard to get >> this >> > >> > done reliably >> > >> > >> > >> >> >> > >> >> >> > >> >> > 6. This question wasn't mine but I was going to ask it as >> > >> >> > well: >> > >> >> > What >> > >> >> > will happen to the Indexing API since H2 is being removed? >> > >> >> As I wrote above, Indexing SPI will be dropped, but IndexQuery >> > >> >> will >> > be >> > >> >> added. >> > >> >> >> > >> >> > 1. As I mentioned above, we Index into Solr, in earlier >> > >> >> > versions >> > of >> > >> >> > our product we used the indexing SPI to index into Lucene >> > >> >> > on >> > >> >> > the >> > >> >> Ignite >> > >> >> > nodes but this presented so many challenges we ultimately >> > >> abandoned >> > >> >> it and >> > >> >> > replaced it with the current Solr solution. >> > >> >> AFAIK, some guys developed and sell a plugin for Ignite-2 with >> > >> persistent >> > >> >> Lucene and Geo indices. >> > >> >> I don't know about the capabilities and limitations of their >> > solution, >> > >> >> because of closed code. >> > >> >> You can easily google it. >> > >> >> >> > >> >> I saw few encouraged guys who want to improve TEXT queries, >> > >> >> but unfortunately, things weren't moved far enough. For now, they >> are >> > >> >> in >> > >> >> the middle of fixing the merging TEXT query results. >> > >> >> So far so good. >> > >> >> >> > >> >> I think it is a good chance to master the skill developing of a >> > >> >> distributed >> > >> >> system for the one >> > >> >> who will take a lead over the full-text search feature and add >> native >> > >> >> FullText index support into Ignite-3. >> > >> >> >> > >> > I've seen the other thread from Atri I believe about this. >> > >> > >> > >> >> >> > >> >> >> > >> >> > 7. What impact does RAFT now have on conflict resolution? >> > >> >> RAFT is a state machine replication protocol. It guarantees all >> > >> >> the >> > >> nodes >> > >> >> will see the updates in the same order. >> > >> >> So, seems no conflicts are possible. Recovery from split-brain is >> > >> >> impossible in common-case. >> > >> >> >> > >> >> However, I think we have a conflict resolver analog in Ignite-3 >> > >> >> as >> it >> > >> >> is >> > >> >> very useful in some cases >> > >> >> e.g datacenter replication, incremental data load from 3-rd party >> > >> source, >> > >> >> recovery from 3-rd party source. >> > >> >> >> > >> >> >> > >> >> > 8. CacheGroups. >> > >> >> AFAIK, CacheGroup will be eliminated, actually, we'll keep this >> > >> mechanic, >> > >> >> but it will be configured in a different way, >> > >> >> which makes Ignite configuring a bit simpler. >> > >> >> Sorry, for now, I have no answer on your performance concerns, >> > >> >> this >> > >> >> part >> > >> >> of >> > >> >> Ignite-3 slipped from my radar. >> > >> >> >> > >> > No worries. I'll wait and see if anyone else suggests something. >> > >> > Its >> > >> > getting a lot worse, a node took 1hr to start yesterday after a >> > >> deployment >> > >> > and its in prod with very little visibility into what it is doing, >> it >> > >> > was >> > >> > just stopped, no logging or anything and then resumed. >> > >> > >> > >> > 2021-07-22 13:40:15.997 INFO [ArcOS,,,] 9 --- [orker-#40%hypi%] >> > >> > o.a.i.i.p.cache.GridCacheProcessor [285] : Finished recovery >> for >> > >> > cache [cache=hypi_01F8ZC3DGT66RNYCDZH3XNVY2E_Hue, grp=hypi, >> > >> > startVer=AffinityTopologyVersion [topVer=79, minorTopVer=0]] >> > >> > >> > >> > One hour later it printed the next cache recovery message and >> started >> > >> > 30 >> > >> > seconds after going through other tables. >> > >> > >> > >> > >> > >> > >> > >> >> >> > >> >> Let's wait if someone will clarify what we could expect in >> Ignite-3. >> > >> >> Guys, can someone chime in and give more light on 3,4,7,8 >> questions? >> > >> >> >> > >> >> >> > >> >> On Thu, Jul 22, 2021 at 4:15 AM Courtney Robinson < >> > >> >> courtney.robin...@hypi.io> >> > >> >> wrote: >> > >> >> >> > >> >> > Hey everyone, >> > >> >> > I attended the Alpha 2 update yesterday and was quite pleased >> > >> >> > to >> > see >> > >> the >> > >> >> > progress on things so far. So first, congratulations to >> > >> >> > everyone >> on >> > >> the >> > >> >> > work being put in and thank you to Val and Kseniya for running >> > >> >> yesterday's >> > >> >> > event. >> > >> >> > >> > >> >> > I asked a few questions after the webinar which Val had some >> > answers >> > >> to >> > >> >> but >> > >> >> > suggested posting here as some of them are not things that have >> > been >> > >> >> > thought about yet or no plans exist around it at this point. >> > >> >> > >> > >> >> > I'll put all of them here and if necessary we can break into >> > >> >> > different >> > >> >> > threads after. >> > >> >> > >> > >> >> > 1. Schema change - does that include the ability to change >> > >> >> > the >> > >> types >> > >> >> of >> > >> >> > fields/columns? >> > >> >> > 1. Val's answer was yes with some limitations but those >> > >> >> > are >> > >> >> > not >> > >> >> well >> > >> >> > defined yet. He did mention that something like some kind >> of >> > >> >> > transformer >> > >> >> > could be provided for doing the conversion and I would >> second >> > >> >> this, >> > >> >> > even >> > >> >> > for common types like int to long being able to do a >> > >> >> > custom >> > >> >> > conversion will >> > >> >> > be immensely valuable. >> > >> >> > 2. Will the new guaranteed consistency between APIs also >> > >> >> > mean >> > SQL >> > >> >> will >> > >> >> > gain transaction support? >> > >> >> > 1. I believe the answer here was yes but perhaps someone >> else >> > >> may >> > >> >> > want to weigh in to confirm >> > >> >> > 3. Has there been any decision about how much of Calcite >> > >> >> > will >> be >> > >> >> exposed >> > >> >> > to the client? When using thick clients, it'll be hugely >> > >> >> > beneficial >> > >> >> to >> > >> >> > be >> > >> >> > able to work with Calcite APIs directly to provide custom >> rules >> > >> >> > and >> > >> >> > optimisations to better suit organisation needs >> > >> >> > 1. We currently use Calcite ourselves and have a lot of >> > >> >> > custom >> > >> rules >> > >> >> and >> > >> >> > optimisations and have slowly pushed more of our queries >> > >> >> > to >> > >> >> > Calcite that we >> > >> >> > then push down to Ignite. >> > >> >> > 2. We Index into Solr and use the Solr indices and others >> to >> > >> >> > fulfill over all queries with Ignite just being one of >> > >> >> > the >> > >> >> > possible storage >> > >> >> > targets Calcite pushes down to. If we could get to the >> > calcite >> > >> >> > API from an >> > >> >> > Ignite thick client, it would enable us to remove a layer >> of >> > >> >> > abstraction >> > >> >> > and complexity and make Ignite our primary that we then >> link >> > >> >> > with Solr and >> > >> >> > others to fulfill queries. >> > >> >> > 4. Will the unified storage model enable different versions >> > >> >> > of >> > >> >> Ignite to >> > >> >> > be in the cluster when persistence is enabled so that >> > >> >> > rolling >> > >> >> restarts >> > >> >> > can >> > >> >> > be done? >> > >> >> > 1. We have to do a strange dance to perform Ignite upgrades >> > >> >> > without >> > >> >> > downtime because pods/nodes will fail to start on version >> > >> mismatch >> > >> >> > and if >> > >> >> > we get that dance wrong, we will corrupt a node's data. >> > >> >> > It >> > >> >> > will >> > >> >> make >> > >> >> > admin/upgrades far less brittle and error prone if this >> > >> >> > was >> > >> >> possible. >> > >> >> > 5. Will it be possible to provide a custom cache store still >> and >> > >> will >> > >> >> > these changes enable custom cache stores to be queryable >> > >> >> > from >> > >> >> > SQL? >> > >> >> > 1. Our Ignite usage is wide and complex because we use KV, >> > >> >> > SQL >> > >> >> > and >> > >> >> other >> > >> >> > APIs. The inconsistency of what can and can't be used >> > >> >> > from >> > one >> > >> >> API to >> > >> >> > another is a real challenge and has forced us over time >> > >> >> > to >> > >> >> > stick >> > >> >> > to one API >> > >> >> > and write alternative solutions outside of Ignite. It >> > >> >> > will >> > >> >> > drastically >> > >> >> > simplify things if any CacheStore (or some new >> > >> >> > equivalent) >> > >> >> > could >> > >> >> > be plugged >> > >> >> > in and be made accessible to SQL (and in fact all other >> APIs) >> > >> >> without >> > >> >> > having to load all the data from the underlying >> > >> >> > CacheStore >> > >> >> > first >> > >> >> > into memory >> > >> >> > 6. This question wasn't mine but I was going to ask it as >> well: >> > >> What >> > >> >> > will happen to the Indexing API since H2 is being removed? >> > >> >> > 1. As I mentioned above, we Index into Solr, in earlier >> > >> >> > versions >> > >> >> of >> > >> >> > our product we used the indexing SPI to index into Lucene >> on >> > >> >> > the >> > >> >> > Ignite >> > >> >> > nodes but this presented so many challenges we ultimately >> > >> >> > abandoned it and >> > >> >> > replaced it with the current Solr solution. >> > >> >> > 2. Lucene indexing was ideal because it meant we didn't >> have >> > >> >> > to >> > >> >> > re-invent Solr or Elasticsearch's sharding capabilities, >> that >> > >> was >> > >> >> > almost >> > >> >> > automatic with Ignite only giving you the data that was >> meant >> > >> for >> > >> >> the >> > >> >> > current node. >> > >> >> > 3. The Lucene API enabled more flexibility and removed a >> > >> >> > network >> > >> >> > round trip from our queries. >> > >> >> > 4. Given Calcite's ability to support custom SQL >> > >> >> > functions, >> > >> >> > I'd >> > >> >> love >> > >> >> > to have the ability to define custom functions that >> > >> >> > Lucene >> > was >> > >> >> > answering >> > >> >> > 7. What impact does RAFT now have on conflict resolution, >> > >> >> > off >> > the >> > >> >> top of >> > >> >> > my head there are two cases >> > >> >> > 1. On startup after a split brain Ignite currently takes >> > >> >> > an >> > >> >> "exercise >> > >> >> > for the reader" approach and dumps a log along the lines >> > >> >> > of >> > >> >> > >> > >> >> > > 1. BaselineTopology of joining node is not compatible with >> > >> >> > > BaselineTopology in the cluster. >> > >> >> > > 1. Branching history of cluster BlT doesn't contain >> branching >> > >> point >> > >> >> > > hash of joining node BlT. Consider cleaning persistent >> > >> >> > > storage >> > >> >> of >> > >> >> > the node >> > >> >> > > and adding it to the cluster again. >> > >> >> > > >> > >> >> > 1. This leaves you with no choice except to take one half >> > >> >> > and >> > >> >> manually >> > >> >> > copy, write data back over to the other half then destroy >> the >> > >> bad >> > >> >> > one. >> > >> >> > 2. The second case is conflicts on keys, I >> > >> >> > beleive CacheVersionConflictResolver and manager are used >> > >> >> > by GridCacheMapEntry which just says if use old value do >> this >> > >> >> > otherwise use >> > >> >> > newVal. Ideally this will be exposed in the new API so >> > >> >> > that >> > >> >> > one >> > >> >> can >> > >> >> > override this behaviour. The last writer wins approach >> isn't >> > >> >> always >> > >> >> > ideal >> > >> >> > and the semantics of the domain can mean that what is >> > consider >> > >> >> > "correct" in >> > >> >> > a conflict is not so for a different domain. >> > >> >> > 8. This is last on the list but is actually the most >> > >> >> > important >> > >> >> > for >> > >> us >> > >> >> > right now as it is an impending and growing risk. We allow >> > >> customers >> > >> >> to >> > >> >> > create their own tables on demand. We're already using the >> same >> > >> cache >> > >> >> > group >> > >> >> > etc for data structures to be re-used but now that we're >> getting >> > >> >> > to >> > >> >> > thousands of tables/caches our startup times are sometimes >> > >> >> unpredictably >> > >> >> > long - at present it seems to depend on the state of the >> > >> cache/table >> > >> >> > before >> > >> >> > the restart but we're into the order of 5 - 7 mins and >> steadily >> > >> >> > increasing >> > >> >> > with the growth of tables. Are there any provisions in >> > >> >> > Ignite >> 3 >> > >> >> > for >> > >> >> > ensuring startup time isn't proportional to the number of >> > >> >> tables/caches >> > >> >> > available? >> > >> >> > >> > >> >> > >> > >> >> > Those are the key things I can think of at the moment. Val and >> > >> >> > others >> > >> >> I'd >> > >> >> > love to open a conversation around these. >> > >> >> > >> > >> >> > Regards, >> > >> >> > Courtney Robinson >> > >> >> > Founder and CEO, Hypi >> > >> >> > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> >> > >> >> > >> > >> >> > <https://hypi.io> >> > >> >> > https://hypi.io >> > >> >> > >> > >> >> >> > >> >> >> > >> >> -- >> > >> >> Best regards, >> > >> >> Andrey V. Mashenkov >> > >> >> >> > >> > >> > >> >> > > >> > >> > >> > -- >> > >> > Best regards, >> > Ivan Pavlukhin >> > >> > -- Best regards, Ivan Pavlukhin