I've created IEP-13 [1] to cover all cases. Feel free to create issues. [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769
On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <voze...@gridgain.com> wrote: > Let's start with a single and the most simple view, e.g. > LOCAL_TRANSACTIONS. We will review and merge it along with necessary > infrastructure. Then will handle the rest view in separate tickets and > separate focused discussions. > > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <plehanov.a...@gmail.com> > wrote: > > > 1) It’s not a principal point, I can change schema. The > INFORMATION_SCHEMA > > was used because it’s already exists and usually used for metadata tables > > and views. Your proposal is to use schema “IGNITE”, am I understand you > > right? BTW, for now, we can’t query another (H2) meta tables from the > > INFORMATION_SCHEMA, so, “Ignite system views” is only available views to > > query from this schema. > > 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to > determine > > which node we are connected to. > > 3) As the first phase, in my opinion, local views will be enough. > > Performance and caching of distributed views should be discussed at next > > phases, when distributed views implementation will be planned. In current > > implementation I tried to use indexing for local views wherever it’s > > possible. > > 4) I don’t think, that JVM info is more critical information than, for > > example, caches or nodes information. When authorization capabilities > > planned to implement? > > > > About local data: yes, we can rename all currently implemented views for > > the local node data as LOCAL_..., and create (someday) new whole cluster > > views (which use distributed requests) without prefix or, for example, > with > > CLUSTER_ prefix. But some views can show all cluster information using > only > > local node data, without distributed requests (for example > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION, > > IGNITE_NODES, etc). Are they local or cluster views in this concept? > Which > > prefix should be used? And what about caches? Are they local or cluster? > On > > local node we can see cluster wide caches (replicated and distributed) > and > > caches for current node only. Local caches list may differ from node to > > node. Which prefix should be used for this view? And one more, there is > no > > sense for some views to make them cluster wide (for example > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating > > INSTANCE view? > > > > So, next steps: split PR, change schema name (IGNITE?), change view name > > for caches (CACHES, LOCAL_CACHES?) > > > > > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: > > > > > Hi Alex, > > > > > > System views could be extremely valuable addition for Ignite. Ideally, > > user > > > should be able to monitor and manage state of the whole cluster with a > > > single SQL command line. We have plans to implement it for a very long > > > time. However, this is very sensitive task which should take a lot of > > > moving pieces in count, such as usability, consistency, performance, > > > security, etc.. > > > > > > Let me point several major concerns I see at the moment: > > > > > > 1) Usability: INFORMATION_SCHEMA > > > This schema is part of SQL ANSI standard. When creating system views, > > some > > > vendors prefer to store them in completely different predefined schema > > > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA > > > directly. Both approaches could work. However, the latter breaks > > separation > > > of concerns - we store typical metadata near to possibly sensitive > system > > > data. Also it makes security management more complex - system data is > > very > > > sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA > to > > > user. Instead, we have to grant that access on per-view basis. For this > > > reason my preference is to store system tables in separate schema, not > in > > > INFORMATION_SCHEMA > > > > > > 2) Consistency: local data > > > One of implemented view GridH2SysViewImplInstance. Normally SQL users > > > communicate with Ignite through JDBC/ODBC drivers. These drivers are > > > connected to a single node, typically client node. Moreover, we will > > > introduce high-availability feature when drivers were able to connect > to > > > any address from a predefined list. It renders this view useless, as > you > > do > > > not know which node you connected to. Also, local-only data cannot be > > > joined in general case - you will receive different results on > different > > > nodes. The same goes for transactions, JVM info, etc. > > > > > > 3) Performance > > > Suppose we fixed consistency of transactions and now this view shows > > > transactions in the whole cluster with possibility to filter them by > > nodes > > > - this is what user would expect out of the box. Another problem > appears > > > then - performance. How would we collect necessary data? How would we > > > handle joins, when particular view could be scanned multiple times > during > > > query execution? How we achieve sensible consistency? Most probably we > > > would collect remote data once when query is started, cache it somehow > on > > > query session level, and then re-use during joins. But again, this > should > > > be discussed separately. > > > > > > 4) Security: JVM info > > > We should define clear boundaries of what info is exposed. JVM data > along > > > with running threads is critically sensitive information. We should not > > > expose it until we have authorization capabilities. > > > > > > In order to start moving this code from prototype to production state > we > > > should start with the most simple and consistent views. E.g. > > IGNITE_CACHES. > > > Let's move it to a separate PR, review infrastructure code, review view > > > implementation, agree on proper naming and placement, and merge it. > Then > > > each and every view (or group of related views) should be discussed and > > > reviewed separately. > > > > > > As far as node-local stuff, may be we should move it to a separate > > schema, > > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all > > transactions > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the local > > > node. In this case we will be able to merge "local" stuff shortly, and > > > implement more complex but at the same time much more useful > distributed > > > stuff later on. > > > > > > Makes sense? > > > > > > Vladimir. > > > > > > > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov < > plehanov.a...@gmail.com> > > > wrote: > > > > > > > Hello, Igniters! > > > > > > > > For Ignite diagnostic usually it’s helpful to get some Ignite > internals > > > > information. But currently, in my opinion, there are no convenient > > tools > > > > for this purpose: > > > > > > > > · Some issues can be solved by analyzing log files. Log files > > are > > > > useful for dumps, but sometimes they are difficult to read. Also > > > > interesting metrics can’t be received runtime by request, we need to > > wait > > > > until Ignite will write these metrics by timeout or other events. > > > > > > > > · JMX is useful for scalar metrics. Complex and table data can > > > also > > > > be received, but it’s difficult to read, filter and sort them without > > > > processing by specialized external tools. For most frequently used > > cases > > > > almost duplicating metrics are created to show data in an > easy-to-read > > > > form. > > > > > > > > · Web-console is able to show table and complex data. Perhaps, > > > > someday web-console will contain all necessary dashboards for most > > > problem > > > > investigation, but some non-trivial queries will not be covered > anyway. > > > > Also web-console needs additional infrastructure to work. > > > > > > > > · External “home-made” tools can be used for non-trivial > cases. > > > They > > > > cover highly specialized cases and usually can’t be used as general > > > purpose > > > > tools. > > > > > > > > Sometimes we are forced to use more than one tool and join data by > > hands > > > > (for example, current thread dump and data from logs). > > > > > > > > Often RDBMS for diagnostic purposes provides system views (for > example, > > > > DBA_% and V$% in Oracle), which can be queried by SQL. This solution > > > makes > > > > all internal diagnostic information available in a readable form > (with > > > all > > > > possible filters and projections) without using any other internal or > > > > external tools. My proposal is to create similar system views in > > Ignite. > > > > > > > > I implement working prototype (PR: [1]). It contains views: > > > > > > > > IGNITE_SYSTEM_VIEWS > > > > > > > > Registered system views > > > > > > > > IGNITE_INSTANCE > > > > > > > > Ignite instance > > > > > > > > IGNITE_JVM_THREADS > > > > > > > > JVM threads > > > > > > > > IGNITE_JVM_RUNTIME > > > > > > > > JVM runtime > > > > > > > > IGNITE_JVM_OS > > > > > > > > JVM operating system > > > > > > > > IGNITE_CACHES > > > > > > > > Ignite caches > > > > > > > > IGNITE_CACHE_CLUSTER_METRICS > > > > > > > > Ignite cache cluster metrics > > > > > > > > IGNITE_CACHE_NODE_METRICS > > > > > > > > Ignite cache node metrics > > > > > > > > IGNITE_CACHE_GROUPS > > > > > > > > Cache groups > > > > > > > > IGNITE_NODES > > > > > > > > Nodes in topology > > > > > > > > IGNITE_NODE_HOSTS > > > > > > > > Node hosts > > > > > > > > IGNITE_NODE_ADDRESSES > > > > > > > > Node addresses > > > > > > > > IGNITE_NODE_ATTRIBUTES > > > > > > > > Node attributes > > > > > > > > IGNITE_NODE_METRICS > > > > > > > > Node metrics > > > > > > > > IGNITE_TRANSACTIONS > > > > > > > > Active transactions > > > > > > > > IGNITE_TRANSACTION_ENTRIES > > > > > > > > Cache entries used by transaction > > > > > > > > IGNITE_TASKS > > > > > > > > Active tasks > > > > > > > > IGNITE_PART_ASSIGNMENT > > > > > > > > Partition assignment map > > > > > > > > IGNITE_PART_ALLOCATION > > > > > > > > Partition allocation map > > > > > > > > > > > > > > > > There are much more useful views can be implemented (executors > > > diagnostic, > > > > SPIs diagnostic, etc). > > > > > > > > Some usage examples: > > > > > > > > Cache groups and their partitions, which used by transaction more > than > > 5 > > > > minutes long: > > > > > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS > > ENTITIES_CNT > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t > > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = > > te.XID > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW()) > > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION > > > > > > > > > > > > > > > > Average CPU load on server nodes grouped by operating system: > > > > > > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = > n.ID > > > AND > > > > na.NAME = 'os.name' > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID > > > > WHERE n.IS_CLIENT = false > > > > GROUP BY na.VALUE > > > > > > > > > > > > > > > > Top 5 nodes by puts to cache ‘cache’: > > > > > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm > > > > WHERE cm.CACHE_NAME = 'cache' > > > > ORDER BY cm.CACHE_PUTS DESC > > > > LIMIT 5 > > > > > > > > > > > > > > > > Does this implementation interesting to someone else? Maybe any views > > are > > > > redundant? Which additional first-priority views must be implemented? > > Any > > > > other thoughts or proposal? > > > > > > > > [1] https://github.com/apache/ignite/pull/3413 > > > > > > > > > >