Let's start with a single and the most simple view, e.g. LOCAL_TRANSACTIONS. We will review and merge it along with necessary infrastructure. Then will handle the rest view in separate tickets and separate focused discussions.
On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <plehanov.a...@gmail.com> wrote: > 1) It’s not a principal point, I can change schema. The INFORMATION_SCHEMA > was used because it’s already exists and usually used for metadata tables > and views. Your proposal is to use schema “IGNITE”, am I understand you > right? BTW, for now, we can’t query another (H2) meta tables from the > INFORMATION_SCHEMA, so, “Ignite system views” is only available views to > query from this schema. > 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to determine > which node we are connected to. > 3) As the first phase, in my opinion, local views will be enough. > Performance and caching of distributed views should be discussed at next > phases, when distributed views implementation will be planned. In current > implementation I tried to use indexing for local views wherever it’s > possible. > 4) I don’t think, that JVM info is more critical information than, for > example, caches or nodes information. When authorization capabilities > planned to implement? > > About local data: yes, we can rename all currently implemented views for > the local node data as LOCAL_..., and create (someday) new whole cluster > views (which use distributed requests) without prefix or, for example, with > CLUSTER_ prefix. But some views can show all cluster information using only > local node data, without distributed requests (for example > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION, > IGNITE_NODES, etc). Are they local or cluster views in this concept? Which > prefix should be used? And what about caches? Are they local or cluster? On > local node we can see cluster wide caches (replicated and distributed) and > caches for current node only. Local caches list may differ from node to > node. Which prefix should be used for this view? And one more, there is no > sense for some views to make them cluster wide (for example > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating > INSTANCE view? > > So, next steps: split PR, change schema name (IGNITE?), change view name > for caches (CACHES, LOCAL_CACHES?) > > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: > > > Hi Alex, > > > > System views could be extremely valuable addition for Ignite. Ideally, > user > > should be able to monitor and manage state of the whole cluster with a > > single SQL command line. We have plans to implement it for a very long > > time. However, this is very sensitive task which should take a lot of > > moving pieces in count, such as usability, consistency, performance, > > security, etc.. > > > > Let me point several major concerns I see at the moment: > > > > 1) Usability: INFORMATION_SCHEMA > > This schema is part of SQL ANSI standard. When creating system views, > some > > vendors prefer to store them in completely different predefined schema > > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA > > directly. Both approaches could work. However, the latter breaks > separation > > of concerns - we store typical metadata near to possibly sensitive system > > data. Also it makes security management more complex - system data is > very > > sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA to > > user. Instead, we have to grant that access on per-view basis. For this > > reason my preference is to store system tables in separate schema, not in > > INFORMATION_SCHEMA > > > > 2) Consistency: local data > > One of implemented view GridH2SysViewImplInstance. Normally SQL users > > communicate with Ignite through JDBC/ODBC drivers. These drivers are > > connected to a single node, typically client node. Moreover, we will > > introduce high-availability feature when drivers were able to connect to > > any address from a predefined list. It renders this view useless, as you > do > > not know which node you connected to. Also, local-only data cannot be > > joined in general case - you will receive different results on different > > nodes. The same goes for transactions, JVM info, etc. > > > > 3) Performance > > Suppose we fixed consistency of transactions and now this view shows > > transactions in the whole cluster with possibility to filter them by > nodes > > - this is what user would expect out of the box. Another problem appears > > then - performance. How would we collect necessary data? How would we > > handle joins, when particular view could be scanned multiple times during > > query execution? How we achieve sensible consistency? Most probably we > > would collect remote data once when query is started, cache it somehow on > > query session level, and then re-use during joins. But again, this should > > be discussed separately. > > > > 4) Security: JVM info > > We should define clear boundaries of what info is exposed. JVM data along > > with running threads is critically sensitive information. We should not > > expose it until we have authorization capabilities. > > > > In order to start moving this code from prototype to production state we > > should start with the most simple and consistent views. E.g. > IGNITE_CACHES. > > Let's move it to a separate PR, review infrastructure code, review view > > implementation, agree on proper naming and placement, and merge it. Then > > each and every view (or group of related views) should be discussed and > > reviewed separately. > > > > As far as node-local stuff, may be we should move it to a separate > schema, > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all > transactions > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the local > > node. In this case we will be able to merge "local" stuff shortly, and > > implement more complex but at the same time much more useful distributed > > stuff later on. > > > > Makes sense? > > > > Vladimir. > > > > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <plehanov.a...@gmail.com> > > wrote: > > > > > Hello, Igniters! > > > > > > For Ignite diagnostic usually it’s helpful to get some Ignite internals > > > information. But currently, in my opinion, there are no convenient > tools > > > for this purpose: > > > > > > · Some issues can be solved by analyzing log files. Log files > are > > > useful for dumps, but sometimes they are difficult to read. Also > > > interesting metrics can’t be received runtime by request, we need to > wait > > > until Ignite will write these metrics by timeout or other events. > > > > > > · JMX is useful for scalar metrics. Complex and table data can > > also > > > be received, but it’s difficult to read, filter and sort them without > > > processing by specialized external tools. For most frequently used > cases > > > almost duplicating metrics are created to show data in an easy-to-read > > > form. > > > > > > · Web-console is able to show table and complex data. Perhaps, > > > someday web-console will contain all necessary dashboards for most > > problem > > > investigation, but some non-trivial queries will not be covered anyway. > > > Also web-console needs additional infrastructure to work. > > > > > > · External “home-made” tools can be used for non-trivial cases. > > They > > > cover highly specialized cases and usually can’t be used as general > > purpose > > > tools. > > > > > > Sometimes we are forced to use more than one tool and join data by > hands > > > (for example, current thread dump and data from logs). > > > > > > Often RDBMS for diagnostic purposes provides system views (for example, > > > DBA_% and V$% in Oracle), which can be queried by SQL. This solution > > makes > > > all internal diagnostic information available in a readable form (with > > all > > > possible filters and projections) without using any other internal or > > > external tools. My proposal is to create similar system views in > Ignite. > > > > > > I implement working prototype (PR: [1]). It contains views: > > > > > > IGNITE_SYSTEM_VIEWS > > > > > > Registered system views > > > > > > IGNITE_INSTANCE > > > > > > Ignite instance > > > > > > IGNITE_JVM_THREADS > > > > > > JVM threads > > > > > > IGNITE_JVM_RUNTIME > > > > > > JVM runtime > > > > > > IGNITE_JVM_OS > > > > > > JVM operating system > > > > > > IGNITE_CACHES > > > > > > Ignite caches > > > > > > IGNITE_CACHE_CLUSTER_METRICS > > > > > > Ignite cache cluster metrics > > > > > > IGNITE_CACHE_NODE_METRICS > > > > > > Ignite cache node metrics > > > > > > IGNITE_CACHE_GROUPS > > > > > > Cache groups > > > > > > IGNITE_NODES > > > > > > Nodes in topology > > > > > > IGNITE_NODE_HOSTS > > > > > > Node hosts > > > > > > IGNITE_NODE_ADDRESSES > > > > > > Node addresses > > > > > > IGNITE_NODE_ATTRIBUTES > > > > > > Node attributes > > > > > > IGNITE_NODE_METRICS > > > > > > Node metrics > > > > > > IGNITE_TRANSACTIONS > > > > > > Active transactions > > > > > > IGNITE_TRANSACTION_ENTRIES > > > > > > Cache entries used by transaction > > > > > > IGNITE_TASKS > > > > > > Active tasks > > > > > > IGNITE_PART_ASSIGNMENT > > > > > > Partition assignment map > > > > > > IGNITE_PART_ALLOCATION > > > > > > Partition allocation map > > > > > > > > > > > > There are much more useful views can be implemented (executors > > diagnostic, > > > SPIs diagnostic, etc). > > > > > > Some usage examples: > > > > > > Cache groups and their partitions, which used by transaction more than > 5 > > > minutes long: > > > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS > ENTITIES_CNT > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = > te.XID > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW()) > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION > > > > > > > > > > > > Average CPU load on server nodes grouped by operating system: > > > > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = n.ID > > AND > > > na.NAME = 'os.name' > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID > > > WHERE n.IS_CLIENT = false > > > GROUP BY na.VALUE > > > > > > > > > > > > Top 5 nodes by puts to cache ‘cache’: > > > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm > > > WHERE cm.CACHE_NAME = 'cache' > > > ORDER BY cm.CACHE_PUTS DESC > > > LIMIT 5 > > > > > > > > > > > > Does this implementation interesting to someone else? Maybe any views > are > > > redundant? Which additional first-priority views must be implemented? > Any > > > other thoughts or proposal? > > > > > > [1] https://github.com/apache/ignite/pull/3413 > > > > > >