Vova, Could you confirm https://issues.apache.org/jira/browse/IGNITE-7527 ready to be merged?
On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <voze...@gridgain.com> wrote: > I would start with NODES and NODE_ATTRIBUTES as the most simple thing. > > On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <dma...@apache.org> wrote: > > > Alex P, sounds like a good plan for me. > > > > Vladimir, do you have any suggestions or corrections? > > > > — > > Denis > > > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <plehanov.a...@gmail.com> > > wrote: > > > > > > The views engine and the first view are almost ready to merge (review > > > comments are resolved). Which views should we take next? My proposal - > > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES, > > since > > > these views are clear and all topology data available on each node. > > > Any objections? > > > > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <plehanov.a...@gmail.com>: > > > > > >> Anton, Vladimir, I've made some fixes. There is only one view left and > > >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'. > > >> > > >> High level design of solution: > > >> When IgniteH2Indexing is starting, it create and start > > >> new GridH2SysViewProcessor, which create and register in H2 (via its > own > > >> table engine) all implementations of system views. Each system view > > >> implementation extends base abstract class GridH2SysView. View > > >> implementation describes columns, their types and indexes in > constructor > > >> and must override method getRows for data retrieval (this method > called > > by > > >> H2-compatible table and index implementations for ignite system > views). > > >> Almost no fixes to existing parsing engine was made, except some > places, > > >> where GridH2Table instance was expected, but for system views there is > > >> another class. > > >> > > >> New PR: [1]. Please have a look. > > >> > > >> [1] https://github.com/apache/ignite/pull/3433 > > >> > > >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <avinogra...@gridgain.com > >: > > >> > > >>> I've created IEP-13 [1] to cover all cases. > > >>> Feel free to create issues. > > >>> > > >>> [1] > > >>> https://cwiki.apache.org/confluence/pages/viewpage. > > action?pageId=75962769 > > >>> > > >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov < > voze...@gridgain.com > > > > > >>> wrote: > > >>> > > >>>> Let's start with a single and the most simple view, e.g. > > >>>> LOCAL_TRANSACTIONS. We will review and merge it along with necessary > > >>>> infrastructure. Then will handle the rest view in separate tickets > and > > >>>> separate focused discussions. > > >>>> > > >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov < > > plehanov.a...@gmail.com > > >>>> > > >>>> wrote: > > >>>> > > >>>>> 1) It’s not a principal point, I can change schema. The > > >>>> INFORMATION_SCHEMA > > >>>>> was used because it’s already exists and usually used for metadata > > >>> tables > > >>>>> and views. Your proposal is to use schema “IGNITE”, am I understand > > >>> you > > >>>>> right? BTW, for now, we can’t query another (H2) meta tables from > the > > >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available > views > > >>> to > > >>>>> query from this schema. > > >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to > > >>>> determine > > >>>>> which node we are connected to. > > >>>>> 3) As the first phase, in my opinion, local views will be enough. > > >>>>> Performance and caching of distributed views should be discussed at > > >>> next > > >>>>> phases, when distributed views implementation will be planned. In > > >>> current > > >>>>> implementation I tried to use indexing for local views wherever > it’s > > >>>>> possible. > > >>>>> 4) I don’t think, that JVM info is more critical information than, > > for > > >>>>> example, caches or nodes information. When authorization > capabilities > > >>>>> planned to implement? > > >>>>> > > >>>>> About local data: yes, we can rename all currently implemented > views > > >>> for > > >>>>> the local node data as LOCAL_..., and create (someday) new whole > > >>> cluster > > >>>>> views (which use distributed requests) without prefix or, for > > example, > > >>>> with > > >>>>> CLUSTER_ prefix. But some views can show all cluster information > > using > > >>>> only > > >>>>> local node data, without distributed requests (for example > > >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, > IGNITE_PART_ALLOCATION, > > >>>>> IGNITE_NODES, etc). Are they local or cluster views in this > concept? > > >>>> Which > > >>>>> prefix should be used? And what about caches? Are they local or > > >>> cluster? > > >>>> On > > >>>>> local node we can see cluster wide caches (replicated and > > distributed) > > >>>> and > > >>>>> caches for current node only. Local caches list may differ from > node > > >>> to > > >>>>> node. Which prefix should be used for this view? And one more, > there > > >>> is > > >>>> no > > >>>>> sense for some views to make them cluster wide (for example > > >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without > creating > > >>>>> INSTANCE view? > > >>>>> > > >>>>> So, next steps: split PR, change schema name (IGNITE?), change view > > >>> name > > >>>>> for caches (CACHES, LOCAL_CACHES?) > > >>>>> > > >>>>> > > >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: > > >>>>> > > >>>>>> Hi Alex, > > >>>>>> > > >>>>>> System views could be extremely valuable addition for Ignite. > > >>> Ideally, > > >>>>> user > > >>>>>> should be able to monitor and manage state of the whole cluster > > >>> with a > > >>>>>> single SQL command line. We have plans to implement it for a very > > >>> long > > >>>>>> time. However, this is very sensitive task which should take a lot > > >>> of > > >>>>>> moving pieces in count, such as usability, consistency, > performance, > > >>>>>> security, etc.. > > >>>>>> > > >>>>>> Let me point several major concerns I see at the moment: > > >>>>>> > > >>>>>> 1) Usability: INFORMATION_SCHEMA > > >>>>>> This schema is part of SQL ANSI standard. When creating system > > >>> views, > > >>>>> some > > >>>>>> vendors prefer to store them in completely different predefined > > >>> schema > > >>>>>> (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA > > >>>>>> directly. Both approaches could work. However, the latter breaks > > >>>>> separation > > >>>>>> of concerns - we store typical metadata near to possibly sensitive > > >>>> system > > >>>>>> data. Also it makes security management more complex - system data > > >>> is > > >>>>> very > > >>>>>> sensitive, and now we cannot simply grant access > > >>> INFORMATIONAL_SCHEMA > > >>>> to > > >>>>>> user. Instead, we have to grant that access on per-view basis. For > > >>> this > > >>>>>> reason my preference is to store system tables in separate schema, > > >>> not > > >>>> in > > >>>>>> INFORMATION_SCHEMA > > >>>>>> > > >>>>>> 2) Consistency: local data > > >>>>>> One of implemented view GridH2SysViewImplInstance. Normally SQL > > >>> users > > >>>>>> communicate with Ignite through JDBC/ODBC drivers. These drivers > are > > >>>>>> connected to a single node, typically client node. Moreover, we > will > > >>>>>> introduce high-availability feature when drivers were able to > > >>> connect > > >>>> to > > >>>>>> any address from a predefined list. It renders this view useless, > as > > >>>> you > > >>>>> do > > >>>>>> not know which node you connected to. Also, local-only data cannot > > >>> be > > >>>>>> joined in general case - you will receive different results on > > >>>> different > > >>>>>> nodes. The same goes for transactions, JVM info, etc. > > >>>>>> > > >>>>>> 3) Performance > > >>>>>> Suppose we fixed consistency of transactions and now this view > shows > > >>>>>> transactions in the whole cluster with possibility to filter them > by > > >>>>> nodes > > >>>>>> - this is what user would expect out of the box. Another problem > > >>>> appears > > >>>>>> then - performance. How would we collect necessary data? How would > > >>> we > > >>>>>> handle joins, when particular view could be scanned multiple times > > >>>> during > > >>>>>> query execution? How we achieve sensible consistency? Most > probably > > >>> we > > >>>>>> would collect remote data once when query is started, cache it > > >>> somehow > > >>>> on > > >>>>>> query session level, and then re-use during joins. But again, this > > >>>> should > > >>>>>> be discussed separately. > > >>>>>> > > >>>>>> 4) Security: JVM info > > >>>>>> We should define clear boundaries of what info is exposed. JVM > data > > >>>> along > > >>>>>> with running threads is critically sensitive information. We > should > > >>> not > > >>>>>> expose it until we have authorization capabilities. > > >>>>>> > > >>>>>> In order to start moving this code from prototype to production > > >>> state > > >>>> we > > >>>>>> should start with the most simple and consistent views. E.g. > > >>>>> IGNITE_CACHES. > > >>>>>> Let's move it to a separate PR, review infrastructure code, review > > >>> view > > >>>>>> implementation, agree on proper naming and placement, and merge > it. > > >>>> Then > > >>>>>> each and every view (or group of related views) should be > discussed > > >>> and > > >>>>>> reviewed separately. > > >>>>>> > > >>>>>> As far as node-local stuff, may be we should move it to a separate > > >>>>> schema, > > >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all > > >>>>> transactions > > >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the > > >>> local > > >>>>>> node. In this case we will be able to merge "local" stuff shortly, > > >>> and > > >>>>>> implement more complex but at the same time much more useful > > >>>> distributed > > >>>>>> stuff later on. > > >>>>>> > > >>>>>> Makes sense? > > >>>>>> > > >>>>>> Vladimir. > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov < > > >>>> plehanov.a...@gmail.com> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Hello, Igniters! > > >>>>>>> > > >>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite > > >>>> internals > > >>>>>>> information. But currently, in my opinion, there are no > convenient > > >>>>> tools > > >>>>>>> for this purpose: > > >>>>>>> > > >>>>>>> · Some issues can be solved by analyzing log files. Log > > >>> files > > >>>>> are > > >>>>>>> useful for dumps, but sometimes they are difficult to read. Also > > >>>>>>> interesting metrics can’t be received runtime by request, we need > > >>> to > > >>>>> wait > > >>>>>>> until Ignite will write these metrics by timeout or other events. > > >>>>>>> > > >>>>>>> · JMX is useful for scalar metrics. Complex and table data > > >>> can > > >>>>>> also > > >>>>>>> be received, but it’s difficult to read, filter and sort them > > >>> without > > >>>>>>> processing by specialized external tools. For most frequently > used > > >>>>> cases > > >>>>>>> almost duplicating metrics are created to show data in an > > >>>> easy-to-read > > >>>>>>> form. > > >>>>>>> > > >>>>>>> · Web-console is able to show table and complex data. > > >>> Perhaps, > > >>>>>>> someday web-console will contain all necessary dashboards for > > >>> most > > >>>>>> problem > > >>>>>>> investigation, but some non-trivial queries will not be covered > > >>>> anyway. > > >>>>>>> Also web-console needs additional infrastructure to work. > > >>>>>>> > > >>>>>>> · External “home-made” tools can be used for non-trivial > > >>>> cases. > > >>>>>> They > > >>>>>>> cover highly specialized cases and usually can’t be used as > > >>> general > > >>>>>> purpose > > >>>>>>> tools. > > >>>>>>> > > >>>>>>> Sometimes we are forced to use more than one tool and join data > by > > >>>>> hands > > >>>>>>> (for example, current thread dump and data from logs). > > >>>>>>> > > >>>>>>> Often RDBMS for diagnostic purposes provides system views (for > > >>>> example, > > >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This > > >>> solution > > >>>>>> makes > > >>>>>>> all internal diagnostic information available in a readable form > > >>>> (with > > >>>>>> all > > >>>>>>> possible filters and projections) without using any other > > >>> internal or > > >>>>>>> external tools. My proposal is to create similar system views in > > >>>>> Ignite. > > >>>>>>> > > >>>>>>> I implement working prototype (PR: [1]). It contains views: > > >>>>>>> > > >>>>>>> IGNITE_SYSTEM_VIEWS > > >>>>>>> > > >>>>>>> Registered system views > > >>>>>>> > > >>>>>>> IGNITE_INSTANCE > > >>>>>>> > > >>>>>>> Ignite instance > > >>>>>>> > > >>>>>>> IGNITE_JVM_THREADS > > >>>>>>> > > >>>>>>> JVM threads > > >>>>>>> > > >>>>>>> IGNITE_JVM_RUNTIME > > >>>>>>> > > >>>>>>> JVM runtime > > >>>>>>> > > >>>>>>> IGNITE_JVM_OS > > >>>>>>> > > >>>>>>> JVM operating system > > >>>>>>> > > >>>>>>> IGNITE_CACHES > > >>>>>>> > > >>>>>>> Ignite caches > > >>>>>>> > > >>>>>>> IGNITE_CACHE_CLUSTER_METRICS > > >>>>>>> > > >>>>>>> Ignite cache cluster metrics > > >>>>>>> > > >>>>>>> IGNITE_CACHE_NODE_METRICS > > >>>>>>> > > >>>>>>> Ignite cache node metrics > > >>>>>>> > > >>>>>>> IGNITE_CACHE_GROUPS > > >>>>>>> > > >>>>>>> Cache groups > > >>>>>>> > > >>>>>>> IGNITE_NODES > > >>>>>>> > > >>>>>>> Nodes in topology > > >>>>>>> > > >>>>>>> IGNITE_NODE_HOSTS > > >>>>>>> > > >>>>>>> Node hosts > > >>>>>>> > > >>>>>>> IGNITE_NODE_ADDRESSES > > >>>>>>> > > >>>>>>> Node addresses > > >>>>>>> > > >>>>>>> IGNITE_NODE_ATTRIBUTES > > >>>>>>> > > >>>>>>> Node attributes > > >>>>>>> > > >>>>>>> IGNITE_NODE_METRICS > > >>>>>>> > > >>>>>>> Node metrics > > >>>>>>> > > >>>>>>> IGNITE_TRANSACTIONS > > >>>>>>> > > >>>>>>> Active transactions > > >>>>>>> > > >>>>>>> IGNITE_TRANSACTION_ENTRIES > > >>>>>>> > > >>>>>>> Cache entries used by transaction > > >>>>>>> > > >>>>>>> IGNITE_TASKS > > >>>>>>> > > >>>>>>> Active tasks > > >>>>>>> > > >>>>>>> IGNITE_PART_ASSIGNMENT > > >>>>>>> > > >>>>>>> Partition assignment map > > >>>>>>> > > >>>>>>> IGNITE_PART_ALLOCATION > > >>>>>>> > > >>>>>>> Partition allocation map > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> There are much more useful views can be implemented (executors > > >>>>>> diagnostic, > > >>>>>>> SPIs diagnostic, etc). > > >>>>>>> > > >>>>>>> Some usage examples: > > >>>>>>> > > >>>>>>> Cache groups and their partitions, which used by transaction more > > >>>> than > > >>>>> 5 > > >>>>>>> minutes long: > > >>>>>>> > > >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS > > >>>>> ENTITIES_CNT > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = > > >>>>> te.XID > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = > c.NAME > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = > > >>> cg.ID > > >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW()) > > >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> Average CPU load on server nodes grouped by operating system: > > >>>>>>> > > >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID > = > > >>>> n.ID > > >>>>>> AND > > >>>>>>> na.NAME = 'os.name' > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = > > >>> n.ID > > >>>>>>> WHERE n.IS_CLIENT = false > > >>>>>>> GROUP BY na.VALUE > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> Top 5 nodes by puts to cache ‘cache’: > > >>>>>>> > > >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM > > >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm > > >>>>>>> WHERE cm.CACHE_NAME = 'cache' > > >>>>>>> ORDER BY cm.CACHE_PUTS DESC > > >>>>>>> LIMIT 5 > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> Does this implementation interesting to someone else? Maybe any > > >>> views > > >>>>> are > > >>>>>>> redundant? Which additional first-priority views must be > > >>> implemented? > > >>>>> Any > > >>>>>>> other thoughts or proposal? > > >>>>>>> > > >>>>>>> [1] https://github.com/apache/ignite/pull/3413 > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > >> > > > > >