2017-08-01 18:35 GMT+02:00 Remi Colinet <remi.coli...@gmail.com>: > I did it in version 2 of the patch. > The patch could yield TEXT, JSON, and XML ouput. > > For below query, it gives: > > => Terminal 1 > test=# select * from t_10m, t_1m where t_10m.md5 like '%cb%'; > > => Terminal 2 > test=# \watch PROGRESS 9546; > Wed 10 May 2017 06:29:59 PM CEST (every 1s) > > PLAN > PROGRESS > ------------------------------------------------------------ > --------------------------------------------- > status: <query running> > query: select * from t_10m, t_1m where t_10m.md5 like '%cb%'; > time used (s): 10 > Nested Loop > -> Seq Scan on t_1m => rows 7/1000000 0% => blks 8334/8334 100% > -> Materialize => file read readptrcount=1 rows write=1189285 > read=6584854 disk use (bytes) 53842965 > -> Seq Scan on t_10m => rows 1145596/738172 155% > Filter: (md5 ~~ '%cb%'::text) > total disk space used (MB) 51 > (9 rows) > > => Terminal 2 > test=# PROGRESS (FORMAT JSON) 9546; > PLAN PROGRESS > ---------------------------------------------------------------------- > [ + > "status": "<query running>", + > "query": "select * from t_10m, t_1m where t_10m.md5 like '%cb%';",+ > "time used (s)": 0, + > "single worker": { + > "Node Type": "Nested Loop", + > "Partial Mode": "", + > "Operation": "single worker", + > "Parent Relationship": "single worker", + > "Custom Plan Provider": "(@\u0004\u0001", + > "Parallel Aware": false, + > "Outer": { + > "Node Type": "Seq Scan", + > "Strategy": "", + > "Partial Mode": "single worker", + > "Operation": "Outer", + > "Parent Relationship": "Outer", + > "Custom Plan Provider": "(@\u0004\u0001", + > "Parallel Aware": false, + > "relation": "t_1m", + > "rows fetched": 1, + > "rows total": 1000000, + > "rows percent": 0, + > "blocks fetched": 8334, + > "blocks total": 8334, + > "blocks percent": 100 + > }, + > "Inner": { + > "Node Type": "Materialize", + > "Strategy": "", + > "Partial Mode": "single worker", + > "Operation": "Inner", + > "Parent Relationship": "Inner", + > "Custom Plan Provider": "(@\u0004\u0001", + > "Parallel Aware": false, + > "file store": "write", + > "readptrcount": 1, + > "rows write": 297256, + > "rows read": 0, + > "disk use (bytes)": 11911168, + > "Outer": { + > "Node Type": "Seq Scan", + > "Strategy": "", + > "Partial Mode": "Inner", + > "Operation": "Outer", + > "Parent Relationship": "Outer", + > "Custom Plan Provider": "HtFH\b[]\u000f\u001f", + > "Parallel Aware": false, + > "relation": "t_10m", + > "rows fetched": 253566, + > "rows total": 738172, + > "rows percent": 34, + > "blocks fetched": 18436, + > "blocks total": 83334, + > "blocks percent": 22, + > "Filter": "(md5 ~~ '%cb%'::text)" + > } + > } + > }, + > "unit": "MB", + > "total disk space used": 11 + > ] > (1 row) > > test=# > > I'am skeptical about the use of JSON, XML, and others in such output. > > Does anyone use these formats (XML, JSON, YAML) for EXPLAIN output? > I suspect only TEXT format is being used. > > This looks better with a SQL table made of name/value fields to be used > both by administrators and by utilities. >
I disagree. It should be consistent with EXPLAIN - there is zero benefit from introduction new format. I know few utils that uses JSON format. Regards Pavel > And this is much more inline with SQL. No need to parse JSON, XML or any > other format. Output can be controlled with WHERE clause. > > Rgds > Remi > > 2017-07-31 12:10 GMT+02:00 Pavel Stehule <pavel.steh...@gmail.com>: > >> >> >> 2017-07-31 11:09 GMT+02:00 Remi Colinet <remi.coli...@gmail.com>: >> >>> >>> >>> 2017-07-26 15:27 GMT+02:00 Robert Haas <robertmh...@gmail.com>: >>> >>>> On Wed, Jun 21, 2017 at 10:01 AM, Remi Colinet <remi.coli...@gmail.com> >>>> wrote: >>>> > test=# SELECT pid, ppid, bid, concat(repeat(' ', 3 * indent),name), >>>> value, >>>> > unit FROM pg_progress(0,0); >>>> > pid | ppid | bid | concat | value | unit >>>> > -------+------+-----+------------------+------------------+--------- >>>> > 14106 | 0 | 4 | status | query running | >>>> > 14106 | 0 | 4 | relationship | progression | >>>> > 14106 | 0 | 4 | node name | Sort | >>>> > 14106 | 0 | 4 | sort status | on tapes writing | >>>> > 14106 | 0 | 4 | completion | 0 | percent >>>> > 14106 | 0 | 4 | relationship | Outer | >>>> > 14106 | 0 | 4 | node name | Seq Scan | >>>> > 14106 | 0 | 4 | scan on | t_10m | >>>> > 14106 | 0 | 4 | fetched | 25049 | block >>>> > 14106 | 0 | 4 | total | 83334 | block >>>> > 14106 | 0 | 4 | completion | 30 | percent >>>> > (11 rows) >>>> > >>>> > test=# >>>> >>>> Somehow I imagined that the output would look more like what EXPLAIN >>>> produces. >>>> >>> >>> >>> I had initially used the same output as for the ANALYZE command: >>> >>> test=# PROGRESS 14611; >>> PLAN >>> PROGRESS >>> ------------------------------------------------------------ >>> ----------------------------- >>> Gather Merge >>> -> Sort=> dumping tuples to tapes >>> rows r/w merge 0/0 rows r/w effective 0/1464520 0% >>> Sort Key: md5 >>> -> Parallel Seq Scan on t_10m => rows 1464520/4166700 35% blks >>> 36011/83334 43% >>> (5 rows) >>> >>> test=# >>> >>> But this restricts the use to "human consumers". Using a table output >>> with name/value pairs, allows the use by utilities for instance, without >>> parsing. This is less handy for administrators, but far better for 3rd >>> party utilities. One solution is otherwise to create a PL/SQL command on >>> top of pg_progress() SQL function to produce an output similar to the one >>> of the ANALYZE command. >>> >> >> you can support XML, JSON output format like EXPLAIN does. >> >> https://www.postgresql.org/docs/current/static/sql-explain.html >> >> Regards >> >> pavel >> >>> >>> >>>> > If the one shared memory page is not enough for the whole progress >>>> report, >>>> > the progress report transfert between the 2 backends is done with a >>>> series >>>> > of request/response. Before setting the latch, the monitored backend >>>> write >>>> > the size of the data dumped in shared memory and set a status to >>>> indicate >>>> > that more data is to be sent through the shared memory page. The >>>> monitoring >>>> > backends get the result and sends an other signal, and then wait for >>>> the >>>> > latch again. The monitored backend does not collect a new progress >>>> report >>>> > but continues to dump the already collected report. And the exchange >>>> goes on >>>> > until the full progress report has been dumped. >>>> >>>> This is basically what shm_mq does. We probably don't want to >>>> reinvent that code, as it has taken a surprising amount of debugging >>>> to get it fully working. >>>> >>> >>> Yes, I had once considered this solution but then moved away as I was >>> unsure of the exact need for the transfert of the progress report between >>> the monitored and the monitoring backends. >>> I'am going to switch to shm_mq. >>> >>> Thx & Rgds >>> >>> >>> >>>> >>>> -- >>>> Robert Haas >>>> EnterpriseDB: http://www.enterprisedb.com >>>> The Enterprise PostgreSQL Company >>>> >>> >>> >> >