Re: [PR] Implement `SHOW FUNCTIONS` [datafusion]

via GitHub Thu, 19 Dec 2024 08:27:13 -0800


goldmedal commented on PR #13799:
URL: https://github.com/apache/datafusion/pull/13799#issuecomment-2554924707


   > One thing I noticed is that including all the information in the output 
results in a pretty wide output schema
   > 
   > <img alt="Screenshot 2024-12-19 at 6 14 52 AM" width="1530" 
src="https://private-user-images.githubusercontent.com/490673/397353306-c4b4ddfb-4316-4aea-ad77-44ab7ecc9edb.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzQ2MjQ1NDYsIm5iZiI6MTczNDYyNDI0NiwicGF0aCI6Ii80OTA2NzMvMzk3MzUzMzA2LWM0YjRkZGZiLTQzMTYtNGFlYS1hZDc3LTQ0YWI3ZWNjOWVkYi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQxMjE5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MTIxOVQxNjA0MDZaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1lMmQyOTAwNDkyOTlhM2EzZjE5ZjZhY2FkMDQ0ZmExOTM2NjI2NDdhMGJlMzJmNjgyZjdlYWUxNGIzNmM4MmNkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.rJVOm1iSMmI_81mhdJalOi7xeP47SdA26P9pcrFxvlU";>
   > However, I don't really have any good suggestion to avoid this
   
   Thanks, @alamb for the review. Indeed, I noticed there are two directions 
for improvement:
   - How CLI shows the wide-column pretty.
   - `SHOW FUNCTIONS` can't select the specific column only.
   
   About showing the wide column, our behavior is similar to Postgres `psql`.
   ```
   test=# select 
'11111111111111111111111111111199999999999999999999999999999991111111111111111111111111111119999999999999999999999999999999',
 
   '1111111111111111111111111111119999999999999999999999999999999';
                                                             ?column?           
                                               |                           
?column?                            
   
----------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------
    
11111111111111111111111111111199999999999999999999999999999991111111111111111111111111111119999999999999999999999999999999
 | 1111111111111111111111111111119999999999999999999999999999999
   (1 row)
   ```
   However, the way DuckDB did is more readable. 
   ```
   D select 
'11111111111111111111111111111199999999999999999999999999999991111111111111111111111111111119999999999999999999999999999999',
 
     '1111111111111111111111111111119999999999999999999999999999999';
   
┌──────────────────────────────────────────────────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────┐
   │ 
'1111111111111111111111111111119999999999999999999999999999999111111111111111111111111…
  │ '1111111111111111111111111111119999999999999999999999999999999' │
   │                                         varchar                            
              │                             varchar                             
│
   
├──────────────────────────────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────┤
   │ 
11111111111111111111111111111199999999999999999999999999999991111111111111111111111111…
  │ 1111111111111111111111111111119999999999999999999999999999999   │
   
└──────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────────────────────┘
   ```
   
   
   About the second issue, I have no idea how to improve it 🤔. It's an limit of 
`SHOW FUNCTIONS` syntax. In DuckDB, they provide the `duckdb_function()` table 
function instead of the `show function` syntax. So the user can select only 
what they want.
   If the number of columns is big, they will fold the result like:
   ```
   D select * from duckdb_functions();
   
┌───────────────┬──────────────┬─────────────┬──────────────────────┬───────────────┬───┬──────────────────┬──────────┬──────────────┬─────────┬───────────┐
   │ database_name │ database_oid │ schema_name │    function_name     │ 
function_type │ … │ has_side_effects │ internal │ function_oid │ example │ 
stability │
   │    varchar    │   varchar    │   varchar   │       varchar        │    
varchar    │   │     boolean      │ boolean  │    int64     │ varchar │  
varchar  │
   
├───────────────┼──────────────┼─────────────┼──────────────────────┼───────────────┼───┼──────────────────┼──────────┼──────────────┼─────────┼───────────┤
   │ system        │ 0            │ main        │ read_csv_auto        │ table  
       │ … │                  │ true     │           70 │         │           │
   │ system        │ 0            │ main        │ read_csv_auto        │ table  
       │ … │                  │ true     │           70 │         │           │
   │ system        │ 0            │ main        │ arrow_scan           │ table  
       │ … │                  │ true     │           96 │         │           │
   │ system        │ 0            │ main        │ arrow_scan_dumb      │ table  
       │ … │                  │ true     │           98 │         │           │
   │ system        │ 0            │ main        │ checkpoint           │ table  
       │ … │                  │ true     │           72 │         │           │
   ```
   or
   ```
   D select function_name, tags, parameters from duckdb_functions();
   
┌──────────────────────┬──────────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   │    function_name     │         tags         │                              
                    parameters                                                  
│
   │       varchar        │ map(varchar, varch…  │                              
                    varchar[]                                                   
│
   
├──────────────────────┼──────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
   │ read_csv_auto        │ {}                   │ [col0, hive_types_autocast, 
hive_types, union_by_name, filename, dtypes, null_padding, parallel, decimal_s… 
 │
   │ read_csv_auto        │ {}                   │ [col0, delim, dateformat, 
column_names, sep, hive_partitioning, header, escape, allow_quoted_nulls, 
maximu…  │
   │ arrow_scan           │ {}                   │ [col0, col1, col2]           
                                                                                
│
   │ arrow_scan_dumb      │ {}                   │ [col0, col1, col2]           
                                                                                
│
   ```
   Maybe we can consider improving the UX of the CLI. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] Implement `SHOW FUNCTIONS` [datafusion]

Reply via email to