clflushopt commented on issue #14608:
URL: https://github.com/apache/datafusion/issues/14608#issuecomment-2708663987

   Hey @alamb @lmwnshn I've been actually following the CMU 15-799 course 
(nights and weekend's mostly) and started working on a Rust port of the 
benchbase Java implementation after seieng this discussion, but I am also 
looking at Trino's TPCH generator and DuckDB's generator.
   
   I ported most of the randomness logic, I tried to keep it compatible (bug 
for bug), I am currently working on sanity checks for the RNG stuff but I am 
seeing some discrepancies .
   
   Example (my implementation) :
   
   ```
   
+-------------+--------------+-----------------------------------------------------------------------------------+
   | r_regionkey | r_name       | r_comment                                     
                                    |
   
+-------------+--------------+-----------------------------------------------------------------------------------+
   | 0           | AFRICA       | e. blithely special packages boost finally 
bold, quiet pains. furiously regular   |
   |             |              | instructions cajole furiously! fina           
                                    |
   
+-------------+--------------+-----------------------------------------------------------------------------------+
   | 1           | AMERICA      | counts. ironic, even ideas use                
                                    |
   
+-------------+--------------+-----------------------------------------------------------------------------------+
   | 2           | ASIA         | , ironic platelets. regular, qu               
                                    |
   
+-------------+--------------+-----------------------------------------------------------------------------------+
   | 3           | EUROPE       | . slyly express dolphins use carefully. even  
                                    |
   
+-------------+--------------+-----------------------------------------------------------------------------------+
   | 4           | MIDDLE EAST  | n foxes. slowly unusual deposits might cajole 
blithely special theodolites.       |
   |             |              | evenly express deposits sleep ca              
                                    |
   
+-------------+--------------+-----------------------------------------------------------------------------------+
   ```
   
   Duck DB's output after running `INSTALL tpch; LOAD tpch; CALL dbgen(sf = 
1);select * from region`.
   
   ```
   
┌─────────────┬─────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
   │ r_regionkey │   r_name    │                                                
      r_comment                                                      │
   │    int32    │   varchar   │                                                
       varchar                                                       │
   
├─────────────┼─────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
   │           0 │ AFRICA      │ ar packages. regular excuses among the ironic 
requests cajole fluffily blithely final requests. furiously express p │
   │           1 │ AMERICA     │ s are. furiously even pinto bea                
                                                                     │
   │           2 │ ASIA        │ c, special dependencies around                 
                                                                     │
   │           3 │ EUROPE      │ e dolphins are furiously about the carefully   
                                                                     │
   │           4 │ MIDDLE EAST │  foxes boost furiously along the carefully 
dogged tithes. slyly regular orbits according to the special epit        │
   
└─────────────┴─────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
   
   ```
   
   I am debugging this issue before I add support for the remaining tables 
which shouldn't take too long, my implementation is currently in a lib crate 
and i'll also add a cli crate. My goal is to potentially donate it to the 
[datafusion-contrib ](https://github.com/datafusion-contrib) organization and 
then remain as the maintainer this way we can coordinate how to integrate it 
into datafusion as an extension. I am keeping the repo private for now until I 
finish a 0.1.0 release but can invite you and others.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to