timsaucer commented on PR #21030:
URL: https://github.com/apache/datafusion/pull/21030#issuecomment-4092124577

   One of the things I've been thinking about here is doing some scale testing 
of performance, which I haven't done on the FFI crate really. I was thinking we 
could do something along the lines of using 
https://github.com/datafusion-contrib/datafusion-tpch to generate table 
providers at different scale factors. Then it would seem we could have a series 
of tests:
   
   1. Pure rust with no FFI work.
   2. Pure rust but using two modules and passing table provider via FFI.
   3. Expose table provider to python and test with datafusion-python.
   
   The thing I like about doing this is that we would be able to see the 
impacts of each of the layers between the code, ideally going from 2->3 having 
near zero impact.
   
   For such a test I would think about setting up a stream, reading in and 
dumping the data as fast as possible.
   
   Since this is orthogonal to the actual FFI work you're proposing I might try 
setting this up on a test repo.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to