Please join the DataSource V2 meetings, the next one is tomorrow since we are discussing these very topics right now. Datasource v1 cannot provide this information but any source which just generates RDDs can specify a partitioner. This is only useful though if you are only using RDDs, for Dataframes DSV2 is the place to look.
https://calendar.google.com/event?action=TEMPLATE&tmeid=NzhmcGRka3JscjNiZWFkYnRwNnQ0ZzZlajcgcnVzc2VsbC5zcGl0emVyQG0&tmsrc=russell.spitzer%40gmail.com On Tue, Apr 16, 2019 at 5:31 PM Long, Andrew <loand...@amazon.com.invalid> wrote: > Hey Friends, > > > > Is it possible to specify the sort order or bucketing in a way that can be > used by the optimizer in spark? > > > > Cheers Andrew >