On Tue, 7 Feb 2023 at 19:32, Quentin Lhoest <quen...@huggingface.co> wrote: > > Hi, > > If I remember correctly one can already pass `types_mapper` > to `pa.Table.to_pandas`, to allow Ray or HF Datasets to define > their own pandas extension types associated to the arrow > extension types. I guess this could also be used until there is a decision > to include those types in Arrow or not ? >
Yes, that's correct (although we should verify this also works to override this for extension types, i.e. that types_mappers gets the priority in deciding the resulting pandas extension dtype). For packages like Ray or HF Datasets, that might be a good enough solution; for end-users this is less convenient because you need to specify this any time you do a conversion from arrow to pandas, while with `to_pandas_dtype` mechanism this gets used by default. Joris