Hey folks, please let me know this is more of a user@ post!

I'm building a Spark connector for my company's data-lake-ish product, and
it looks like there's very little documentation about how to go about it.

I found ExternalCatalog a few days ago and have been implementing one of
those, but it seems like DataSourceRegister / SupportsCatalogOptions is
another popular approach. I'm not sure offhand how they overlap/intersect
just yet.

I've also noticed a few implementations that put some of their code in
org.apache.spark.* packages in addition to their own; presumably this isn't
by accident. Is this practice necessary to get around package-private
visibility or something?

Thanks!

-0xe1a

Reply via email to