I already went ahead with this one, everything is pretty self explanatory + previous emails seem pretty helpful about how to test things. I don't need answers on my previous questions any more.
On Fri, May 22, 2020 at 10:12 AM Edgar Klerks <edgar.kle...@gmail.com> wrote: > Hi there, > > I am a potentially new contributor, so don't spend too much time on me. > However I would like to give this a try. Reason is that it would be a nice > to have at my work (the connection between glue and spark). We run our own > spark clusters and don't use EMR and right now our spark jobs can't benefit > from the glue metastore. This is not a huge problem, because we keep strict > naming conventions and use orc, but still it would be nice for our user > base. > > As you can guess, our cluster runs on AWS and I have a good amount of > experience with the aws SDK's, reasonable amount with Scala. I am however a > beginner with Spark, never contributed before. > > As far as I can see I need to implement ExternelCatalog for Glue and glue > seems to support all operations specified in the trait. Even the user > defined functions, which surprised me, because Athena doesn't support this. > > I can see some obstacles, e.g. how to deal with permissions. Therefore I > will study the hive ExternalCatalog. Can I take that as leading example? > > I also saw there was prior work from the mailing list ( > http://apache-spark-developers-list.1001551.n3.nabble.com/A-new-external-catalog-td23394.html), > but unfortunately there is no code. > > Would this be a suitable project to pick up? I thought it might be, > because it is kinda on the edge of Spark. > > Thanks for your time in advance! > > Greets, > > Edgar Klerks >