Apache Spark Architecture and GenevaERS Open Source Community

Kip M Twitchell Fri, 24 Sep 2021 10:55:33 -0700

Spark Development Community:

I lead an open-source project called GenevaERS, which has been continuously developed since the 90’s, and has many characteristics like Spark. Our project has been experimenting with Apache Spark for a year now to see if there are complementary areas between the projects. I wondered if someone from the Spark team would be interested in discussing that.

GenevaERS runs on z/OS, a mainframe and is an Active Project of the Linux Foundation’s Open Mainframe Project. It has Extract and Format Phases, similar in some respects to Map-Reduce. It is a parallel processing engine that generates and executes highly efficient machine code, created to resolve all processes (“queries” if you will) in one scan of the source data. It is often used to scan billions of rows of data in a few minutes, performing at times billions of joins or look-ups.

The project team thinks there may be architectural benefits in the Spark space to learn about the GenevaERS extract engine. Specifically, we think Spark might benefit from the idea of automatically doing multiple functions in one pass through a source, perhaps in the map phase.

We typically have an Open R&D Hour on Fridays at noon ET on the Webex link below if that was convenient but would be willing to set up another session if desired. https://ibm.webex.com/meet/kip.twitchell

Kip Twitchell

Technical Steering Committee Chair

GenevaERS Project
IBM Global Business Services
[email protected]
630-248-0443 (cell)

--------------------------------------------------------------------- To unsubscribe e-mail: [email protected]

Apache Spark Architecture and GenevaERS Open Source Community

Reply via email to