Re: Types in the Python API

Chesnay Schepler Thu, 30 Jul 2015 13:22:54 -0700

To be perfectly honest i never really managed to work my way throughSpark's python API, it's a whole bunch of magic to me; not even thegeneral structure is understandable.

With "pure python" do you mean doing everything in python? as in justhaving serialized data on the java side?


I believe the way to do this with Flink is to add a switch that
a) disables all type checks
b) creates serializers dynamically at runtime.

a) should be fairly straight forward, b) on the other hand....

btw., the Python API itself doesn't require the type information, italready does the b part.


On 30.07.2015 22:11, Gyula Fóra wrote:

That I understand, but could you please tell me how is this done
differently in Spark for instance?

What would we need to change to make this work with pure python (as it
seems to be possible)? This probably have large performance implications
though.

Gyula

Chesnay Schepler <[email protected]> ezt írta (időpont: 2015. júl. 30., Cs,
22:04):

because it still goes through the Java API that requires some kind of
type information. imagine a java api program where you omit all generic
types, it just wouldn't work as of now.

On 30.07.2015 21:17, Gyula Fóra wrote:

Hey!

Could anyone briefly tell me what exactly is the reason why we force the
users in the Python API to declare types for operators?

I don't really understand how this works in different systems but I am

just

curious why Flink has types and why Spark doesn't for instance.

If you give me some pointers to read that would also be fine :)

Thank you,
Gyula

Re: Types in the Python API

Reply via email to