I'm a Python guru; if it doesn't have a Python API, I'll likely help
make one :)
Work is bad this week but I'm planning to get started on this next week!
Shannon
On 3/14/16 5:37 AM, Robert Metzger wrote:
Hi Shannon,
I'm happy to see some community engagement on our Python APIs!
On Fri, Mar 11, 2016 at 6:32 PM, Chesnay Schepler <ches...@apache.org>
wrote:
The subtaskIndex is not currently exposed to the python operator.
Fortunately this can be changed very easily:
On the java side, within PythonStreamer.startPython() the python process
is started and several parameters are transferred (L.129++) using
stdin/-out.
These parameters are received on the python side in Environment.execute()
(L.168++).
So the transfer is rather straight-forward, after that you only have to
modify the operator.configure() method to
also take a subtaskIndex argument, modify the RuntimeContext constructor,
add a getIndexOfThisSubtask() method and you're set.
Feel free to open a JIRA for this.
On 11.03.2016 18:15, Shannon Quinn wrote:
Hi all,
I'm interested in getting involved the Python API development. The first
use-case I've encountered in my work is that of zipWithIndex, so I started
looking into how to go about implementing that. It looks like the core of
it involves being able to uniquely identify what worker you're currently in
between distributed calls; the Scala end has
getRuntimeContext().getIndexOfThisSubtask(), but Python's runtime context
is more or less limited to the broadcast variables.
Happy to hear any hints as to how I should get started with this. Thanks.
Regards,
Shannon