Ok - that makes sense. My specific workaround was to remove the with_output_types for now, so advising the user on this in the error message would be nice. I was just worried about silently passing.
As for the formalization: 1. I am a little confused on how this is different than passing multiple tagged inputs to a PTransform that does a CoGroupBy*. In this case, with_input_types seems to expect a union of all the types for the keyed values. Why would the same not work for output types? 2. What is the process for proposing a formalized solution? Should I start a document, or does one already exist? Or does this kind of thing get tracked via Jira issues? Best, Joshua On Tue, Mar 31, 2020 at 11:07 AM Udi Meiri <[email protected]> wrote: > Hi Joshua, > I've been working on type hints recently. > Your issue is similar to this: > https://issues.apache.org/jira/browse/BEAM-8782 (exactly the same if tags > are passed to with_outputs() in the example). > There's also this related bug about type inference: > https://issues.apache.org/jira/browse/BEAM-4132 > > I agree with Luke that it would be helpful to point to a workaround in the > error message (such as removing with_output_types). > > From what I remember, we'll need to formalize how multi-output type hints > are provided to Beam. > For example, by passing keywords to with_output_types: main=type, > TAG=type, etc. > > On Tue, Mar 31, 2020 at 9:55 AM Luke Cwik <[email protected]> wrote: > >> I can see that argument but what does a user need to do in this case if >> we raise NotImplementedError? Would the need to disable type checking >> everywhere? >> >> Over the long term users will need to deal with improvements to type >> checking and will need to fix typing errors when they change Apache Beam >> versions. >> >> >> On Tue, Mar 31, 2020 at 9:34 AM Joshua B. Harrison < >> [email protected]> wrote: >> >>> The current code errors out with a cryptic message around tag types in >>> the multi-output. Adding a NotImplementedError was just an attempt to make >>> the failure reason more clear. >>> >>> I would be worried about trivially passing because then the user might >>> think they have type checking safety when they don't, which could cause >>> failures at later stages and might be hard to debug. Do you agree? >>> >>> Best, >>> Joshua >>> >>> On Tue, Mar 31, 2020 at 10:16 AM Luke Cwik <[email protected]> wrote: >>> >>>> Would the NotImplementedError cause users pipeline errors or is that a >>>> signal to the type checking mechanism to ignore it? >>>> If this would cause failures I would rather make the unsupported case >>>> return something that would be trivially true. >>>> >>>> On Mon, Mar 30, 2020 at 12:01 PM Joshua B. Harrison < >>>> [email protected]> wrote: >>>> >>>>> Hey all, >>>>> >>>>> I brought up an issue recently on the user forums noting issues around >>>>> type hints and multi-output PTransforms: >>>>> https://lists.apache.org/thread.html/r94bf2e43f09a290dbe87d5a8d7eedb34ea215e0bea861521cbdb0c1c%40%3Cuser.beam.apache.org%3E >>>>> >>>>> As mentioned there, I think that a NotImplementedError should be >>>>> raised when attaching type hints to multi-output PTransforms while the >>>>> correct implementation is figured out. And that a 'correct' implementation >>>>> would look something like the Union typehints that are expected on >>>>> multi-input PTransforms. >>>>> >>>>> I am happy to help out and wanted to get the discussion started around >>>>> what the community would like to see here. Thank you all for a great >>>>> product. >>>>> >>>>> Best, >>>>> Joshua >>>>> >>>>> -- >>>>> Joshua Harrison | Software Engineer | [email protected] >>>>> <[email protected]> | 404-433-0242 <(404)%20433-0242> >>>>> >>>> >>> >>> -- >>> Joshua Harrison | Software Engineer | [email protected] >>> <[email protected]> | 404-433-0242 <(404)%20433-0242> >>> >> -- Joshua Harrison | Software Engineer | [email protected] <[email protected]> | 404-433-0242
