Hello Beam community. I'm currently trying out Spark Runner and while going through the code, I noticed that when evaluating a ParDo operation, it applies too many filter operations (from line 467 in TransformTranslator.java).
The original intent of this code seems to be to apply filters because the output of the ParDo can have multiple outputs. In other words, it makes sense to apply the filter operation when there are multiple outputs, but I believe that applying the filter operation when there is only one output actually degrades pipeline performance (because the equals operation has to be applied to each element to compare them). So I changed the PTransform to only apply when there are multiple outputs and tested it. I need to do more testing, but it didn't affect the output and the results weren't bad. If this is ok, would it be ok to make a PR? Also, if I'm missing anything, I'd be grateful if you could let me know. Cheers.