FYI - this should be fixed by https://github.com/apache/beam/pull/14960
On Thu, Jun 3, 2021 at 10:00 AM Reuven Lax wrote:
> Correct.
>
> On Thu, Jun 3, 2021 at 9:51 AM Kenneth Knowles wrote:
>
>> I still don't quite grok the details of how this succeeds or fails in
>> different situations. The
Correct.
On Thu, Jun 3, 2021 at 9:51 AM Kenneth Knowles wrote:
> I still don't quite grok the details of how this succeeds or fails in
> different situations. The invalid row succeeds in serialization because the
> coder is not sensitive to the way in which it is invalid?
>
> Kenn
>
> On Wed, Ju
I still don't quite grok the details of how this succeeds or fails in
different situations. The invalid row succeeds in serialization because the
coder is not sensitive to the way in which it is invalid?
Kenn
On Wed, Jun 2, 2021 at 2:54 PM Brian Hulette wrote:
> > One thing that's been on the b
> One thing that's been on the back burner for a long time is making
CoderProperties into a CoderTester like Guava's EqualityTester.
Reuven's point still applies here though. This issue is not due to a bug in
SchemaCoder, it's a problem with the Row we gave SchemaCoder to encode. I'm
assuming a Co
Mutability checking might catch that.
I meant to suggest not putting the check in the pipeline, but offering a
testing discipline that will catch such issues. One thing that's been on
the back burner for a long time is making CoderProperties into a
CoderTester like Guava's EqualityTester. Then it
Could the DirectRunner just do an equality check whenever it does an
encode/decode? It sounds like it's already effectively performing
a CoderProperties.coderDecodeEncodeEqual for every element, just omitting
the equality check.
On Wed, Jun 2, 2021 at 12:04 PM Reuven Lax wrote:
> There is no bug
There is no bug in the Coder itself, so that wouldn't catch it. We could
insert CoderProperties.coderDecodeEncodeEqual in a subsequent ParDo, but if
the Direct runner already does an encode/decode before that ParDo, then
that would have fixed the problem before we could see it.
On Wed, Jun 2, 2021
Would it be caught by CoderProperties?
Kenn
On Wed, Jun 2, 2021 at 8:16 AM Reuven Lax wrote:
> I don't think this bug is schema specific - we created a Java object that
> is inconsistent with its encoded form, which could happen to any transform.
>
> This does seem to be a gap in DirectRunner t
I don't think this bug is schema specific - we created a Java object that
is inconsistent with its encoded form, which could happen to any transform.
This does seem to be a gap in DirectRunner testing though. It also makes it
hard to test using PAssert, as I believe that puts everything in a side
+dev
> I bet the DirectRunner is encoding and decoding in between, which fixes
the object.
Do we need better testing of schema-aware (and potentially other built-in)
transforms in the face of fusion to root out issues like this?
Brian
On Wed, Jun 2, 2021 at 5:13 AM Matthew Ouyang
wrote:
> I
I have some other work-related things I need to do this week, so I will
likely report back on this over the weekend. Thank you for the
explanation. It makes perfect sense now.
On Tue, Jun 1, 2021 at 11:18 PM Reuven Lax wrote:
> Some more context - the problem is that RenameFields outputs (in t
Some more context - the problem is that RenameFields outputs (in this case)
Java Row objects that are inconsistent with the actual schema. For example
if you have the following schema:
Row {
field1: Row {
field2: string
}
}
And rename field1.field2 -> renamed, you'll get the followin
Aha, yes this indeed another bug in the transform. The schema is set on the
top-level Row but not on any nested rows.
On Tue, Jun 1, 2021 at 6:37 PM Matthew Ouyang
wrote:
> Thank you everyone for your input. I believe it will be easiest to
> respond to all feedback in a single message rather th
Thank you everyone for your input. I believe it will be easiest to respond
to all feedback in a single message rather than messages per person.
- NeedsRunner - The tests are run eventually, so obviously all good on
my end. I was trying to run the smallest subset of test cases possible and
This transform is the same across all runners. A few comments on the test:
- Using attachValues directly is error prone (per the comment on the
method). I recommend using the withFieldValue builders instead.
- I recommend capturing the RenameFields PCollection into a local
variable of type PCo
On Tue, Jun 1, 2021 at 12:42 PM Brian Hulette wrote:
> Hi Matthew,
>
> > The unit tests also seem to be disabled for this as well and so I don’t
> know if the PTransform behaves as expected.
>
> The exclusion for NeedsRunner tests is just a quirk in our testing
> framework. NeedsRunner indicates
Hi Matthew,
> The unit tests also seem to be disabled for this as well and so I don’t
know if the PTransform behaves as expected.
The exclusion for NeedsRunner tests is just a quirk in our testing
framework. NeedsRunner indicates that a test suite can't be executed with
the SDK alone, it needs a
17 matches
Mail list logo