[
https://issues.apache.org/jira/browse/BEAM-6748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784108#comment-16784108
]
Valentyn Tymofieiev commented on BEAM-6748:
-------------------------------------------
Avro read has the same behavior as FastAvro read:
On Python 3, (offset, size) tuples are:
273 16029
16302 16029
32331 16029
48360 16029
64389 16029
80418 16029
96447 16029
112476 16029
128505 16029
144534 16029
160563 13941
On Python 2 they are:
273, 64025
64298, 64024
128322 46014
> Splitting logic in Avro IO tests behaves unexpectedly in Python 3
> -----------------------------------------------------------------
>
> Key: BEAM-6748
> URL: https://issues.apache.org/jira/browse/BEAM-6748
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-core
> Reporter: Valentyn Tymofieiev
> Assignee: Valentyn Tymofieiev
> Priority: Major
>
> *apache_beam.io.avroio_test.TestAvro.test_split_points*
> *apache_beam.io.avroio_test.TestFastAvro.test_split_points*
> fail with:
>
> {code:java}
> Traceback (most recent call last):
> File "/home/robbe/workspace/beam/sdks/python/apache_beam/io/avroio_test.py",
> line 308, in test_split_points
> self.assertEquals(split_points_report[-10:], [(2, 1)] * 10)
> AssertionError: Lists differ: [(10, 1), (10, 1), (10, 1), (10, 1), (10, 1[42
> chars], 1)] != [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2[32 chars], 1)]
> First differing element 0:
> (10, 1)
> (2, 1)
> + [(2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1), (2, 1),
> (2, 1)]
> - [(10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1),
> - (10, 1)] {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)