I agree with independent validation.

On Tue, Feb 25, 2020 at 2:55 PM David Li <[email protected]> wrote:
>
> Hey Bryan,
>
> Thanks for looking into this issue. I would vote that we should
> validate each batch independently, so we can catch issues related to
> the structure of the data and not just the content. C++ doesn't do any
> detection of empty batches per se, but on both ends it reads all the
> data into a table, which would eliminate any empty batches.
>
> It also wouldn't be reasonable to stop sending batches that are empty,
> because Flight lets you attach metadata to batches, and so an empty
> batch might still have metadata that the client or server wants.
>
> Best,
> David
>
> On 2/24/20, Bryan Cutler <[email protected]> wrote:
> > While looking into Null type testing for ARROW-7899, a couple small issues
> > came up regarding Flight integration testing with empty batches (row count
> > == 0) that could be worked out with a quick discussion. It seems there is a
> > small difference between the C++ and Java Flight servers when there are
> > empty record batches at the end of a stream, more details in PR
> > https://github.com/apache/arrow/pull/6476.
> >
> > The Java server sends all record batches, even the empty ones, and the test
> > client verifies each of these batches matches the batches read from a JSON
> > file. The C++ servers seems to recognize if the end of the stream is only
> > empty batches (please correct me if I'm wrong) and will not serve them.
> > This seems reasonable, as there is no more actual data left in the stream.
> > The C++ test client reads all batches into a table, does the same for the
> > JSON file, and compares final Tables. I also noticed that empty batches in
> > the middle of the stream will be served.  My questions are:
> >
> > 1) What is the expected behavior of a Flight server for empty record
> > batches, can they be ignored and not sent to the Client?
> >
> > 2) Is it good enough to test against a final concatenation of all batches
> > in the stream or should each batch be verified individually to ensure the
> > server is sending out correctly batched data?
> >
> > Thanks,
> > Bryan
> >

Reply via email to