Pietro Cerutti created AVRO-3710:
------------------------------------

             Summary: C++ - don't take ownership of I/OStream in DataFile
                 Key: AVRO-3710
                 URL: https://issues.apache.org/jira/browse/AVRO-3710
             Project: Apache Avro
          Issue Type: Improvement
          Components: c++
            Reporter: Pietro Cerutti


In AVRO-2014, I raised concerns regarding how DataFile(Reader|Writer)[Base] 
take their streams by unique_ptr.

Here, I would like to propose a fix.

The problem is that, because streams are taken by unique_ptr, caller code 
doesn't have access to the streams after the construction of the DataFile* 
object. This makes it impossible to use custom streams, or even the built-in 
memory streams:
{code:cpp}
auto schema{ giveMeMySchema(); }
auto os{ avro::memoryOutputStream() };
DataFileWriter w{ os, schema };
auto is{ memoryInputStream(*os) }; // ouch: os is gone
doSomethingWithTheAvroStream(is); 
{code}
 
I am proposing an almost backwards-compatible change, which is to change the 
DataFile classes to take and hold the streams by shared_ptr.
The semantics for client code don't change: you can still move a 
unique_ptr<Stream> into the DataFile constructor, and in that case the DataFile 
will be the only owner.
But this enable a client from doing something like this:

{code:cpp}
auto schema{ giveMeMySchema(); }
auto os{ avro::memoryOutputStream() };
std::shared_ptr<OutputStream> shared_os{ os.get(), boost::null_deleter{} };
DataFileWriter w{ shared_os, schema };
auto is{ memoryInputStream(*os) }; // good: os is alive
doSomethingWithTheAvroStream(is); 
{code}

I will submit a PR if this is accepted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to