On Sun, Jul 12, 2020 at 2:44 PM <anthony.ab...@gmail.com> wrote:
>
> Wes,
>
> I thought Arrow was (or at least includes) an open standard for
> interoperability? There are even specific 'implementation guidelines'
> regarding supporting parts or all of the specification.

That's true, but at the moment there is not any C# library available
that has been demonstrated (by passing the integration tests) to
correctly implement the columnar specification. This idea of
"instability" or "precariousness" is a non-issue if the C# development
community will follow the example of the other reference
implementations and implement the integration tests. We discussed this
in a JIRA last fall

https://issues.apache.org/jira/browse/ARROW-7156

To summarize, without integration tests, an Arrow reference
implementation can't be considered seaworthy. For example, recently
the Rust library found only through integration testing that some
parts of the format aren't implemented correctly.

In any case, I'd like to do what I can to help the C# ecosystem have a
trustworthy reference implementation of Arrow that can be used to
build production applications.

> It appears that fragmentation is already a problem (ie private forks)

Private forks are only a problem if there is a permanent divergence
with no intent to upstream patches. With any open source project you
see organizations apply patches to upstream for reasons of business
expediency and then work to upstream those patches.

> Where I work, we don't trust the C# library to do anything other than what
> we know works: writing certain large files with only a subset of the
> supported column types.   We had even considered switching to C++, but I
> was able to get something stable. To give you an idea of how precarious it
> is though, we can't even read the files we just created (but we know they
> work since they open fine in R)   We decided having a write only library
> was 'good enough' since we don't need to consume the files ourselves.
>
> I decided sometime ago that to get the features I want / need out of an
> Arrow library, it was easier to build an independent implementation
> directly from the spec/.fbs, rather than try to apply bandages to what
> already existed.  Aside from the numerous bugs, the current library is just
> not designed for parallelism and speed.

I think it's fine to enumerate your criticisms of the current codebase
and make constructive recommendations about what you would like to see
change. I find that people have many reasons for not contributing to
an existing open source project, so I want to make sure I know what
yours are, whether one of:

* Not wanting to refactor and work within an existing codebase
* Belief that there is resistance or difficulty having patches accepted
* For reasons of business expediency, not wanting to collaborate (e.g.
in code reviews) with developers outside of your organization or
participate in a process where one does not have unilateral control
over when commits are merged to master

I haven't seen any resistance to C# PRs. If anyone ever is concerned
about this please raise it on the mailing list.

Thanks,
Wes

>
> On Sun, Jul 12, 2020 at 1:36 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > hi Anthony,
> >
> > On Sun, Jul 12, 2020 at 12:13 PM <anthony.ab...@gmail.com> wrote:
> > >
> > > I am in the same position as Adam - We don't use the official apache
> > arrow
> > > library any more either and have been using an old fork with our own
> > > (probably the same) bug fixes.
> > >
> > > Personally, I have somewhat given up on the Apache .Net library... I have
> > > an alternative C# arrow library that I have written (from the flat
> > buffers
> > > spec) that has C# features I need / want...  Async/Await - Tasks,
> > > IAsyncEnumerable, multi-threading / high performance/ serialization
> > > plugins, etc)  - I am considering releasing it since I think many others
> > > could benefit from it over the current library.
> >
> > I'm a bit confused by this, it seems like avoiding fragmentation and
> > having a canonical library that is well-supported by the community is
> > the goal we are all working toward. Why would the current library not
> > evolve to have the features you need? I don't think there is any
> > barrier (aside from having to respond to code review comments) to
> > having patches accepted.
> >
> > From my perspective, we accepted an initial C# code donation from
> > Feyen Zylstra but then there were no many further contributions from
> > this organization. Eric from Microsoft has done some development work,
> > but otherwise it seems like we are still in "community bootstrapping"
> > mode. If there are individuals who are invested in having a good
> > standard Arrow library for C#, you are as free as any other open
> > source contributor to take up a de facto leadership role in the
> > project.
> >
> > To have an Arrow library that can be trusted for mission critical work
> > (i.e. that passes the integration test suite, in particular) is a
> > significant amount of work, so I'm concerned if the C# community does
> > not pool efforts on this that the most likely outcome is that Arrow as
> > a technology will simply fail to get traction in the .NET world.
> >
> > > -Anthony
> > >
> > >
> > > On Fri, Jul 10, 2020 at 2:23 PM Eric Erhardt
> > > <eric.erha...@microsoft.com.invalid> wrote:
> > >
> > > > I agree with Adam, the more usage and feedback we can get the better on
> > > > the .NET Library.
> > > >
> > > > > However there is no library for C# listed anywhere else in the
> > > > > documentation.
> > > >
> > > > We have some XML style doc comments in the code. It would be great if
> > we
> > > > could generate a website/markdown from those XML files produced by the
> > > > build. And then get it shown under the Documentation tab on
> > > > https://arrow.apache.org/.  I've opened
> > > > https://issues.apache.org/jira/browse/ARROW-9406 for this.
> > > >
> > > > Eric
> > > >
> > > > -----Original Message-----
> > > > From: Adam Szmigin <adam.szmi...@xsco.net>
> > > > Sent: Friday, July 10, 2020 6:28 AM
> > > > To: dev@arrow.apache.org
> > > > Subject: [EXTERNAL] Re: .NET support for Arrow
> > > >
> > > > Hi Yash,
> > > >
> > > > My organisation is using the C# library for a product we are working
> > on.
> > > > However, we are using a fork which includes a number of bug-fixes for
> > > > issues that would have otherwise blocked us. I've raised a few PRs to
> > fix
> > > > these upstream.
> > > >
> > > > I think it's fair to say that the C# library is at an early stage of
> > > > development at the moment.  The more people who are able to test and
> > > > contribute back, the better.
> > > >
> > > > Kind regards,
> > > >
> > > >
> > > > --
> > > > Adam Szmigin
> > > >
> > > > On 10/07/2020 04:05, Yash Ganthe wrote:
> > > > > Hi,
> > > > >
> > > > > The first paragraph of docs at
> > > > >
> > > >
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Farrow.apache.org%2F&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7C150d7a7f5f1a4274567008d824c46983%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637299773289674614&amp;sdata=IbmMQwZMqlo0Ya7ocgfNrZAsHruErwB%2Bg1DuD7qqzm0%3D&amp;reserved=0
> > > > says it supports C#.
> > > > > However there is no library for C# listed anywhere else in the
> > > > > documentation. Is .NET supported at all?
> > > > >
> > > > > Regards,
> > > > > Yash
> > > > >
> > > >
> >

Reply via email to