+1

In 
<mn2pr05mb6496c7164ff07547514ffaf7ae...@mn2pr05mb6496.namprd05.prod.outlook.com>
  "Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
using MEX" on Fri, 10 Jun 2022 18:22:47 +0000,
  Kevin Gurney <kgur...@mathworks.com> wrote:

> Hi Kou,
> 
> Thank you for helping to clear up our confusion.
> 
>> How do we install the object dispatch layer to use it in
>> apache/arrow? I assumed that something like the following:
>>
>> ----
>> $ git clone https://github.com/mathworks/object-dispatch-layer.git
>> $ cd object-dispatch-layer
>> $ cmake -S . -B build ...
>> $ cmake --build build
>> $ cmake --install build
>> $ git clone https://github.com/apache/arrow.git
>> $ cd apache/matlab
>> $ cmake -S . -B build # This find installed object-dispatch-layer
>> $ cmake --build build
>> $ cmake --install build
>> ----
>>
>> My assumption is right?
> 
> Your understanding is correct. Thanks for checking.
> 
>> BTW, why do you want to use "git submodule" to use the
>> object dispatch layer? Why don't you install it separately
>> or build by externalproject_add() in CMake?
>> https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html>
> 
> After reflecting on your response, we realize that using a git submodule 
> seems like a less than ideal solution.
> 
> Initially, we were thinking that if code within the apache/arrow repository 
> were to "depend" on some MATLAB files from the object dispatch layer, that we 
> would need to "physically" (via vendoring or IP Clearance) or "virtually" 
> (via git submodule) integrate this code into the apache/arrow source tree. 
> However, since these files are only needed at build time / run time, this 
> means that the object dispatch layer code does not necessarily need to be 
> redistributable along with the rest of the code in the apache/arrow 
> repository.
> 
> It seems much clearer now that the object dispatch layer can be treated as a 
> "pure" external "library" dependency, and thus, the code should not need to 
> be present in the apache/arrow repository. Per your suggestion, at build / 
> install time, it should be possible to copy any required MATLAB files or C++ 
> header files to appropriate locations, so that they can be used by CMake and 
> MATLAB.
> 
> Using externalproject_add() is a great idea and seems more "in-model" than 
> repeatedly bumping the version of a vendored copy of the object dispatch 
> layer source or using a git submodule.
> 
> To summarize, it sounds like a reasonable path forward would be to:
> 
> 1. Develop the object dispatch layer in an external repository underneath the 
> MathWorks GitHub organization, with a 2-Clause BSD license.
> 2. Use externalproject_add() to fetch and build the source code dynamically.
> 
> Once the object dispatch layer is available on GitHub, I will follow up on 
> this email thread with a link to the repository so that anyone in the 
> community can track development progress, as well as contribute to the 
> framework, if they are interested.
> 
> If anyone has any objections to this approach, please let us know.
> 
> Thank you!
> 
> Kevin Gurney
> 
> ________________________________
> From: Sutou Kouhei <k...@clear-code.com>
> Sent: Friday, June 10, 2022 4:13 AM
> To: dev@arrow.apache.org <dev@arrow.apache.org>
> Cc: Kevin Gurney <kgur...@mathworks.com>; Fiona La <fion...@mathworks.com>; 
> Jeremy Hughes <jhug...@mathworks.com>; Nick Haddad <nhad...@mathworks.com>
> Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
> objects using MEX
> 
> Hi,
> 
>> 1. A developer would author a custom MEX function that
>> uses C++ "building blocks" (i.e. classes and header files)
>> from the object dispatch layer "framework". They would
>> link their custom MEX function against a helper shared
>> library that is built from the source code of the object
>> dispatch layer and provides the symbols/implementation for
>> the aforementioned C++ "building blocks".
> 
> How do we install the object dispatch layer to use it in
> apache/arrow? I assumed that something like the following:
> 
> ----
> $ git clone 
> https://github.com/mathworks/object-dispatch-layer.git<https://github.com/mathworks/object-dispatch-layer.git>
> $ cd object-dispatch-layer
> $ cmake -S . -B build ...
> $ cmake --build build
> $ cmake --install build
> $ git clone 
> https://github.com/apache/arrow.git<https://github.com/apache/arrow.git>
> $ cd apache/matlab
> $ cmake -S . -B build # This find installed object-dispatch-layer
> $ cmake --build build
> $ cmake --install build
> ----
> 
> My assumption is right?
> 
>> Essentially, for a developer to use the object dispatch
>> layer, they will need to author a fair amount of custom
>> code which makes use of both MATLAB and C++ "building
>> blocks" from the "framework".
> 
> I think that this is a normal library usage. For example,
> our S3 filesystem module implementation in C++ has about
> 2500 lines and uses classes provides by AWS SDK C++:
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc<https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc>
> 
> 
>> If we had to go through the IP Clearance Process, would
>> that mean we would need to repeatedly clear the code every
>> time we wanted to sync up the git submodule with the
>> latest source code from the external repository? It seems
>> like this would quickly become impractical since we
>> anticipate the need to iterate frequently on the object
>> dispatch layer early on.
> 
> If the object dispatch layer doesn't depend on Apache Arrow
> and is a general purpose framework, we can vendor it without
> IP clearance.
> e.g.: 
> https://github.com/apache/arrow/tree/master/cpp/src/arrow/vendored/xxhash<https://github.com/apache/arrow/tree/master/cpp/src/arrow/vendored/xxhash>
> 
> BTW, why do you want to use "git submodule" to use the
> object dispatch layer? Why don't you install it separately
> or build by externalproject_add() in CMake?
> https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html>
> 
> 
> Thanks,
> --
> kou
> 
> In 
> <byapr05mb648755bee2e0caccb0f407d9ae...@byapr05mb6487.namprd05.prod.outlook.com>
> "Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
> using MEX" on Wed, 8 Jun 2022 15:59:13 +0000,
> Kevin Gurney <kgur...@mathworks.com> wrote:
> 
>> Hi Kou,
>>
>> ---
>>
>> Note: I am replying to your email as a forward from Fiona (Cc'd) since your 
>> original email was accidentally blocked by my email client).
>>
>> ---
>>
>> The way that we expected the object dispatch layer to be used by client code 
>> is as follows:
>>
>> 1. A developer would author a custom MEX function that uses C++ "building 
>> blocks" (i.e. classes and header files) from the object dispatch layer 
>> "framework". They would link their custom MEX function against a helper 
>> shared library that is built from the source code of the object dispatch 
>> layer and provides the symbols/implementation for the aforementioned C++ 
>> "building blocks".
>>
>> 2. The object dispatch layer expects the compiled MEX function to have a 
>> specific name and be available on the MATLAB Search Path [1] so that it can 
>> be used by the MATLAB side of the object dispatch layer.
>>
>> 3. Once the MEX function is available on the MATLAB Search Path, client 
>> MATLAB code can use a set of MATLAB "building blocks" (i.e. classes), which 
>> are part of the object dispatch layer "framework", to connect a MATLAB class 
>> with a corresponding C++ class.
>>
>> Essentially, for a developer to use the object dispatch layer, they will 
>> need to author a fair amount of custom code which makes use of both MATLAB 
>> and C++ "building blocks" from the "framework".
>>
>> It's not clear to me whether the steps described above classify as "library 
>> usage" with regard to the IP Clearance Process.
>>
>> If we had to go through the IP Clearance Process, would that mean we would 
>> need to repeatedly clear the code every time we wanted to sync up the git 
>> submodule with the latest source code from the external repository? It seems 
>> like this would quickly become impractical since we anticipate the need to 
>> iterate frequently on the object dispatch layer early on.
>>
>> It's quite possible that I am not answering your questions completely, so 
>> please let me know if anything is unclear. My apologies in advance for any 
>> confusion.
>>
>> [1] 
>> https://www.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html
>>
>> Best,
>>
>> Kevin Gurney
>>
>> ________________________________
>> From: Fiona La <fion...@mathworks.com>
>> Sent: Wednesday, June 8, 2022 11:24 AM
>> To: Kevin Gurney <kgur...@mathworks.com>
>> Subject: FW: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
>> objects using MEX
>>
>>
>>
>>
>>
>>
>> From: Sutou Kouhei <k...@clear-code.com>
>> Date: Tuesday, June 7, 2022 at 8:36 PM
>> To: dev@arrow.apache.org <dev@arrow.apache.org>
>> Cc: Fiona La <fion...@mathworks.com>, Jeremy Hughes <jhug...@mathworks.com>, 
>> Nick Haddad <nhad...@mathworks.com>
>> Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ 
>> objects using MEX
>>
>> Hi,
>>
>> Can we use the object dispatch layer as a library? Or should
>> we copy (or submodule) the object dispatch layer to
>> apache/arrow?
>>
>> If we can use the object dispatch layer as a library, we can
>> just use it as an external library like GoogleTest. We don't
>> need IP clearance. You can use any Apache License 2.0
>> compatible license for the object dispatch layer.
>>
>> Thanks,
>> --
>> kou
>>
>> In 
>> <mn2pr05mb6496cff053c60e54c133c93cae...@mn2pr05mb6496.namprd05.prod.outlook.com>
>> "[MATLAB] Integrating a framework for connecting MATLAB and C++ objects 
>> using MEX" on Tue, 7 Jun 2022 18:10:43 +0000,
>> Kevin Gurney <kgur...@mathworks.com> wrote:
>>
>>> Hi All,
>>>
>>> I am reaching out to seek guidance from the community regarding a code 
>>> integration puzzle.
>>>
>>> The architecture that we are currently pursuing for the MATLAB interface to 
>>> Arrow [1] involves dispatching to the Arrow C++ libraries using MEX (a 
>>> MATLAB facility for calling C/C++ code [2]). A major challenge with this 
>>> approach has been keeping Arrow C++ objects (e.g. arrow::Array) alive in 
>>> memory for the appropriate amount of time and making it easy to interface 
>>> with them from MATLAB.
>>>
>>> MATLAB has a recommended solution for this problem [3]. However, we've been 
>>> pursuing a MEX-based solution due to the pervasiveness of MEX and its 
>>> familiarity to MATLAB users. Our hope is that using MEX will make it easy 
>>> for others to contribute to the MATLAB interface.
>>>
>>> To help maintain the connection between MATLAB objects and C++, we've been 
>>> experimenting with a MEX-based object dispatch layer. The primary goal of 
>>> this work is to unblock development of the MATLAB interface to Arrow. 
>>> However, this object dispatch layer is non-trivial and ultimately unrelated 
>>> to the Arrow project's core mission. Therefore, submitting this code to the 
>>> Arrow project doesn't seem like the optimal code integration strategy.
>>>
>>> We’ve been considering the possibility of creating a new open-source 
>>> repository under the MathWorks GitHub organization [4] to host the object 
>>> dispatch layer (a side effect of this approach is that it may help 
>>> encourage reuse of this infrastructure in future open-source MATLAB 
>>> projects).
>>>
>>> However, this approach would come with notable tradeoffs:
>>>
>>> 1. We would need to follow the ASF IP Clearance Process [5] to integrate 
>>> this code into the Arrow project (it's possible we are mistaken about this).
>>>
>>> 2. It's not obvious how we should keep the code in sync. Would it be 
>>> possible to use a git submodule [6] to "symlink" to the external repo?
>>>
>>> 3. What about licensing? Does the code need to be Apache licensed, or would 
>>> it be possible to use another Apache-compatible license [7], like BSD? BSD 
>>> is the default choice for new projects hosted under the MathWorks GitHub 
>>> organization.
>>>
>>> Admittedly, we aren't sure what the best path forward is, so we appreciate 
>>> the community's guidance. We welcome any suggestions.
>>>
>>> [1] 
>>> https://github.com/apache/arrow/tree/master/matlab<https://github.com/apache/arrow/tree/master/matlab><https://github.com/apache/arrow/tree/master/matlab<https://github.com/apache/arrow/tree/master/matlab>>
>>> [2] https://www.mathworks.com/help/matlab/call-mex-files-1.html
>>> [3] 
>>> https://www.mathworks.com/help/matlab/build-matlab-interface-to-c-library.html
>>> [4] 
>>> https://github.com/mathworks<https://github.com/mathworks><https://github.com/mathworks<https://github.com/mathworks>>
>>> [5] 
>>> https://incubator.apache.org/ip-clearance/<https://incubator.apache.org/ip-clearance><https://incubator.apache.org/ip-clearance<https://incubator.apache.org/ip-clearance>>
>>> [6] 
>>> https://github.blog/2016-02-01-working-with-submodules/<https://github.blog/2016-02-01-working-with-submodules><https://github.blog/2016-02-01-working-with-submodules<https://github.blog/2016-02-01-working-with-submodules>>
>>> [7] 
>>> https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a><https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a>>
>>>
>>> Thank you,
>>>
>>> Kevin Gurney

Reply via email to