+1 In <mn2pr05mb6496c7164ff07547514ffaf7ae...@mn2pr05mb6496.namprd05.prod.outlook.com> "Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX" on Fri, 10 Jun 2022 18:22:47 +0000, Kevin Gurney <kgur...@mathworks.com> wrote:
> Hi Kou, > > Thank you for helping to clear up our confusion. > >> How do we install the object dispatch layer to use it in >> apache/arrow? I assumed that something like the following: >> >> ---- >> $ git clone https://github.com/mathworks/object-dispatch-layer.git >> $ cd object-dispatch-layer >> $ cmake -S . -B build ... >> $ cmake --build build >> $ cmake --install build >> $ git clone https://github.com/apache/arrow.git >> $ cd apache/matlab >> $ cmake -S . -B build # This find installed object-dispatch-layer >> $ cmake --build build >> $ cmake --install build >> ---- >> >> My assumption is right? > > Your understanding is correct. Thanks for checking. > >> BTW, why do you want to use "git submodule" to use the >> object dispatch layer? Why don't you install it separately >> or build by externalproject_add() in CMake? >> https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html> > > After reflecting on your response, we realize that using a git submodule > seems like a less than ideal solution. > > Initially, we were thinking that if code within the apache/arrow repository > were to "depend" on some MATLAB files from the object dispatch layer, that we > would need to "physically" (via vendoring or IP Clearance) or "virtually" > (via git submodule) integrate this code into the apache/arrow source tree. > However, since these files are only needed at build time / run time, this > means that the object dispatch layer code does not necessarily need to be > redistributable along with the rest of the code in the apache/arrow > repository. > > It seems much clearer now that the object dispatch layer can be treated as a > "pure" external "library" dependency, and thus, the code should not need to > be present in the apache/arrow repository. Per your suggestion, at build / > install time, it should be possible to copy any required MATLAB files or C++ > header files to appropriate locations, so that they can be used by CMake and > MATLAB. > > Using externalproject_add() is a great idea and seems more "in-model" than > repeatedly bumping the version of a vendored copy of the object dispatch > layer source or using a git submodule. > > To summarize, it sounds like a reasonable path forward would be to: > > 1. Develop the object dispatch layer in an external repository underneath the > MathWorks GitHub organization, with a 2-Clause BSD license. > 2. Use externalproject_add() to fetch and build the source code dynamically. > > Once the object dispatch layer is available on GitHub, I will follow up on > this email thread with a link to the repository so that anyone in the > community can track development progress, as well as contribute to the > framework, if they are interested. > > If anyone has any objections to this approach, please let us know. > > Thank you! > > Kevin Gurney > > ________________________________ > From: Sutou Kouhei <k...@clear-code.com> > Sent: Friday, June 10, 2022 4:13 AM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Cc: Kevin Gurney <kgur...@mathworks.com>; Fiona La <fion...@mathworks.com>; > Jeremy Hughes <jhug...@mathworks.com>; Nick Haddad <nhad...@mathworks.com> > Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ > objects using MEX > > Hi, > >> 1. A developer would author a custom MEX function that >> uses C++ "building blocks" (i.e. classes and header files) >> from the object dispatch layer "framework". They would >> link their custom MEX function against a helper shared >> library that is built from the source code of the object >> dispatch layer and provides the symbols/implementation for >> the aforementioned C++ "building blocks". > > How do we install the object dispatch layer to use it in > apache/arrow? I assumed that something like the following: > > ---- > $ git clone > https://github.com/mathworks/object-dispatch-layer.git<https://github.com/mathworks/object-dispatch-layer.git> > $ cd object-dispatch-layer > $ cmake -S . -B build ... > $ cmake --build build > $ cmake --install build > $ git clone > https://github.com/apache/arrow.git<https://github.com/apache/arrow.git> > $ cd apache/matlab > $ cmake -S . -B build # This find installed object-dispatch-layer > $ cmake --build build > $ cmake --install build > ---- > > My assumption is right? > >> Essentially, for a developer to use the object dispatch >> layer, they will need to author a fair amount of custom >> code which makes use of both MATLAB and C++ "building >> blocks" from the "framework". > > I think that this is a normal library usage. For example, > our S3 filesystem module implementation in C++ has about > 2500 lines and uses classes provides by AWS SDK C++: > https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc<https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc> > > >> If we had to go through the IP Clearance Process, would >> that mean we would need to repeatedly clear the code every >> time we wanted to sync up the git submodule with the >> latest source code from the external repository? It seems >> like this would quickly become impractical since we >> anticipate the need to iterate frequently on the object >> dispatch layer early on. > > If the object dispatch layer doesn't depend on Apache Arrow > and is a general purpose framework, we can vendor it without > IP clearance. > e.g.: > https://github.com/apache/arrow/tree/master/cpp/src/arrow/vendored/xxhash<https://github.com/apache/arrow/tree/master/cpp/src/arrow/vendored/xxhash> > > BTW, why do you want to use "git submodule" to use the > object dispatch layer? Why don't you install it separately > or build by externalproject_add() in CMake? > https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html> > > > Thanks, > -- > kou > > In > <byapr05mb648755bee2e0caccb0f407d9ae...@byapr05mb6487.namprd05.prod.outlook.com> > "Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects > using MEX" on Wed, 8 Jun 2022 15:59:13 +0000, > Kevin Gurney <kgur...@mathworks.com> wrote: > >> Hi Kou, >> >> --- >> >> Note: I am replying to your email as a forward from Fiona (Cc'd) since your >> original email was accidentally blocked by my email client). >> >> --- >> >> The way that we expected the object dispatch layer to be used by client code >> is as follows: >> >> 1. A developer would author a custom MEX function that uses C++ "building >> blocks" (i.e. classes and header files) from the object dispatch layer >> "framework". They would link their custom MEX function against a helper >> shared library that is built from the source code of the object dispatch >> layer and provides the symbols/implementation for the aforementioned C++ >> "building blocks". >> >> 2. The object dispatch layer expects the compiled MEX function to have a >> specific name and be available on the MATLAB Search Path [1] so that it can >> be used by the MATLAB side of the object dispatch layer. >> >> 3. Once the MEX function is available on the MATLAB Search Path, client >> MATLAB code can use a set of MATLAB "building blocks" (i.e. classes), which >> are part of the object dispatch layer "framework", to connect a MATLAB class >> with a corresponding C++ class. >> >> Essentially, for a developer to use the object dispatch layer, they will >> need to author a fair amount of custom code which makes use of both MATLAB >> and C++ "building blocks" from the "framework". >> >> It's not clear to me whether the steps described above classify as "library >> usage" with regard to the IP Clearance Process. >> >> If we had to go through the IP Clearance Process, would that mean we would >> need to repeatedly clear the code every time we wanted to sync up the git >> submodule with the latest source code from the external repository? It seems >> like this would quickly become impractical since we anticipate the need to >> iterate frequently on the object dispatch layer early on. >> >> It's quite possible that I am not answering your questions completely, so >> please let me know if anything is unclear. My apologies in advance for any >> confusion. >> >> [1] >> https://www.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html >> >> Best, >> >> Kevin Gurney >> >> ________________________________ >> From: Fiona La <fion...@mathworks.com> >> Sent: Wednesday, June 8, 2022 11:24 AM >> To: Kevin Gurney <kgur...@mathworks.com> >> Subject: FW: [MATLAB] Integrating a framework for connecting MATLAB and C++ >> objects using MEX >> >> >> >> >> >> >> From: Sutou Kouhei <k...@clear-code.com> >> Date: Tuesday, June 7, 2022 at 8:36 PM >> To: dev@arrow.apache.org <dev@arrow.apache.org> >> Cc: Fiona La <fion...@mathworks.com>, Jeremy Hughes <jhug...@mathworks.com>, >> Nick Haddad <nhad...@mathworks.com> >> Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ >> objects using MEX >> >> Hi, >> >> Can we use the object dispatch layer as a library? Or should >> we copy (or submodule) the object dispatch layer to >> apache/arrow? >> >> If we can use the object dispatch layer as a library, we can >> just use it as an external library like GoogleTest. We don't >> need IP clearance. You can use any Apache License 2.0 >> compatible license for the object dispatch layer. >> >> Thanks, >> -- >> kou >> >> In >> <mn2pr05mb6496cff053c60e54c133c93cae...@mn2pr05mb6496.namprd05.prod.outlook.com> >> "[MATLAB] Integrating a framework for connecting MATLAB and C++ objects >> using MEX" on Tue, 7 Jun 2022 18:10:43 +0000, >> Kevin Gurney <kgur...@mathworks.com> wrote: >> >>> Hi All, >>> >>> I am reaching out to seek guidance from the community regarding a code >>> integration puzzle. >>> >>> The architecture that we are currently pursuing for the MATLAB interface to >>> Arrow [1] involves dispatching to the Arrow C++ libraries using MEX (a >>> MATLAB facility for calling C/C++ code [2]). A major challenge with this >>> approach has been keeping Arrow C++ objects (e.g. arrow::Array) alive in >>> memory for the appropriate amount of time and making it easy to interface >>> with them from MATLAB. >>> >>> MATLAB has a recommended solution for this problem [3]. However, we've been >>> pursuing a MEX-based solution due to the pervasiveness of MEX and its >>> familiarity to MATLAB users. Our hope is that using MEX will make it easy >>> for others to contribute to the MATLAB interface. >>> >>> To help maintain the connection between MATLAB objects and C++, we've been >>> experimenting with a MEX-based object dispatch layer. The primary goal of >>> this work is to unblock development of the MATLAB interface to Arrow. >>> However, this object dispatch layer is non-trivial and ultimately unrelated >>> to the Arrow project's core mission. Therefore, submitting this code to the >>> Arrow project doesn't seem like the optimal code integration strategy. >>> >>> We’ve been considering the possibility of creating a new open-source >>> repository under the MathWorks GitHub organization [4] to host the object >>> dispatch layer (a side effect of this approach is that it may help >>> encourage reuse of this infrastructure in future open-source MATLAB >>> projects). >>> >>> However, this approach would come with notable tradeoffs: >>> >>> 1. We would need to follow the ASF IP Clearance Process [5] to integrate >>> this code into the Arrow project (it's possible we are mistaken about this). >>> >>> 2. It's not obvious how we should keep the code in sync. Would it be >>> possible to use a git submodule [6] to "symlink" to the external repo? >>> >>> 3. What about licensing? Does the code need to be Apache licensed, or would >>> it be possible to use another Apache-compatible license [7], like BSD? BSD >>> is the default choice for new projects hosted under the MathWorks GitHub >>> organization. >>> >>> Admittedly, we aren't sure what the best path forward is, so we appreciate >>> the community's guidance. We welcome any suggestions. >>> >>> [1] >>> https://github.com/apache/arrow/tree/master/matlab<https://github.com/apache/arrow/tree/master/matlab><https://github.com/apache/arrow/tree/master/matlab<https://github.com/apache/arrow/tree/master/matlab>> >>> [2] https://www.mathworks.com/help/matlab/call-mex-files-1.html >>> [3] >>> https://www.mathworks.com/help/matlab/build-matlab-interface-to-c-library.html >>> [4] >>> https://github.com/mathworks<https://github.com/mathworks><https://github.com/mathworks<https://github.com/mathworks>> >>> [5] >>> https://incubator.apache.org/ip-clearance/<https://incubator.apache.org/ip-clearance><https://incubator.apache.org/ip-clearance<https://incubator.apache.org/ip-clearance>> >>> [6] >>> https://github.blog/2016-02-01-working-with-submodules/<https://github.blog/2016-02-01-working-with-submodules><https://github.blog/2016-02-01-working-with-submodules<https://github.blog/2016-02-01-working-with-submodules>> >>> [7] >>> https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a><https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a>> >>> >>> Thank you, >>> >>> Kevin Gurney