Hi Kou, Thank you for helping to clear up our confusion.
> How do we install the object dispatch layer to use it in > apache/arrow? I assumed that something like the following: > > ---- > $ git clone https://github.com/mathworks/object-dispatch-layer.git > $ cd object-dispatch-layer > $ cmake -S . -B build ... > $ cmake --build build > $ cmake --install build > $ git clone https://github.com/apache/arrow.git > $ cd apache/matlab > $ cmake -S . -B build # This find installed object-dispatch-layer > $ cmake --build build > $ cmake --install build > ---- > > My assumption is right? Your understanding is correct. Thanks for checking. > BTW, why do you want to use "git submodule" to use the > object dispatch layer? Why don't you install it separately > or build by externalproject_add() in CMake? > https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html> After reflecting on your response, we realize that using a git submodule seems like a less than ideal solution. Initially, we were thinking that if code within the apache/arrow repository were to "depend" on some MATLAB files from the object dispatch layer, that we would need to "physically" (via vendoring or IP Clearance) or "virtually" (via git submodule) integrate this code into the apache/arrow source tree. However, since these files are only needed at build time / run time, this means that the object dispatch layer code does not necessarily need to be redistributable along with the rest of the code in the apache/arrow repository. It seems much clearer now that the object dispatch layer can be treated as a "pure" external "library" dependency, and thus, the code should not need to be present in the apache/arrow repository. Per your suggestion, at build / install time, it should be possible to copy any required MATLAB files or C++ header files to appropriate locations, so that they can be used by CMake and MATLAB. Using externalproject_add() is a great idea and seems more "in-model" than repeatedly bumping the version of a vendored copy of the object dispatch layer source or using a git submodule. To summarize, it sounds like a reasonable path forward would be to: 1. Develop the object dispatch layer in an external repository underneath the MathWorks GitHub organization, with a 2-Clause BSD license. 2. Use externalproject_add() to fetch and build the source code dynamically. Once the object dispatch layer is available on GitHub, I will follow up on this email thread with a link to the repository so that anyone in the community can track development progress, as well as contribute to the framework, if they are interested. If anyone has any objections to this approach, please let us know. Thank you! Kevin Gurney ________________________________ From: Sutou Kouhei <k...@clear-code.com> Sent: Friday, June 10, 2022 4:13 AM To: dev@arrow.apache.org <dev@arrow.apache.org> Cc: Kevin Gurney <kgur...@mathworks.com>; Fiona La <fion...@mathworks.com>; Jeremy Hughes <jhug...@mathworks.com>; Nick Haddad <nhad...@mathworks.com> Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX Hi, > 1. A developer would author a custom MEX function that > uses C++ "building blocks" (i.e. classes and header files) > from the object dispatch layer "framework". They would > link their custom MEX function against a helper shared > library that is built from the source code of the object > dispatch layer and provides the symbols/implementation for > the aforementioned C++ "building blocks". How do we install the object dispatch layer to use it in apache/arrow? I assumed that something like the following: ---- $ git clone https://github.com/mathworks/object-dispatch-layer.git<https://github.com/mathworks/object-dispatch-layer.git> $ cd object-dispatch-layer $ cmake -S . -B build ... $ cmake --build build $ cmake --install build $ git clone https://github.com/apache/arrow.git<https://github.com/apache/arrow.git> $ cd apache/matlab $ cmake -S . -B build # This find installed object-dispatch-layer $ cmake --build build $ cmake --install build ---- My assumption is right? > Essentially, for a developer to use the object dispatch > layer, they will need to author a fair amount of custom > code which makes use of both MATLAB and C++ "building > blocks" from the "framework". I think that this is a normal library usage. For example, our S3 filesystem module implementation in C++ has about 2500 lines and uses classes provides by AWS SDK C++: https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc<https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc> > If we had to go through the IP Clearance Process, would > that mean we would need to repeatedly clear the code every > time we wanted to sync up the git submodule with the > latest source code from the external repository? It seems > like this would quickly become impractical since we > anticipate the need to iterate frequently on the object > dispatch layer early on. If the object dispatch layer doesn't depend on Apache Arrow and is a general purpose framework, we can vendor it without IP clearance. e.g.: https://github.com/apache/arrow/tree/master/cpp/src/arrow/vendored/xxhash<https://github.com/apache/arrow/tree/master/cpp/src/arrow/vendored/xxhash> BTW, why do you want to use "git submodule" to use the object dispatch layer? Why don't you install it separately or build by externalproject_add() in CMake? https://cmake.org/cmake/help/latest/module/ExternalProject.html<https://cmake.org/cmake/help/latest/module/ExternalProject.html> Thanks, -- kou In <byapr05mb648755bee2e0caccb0f407d9ae...@byapr05mb6487.namprd05.prod.outlook.com> "Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ objects using MEX" on Wed, 8 Jun 2022 15:59:13 +0000, Kevin Gurney <kgur...@mathworks.com> wrote: > Hi Kou, > > --- > > Note: I am replying to your email as a forward from Fiona (Cc'd) since your > original email was accidentally blocked by my email client). > > --- > > The way that we expected the object dispatch layer to be used by client code > is as follows: > > 1. A developer would author a custom MEX function that uses C++ "building > blocks" (i.e. classes and header files) from the object dispatch layer > "framework". They would link their custom MEX function against a helper > shared library that is built from the source code of the object dispatch > layer and provides the symbols/implementation for the aforementioned C++ > "building blocks". > > 2. The object dispatch layer expects the compiled MEX function to have a > specific name and be available on the MATLAB Search Path [1] so that it can > be used by the MATLAB side of the object dispatch layer. > > 3. Once the MEX function is available on the MATLAB Search Path, client > MATLAB code can use a set of MATLAB "building blocks" (i.e. classes), which > are part of the object dispatch layer "framework", to connect a MATLAB class > with a corresponding C++ class. > > Essentially, for a developer to use the object dispatch layer, they will need > to author a fair amount of custom code which makes use of both MATLAB and C++ > "building blocks" from the "framework". > > It's not clear to me whether the steps described above classify as "library > usage" with regard to the IP Clearance Process. > > If we had to go through the IP Clearance Process, would that mean we would > need to repeatedly clear the code every time we wanted to sync up the git > submodule with the latest source code from the external repository? It seems > like this would quickly become impractical since we anticipate the need to > iterate frequently on the object dispatch layer early on. > > It's quite possible that I am not answering your questions completely, so > please let me know if anything is unclear. My apologies in advance for any > confusion. > > [1] > https://www.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html > > Best, > > Kevin Gurney > > ________________________________ > From: Fiona La <fion...@mathworks.com> > Sent: Wednesday, June 8, 2022 11:24 AM > To: Kevin Gurney <kgur...@mathworks.com> > Subject: FW: [MATLAB] Integrating a framework for connecting MATLAB and C++ > objects using MEX > > > > > > > From: Sutou Kouhei <k...@clear-code.com> > Date: Tuesday, June 7, 2022 at 8:36 PM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Cc: Fiona La <fion...@mathworks.com>, Jeremy Hughes <jhug...@mathworks.com>, > Nick Haddad <nhad...@mathworks.com> > Subject: Re: [MATLAB] Integrating a framework for connecting MATLAB and C++ > objects using MEX > > Hi, > > Can we use the object dispatch layer as a library? Or should > we copy (or submodule) the object dispatch layer to > apache/arrow? > > If we can use the object dispatch layer as a library, we can > just use it as an external library like GoogleTest. We don't > need IP clearance. You can use any Apache License 2.0 > compatible license for the object dispatch layer. > > Thanks, > -- > kou > > In > <mn2pr05mb6496cff053c60e54c133c93cae...@mn2pr05mb6496.namprd05.prod.outlook.com> > "[MATLAB] Integrating a framework for connecting MATLAB and C++ objects using > MEX" on Tue, 7 Jun 2022 18:10:43 +0000, > Kevin Gurney <kgur...@mathworks.com> wrote: > >> Hi All, >> >> I am reaching out to seek guidance from the community regarding a code >> integration puzzle. >> >> The architecture that we are currently pursuing for the MATLAB interface to >> Arrow [1] involves dispatching to the Arrow C++ libraries using MEX (a >> MATLAB facility for calling C/C++ code [2]). A major challenge with this >> approach has been keeping Arrow C++ objects (e.g. arrow::Array) alive in >> memory for the appropriate amount of time and making it easy to interface >> with them from MATLAB. >> >> MATLAB has a recommended solution for this problem [3]. However, we've been >> pursuing a MEX-based solution due to the pervasiveness of MEX and its >> familiarity to MATLAB users. Our hope is that using MEX will make it easy >> for others to contribute to the MATLAB interface. >> >> To help maintain the connection between MATLAB objects and C++, we've been >> experimenting with a MEX-based object dispatch layer. The primary goal of >> this work is to unblock development of the MATLAB interface to Arrow. >> However, this object dispatch layer is non-trivial and ultimately unrelated >> to the Arrow project's core mission. Therefore, submitting this code to the >> Arrow project doesn't seem like the optimal code integration strategy. >> >> We’ve been considering the possibility of creating a new open-source >> repository under the MathWorks GitHub organization [4] to host the object >> dispatch layer (a side effect of this approach is that it may help encourage >> reuse of this infrastructure in future open-source MATLAB projects). >> >> However, this approach would come with notable tradeoffs: >> >> 1. We would need to follow the ASF IP Clearance Process [5] to integrate >> this code into the Arrow project (it's possible we are mistaken about this). >> >> 2. It's not obvious how we should keep the code in sync. Would it be >> possible to use a git submodule [6] to "symlink" to the external repo? >> >> 3. What about licensing? Does the code need to be Apache licensed, or would >> it be possible to use another Apache-compatible license [7], like BSD? BSD >> is the default choice for new projects hosted under the MathWorks GitHub >> organization. >> >> Admittedly, we aren't sure what the best path forward is, so we appreciate >> the community's guidance. We welcome any suggestions. >> >> [1] >> https://github.com/apache/arrow/tree/master/matlab<https://github.com/apache/arrow/tree/master/matlab><https://github.com/apache/arrow/tree/master/matlab<https://github.com/apache/arrow/tree/master/matlab>> >> [2] https://www.mathworks.com/help/matlab/call-mex-files-1.html >> [3] >> https://www.mathworks.com/help/matlab/build-matlab-interface-to-c-library.html >> [4] >> https://github.com/mathworks<https://github.com/mathworks><https://github.com/mathworks<https://github.com/mathworks>> >> [5] >> https://incubator.apache.org/ip-clearance/<https://incubator.apache.org/ip-clearance><https://incubator.apache.org/ip-clearance<https://incubator.apache.org/ip-clearance>> >> [6] >> https://github.blog/2016-02-01-working-with-submodules/<https://github.blog/2016-02-01-working-with-submodules><https://github.blog/2016-02-01-working-with-submodules<https://github.blog/2016-02-01-working-with-submodules>> >> [7] >> https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a><https://www.apache.org/legal/resolved.html#category-a<https://www.apache.org/legal/resolved.html#category-a>> >> >> Thank you, >> >> Kevin Gurney