On 11/3/19 1:05 PM, Viktor Gal wrote:
ah yeah i forgot to mention. it’s the same, i.e. it’s not the byte code 
compilation that causes this behaviour but the preparation for lazy loading.

R is not optimized for these cases (generated code, source file with >100,000 lines of code), but R has bindings for a large number of external libraries - it should be possible to make the bindings several orders of magnitude smaller and then they'd likely work well.

Making R work well on files like shogun.R would probably require a large amount of non-trivial work on R internals. I would be surprised if it were just say a memory leak we could fix and solve the issue quickly, it may well be that some data structures and algorithms simply won't scale to this extent. If you want to find out, you can debug using the usual R means (R profiler ?Rprof, run the script using ?source, perhaps disabling source references), but to interpret the results you may have to go deep into the implementation of R and in this case of S4.

Preparation for lazy loading starts by sourcing the file - with some details you can find out in the source code of installation and the documentation. I tried quickly and saw a lot of time spent in S4, which is not surprising as the generated file stresses S4 well beyond what is normally the case with R. But I would not be surprised if there were other bottlenecks to be seen later and even if you managed to prepare the package for lazy loading, there would probably be significant overheads at runtime. Still you could experiment with modifying the code generator to avoid the bottlenecks you identify.

If your primary goal is to create R bindings for an external library, I'd recommend having a look at how other packages do it to see what is scalable (there should be a way to make the code way smaller, and easily written by hand in most cases, even though some interfaces are generated, too).

Best
Tomas
cheers,
viktor

On 3 Nov 2019, at 06:53, Uwe Ligges <lig...@statistik.tu-dortmund.de> wrote:

What happens if you disable byte code compilation?

Best,
Uwe Ligges

On 02.11.2019 19:37, Viktor Gal wrote:
Hi Dirk,
no worries, thnx for the feedback!
cheers,
viktor
On 2 Nov 2019, at 13:58, Viktor Gal <wik...@maeth.com> wrote:

Hi Dirk,

so the project is open source, you can reproduce the error yourself (but note 
it’ll take a long time to actually compile it). steps for reproducing:
git clone https://github.com/shogun-toolbox/shogun.git
cd shogun
git checkout feature/shared_ptr
mkdir build
cd build
cmake -DINTERFACE_R=ON ..
make
make install

(it requires tons of dependencies… if you have docker you can docker pull 
shogun/shogun-dev and run things inside the container)

the make install part runs the R CMD INSTALL so that’ll cause the problem.

but i’ve just uploaded the generated R code that causes the problem here, note 
the script is 7Mb i.e. 175k LoC, so you better wget/curl it:
http://maeth.com/shogun.R

cheers,
viktor

On 2 Nov 2019, at 13:52, Dirk Eddelbuettel <e...@debian.org> wrote:


Hi Viktor,

On 2 November 2019 at 13:09, Viktor Gal wrote:
| I’m developing an ML library that has R bindings… when installing the library 
with R CMD INSTALL the R process is running out of memory (50G+ ram) when doing:
| ** byte-compile and prepare package for lazy loading
|
| any ideas how i could debug this part of code, to figure out what is actually 
happening and why is there a memory leak?

Easiest for us to help if we can see code -- so if you have a public repo
somewhere please the link.

I suspect you have some sort of recursion or circular dependency
somewhere. It would be very hard for R to run out of 50gb. But we cannot say
more.

So maybe triage. In a situation like this when a (supposedly complete)
package draft of mine fails "top-down" I often re-validate the toolchain
"bottom-up" with a minimal package. If that works, keep adding pieces step by
step from the 'not-working large package' to the 'small working' package
while continuously ensuring that it still builds.

Hope this helps, Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to