I agree with Ivan here. And more generally, R is a fully featured programming language. You don't need just this one "exploit" (though, it really does feel like a feature to some degree lol!) to be a bad guy with R.
You can link to a pre-compiled binary (like my team makes for an R package that contains proprietary code https://github.com/R-ArcGIS/r-bridge/tree/master/libs/x64) and call completely compiled function that have bad side effects. You can initialize a logger in `.onLoad()` or have a function that sends your data to someone using httr quietly while doing something actually useful. There are also fairly widely used R packages that exist on GitHub/Lab or r-universe or elsewhere. You'd be taking on a sisyphean task trying to route out all the evil code from the R world. There's also likely little to none of it (shouts out to CRAN maintainers for being really good at what they do even if it does grind my gears sometimes 😬💞) On Fri, May 3, 2024 at 4:57 PM Ivan Krylov via R-package-devel < r-package-devel@r-project.org> wrote: > On Fri, 3 May 2024 18:17:52 +0200 > Maciej Nasinski <nasinski.mac...@gmail.com> wrote: > > > I found the https://github.com/hrbrmstr/rdaradar solution and ran it > > on the 100 most downloaded R packages. > > Happily, all data/inst rda files are safe/non-exposed to RDS exploit > > (using the linked solution). > > This is a bit useful - knowing that there are no obvious exploits in > the 100 most downloaded CRAN packages is better that not knowing that - > but it is important to keep the big picture in mind. Bob himself said > that the script is "super basic". Currently, it only checks whether an > *.rda file, when loaded in the global environment, would shadow certain > important functions. This is not an attack a package author would > perform; this is something one would send directly to the victim. > > In order to defeat an attacker, you must think like an attacker. > > Here's someone jokingly describing how they would trojan the world's > online shop checkout systems if they wanted to commit financial crimes: > https://archive.ph/FCdBu > (With kindness and pull requests.) > > Here's someone spending two years to plant a fake maintainer with a > backdoor in a key free software project: > https://lwn.net/Articles/967192/ > (The backdoor was assembled from obfuscated "test files for the > decompressor".) > > Here's the 2015 Underhanded C Contest, where people competed in writing > the most harmless-looking code that would instead do something > nefarious: http://www.underhanded-c.org/ > > On the one hand, hiding the bad functions in a data file (which is > compressed and binary) instead of the R files (which are plain text and > indexed everywhere) would be the obvious first step, so it may be > useful to flag data files with functions in them for human review. > > On the other hand, an evil package author has so many tools at their > disposal that they may not need this one in particular. There are CRAN > packages with tens of megabytes of compiled code inside. Sneaking a > little extra something in a file starting with "// This is generated > grammar parser. Do not edit!" followed by an impenetrable wall of C > could be easier and stay undetected for longer. How many packages use > Java? You don't even have to ship the Java source together with an R > package, so one of your *.jars could have a poisoned dependency with > nobody being the wiser. > > Attackers are very cunning, and we don't even know what exactly we are > looking for. We can automate some of it, but the kind of code review > that will spot an evil function tucked 50 layers inside a giant > auxiliary data object is a lot of effort, hours to days per package. > > > It will be great to run it on all CRAN packages, but I imagine we > > should be sure that the check is decent enough to not overload the > > servers without a need. > > This probably counts as creating an unofficial CRAN mirror: > https://cran.r-project.org/mirror-howto.html > > (I remember someone sending too many requests to download packages one > my one and losing access from a university address to CRAN as a result.) > > You'll need 12.7 Gb for the current versions of the packages or >400 Gb > for the whole archive. > > -- > Best regards, > Ivan > > ______________________________________________ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel