Hi,
Base R 4.2.0 introduced a change ([Rd] R 4.2.0 is released), "Calling if() or
while() with a condition of length greater than one gives an error rather than
a warning."
The below code is a reproducible example of the issue. If it is executed in R
>=4.2.0 then it will generate an error, else just a warning message. Sys.time()
is a multi-class object in R, and throughout the Spark R repository 'if'
statement is used as: 'if(class(x) == "Column")' - this causes the error in
latest R version. Note that, R allows an object to have multiple 'class' names
as a character vector (R: Object Classes); hence this type of check itself was
not a good idea in the first place.
t <- Sys.time() sdf <- SparkR::createDataFrame(data.frame(xx = t +
c(-1,1,-1,1,-1)))
SparkR::collect(SparkR::filter(sdf, SparkR::column("xx") > t))
The suggested change is to add 'all' function while doing the check of whether
class(.) is Column or not: 'if(all(class(x) == "Column"))'.
Creating an issue in JIRA is not very clear to me, hence mailing it to the
'user' list.
Vivek