Thanks Shane .. the URL I linked somehow didn't work in other people
browser. Hope this link works:
https://issues.apache.org/jira/browse/SPARK-23492?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion
I'm trying to re-read however I'm getting cached data (which is a bit
confusing). For re-read I'm issuing:
spark.read.format("delta").load("/data").groupBy(col("event_hour")).count
The cache seems to be global influencing also new dataframes.
So the question is how should I re-read without loosin
Greetings! I am looking into the possibility of JRuby support for Spark, and
could use some pointers (references?) to orient myself a bit better within the
codebase.
JRuby fat jars load just fine in Spark but where things start to get
predictably dicey is with object serialization for RDDs gettin
I will add one more condition for "updated". So, it will additionally avoid
things updated within one year but left open against EOL releases.
project = SPARK
AND status in (Open, "In Progress", Reopened)
AND (
affectedVersion = EMPTY OR
NOT (affectedVersion in versionMatch("^3.*")
I'd only tweak this to perhaps not close JIRAs that have been updated
recently -- even just avoiding things updated in the last month. For
example this would close
https://issues.apache.org/jira/browse/SPARK-27758 which
was opened Friday (though, for other reasons it should probably be closed).
Sti