Be frankly, I also love the pure Java type in Java API and Scala type in
Scala API. :-)
If we don't treat Java as a "FRIEND" of Scala, just as Python, maybe we
can adopt the status of option 1, the specific Java classes. (But I don't
like the `Java` prefix, which is redundant when I'm coding Java
The con is much more than just more effort to maintain a parallel API. It
puts the burden for all libraries and library developers to maintain a
parallel API as well. That’s one of the primary reasons we moved away from
this RDD vs JavaRDD approach in the old RDD API.
On Tue, Apr 28, 2020 at 12:3
Spark has targeted to have a unified API set rather than having separate
Java classes to reduce the maintenance cost,
e.g.) JavaRDD <> RDD vs DataFrame. These JavaXXX are more about the legacy.
I think it's best to stick to the approach 4. in general cases.
Other options might have to be considere
I have 3 external function
I want to apply external function on streaming data, I use sliding window to
get streaming data but I can not apply external function on streaming data
I write code by python
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
--
Can any one help me please
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
I have external function
scaler1 = MinMaxScaler(feature_range=(-1, 1))
def difference(dataset, interval=1):
diff = list()
for i in range(interval, len(dataset)):
value = dataset[i] - dataset[i - interval]
diff.append(value)
return Series(diff)