Dear All,
I have the following question:
I am using SPARK SQL 2.0 version and, in particular I am doing some joins in
pipeline of the following pattern (d3 = d1 join d2, d4=d5 join d6, d7=d3 join
d4).
When running my code, I realised that the building of d7 generates an issue as
reported belo
Hi All,
I am using SPARK 2.0 and I have got the following issue:
I am able to run the step 1-5 (see below) but not the step 6 which uses an UDF.
Actually, the step 1-5 takes few second and the step 6 looks like that it never
ends.
Is there anything wrong? how should I address it?
Any sugge
Hi All,
I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am
getting the following exception "org.apache.spark.SparkException: Exception
thrown in awaitResult”.
Googling for it, I found that the closed is the answer reported
https://issues.apache.org/jira/browse/SPARK
Hi Rui,
Thanks for the promptly reply.
No, I am not using Mesos.
Ok. I am writing a code to build a suitable dataset for my needs as in the
following:
== Session configuration:
SparkSession spark = SparkSession
.builder()
.master("local[6]") //
/recommendation?
Many thanks.
Carlo
On 28 Jul 2016, at 11:06, carlo allocca
mailto:ca6...@open.ac.uk>> wrote:
Hi Rui,
Thanks for the promptly reply.
No, I am not using Mesos.
Ok. I am writing a code to build a suitable dataset for my needs as in the
following:
== Session configuration:
SparkS
commit/947b9020b0d621bc97661a0a056297e6889936d3
2) https://github.com/apache/spark/pull/12433
which both explain why it happens but nothing about what to do to solve it.
Do you have any suggestion/recommendation?
Many thanks.
Carlo
On 28 Jul 2016, at 11:06, carlo allocca
mailto:ca6...@open.ac.uk>> wrote:
Hi Rui,
Th
1:14, Carlo.Allocca
mailto:carlo.allo...@open.ac.uk>> wrote:
I have also found the following two related links:
1)
https://github.com/apache/spark/commit/947b9020b0d621bc97661a0a056297e6889936d3
2) https://github.com/apache/spark/pull/12433
which both explain why it happens but noth
Solved!!
The solution is using date_format with the “u” option.
Thank you very much.
Best,
Carlo
On 28 Jul 2016, at 18:59, carlo allocca
mailto:ca6...@open.ac.uk>> wrote:
Hi Mark,
Thanks for the suggestion.
I changed the maven entries as follows
spark-core_2.10
2.0.0
and
Hi All,
I am trying to convert a Dataset into JavaRDD in order to
apply a linear regression.
I am using spark-core_2.10, version2.0.0 with Java 1.8.
My current approach is:
== Step 1: convert the Dataset into JavaRDD
JavaRDD dataPoints =modelDS.toJavaRDD();
== Step 2: convert JavaRDD int
problem solved.
The package org.apache.spark.api.java.function.Function was missing.
Thanks.
Carlo
On 3 Aug 2016, at 12:14, Carlo.Allocca
mailto:carlo.allo...@open.ac.uk>> wrote:
Hi All,
I am trying to convert a Dataset into JavaRDD in order to
apply a linear regression.
I am using spark-co
Hi All,
I would like to apply a regression to my data. One of the workflow is the
prepare my data as a JavaRDD starting from a Dataset with
its header. So, what I did was the following:
== Step 1: transform the Dataset into JavaRDD
JavaRDD dataPointsWithHeader =modelDS.toJavaRDD();
Hi Aseem,
Thank you very much for your help.
Please, allow me to be more specific for my case (to some extent I already do
what you suggested):
Let us imagine that I two csv datasets d1 and d2. I generate the Dataset
as in the following:
== Reading d1:
sparkSession=spark;
options =
Thanks Mich.
Yes, I know both headers (categoryRankSchema, categorySchema ) as expressed
below:
this.dataset1 =
d1_DFR.schema(categoryRankSchema).csv(categoryrankFilePath);
this.dataset2 = d2_DFR.schema(categorySchema).csv(categoryFilePath);
Can you use filter to get rid of the
One more:
it seems that the steps
== Step 1: transform the Dataset into JavaRDD
JavaRDD dataPointsWithHeader =dataset1_Join_dataset2.toJavaRDD();
and
List someRows = dataPointsWithHeader.collect();
someRows.forEach(System.out::println);
do not print the header.
So, Could I assume
Hi Mich,
Thanks again.
My issue is not when I read the csv from a file.
It is when you have a Dataset that is output of some join operations.
Any help on that?
Many Thanks,
Best,
Carlo
On 3 Aug 2016, at 21:43, Mich Talebzadeh
mailto:mich.talebza...@gmail.com>> wrote:
hm odd.
Otherwise you ca
On 3 Aug 2016, at 22:01, Mich Talebzadeh
mailto:mich.talebza...@gmail.com>> wrote:
ok in other words the result set of joining two dataset ends up with
inconsistent result as a header from one DS is joined with another row from
another DS?
I am not 100% sure I got this point. Let me check if I
Dear All,
I would like to ask for your help about the following issue:
java.lang.ClassNotFoundException: org.apache.spark.Logging
I checked and the class Logging is not present.
Moreover, the line of code where the exception is thrown
final org.apache.spark.mllib.regression.LinearRegressionMode
Hi Ted,
Thanks for the promptly answer.
It is not yet clear to me what I should do.
How to fix it?
Many thanks,
Carlo
On 5 Aug 2016, at 17:58, Ted Yu
mailto:yuzhih...@gmail.com>> wrote:
private[spark] trait Logging {
-- The Open University is incorporated by Royal Charter (RC 000391), an exe
Please Sean, could you detail the version mismatch?
Many thanks,
Carlo
On 5 Aug 2016, at 18:11, Sean Owen
mailto:so...@cloudera.com>> wrote:
You also seem to have a
version mismatch here.
-- The Open University is incorporated by Royal Charter (RC 000391), an exempt
charity in England & Wales
I have also executed:
mvn dependency:tree |grep log
[INFO] | | +- com.esotericsoftware:minlog:jar:1.3.0:compile
[INFO] +- log4j:log4j:jar:1.2.17:compile
[INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.16:compile
[INFO] | | +- commons-logging:commons-logging:jar:1.1.3:compile
and the POM reports
Thanks Marcelo.
Problem solved.
Best,
Carlo
Hi Marcelo,
Thanks you for your help.
Problem solved as you suggested.
Best Regards,
Carlo
> On 5 Aug 2016, at 18:34, Marcelo Vanzin wrote:
>
> On Fri, Aug 5, 2016 at 9:53 AM, Carlo.Allocca
> wrote:
>>
>>org.apache.spark
>>
Hi All,
I am using SPARK and in particular the MLib library.
import org.apache.spark.mllib.regression.LabeledPoint;
import org.apache.spark.mllib.regression.LinearRegressionModel;
import org.apache.spark.mllib.regression.LinearRegressionWithSGD;
For my problem I am using the LinearRegressionWith
Hi Mohit,
Thank you for your reply.
OK. it means coefficient with high score are more important that other with low
score…
Many Thanks,
Best Regards,
Carlo
> On 3 Nov 2016, at 20:41, Mohit Jaggi wrote:
>
> For linear regression, it should be fairly easy. Just sort the co-efficients
> :)
>
Hi Robin,
On 4 Nov 2016, at 09:19, Robin East
mailto:robin.e...@xense.co.uk>> wrote:
Hi
Do you mean the test of significance that you usually get with R output?
Yes, exactly.
I don’t think there is anything implemented in the standard MLLib libraries
however I believe that the sparkR version
Hi Masood,
Thank you very much for your insight.
I am going to scale all my features as you described.
As I am beginners, Is there any paper/book that would explain the suggested
approaches? I would love to read.
Many Thanks,
Best Regards,
Carlo
On 7 Nov 2016, at 16:27, Masood Krohy
mailt
I found it just google
http://sebastianraschka.com/Articles/2014_about_feature_scaling.html
Thanks.
Carlo
On 7 Nov 2016, at 17:12, carlo allocca
mailto:ca6...@open.ac.uk>> wrote:
Hi Masood,
Thank you very much for your insight.
I am going to scale all my features as you described.
A
hanks in advance.
Best Regards,
Carlo
On 7 Nov 2016, at 17:14, carlo allocca
mailto:ca6...@open.ac.uk>> wrote:
I found it just google
http://sebastianraschka.com/Articles/2014_about_feature_scaling.html
Thanks.
Carlo
On 7 Nov 2016, at 17:12, carlo allocca
mailto:ca6...@open.ac.uk&g
Hi Masood,
Thanks for the answer.
Sure. I will do as suggested.
Many Thanks,
Best Regards,
Carlo
On 8 Nov 2016, at 17:19, Masood Krohy
mailto:masood.kr...@intact.net>> wrote:
labels
-- The Open University is incorporated by Royal Charter (RC 000391), an exempt
charity in England & Wales and a
Dear All,
I am using spark-xml_2.10 to parse and extract some data from XML files.
I got the issue of getting null value whereas the XML file contains actually
values.
++-
Dear All,
I would like to ask you help about the following issue when using
spark-xml_2.10:
Given a XML file with the following structure:
xocs:doc
|-- xocs:item: struct (nullable = true)
||-- bibrecord: struct (nullable = true)
|||-- head: struct (nullable = true)
|||
String rowTag="xocs:doc”; and get
the right values for ….abstract.ce:para, etc? what am I doing wrong?
Many Thanks in advance.
Best Regards,
Carlo
On 14 Feb 2017, at 17:35, carlo allocca
mailto:ca6...@open.ac.uk>> wrote:
Dear All,
I would like to ask you help about the foll
question is: How Can I get it right to use String rowTag="xocs:doc”; and get
the right values for ….abstract.ce:para, etc? what am I doing wrong?
Many Thanks in advance.
Best Regards,
Carlo
On 14 Feb 2017, at 17:35, carlo allocca
mailto:ca6...@open.ac.uk>> wrote:
Dear All,
I would
Dear All,
I need to apply a dataset transformation to replace null values with the
previous Non-null Value.
As an example, I report the following:
from:
id | col1
-
1 null
1 null
2 4
2 null
2 null
3 5
3 null
3 null
to:
id | col1
-
1 null
1 null
2 4
2
Dear All,
I am trying to propagate the last valid observation (e.g. not null) to the null
values in a dataset.
Below I reported the partial solution:
Dataset tmp800=tmp700.select("uuid", "eventTime", "Washer_rinseCycles");
WindowSpec wspec=
Window.partitionBy(tmp800.col("uuid")).o
34 matches
Mail list logo