expect to get array elements of type decimal(38,18) and no error when
reading in this case.
Should this be considered a bug? Is there a workaround other than changing the
column array type definition to include explicit precision and scale?
Best regards,
Alexey
Hi,
I also filed a jira yesterday:
https://issues.apache.org/jira/browse/SPARK-26538
Looks like one needs to be closed as duplicate. Sorry for the late update.
Best regards
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---
clean package
Is it only me, who can’t build Spark 1.3?
And, is there any site to download Spark prebuilt for Hadoop 2.5 and Hive?
Thank you for any help.
Alexey
This message, including any attachments, is the property of Sears Holdings
Corporation and/or one of its subsidiaries. It is confidential
when view
expires the legth of sliding window….
So my question: does anybody know/have and can share the piece code/ know how:
how to implement “sliding Top N window” better.
If nothing will be offered, I will share what I will do myself.
Thank you
Alexey
This message, including any attachments
What's the reason for your first cache call? It looks like you've used the
data only once to transform it without reusing the data, so there's no
reason for the first cache call, and you need only the second call (and
that also depends on the rest of your code).
On Thu, Jun 16, 2016 at 3:17 PM, ps
>From my personal experience - we're reading the metadata of the features
column in the dataframe to extract mapping of the feature indices to the
original feature name, and use this mapping to translate the model
coefficients into a JSON string that maps the original feature names to
their weights
Hi Yanbo,
Thanks for your reply. I will keep an eye on that pull request.
For now, I decided to just put my code inside org.apache.spark.ml to be
able to access private classes.
Thanks,
Alexey
On Tue, Aug 16, 2016 at 11:13 PM, Yanbo Liang wrote:
> It seams that VectorUDT is private and
Hi
I have simple spark-streaming job(8 executors 1 core - on 8 node cluster) -
read from Kafka topic( 3 brokers with 8 partitions) and save to Cassandra.
The problem is that when I increase number of incoming messages in topic the
job is starting to fail with kafka.common.OffsetOutOfRangeExcepti
ion in thread "main" java.lang.UnsupportedClassVersionError:
org/apache/maven/cli/MavenCli : Unsupported major.minor version 51.0
Please help how to build the thing.
Thanks
Alexey
This message, including any attachments, is the property of Sears Holdings
Corporation and/or one of
gt;>
>> Or Is there any way to access the array if I store max and min values to a
>> array inside the spark transformation class?
>>
>> Thanks.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Calculating-Min-and-Max-Values-using-Spark-Transformations-tp24491.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>>
>
--
Best regards, Alexey Grishchenko
phone: +353 (87) 262-2154
email: programme...@gmail.com
web: http://0x0fff.com
Boolean:false rows. And hope in the final result, the
>> negative ones could be 10 times more than positive ones.
>>
>>
>> What would be most efficient way to do this?
>>
>> Thanks,
>>
>>
>>
>>
--
Best regards, Alexey Grishchenko
phone: +353 (87) 262-2154
email: programme...@gmail.com
web: http://0x0fff.com
n I figure out at run time how many machines
>> there are so I know how many DStreams to create?
>>
>
--
Best regards, Alexey Grishchenko
phone: +353 (87) 262-2154
email: programme...@gmail.com
web: http://0x0fff.com
, red)
>
> Can you please explain what is going on?
>
> Thanks,
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
--
Alexey Grishchenko, http://0x0fff.com
Hi,
I have the following code
object MyJob extends org.apache.spark.Logging{
...
val source: DStream[SomeType] ...
source.foreachRDD { rdd =>
logInfo(s"""+++ForEachRDD+++""")
rdd.foreachPartition { partitionOfRecords =>
logInfo(s"""+++ForEachPartition+++""")
}
}
I
Hi,
I have an application with 2 streams, which are joined together.
Stream1 - is simple DStream(relativly small size batch chunks)
Stream2 - is a windowed DStream(with duration for example 60 seconds)
Stream1 and Stream2 are Kafka direct stream.
The problem is that according to logs window oper
;Cody Koeninger" :
> Can you provide more info (what version of spark, code example)?
>
> On Tue, Sep 8, 2015 at 8:18 AM, Alexey Ponkin wrote:
>> Hi,
>>
>> I have an application with 2 streams, which are joined together.
>> Stream1 - is simple DStream(
Hello!
I would like to avoid data checkpointing when processing a DStream. Basically,
we do not care if the intermediate data are lost.
Is there a way to achieve that? Is there an extension point or class embedding
all associated activities?
Thanks!
Sincerely yours,
—
Alexey Kharlamov
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
I have tried "select ceil(2/3)", but got "key not found: floor"
On Tue, Jan 27, 2015 at 11:05 AM, Ted Yu wrote:
> Have you tried floor() or ceil() functions ?
>
> According to http://spark.apache.org/sql/, Spark SQL is compatible with
> Hive SQL.
>
> Cheers
>
> On Mon, Jan 26, 2015 at 8:29 PM, 1
= new SparkContext(conf)
println(s"json4s version: ${org.json4s.BuildInfo.version.toString}")
}
}
sbt 0.13.7, sbt-assembly 0.13.0, Scala 2.10.4
Is it possible to force 3.2.11 version usage?
Thanks,
Alexey
is hard coded.
>
> You can rebuild Spark 1.3.0 with json4s 3.2.11
>
> Cheers
>
> On Mon, Mar 23, 2015 at 2:12 PM, Alexey Zinoviev <
> alexey.zinov...@gmail.com> wrote:
>
>> Spark has a dependency on json4s 3.2.10, but this version has several
>> bugs and I nee
hat's wrong with Logging?
PS: I'm running it with spark-1.3.0/bin/spark-submit --class App1 --conf
spark.driver.userClassPathFirst=true --conf
spark.executor.userClassPathFirst=true
$HOME/projects/sparkapp/target/scala-2.10/sparkapp-assembly-1.0.jar
Thanks,
Alexey
On Tue, Mar 24, 2015
how. Can you double check that and remove the Scala
> classes from your app if they're there?
>
> On Mon, Mar 23, 2015 at 10:07 PM, Alexey Zinoviev
> wrote:
>> Thanks Marcelo, this options solved the problem (I'm using 1.3.0), but it
>> works only if I remove &q
Hello again spark users and developers!
I have standalone spark cluster (1.1.0) and spark sql running on it. My
cluster consists of 4 datanodes and replication factor of files is 3.
I use thrift server to access spark sql and have 1 table with 30+
partitions. When I run query on whole table (some
- https://gist.github.com/13h3r/6e5053cf0dbe33f2
Do you have any idea where to look at?
Thanks!
On Fri, Sep 26, 2014 at 10:35 AM, Andrew Ash wrote:
> Hi Alexey,
>
> You should see in the logs a locality measure like NODE_LOCAL,
> PROCESS_LOCAL, ANY, etc. If your Spark workers
Hello spark users and developers!
I am using hdfs + spark sql + hive schema + parquet as storage format. I
have lot of parquet files - one files fits one hdfs block for one day. The
strange thing is very slow first query for spark sql.
To reproduce situation I use only one core and I have 97sec f
show
> whether the upfront compilation really helps. I doubt it.
>
> However is this almost surely due to caching somewhere, in Spark SQL
> or HDFS? I really doubt hotspot makes a difference compared to these
> much larger factors.
>
> On Fri, Oct 10, 2014 at 8:49 AM, Alexe
Hello spark users!
I found lots of strange messages in driver log. Here it is:
2014-12-01 11:54:23,849 [sparkDriver-akka.actor.default-dispatcher-25]
ERROR
akka.remote.EndpointWriter[akka://sparkDriver/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkExecutor%40data1.hadoop%3A1
Any ideas? Anyone got the same error?
On Mon, Dec 1, 2014 at 2:37 PM, Alexey Romanchuk wrote:
> Hello spark users!
>
> I found lots of strange messages in driver log. Here it is:
>
> 2014-12-01 11:54:23,849 [sparkDriver-akka.actor.default-dispatcher-25]
> ERROR
> akka.remot
30 matches
Mail list logo