t;
> In spark 1.6.x I think this may work with spark-csv
> <https://github.com/databricks/spark-csv> :
>
> spark.read.format("com.databricks.spark.csv").option("header", "false")
> .schema(custom_schema)
> .option('delimiter',
x27;delimiter', '\t')
.option('mode', 'DROPMALFORMED')
.load(paths.split(','))
However, even it mentions that this approach would work in Spark 2.x, I don’t
find an implementation of load that accepts an Array[String] as an input
4, Inf_period#1039,
> infectedFamily#1355L, infectedWorker#1385L]
>
> +- Aggregate [S_ID#1903L], [S_ID#1903L, count(1) AS infectedStreet#1415L]
>
> Does someone have a clue about it?
> Thanks,
>
>
>
Didac Gil de la Iglesia
PhD in Computer Science
didacg...@gmail.com
Spain: +34 696 285 544
Sweden: +46 (0)730229737
Skype: didac.gil.de.la.iglesia
signature.asc
Description: Message signed with OpenPGP
.load();
> val ds1 = ds.select($"value")
> val query = ds1.writeStream.outputMode("append").format("console").start()
> query.awaitTermination()
> There are no errors when I execute this code however I don't see any data
> being printed out to console? When I run m
as follow
> user_id1 feature1 feature2 feature3 feature4 feature5...feature100
>
> Is there a more efficient way except join?
>
> Thanks!
Didac Gil de la Iglesia
PhD in Computer Science
didacg...@gmail.com
Spain: +34 696 285 544
Sweden: +46 (0)730229737
Skype: didac.gil.de.la.iglesia
signature.asc
Description: Message signed with OpenPGP
ature2 feature3 feature4 feature5...feature100
>
> Is there a more efficient way except join?
>
> Thanks!
Didac Gil de la Iglesia
PhD in Computer Science
didacg...@gmail.com
Spain: +34 696 285 544
Sweden: +46 (0)730229737
Skype: didac.gil.de.la.iglesia
signature.asc
Description: Message signed with OpenPGP
coalesce in a sql expression, but I'm not having any luck here either.
>
> Obviously, I can do a null check on the fields downstream, however it is not
> in the spirit of scala to pass around nulls, so I wanted to see if I was
> missing another approach first.
>
> Thanks,
>
Spark can be a consumer and a producer from the Kafka point of view.
You can create a kafka client in Spark that registers to a topic and reads the
feeds, and you can process data in Spark and generate a producer that sends
that data into a topic.
So, Spark lies next to Kafka and you can use Kaf
Is 1570 the value of Col1?
If so, you have ordered by that column and selected only the first item. It
seems that both results have the same Col1 value, therefore any of them would
be a right answer to return. Right?
> On 2 Feb 2017, at 11:03, Alex wrote:
>
> Hi As shown below same query when
Are you sure that “age” is a numeric field?
Even numeric, you could pass the “44” between quotes:
INSERT into your_table ("user","age","state") VALUES ('user3’,’44','CT’)
Are you sure there are no more fields that are specified as NOT NULL, and that
you did not provide a value (besides user, a
Any suggestions for using something like OneHotEncoder and StringIndexer on
an InputDStream?
I could try to combine an Indexer based on a static parquet but I want to
use the OneHotEncoder approach in Streaming data coming from a socket.
Thanks!
Dídac Gil de la Iglesia
11 matches
Mail list logo