Is there anyway to map pyspark.sql.Row columns to JDBC table columns, or do
I have to just put them in the right order before saving?
I'm using code like this:
```
rdd = rdd.map(lambda i: Row(name=i.name, value=i.value))
sqlCtx.createDataFrame(rdd).write.jdbc(dbconn_string, tablename,
mode='appen
Thanks
>
> On Wed, Oct 28, 2015 at 6:19 AM, Bob Corsaro wrote:
>
>> Has anyone successful built this? I'm trying to determine if there is a
>> defect in the source package or something strange about my environment. I
>> get a FileNotFound exception on MQTTUtils.cla
Has anyone successful built this? I'm trying to determine if there is a
defect in the source package or something strange about my environment. I
get a FileNotFound exception on MQTTUtils.class during the build of the
MQTT module. The only work around I've found is to remove the MQTT modules
from t
ul 20, 2015 at 9:59 AM Bob Corsaro wrote:
> I'm running a spark cluster and I'd like to access the Spark-UI from
> outside the LAN. The problem is all the links are to internal IP addresses.
> Is there anyway to config hostnames for each of the hosts in the cluster
> and use those for the links?
>
I'm running a spark cluster and I'd like to access the Spark-UI from
outside the LAN. The problem is all the links are to internal IP addresses.
Is there anyway to config hostnames for each of the hosts in the cluster
and use those for the links?
)> 1 df.select("name",
> (df.age)**2).show()
> TypeError: unsupported operand type(s) for ** or pow(): 'Column' and 'int'
>
>
> Moreover testing the functions individually they are working fine.
>
> pow(2,4)
>
> 16
>
> 2**4
>
I'm having trouble using "select pow(col) from table" It seems the function
is not registered for SparkSQL. Is this on purpose or an oversight? I'm
using pyspark.
I've only tried it in python
On Tue, Jun 23, 2015 at 12:16 PM Ignacio Blasco
wrote:
> That issue happens only in python dsl?
> El 23/6/2015 5:05 p. m., "Bob Corsaro" escribió:
>
>> Thanks! The solution:
>>
>> https://gist.github.com/dokipen/018a1deeab
"inner") \
> .select(numbers.name, numbers.value, numbers2.other) \
> .collect()
>
> On Mon, Jun 22, 2015 at 12:53 PM, Ignacio Blasco
> wrote:
> > Sorry thought it was scala/spark
> >
> > El 22/6/2015 9:49 p. m., "Bob Corsaro&qu
That's invalid syntax. I'm pretty sure pyspark is using a DSL to create a
query here and not actually doing an equality operation.
On Mon, Jun 22, 2015 at 3:43 PM Ignacio Blasco wrote:
> Probably you should use === instead of == and !== instead of !=
> Can anyone explain why the dataframe API do
Can anyone explain why the dataframe API doesn't work as I expect it to
here? It seems like the column identifiers are getting confused.
https://gist.github.com/dokipen/4b324a7365ae87b7b0e5
I'm setting PYTHONPATH before calling pyspark, but the worker nodes aren't
inheriting it. I've tried looking through the code and it appears that it
should, I can't find the bug. Here's an example, what am I doing wrong?
https://gist.github.com/dokipen/84c4e4a89fddf702fdf1
wrote:
> like this?
>
> myDStream.foreachRDD(rdd => rdd.saveAsTextFile("/sigmoid/", codec ))
>
>
> Thanks
> Best Regards
>
> On Mon, Jun 8, 2015 at 8:06 PM, Bob Corsaro wrote:
>
>> It looks like saveAsTextFiles doesn't support the compression parameter
>> of
It looks like saveAsTextFiles doesn't support the compression parameter of
RDD.saveAsTextFile. Is there a way to add the functionality in my client
code without patching Spark? I tried making my own saveFunc function and
calling DStream.foreachRDD but ran into trouble with invoking rddToFileName
an
14 matches
Mail list logo