"org.apache.hadoop.io.Writable"}
>>>>
>>>>
>>>> keyConv =
>>>>
>>>> "org.apache.spark.examples.pythonconverters.StringToImmutableBytesWritableConverter"
>>>> valueConv =
>>>> "org.apache.s
s = FlumeUtils.createStream(ssc, 'ubuntu3', 9997)
>>> words = lines.map(lambda line: line[1])
>>> rowid = datetime.now().strftime("%Y%m%d%H%M%S")
>>> outrdd= words.map(lambda x: (str(1),[rowid,"cf1desc","col1",x]))
>>> print
owid,"cf1desc","col1",x]))
>> print("ok 1")
>> outrdd.pprint()
>>
>> outrdd.foreachRDD(lambda x:
>>
>> x.saveAsNewAPIHadoopDataset(conf=conf,keyConverter=keyConv,valueConverter=valueConv))
>>
>> ssc.start()
>> ssc.a
> print("ok 1")
> outrdd.pprint()
>
> outrdd.foreachRDD(lambda x:
>
> x.saveAsNewAPIHadoopDataset(conf=conf,keyConverter=keyConv,valueConverter=valueConv))
>
> ssc.start()
> ssc.awaitTermination()*
>
> the issue is that the rowid variable is allways at the point
the issue is that the rowid variable is allways at the point that the
streaming was began.
How can I go around this? I tried a function, an application, nothing
worked.
Thank you.
jp
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SparkStreaming-variable-scope