Hi,
I am trying to fetch data from Oracle DB using a subquery and experiencing
lot of performance issues.
Below is the query I am using,
*Using Spark 2.0.2*
*val *df = spark_session.read.format(*"jdbc"*)
.option(*"driver"*,*"*oracle.jdbc.OracleDriver*"*)
.option(*"url"*, jdbc_url)
.o
as it has in built HDFS log
> rolling capabilities
>
> On Mon, Jun 26, 2017 at 1:09 PM, Naveen Madhire
> wrote:
>
>> Hi,
>>
>> I am using spark streaming with 1 minute duration to read data from kafka
>> topic, apply transformations and persist into HDF
Hi,
I am using spark streaming with 1 minute duration to read data from kafka
topic, apply transformations and persist into HDFS.
The application is creating a new directory every 1 minute with many
partition files(= nbr of partitions). What parameter should I need to
change/configure to persist
Hi All,
I am running the WikiPedia parsing example present in the "Advance
Analytics with Spark" book.
https://github.com/sryza/aas/blob/d3f62ef3ed43a59140f4ae8afbe2ef81fc643ef2/ch06-lsa/src/main/scala/com/cloudera/datascience/lsa/ParseWikipedia.scala#l112
The partitions of the RDD returned by
Hi,
I am running pyspark in windows and I am seeing an error while adding
pyfiles to the sparkcontext. below is the example,
sc = SparkContext("local","Sample",pyFiles="C:/sample/yattag.zip")
this fails with no file found error for "C"
The below logic is treating the path as individual files l
You can use Intellij for Scala. There are many articles online which you
can refer for setting up Intellij and scala pluggin.
Thanks
On Friday, July 24, 2015, Siva Reddy wrote:
> I want to program in scala for spark.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.10
I had the similar issue with spark 1.3
After migrating to Spark 1.4 and using sqlcontext.read.json it worked well
I think you can look at dataframe select and explode options to read the
nested json elements, array etc.
Thanks.
On Mon, Jul 20, 2015 at 11:07 AM, Davies Liu wrote:
> Could you tr
I am facing the same issue, i tried this but getting compilation error for
the "$" in the explode function
So, I had to modify to the below to make it work.
df.select(explode(new Column("entities.user_mentions")).as("mention"))
On Wed, Jun 24, 2015 at 2:48 PM, Michael Armbrust
wrote:
> Star
Yes. I did this recently. You need to copy the cloudera cluster related
conf files into the local machine
and set HADOOP_CONF_DIR or YARN_CONF_DIR.
And also local machine should be able to ssh to the cloudera cluster.
On Wed, Jul 15, 2015 at 8:51 AM, ayan guha wrote:
> Assuming you run spark lo
also use spark-testing-base from
>> spark-packages.org as a basis for your unittests.
>>
>> On Fri, Jul 10, 2015 at 12:03 PM, Daniel Siegmann <
>> daniel.siegm...@teamaol.com> wrote:
>>
>>> On Fri, Jul 10, 2015 at 1:41 PM, Naveen Madhire
&g
Hi,
I want to write junit test cases in scala for testing spark application. Is
there any guide or link which I can refer.
Thank you very much.
-Naveen
Hi All,
I am working with dataframes and have been struggling with this thing, any
pointers would be helpful.
I've a Json file with the schema like this,
links: array (nullable = true)
||-- element: struct (containsNull = true)
|||-- desc: string (nullable = true)
|||-- id
Hi Marcelo, Quick Question.
I am using Spark 1.3 and using Yarn Client mode. It is working well,
provided I have to manually pip-install all the 3rd party libraries like
numpy etc to the executor nodes.
So the SPARK-5479 fix in 1.5 which you mentioned fix this as well?
Thanks.
On Thu, Jun 25,
Cloudera blog has some details.
Please check if this is helpful to you.
http://blog.cloudera.com/blog/2014/12/new-in-cloudera-labs-sparkonhbase/
Thanks.
On Wed, May 20, 2015 at 4:21 AM, donhoff_h <165612...@qq.com> wrote:
> Hi, all
>
> I wrote a program to get HBaseConfiguration object in Spar
es with a: 24, Lines with b: 15
>
> The exception seems to be happening with Spark cleanup after executing
> your code. Try adding sc.stop() at the end of your program to see if the
> exception goes away.
>
>
>
>
> On Wednesday, December 31, 2014 6:40 AM, Naveen Madh
Hi All,
I am trying to run a sample Spark program using Scala SBT,
Below is the program,
def main(args: Array[String]) {
val logFile = "E:/ApacheSpark/usb/usb/spark/bin/README.md" // Should
be some file on your system
val sc = new SparkContext("local", "Simple App",
"E:/ApacheSpark/
16 matches
Mail list logo