ultiple mysql table to be accesses in a single spark
> job, you can create a list of tables and run a map on that list. Something
> like:
>
> def getTable(tablename:String): Dataframe
> def saveTable(d : Dataframe): Unit
>
> val tables = sc.paralleize()
> tables.map(getTabl
it
>
> On Wed, Mar 22, 2017 at 9:34 AM, Shashank Mandil <
> mandil.shash...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am using spark in a yarn cluster mode.
>> When I run a yarn application it creates multiple executors on the hadoop
>> datanodes fo
Hi All,
I am using spark in a yarn cluster mode.
When I run a yarn application it creates multiple executors on the hadoop
datanodes for processing.
Is it possible for me to create a local spark context (master=local) on
these executors to be able to get a spark context ?
Theoretically since eac
Hi All,
I have a spark data frame which has 992 rows inside it.
When I run a map on this data frame I expect that the map should work for
all the 992 rows.
As a mapper runs on an executor on a cluster I did a distributed count of
the number of rows the mapper is being run on.
dataframe.map(r =>
ned exit code 1, the use case touches Spark very little.
> >
> > What version is that? Do you see "There is an exception in the script
> > exiting with status 1" printed out to stdout?
> >
> > Pozdrawiam,
> > Jacek Laskowski
> >
> > https://medium.com
Hi All,
I wrote a test script which always throws an exception as below :
object Test {
def main(args: Array[String]) {
try {
val conf =
new SparkConf()
.setAppName("Test")
throw new RuntimeException("Some Exception")
println("all done!")
} catch
Hi Aakash,
I think what it generally means that you have to use the general spark APIs
of Dataframe to bring in the data and crunch the numbers, however you
cannot use the KMeansClustering algorithm which is already present in the
MLlib spark library.
I think a good place to start would be unders