Spark Data Frame. PreSorded partitions

2017-12-03 Thread Николай Ижиков
Cross-posting from @user. Hello, guys! I work on implementation of custom DataSource for Spark Data Frame API and have a question: If I have a `SELECT * FROM table1 ORDER BY some_column` query I can sort data inside a partition in my data source. Do I have a built-in option to tell spark tha

Spark Accumulators

2017-12-03 Thread Tejeshwar J1
Hi all, I would like to read a directory containing 100 Files and increment the accumulator value by 1 whenever a file is read. Can anybody please help me out? Thanks, Tejeshwar