Hi all,

I have implemented a simple Scala object using Flink to play with joins
operator. After that, I put the join operator show my results I decided to
sort the output by the first field (.sortPartition(0, Order.ASCENDING)). It
seems that the output is ordered by group. The output shows two groups of
"Fyodor Dostoyevsky". Why is this happening? How do I sort the complete
DataSet?

Kind Regards,
Felipe

import org.apache.flink.api.common.operators.Orderimport
org.apache.flink.api.scala.{ExecutionEnvironment, _}
object JoinBooksAndAuthors {
  val AUTHOR_ID_FIELD: Int = 0
  val AUTHOR_NAME_FIELD: Int = 1

  val BOOK_AUTHORID_FIELD: Int = 0
  val BOOK_YEAR_FIELD: Int = 1
  val BOOK_NAME_FIELD: Int = 2

  def main(args: Array[String]) {

    val env = ExecutionEnvironment.getExecutionEnvironment

    val authors = env.readCsvFile[(Int, String)](
      "downloads/authors.tsv",
      fieldDelimiter = "\t",
      lineDelimiter = "\n",
      includedFields = Array(0, 1)
    )

    val books = env.readCsvFile[(Int, Short, String)](
      "downloads/books.tsv",
      fieldDelimiter = "\t",
      lineDelimiter = "\n",
      includedFields = Array(0, 1, 2)
    )

    authors
      .join(books)
      .where(AUTHOR_ID_FIELD)
      .equalTo(BOOK_AUTHORID_FIELD)
      .map(tuple => (tuple._1._2, tuple._2._3))
      .sortPartition(0, Order.ASCENDING)
      .print()
  }}

output

(Charles Bukowski,Women)(Charles Bukowski,The Most Beautiful Woman in
Town)(Charles Bukowski,Hot Water Music)(Charles
Bukowski,Barfly)(Charles Bukowski,Notes of a Dirty Old Man)(Charles
Bukowski,Ham on Rye)(Fyodor Dostoyevsky,The Brothers Karamazov)(Fyodor
Dostoyevsky,The Double: A Petersburg Poem)(Fyodor Dostoyevsky,Poor
Folk)(George Orwell,Coming Up for Air)(George Orwell,Burmese
Days)(George Orwell,A Clergyman's Daughter)(George Orwell,Down and Out
in Paris and London)(Albert Camus,The Plague)(Fyodor Dostoyevsky,The
Eternal Husband)(Fyodor Dostoyevsky,The Gambler)(Fyodor
Dostoyevsky,The House of the Dead)(Fyodor Dostoyevsky,Crime and
Punishment)(Fyodor Dostoyevsky,Netochka Nezvanova).....






-- 

*---- Felipe Oliveira Gutierrez*

*-- skype: felipe.o.gutierrez*
*--* *https://felipeogutierrez.blogspot.com
<https://felipeogutierrez.blogspot.com>*

Reply via email to