Hi. I want to parse a file and return a key-value pair with pySpark, but result is strange to me. the test.sql is a big fie and each line is usename and password, with # between them, I use below mapper2 to map data, and in my understanding, i in words.take(10) should be a tuple, but the result is that i is username or password, this is strange for me to understand, Thanks for you help.
def mapper2(line): words = line.split('#') return (words[0].strip(), words[1].strip()) def main2(sc): lines = sc.textFile("hdfs://master:9000/spark/test.sql") words = lines.flatMap(mapper2) for i in words.take(10): msg = i + ":" + "\n" -- Rejoice,I Desire! --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org