Hi.
I want to parse a file and return a key-value pair with pySpark, but
result is strange to me.
the test.sql is a big fie and each line is usename and password, with
# between them, I use below mapper2 to map data, and in my
understanding, i in words.take(10) should be a tuple, but the result
is that i is username or password, this is strange for me to
understand, Thanks for you help.

def mapper2(line):

    words = line.split('#')
    return (words[0].strip(), words[1].strip())

def main2(sc):

    lines = sc.textFile("hdfs://master:9000/spark/test.sql")
    words = lines.flatMap(mapper2)

    for i in words.take(10):
        msg = i + ":" + "\n"


-- 
Rejoice,I Desire!

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to