Re: [PySpark DataFrame] When a Row is not a Row

2015-07-12 Thread Davies Liu
he-spark-developers-list.1001551.n3.nabble.com/PySpark-DataFrame-When-a-Row-is-not-a-Row-tp12210p13153.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail:

Re: [PySpark DataFrame] When a Row is not a Row

2015-07-11 Thread Jerry Lam
? Best Regards, Jerry -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/PySpark-DataFrame-When-a-Row-is-not-a-Row-tp12210p13153.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Re: [PySpark DataFrame] When a Row is not a Row

2015-05-13 Thread Nicholas Chammas
Is there some way around this? For example, can Row just be an implementation of namedtuple throughout? from collections import namedtuple class Row(namedtuple): ... >From a user perspective, it’s confusing that there are 2 different implementations of the Row class with the same name. In my

回复: [PySpark DataFrame] When a Row is not a Row

2015-05-12 Thread Davies Liu
The class (called Row) for rows from Spark SQL is created on the fly, is different from pyspark.sql.Row (is an public API to create Row by users). The reason we done it in this way is that we want to have better performance when accessing the columns. Basically, the rows are just named tuples

Re: [PySpark DataFrame] When a Row is not a Row

2015-05-11 Thread Ted Yu
In Row#equals(): while (i < len) { if (apply(i) != that.apply(i)) { '!=' should be !apply(i).equals(that.apply(i)) ? Cheers On Mon, May 11, 2015 at 1:49 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > This is really strange. > > >>> # Spark 1.3.1 > >>> print type(resu

[PySpark DataFrame] When a Row is not a Row

2015-05-11 Thread Nicholas Chammas
This is really strange. >>> # Spark 1.3.1 >>> print type(results) >>> a = results.take(1)[0] >>> print type(a) >>> print pyspark.sql.types.Row >>> print type(a) == pyspark.sql.types.Row False >>> print isinstance(a, pyspark.sql.types.Row) False If I set a as follows, then the type checks p