Re: [PySpark DataFrame] When a Row is not a Row

2015-07-12 Thread Davies Liu
We finally fix this in 1.5 (next release), see https://github.com/apache/spark/pull/7301 On Sat, Jul 11, 2015 at 10:32 PM, Jerry Lam wrote: > Hi guys, > > I just hit the same problem. It is very confusing when Row is not the same > Row type at runtime. The worst thing is that when I use Spark in

Re: [PySpark DataFrame] When a Row is not a Row

2015-07-11 Thread Jerry Lam
Hi guys, I just hit the same problem. It is very confusing when Row is not the same Row type at runtime. The worst thing is that when I use Spark in local mode, the Row is the same Row type! so it passes the test cases but it fails when I deploy the application. Can someone suggest a workaround?

Re: [PySpark DataFrame] When a Row is not a Row

2015-05-13 Thread Nicholas Chammas
Is there some way around this? For example, can Row just be an implementation of namedtuple throughout? from collections import namedtuple class Row(namedtuple): ... >From a user perspective, it’s confusing that there are 2 different implementations of the Row class with the same name. In my

Re: [PySpark DataFrame] When a Row is not a Row

2015-05-11 Thread Ted Yu
In Row#equals(): while (i < len) { if (apply(i) != that.apply(i)) { '!=' should be !apply(i).equals(that.apply(i)) ? Cheers On Mon, May 11, 2015 at 1:49 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > This is really strange. > > >>> # Spark 1.3.1 > >>> print type(resu