Hi Gil You would need to prune the resulting Row as well based on the requested columns.
Ram Sent from my iPhone > On Jul 7, 2015, at 3:12 AM, Gil Vernik <g...@il.ibm.com> wrote: > > Hi All, > > I wanted to experiment a little bit with TableScan and PrunedScan. > My first test was to print columns from various SQL queries. > To make this test easier, i just took spark-csv and i replaced TableScan with > PrunedScan. > I then changed buildScan method of CsvRelation from > > def BuildScan = { > > to > > def buildScan(requiredColumns: Array[String]) = {… > > This was the only modification i did to CsvRelation.scala. And I added print > of requiredColums to log. > > I then took the same CSV file and run very simple SELECT query on it. > I noticed that when CsvRelation used TableScan - all worked correctly. > But when i used PrunedScan - it didn’t worked and returned empty columns / or > columns in wrong order. > > Why is this happens? Is it some bug? Because I thought that PrunedScan > suppose to work exactly the same as TableScan and i can modify freely > TableScan to PrunedScan. I thought that the only difference is that buildScan > of PrunedScan has requiredColumns as parameter. > > Can someone explain me the behavior i saw? > > I am using Spark 1.5 from trunk. > Thanks a lot > Gil.