Re: TableScan vs PrunedScan

Ram Sriharsha Tue, 07 Jul 2015 06:50:41 -0700

Hi Gil

You would need to prune the resulting Row as well based on the requested 
columns.


Ram

Sent from my iPhone

> On Jul 7, 2015, at 3:12 AM, Gil Vernik <g...@il.ibm.com> wrote:
> 
> Hi All, 
> 
> I wanted to experiment a little bit with TableScan and PrunedScan. 
> My first test was to print columns from various SQL queries.  
> To make this test easier, i just took spark-csv and i replaced TableScan with 
> PrunedScan. 
> I then changed buildScan method of CsvRelation from 
> 
> def BuildScan = { 
> 
> to  
> 
> def buildScan(requiredColumns: Array[String]) = {… 
> 
> This was the only modification i did to CsvRelation.scala.  And I added print 
> of requiredColums to log. 
> 
> I then took the same CSV file and run very simple SELECT query on it. 
> I noticed that when CsvRelation used TableScan - all worked correctly. 
> But when i used PrunedScan - it didn’t worked and returned empty columns / or 
> columns in wrong order.  
> 
> Why is this happens? Is it some bug? Because I thought that PrunedScan 
> suppose to work exactly the same as TableScan and i can modify freely 
> TableScan to PrunedScan. I thought that the only difference is that buildScan 
> of PrunedScan has requiredColumns as parameter. 
> 
> Can someone explain me the behavior i saw? 
> 
> I am using Spark 1.5 from trunk. 
> Thanks a lot 
> Gil.

Re: TableScan vs PrunedScan

Reply via email to