> > val newSchemaRDD = sqlContext.applySchema(existingSchemaRDD, > existingSchemaRDD.schema) >
This line is throwing away the logical information about existingSchemaRDD and thus Spark SQL can't know how to push down projections or predicates past this operator. Can you describe more the problems that you see if you don't do this reapplication of the schema.