I have a data frame in which I load data from a hive table. And my issue is that the data frame is missing the columns that I need to query.
For example: val newdataset = dataset.where(dataset("label") === 1) gives me an error like the following: ERROR yarn.ApplicationMaster: User class threw exception: resolved attributes label missing from label, user_id, ...(the rest of the fields of my table org.apache.spark.sql.AnalysisException: resolved attributes label missing from label, user_id, ... (the rest of the fields of my table) where we can see that the label field actually exist. I manage to solve this issue by updating my syntax to: val newdataset = dataset.where($"label" === 1) which works. However I can not make this trick in all my queries. For example, when I try to do a unionAll from two subsets of the same data frame the error I am getting is that all my fields are missing. Can someone tell me if I need to do some post processing after loading from hive in order to avoid this kind of errors? Thanks -- Cesar Flores