I have a data frame in which I load data from a hive table. And my issue is
that the data frame is missing the columns that I need to query.

For example:

val newdataset = dataset.where(dataset("label") === 1)

gives me an error like the following:

ERROR yarn.ApplicationMaster: User class threw exception: resolved
attributes label missing from label, user_id, ...(the rest of the fields of
my table
org.apache.spark.sql.AnalysisException: resolved attributes label missing
from label, user_id, ... (the rest of the fields of my table)

where we can see that the label field actually exist. I manage to solve
this issue by updating my syntax to:

val newdataset = dataset.where($"label" === 1)

which works. However I can not make this trick in all my queries. For
example, when I try to do a unionAll from two subsets of the same data
frame the error I am getting is that all my fields are missing.

Can someone tell me if I need to do some post processing after loading from
hive in order to avoid this kind of errors?


Thanks
-- 
Cesar Flores

Reply via email to