I don't think inner join will solve my problem.
*For each row in* paramsDataset, I need to filter mydataset. And then I need
to run a bunch of calculation on filtered myDataset.
Say for example paramsDataset has three employee age ranges . Eg:
20-30,30-50, 50-60 and regions USA,Canada.
myDataset
What columns do you want to filter myDataSet on? What are the corresponding
columns in paramsDataSet?
You can easily do what you want using a inner join. For example, if tempview
and paramsview both have a column, say employeeID. You can do this with the SQl
sparkSession.sql("Select * from tem
Hi,
I have one dataset with parameters and another with data that needs to
apply/ filter based on the first dataset (Parameter dataset).
*Scenario is as follows:*
For each row in parameter dataset, I need to apply the parameter row to
the second dataset.I will end up having multiple datase