Hi All,

I was wondering if there is any experimental design to tune the parameters
of ALS algorithm in mahout, so that we can compare its recommendations with
recommendations from another algorithm.

My datasets have implicit data and would like to use the following design
for tuning the ALS parameters (alphs, lambda, numfeatures).

1. Split the data such that for each user, 50% of the clicks go to train,
25% go to validation, 25% goes to test.

2. Create the user and item features by applying the ALS algorithm on
training data, and test on the validation set. (We can pick the parameters
which minimizes the RMSE score, in-case of implicit data, Pui - XY’)
3. Once we find the parameters which give the best RMSE value on
validation, use the user and item matrices generated for those parameters
to predict the top k items and test it with the items in the test set
(compute mean average precision).

Although the above setting looks good, I have few questions

1. Do we have to follow this setting, to compare algorithms? Can't we
report the parameter combination for which we get highest mean average
precision for the test data, when trained on the train set, with out any
validation set.
2. Do we have to tune the "similarityclass" parameter in item-based CF? If
so, do we compare the mean average precision values based on validation
data, and then report the same for the test set?

My ultimate objective is to compare different algorithms but I am confused
as to how to compare the best results (based on parameter tuning) between
algorithms. Are there any publications that explain this in detail? Any
help/comments about the design of experiments is much appreciated.

Thanks,
Rohit

Reply via email to