I have a dataset which has several predictor variables and a dependent
variable, "score" (which is numeric). The score for each row is calculated
using a formula which uses some of the predictor variables. But, the "score"
figures are not explicitly given in the dataset. The scores are only arranged
in ascending order, and the ranks of the numbers are given (like 1, 2, 3, 4,
etc.; rank 1 means that the particular row had the highest score, 2 means it
had the second highest score and so on). So, if the data has 100 rows, the
output has ranks from 1 to 100.
I don't think it would be proper to treat the output column as a numeric one,
since it is an ordinal variable, and the distance (difference in scores)
between ranks 1 and 2 may not be the same as that between ranks 2 and 3.
However, most R regression models for ordinal regression are made for output
such as (high, medium, low), where each level of the output does not
necessarily correspond to a unique row. In my case, each output (rank)
corresponds to a unique row.
So please suggest me what models I could use for this problem. Will treating
the output as numeric instead of ordinal be a reasonable approximation? Or will
the usual models for ordinal regression work on this dataset as well?
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.