On Jun 25, 2011, at 2:13 PM, Håvard Wahl Kongsgård wrote:

Hi, sorry my question was not really clear

 |Are you very early in efforts at learning R?
No, have been a long term user of R, but only use R for the statistical stuff.

The heart of the issue is that I have list of keywords that I want to
analyse with a machine learning algorithm (20 000 keywords with a
response variables). It's much like "micro" array data, but in my case
it's not "genes", but instead keywords. To get it to work in R, I
could create a data frame with multiple vectors containing different
factors.
That would look like this
V1, V2,
"Harry", "Kline"
"Brown", "Larry"

If I am not mistaken if I used V1 and V2 with the standard GLM
function the result would be like

glm( V0 ~ "HARRY"  + "KLINE" + "Brown" + "Larry")

No. You are mistaken. Assuming that V0 is defined and has the same length as there are rows in that data.frame, then V1 and V2 would be the arguments in the formula and they would not be quoted.

?glm   # and work through the examples


Or I could create a complex ordered array where keywords are
represented 1 and 0.
If I used that in GLM I would get the same result with glm?

But is there a better approach?

-Håvard

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to