[R] randomly subsample rows from subsets
Hi, I have a list of 1787 fish from 948 full-sib families and their lengths. My table looks like this, fishfam length 1 a 71.46 2 a 71.06 3 a 62.94 4 b 79.46 5 b 52.38 6 b 56.78 7 b 92.08 8 c 96.86 9 d 98.09 10 d 17.23 11 d 98.35 12 d 82.43 13 e 83.85 14 e 33.92 15 e 23.16 16 e 31.39 17 e 57.08 18 e 27.05 19 f 62.38 20 f 83.21 21 f 18.72 22 f 84.32 23 g 15.99 24 h 40.33 25 h 92.73 26 h 59.08 27 i 29.05 I want to randomly select 2 fish from each family that has 2 or more individuals and exclude those families that have just one fish. How can I do that? Thanks, -- View this message in context: http://r.789695.n4.nabble.com/randomly-subsample-rows-from-subsets-tp4483477p4483477.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reshape from long to wide
Hi, I'm a total beginner in R and this question is probably very simple but I've spent hours reading about it and can't find the answer. I'm trying to reshape a data table from long to wide format. I've tried reshape() and cast() but I get error messages every time and I can't figure why. In my data, I have the length of two fish from each family. My data table (called fish) looks like this: family length 14 18 14 7 15 7 15 21 17 50 17 21 18 36 18 21 20 36 20 42 24 56 24 42 25 43 25 56 27 15 27 42 28 7 28 42 29 56 29 49 I want it to look like this: family kid1 kid2 14 18 7 15 7 21 17 50 21 18 36 21 28 36 42 24 56 42 25 43 56 27 15 42 28 7 42 29 56 49 I've tried: >cast( fish, fam~length) and got the error message: Using length as value column. Use the value argument to cast to override this choice Error in `[.data.frame`(data, , variables, drop = FALSE) : undefined columns selected Then I rename the columns: >myvars<-c("fam","length") >fish<-fish[myvars] and try the cast() again with no luck (same error) By using reshape() I don't get the results I want: >reshape(rdm1, timevar="fam", idvar=c("length"), direction="wide") > head(first) length 14.2014 14.19 7 15.2521 17.3050 18.3236 20.3642 Can someone help with this? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Reshape-from-long-to-wide-tp4486875p4486875.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshape from long to wide
Thanks a lot, I tried one of the ways you guys showed me and it totally work. Just for fun, I tried all the others and with some modifications here and there they work fine too. It was time consuming but definitely worth as a good learning experience. Thanks again -- View this message in context: http://r.789695.n4.nabble.com/Reshape-from-long-to-wide-tp4486875p4494055.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Randomly select elements based on criteria
Hi, I want to randomly pick 2 fish born the same day but I need those individuals to be from different families. My table includes 1787 fish distributed in 948 families. An example of a subset of fish born in one specific day would look like: >fish fam born spawn 25 46 43 25 46 56 26 46 50 43 46 43 131 46 43 133 46 64 136 46 43 136 46 42 136 46 50 136 46 85 137 46 64 142 46 85 144 46 56 144 46 64 144 46 78 144 46 85 145 46 64 146 46 64 147 46 64 148 46 78 149 46 43 149 46 98 149 46 85 150 46 64 150 46 78 150 46 85 151 46 43 152 46 78 153 46 43 156 46 43 157 46 91 158 46 42 Where "fam" is the family that fish belongs to, "born" is the day it was born (in this case day 46), and "spawn" is the day it was spawned. I want to know if there is a correlation in the day of spawn between fish born the same day but that are unrelated (not from the same family). I want to randomly select two rows but they have to be from different fam. The fist part (random selection), I got it by doing: > ran <- sample(nrow (fish), size=2); ran [1] 9 12 > newfish <- fish [ran,]; newfish fam born spawn 103 136 4650 106 142 4685 In this example I got two individuals from different families (good) but I will repeat the process many times and there's a chance that I get two fish from the same family (bad): > ran<-sample (nrow(fish), size=2);ran [1] 26 25 > newfish <-fish [ran,]; newfish fam born spawn 127 150 4685 126 150 4678 I need a conditional but I have no clue on how to include it in the code. Thanks in advance for any suggestions, Aly -- View this message in context: http://r.789695.n4.nabble.com/Randomly-select-elements-based-on-criteria-tp4496483p4496483.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logistic Regression Fitting with EM-Algorithm
Hi all, is there any package which can do an EM algorithm fitting of logistic regression coefficients given only the explanatory variables? I tried to realize this using the Design package, but I didn't find a way. Thanks a lot & Kind regards Robin Aly __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logistic Regression Fitting with EM-Algorithm
Dear Ted, sorry for being unclear. Let me try again. I indeed have no knowledge about the value of the response variable for any object. Instead, I have a data frames of explanatory variables for each object. For example, x1 x2 x3 1 4.409974 2.348745 1.9845313 2 3.809249 2.281260 1.9170466 3 4.229544 2.610347 0.9127431 4 4.259644 1.866025 1.5982859 5 4.001306 2.225069 1.2551570 ... , and I want to model a regression model of the form y ~ x1 + x2 + x3. From prior information I know that all coefficients are approximately Gaussian distributed around one and the same for the intercept around -10. Now I think there must be a package which estimates the coefficients more precisely by fitting the logistic regression function to the data without knowledge of the response variable (similar to fitting Gaussians in a mixture model where the class labels are unknown). I looked at the flexmix package but this seems to "only" find dependencies in the data assuming the presence of some training data. I also found some evidence In Magder1997 (see below) that such an algorithm exists, however from the documented math I can't apply the method to my problem. Thanks in advance, Best Regards Robin Magder, L. S. & Hughes, J. P. Logistic Regression When the Outcome Is Measured with Uncertainty American Journal of Epidemiology, 1997, 146, 195-203 On 01/04/2011 12:36 AM, (Ted Harding) wrote: On 03-Jan-11 14:02:21, Robin Aly wrote: Hi all, is there any package which can do an EM algorithm fitting of logistic regression coefficients given only the explanatory variables? I tried to realize this using the Design package, but I didn't find a way. Thanks a lot& Kind regards Robin Aly As written, this is a strange question! You imply that you do not have data on the response (0/1) variable at all, only on the explanatory variables. In that case there is no possible estimate, because that would require data on at least some of the values of the response variable. I think you should explain more clearly and explicitly what the information is that you have for all the variables. Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 03-Jan-11 Time: 23:36:56 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.