Hi all,
I've read in a large data frame that has formatting similar to the one
in the small example below:
df <-
data.frame(c(1,2,3),c(NA,"AD=2;BA=8","AD=9;BA=1"),c("AD=13;BA=49","AD=1;BA=2",NA));
names(df) <- c("rowNum","first","second")
> df
rowNum first second
1 1 <NA> AD=13;BA=49
2 2 AD=2;BA=8 AD=1;BA=2
3 3 AD=9;BA=1 <NA>
I'd like to reformat all of the non-NA entries in df from "first" and
"second" and so-on such that "AD=13;BA=49" will be replaced by the
following string: "13_13-49".
So applied to df, the output would be the following:
rowNum first second
1 1 <NA> 13_13-49
2 2 2_2-8 1_1-2
3 3 9_9-1 <NA>
I'm generally a big proponent of shell scripting with awk, but I'd prefer
an all-R solution if one exists (and also to learn how to do this more
generally).
Could someone point out an appropriate paradigm or otherwise point me in
the right direction?
Best,
Jonathan
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.