Hi all,
    I've read in a large data frame that has formatting similar to the one
in the small example below:

df <-
data.frame(c(1,2,3),c(NA,"AD=2;BA=8","AD=9;BA=1"),c("AD=13;BA=49","AD=1;BA=2",NA));
names(df) <- c("rowNum","first","second")

> df
  rowNum     first      second
1      1      <NA> AD=13;BA=49
2      2 AD=2;BA=8   AD=1;BA=2
3      3 AD=9;BA=1        <NA>


I'd like to reformat all of the non-NA entries in df from "first" and
"second" and so-on such that "AD=13;BA=49" will be replaced by the
following string: "13_13-49".

So applied to df, the output would be the following:

  rowNum     first      second
1      1      <NA> 13_13-49
2      2 2_2-8   1_1-2
3      3 9_9-1        <NA>


I'm generally a big proponent of shell scripting with awk, but I'd prefer
an all-R solution if one exists (and also to learn how to do this more
generally).

Could someone point out an appropriate paradigm or otherwise point me in
the right direction?

Best,
Jonathan

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to