Try the code below:
df <- read_delim("C:/Users/lruan1/Desktop/1112.csv", "|", escape_double = FALSE, trim_ws = TRUE) df_new <- subset(df,df$IPC == 'H04M001/02'| df$IPC == 'C07K016/26' ) You can add more condition with "|" in the subset function. Good luck! On Wed, Jan 3, 2018 at 2:53 PM, Saptorshee Kanto Chakraborty < chk...@unife.it> wrote: > Hello, > > I have a data of Patents from OECD in delimited text format with IPC being > one column, I want to filter the data by selecting only certain IPC in that > column and delete other rows which do not have my required IPCs. Please, > can anybody guide me doing it, also the IPC codes are string variables. > > The data is somewhat like below, but its a huge dataset containing more > than 11 million rows > > > Appln_id|Prio_Year|App_year|IPC > 1|1999|2000|H04Q007/32 > 1|1999|2000|G06K019/077 > 1|1999|2000|H01R012/18 > 1|1999|2000|G06K017/00 > 1|1999|2000|H04M001/2745 > 1|1999|2000|G06K007/00 > 1|1999|2000|H04M001/02 > 1|1999|2000|H04M001/275 > 2|1991|1992|C12N015/62 > 2|1991|1992|C12N015/09 > 2|1991|1992|C07K019/00 > 2|1991|1992|C07K016/26 > > > > Thanking You > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.