Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Dr Eberhard W Lisse
Andy, you can always open a public Dropbox or Google folder and post the link. el On 29/12/2023 22:37, Andy wrote: > Thanks - I'll have a look at these options too. > > I'm happy to send over a sample document, but wasn't aware if > attachments are allowed. The documents come Lexis+, so require

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread CALUM POLWART
help(read_docx) says that the function only imports one docx file. In > order to read multiple files, use a for loop or the lapply function. > I told you people will suggest better ways to loop!! > > docx_summary(read_docx("Now they want us to charge our electric cars > from litter bins.docx"))

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Ivan Krylov
В Fri, 29 Dec 2023 20:17:41 + Andy пишет: > doc_in <- read_docx(files) > > Results in this error:Error in filetype %in% c("docx") && > grepl("^([fh]ttp)", file) :'length = 9' in coercion to 'logical(1)' help(read_docx) says that the function only imports one docx file. In order to read mul

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Andy
Thanks - I'll have a look at these options too. I'm happy to send over a sample document, but wasn't aware if attachments are allowed. The documents come Lexis+, so require user credentials to log in, but I could upload the file somewhere if that would help? Any ideas for a good location to do

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Dr Eberhard W Lisse
I would also look at https://pandoc.org perhaps which can export a number of formats... And for spreadsheets https://github.com/jqnatividad/qsv is my goto weapon. Can also read and write XLSX and others. A sample document or two would always be helpful... el On 29/12/2023 21:01, CALUM POLWART

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Andy
Hi Roy (& others) Many thanks for the advice - well taken. Thanks also to the others who have responded so quickly - I thought I might have to wait days!! :-) I'm on a Linux (Mint) machine. Below, I document three attempts, two using officer and the last now using textreadr My attempts so far

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread CALUM POLWART
It sounded like he looked at officeR but I would agree content <- officer::docx_summary("filename.docx") Would get the text content into an object called content. That object is a data.frame so you can then manipulate it. To be more specific, we might need an example of the DF You can loop thi

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread CALUM POLWART
textreadr would be the obvious approach. When you say it is depreciated do you mean it's not available on cran? Sometimes maintaining a package on cran in just a pain in the ass. devtools::install_github("trinker/textreadr") Should let you install it. In theory docx files are actually just zip

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread jim holtman
checkout the 'officer' package Thanks Jim Holtman *Data Munger Guru* *What is the problem that you are trying to solve?Tell me what you want to do, not how you want to do it.* On Fri, Dec 29, 2023 at 10:14 AM Andy wrote: > Hello > > I am trying to work through a problem, but feel like I've

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Roy Mendelssohn - NOAA Federal via R-help
Hi Andy: I don’t have an answer but I do have what I hope is some friendly advice. Generally the more information you can provide, the more likely you will get help that is useful. In your case you say that you tried several packages and they didn’t do what you wanted. Providing that code,

[R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Andy
Hello I am trying to work through a problem, but feel like I've gone down a rabbit hole. I'd very much appreciate any help. The task: I have several directories of multiple (some directories, up to 2,500+) *.docx files (newspaper articles downloaded from Lexis+) that I want to iterate throug