Dear R-Users, Are there any packages that enable the modifications of highlighted areas / annotations in pdf documents?
It seems feasible - I have explored some R code (see below). However, I would rather avoid to reinvent the wheel. The problem: When highlighting pdf-documents with Microsoft Edge, the bounding box is sometimes misplaced, and quite ugly so. It also lacks the ability to draw lines or arrows. On the other hand, I did not get used to Acrobat Reader: it usually involves much more effort to add specific highlights. Lines can be drawn, but are NOT straight! Are there tools to change the size/position of highlights? Or to add highlights and underline words? Changing position/size manually by editing the data in the pdf-document is possible. Changing the color is more trickier (somehow possible in Microsoft Edger; though the direct approach to rewrite the actual stream is better). Maybe there are some tools to do it? Some R code is below. Sincerely, Leonard ######### library(zip) con = file("_some_pdf_.pdf", "rb") NL = 0 # - very dirty hack; # - assumes Annotations are in the last fragment/chunk; while(TRUE) { tmp = readBin(con, "raw", 1024*128 + 515); if(length(tmp) == 0) break; x = tmp; # isNL = (x == 10) | (x == 13); isNL = (x == 13); isNL = isNL & (x[which(isNL) + 1] == 10); NL = NL + sum(isNL); } close(con) idP = which(isNL) idS = 935; # will vary with pdf and Annotations and ...; nLast = 4; # usually 2 chunks idx = idP[seq(idS, length.out = nLast)] # Check: Right position? # tmp = x[seq(idx[1] + 2, idx[1 + 2] - 1)] # intToUtf8(tmp) tmp = inflate(x[seq(idx[1] + 2, idx[nLast] - 1)]) intToUtf8(tmp$output) # Output of inflate: an Example # "/GS gs .56078434 .87058824 .97647059 rg\n # 337.298 183.836 m 364.322 183.836 l 364.322 171.83 l 337.298 171.83 l h f\n" # Note: /BBox[ 337.298 171.83 364.322 183.836] The raw pdf data: 1948 0 obj <</AP<</N 1949 0 R >>/C[ 0.560784 0.870588 0.976471]/CA 1/F 4/PDFIUM_HasGeneratedAP true/QuadPoints[ 337.298 186 364.322 186 337.298 174.6 364.322 174.6]/Rect[ 337.298 174.6 364.322 186]/Subtype/Highlight/Type/Annot>> endobj 1949 0 obj <</BBox[ 337.298 171.83 364.322 183.836]/Filter/FlateDecode/FormType 1/Length 86/Matrix[ 1 0 0 1 0 0]/Resources<</ExtGState<</GS<</AIS false/BM/Multiply/CA 1/Type/ExtGState/ca 1>>>>>>/Subtype/Form/Type/XObject>>stream xœE˱ €0 Àž)~“äÛ™€ØP@ûKˆ"Оtó²¢ßjÉC©ðT#ŠBš›zª WŸH—Ò9(Aà š KùäøÅ³_iÀŽmz dR² endstream endobj [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.