[ https://issues.apache.org/jira/browse/TIKA-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18013697#comment-18013697 ]
Tim Allison edited comment on TIKA-4465 at 8/13/25 2:23 PM: ------------------------------------------------------------ Maybe the FDF dictionary according to 32000-2:2020 12.7.8.31 Tables 245 and 246? Nah, let's do that on a different ticket for parsing fdf if there's any interest in that? was (Author: talli...@mitre.org): Maybe the FDF dictionary according to 32000-2:2020 12.7.8.31 Tables 245 and 246? > Extract javascript from name dictionary in PDFs > ----------------------------------------------- > > Key: TIKA-4465 > URL: https://issues.apache.org/jira/browse/TIKA-4465 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Minor > > This blog > [https://labs.senhasegura.blog/unmasking-the-threat-a-deep-dive-into-the-pdf-malicious-2/] > mentions this malware file (be careful! dangerous!): > [https://bazaar.abuse.ch/download/4dc9b0c20ea61d91d6a1b5bdce76fb5365de0762efb8f6c2925113c6a8950cae/] > > > We're currently extracting javascript from actions, but not from the name > tree (document level-javascript). > > We should add this extraction if "extractActions" is set to true... or > better, come up with a better name for that variable in trunk. > > Related to this, I'd also like to extract javascript in TikaCLI by default as > we do for extracting inline images and incremental updates. -- This message was sent by Atlassian Jira (v8.20.10#820010)