Tim Allison created TIKA-4465:
---------------------------------

             Summary: Extract javascript from name dictionary in PDFs
                 Key: TIKA-4465
                 URL: https://issues.apache.org/jira/browse/TIKA-4465
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


This blog 
[https://labs.senhasegura.blog/unmasking-the-threat-a-deep-dive-into-the-pdf-malicious-2/]
 mentions this malware file (be careful! dangerous!): 
[https://bazaar.abuse.ch/download/4dc9b0c20ea61d91d6a1b5bdce76fb5365de0762efb8f6c2925113c6a8950cae/]
 

 

We're currently extracting javascript from actions, but not from the name tree 
(document level-javascript).

 

We should add this extraction if "extractActions" is set to true... or better, 
come up with a better name for that variable in trunk.

 

Related to this, I'd also like to extract javascript in TikaCLI by default as 
we do for extracting inline images and incremental updates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to