[ 
https://issues.apache.org/jira/browse/TIKA-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18014738#comment-18014738
 ] 

Hudson commented on TIKA-4465:
------------------------------

SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk17 #862 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/862/])
TIKA-4465 -- extract javascript from name tree (#2305) (github: 
[https://github.com/apache/tika/commit/b466c4920c3ab0ae5bb9fd203749c8585c52126f])
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-package/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java
* (edit) tika-core/src/main/java/org/apache/tika/metadata/PDF.java


> Extract javascript from name tree in PDFs
> -----------------------------------------
>
>                 Key: TIKA-4465
>                 URL: https://issues.apache.org/jira/browse/TIKA-4465
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 4.0.0, 3.2.3
>
>
> This blog 
> [https://labs.senhasegura.blog/unmasking-the-threat-a-deep-dive-into-the-pdf-malicious-2/]
>  mentions this malware file (be careful! dangerous!): 
> [https://bazaar.abuse.ch/download/4dc9b0c20ea61d91d6a1b5bdce76fb5365de0762efb8f6c2925113c6a8950cae/]
>  
>  
> We're currently extracting javascript from actions, but not from the name 
> tree (document level-javascript).
>  
> We should add this extraction if "extractActions" is set to true... or 
> better, come up with a better name for that variable in trunk.
>  
> Related to this, I'd also like to extract javascript in TikaCLI by default as 
> we do for extracting inline images and incremental updates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to