[ 
https://issues.apache.org/jira/browse/TIKA-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883291#comment-17883291
 ] 

ASF GitHub Bot commented on TIKA-4309:
--------------------------------------

alexey-pelykh commented on PR #1947:
URL: https://github.com/apache/tika/pull/1947#issuecomment-2363832999

   > Y, that makes sense. If we treat it as a container, file though, we could 
make up/find a mime type for fatmachO (application/x-fat-mach-o I just made 
that up...is there an existing mime type?) and then the attachments/embedded 
files would have their own correct mime types.
   
   I've seen `application/x-mach-binary` and `application/x-mach-o` only, 
however I'd be more than happy to come up with more definite ones, alike the 
ELF-ones. Mach-O also can be shared lib, executable, etc. So technically we 
should've been doing `application/x-mach-o-executable` 
`application/x-mach-o-sharedlib` and so on. For Fat I've seen 
`application/x-mach-universal` but following the proposed structure it would've 
been `application/x-mach-o-universal`
   
   > If at all possible, we should try to use magic to distinguish fat machO 
from the other cafebabes.
   
   The structure of fat Mach-O is quite 
[vague](https://book.hacktricks.xyz/macos-hardening/macos-security-and-privilege-escalation/macos-files-folders-and-binaries/universal-binaries-and-mach-o-format#fat-header)
 (and 
[this](https://opensource.apple.com/source/xnu/xnu-123.5/EXTERNAL_HEADERS/mach-o/fat.h)),
 only deep validation by code can help. So ideally I'd use ExecutableParser as 
priority and if it fails - try other magic-matching `cafebabe`'s
   
   > It is tricky if you're new to Tika.
   
   :) That I've noticed :)
   
   > I can try to help if you can create the skeleton for this file type
   
   I'd gladly do so, the container is quite simple. It's 0xCAFEBABE + uint32t 
of number of headers and every header just contains cpu/arch/type flags




> ExecutableParser: support MachO
> -------------------------------
>
>                 Key: TIKA-4309
>                 URL: https://issues.apache.org/jira/browse/TIKA-4309
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: Alexey Pelykh
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to