alexey-pelykh commented on PR #1947:
URL: https://github.com/apache/tika/pull/1947#issuecomment-2363832999

   > Y, that makes sense. If we treat it as a container, file though, we could 
make up/find a mime type for fatmachO (application/x-fat-mach-o I just made 
that up...is there an existing mime type?) and then the attachments/embedded 
files would have their own correct mime types.
   
   I've seen `application/x-mach-binary` and `application/x-mach-o` only, 
however I'd be more than happy to come up with more definite ones, alike the 
ELF-ones. Mach-O also can be shared lib, executable, etc. So technically we 
should've been doing `application/x-mach-o-executable` 
`application/x-mach-o-sharedlib` and so on. For Fat I've seen 
`application/x-mach-universal` but following the proposed structure it would've 
been `application/x-mach-o-universal`
   
   > If at all possible, we should try to use magic to distinguish fat machO 
from the other cafebabes.
   
   The structure of fat Mach-O is quite 
[vague](https://book.hacktricks.xyz/macos-hardening/macos-security-and-privilege-escalation/macos-files-folders-and-binaries/universal-binaries-and-mach-o-format#fat-header)
 (and 
[this](https://opensource.apple.com/source/xnu/xnu-123.5/EXTERNAL_HEADERS/mach-o/fat.h)),
 only deep validation by code can help. So ideally I'd use ExecutableParser as 
priority and if it fails - try other magic-matching `cafebabe`'s
   
   > It is tricky if you're new to Tika.
   
   :) That I've noticed :)
   
   > I can try to help if you can create the skeleton for this file type
   
   I'd gladly do so, the container is quite simple. It's 0xCAFEBABE + uint32t 
of number of headers and every header just contains cpu/arch/type flags


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to