[ https://issues.apache.org/jira/browse/TIKA-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883291#comment-17883291 ]
ASF GitHub Bot commented on TIKA-4309: -------------------------------------- alexey-pelykh commented on PR #1947: URL: https://github.com/apache/tika/pull/1947#issuecomment-2363832999 > Y, that makes sense. If we treat it as a container, file though, we could make up/find a mime type for fatmachO (application/x-fat-mach-o I just made that up...is there an existing mime type?) and then the attachments/embedded files would have their own correct mime types. I've seen `application/x-mach-binary` and `application/x-mach-o` only, however I'd be more than happy to come up with more definite ones, alike the ELF-ones. Mach-O also can be shared lib, executable, etc. So technically we should've been doing `application/x-mach-o-executable` `application/x-mach-o-sharedlib` and so on. For Fat I've seen `application/x-mach-universal` but following the proposed structure it would've been `application/x-mach-o-universal` > If at all possible, we should try to use magic to distinguish fat machO from the other cafebabes. The structure of fat Mach-O is quite [vague](https://book.hacktricks.xyz/macos-hardening/macos-security-and-privilege-escalation/macos-files-folders-and-binaries/universal-binaries-and-mach-o-format#fat-header) (and [this](https://opensource.apple.com/source/xnu/xnu-123.5/EXTERNAL_HEADERS/mach-o/fat.h)), only deep validation by code can help. So ideally I'd use ExecutableParser as priority and if it fails - try other magic-matching `cafebabe`'s > It is tricky if you're new to Tika. :) That I've noticed :) > I can try to help if you can create the skeleton for this file type I'd gladly do so, the container is quite simple. It's 0xCAFEBABE + uint32t of number of headers and every header just contains cpu/arch/type flags > ExecutableParser: support MachO > ------------------------------- > > Key: TIKA-4309 > URL: https://issues.apache.org/jira/browse/TIKA-4309 > Project: Tika > Issue Type: New Feature > Reporter: Alexey Pelykh > Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)