Sorry to bust your bubble, but docx files really are zip files, with a
predetermined set of files in them. Microsoft even tried to patent the
idea, which I believe was originally coined by Sun's StarOffice. Most
office suites have since adopted the practice so in order to inspect
them you'll first have to extract at least one file from it and
determine the type and version of the document.

The question is what your motivations are. From a security point of
view this is wasted CPU cycles. A valid office document, can still
have perfectly valid malicious code in it. If you want to protect your
users, simply feed it to a malware scanner and be done with it.

For all other cases, this is python. Use duck typing: if looks like an
image, open it with PIL. Fail? Ditch.

On Fri, Sep 22, 2017 at 8:23 AM, Paul <sevenrrain...@gmail.com> wrote:
> I'm trying to validate mime types of files uploaded with a predefined list
> of validate mime types.
>
>
> I need to do the check the file in the buffer before save, even if they are
> faked or no extensions.
>
>
>
> 1. python own  mimetypes package seems to "guess" only base on extension
>
>
> 2. magic-python looks ok, but has OS dependencies because is using UNIX
> libmagic.
>
>  I had a lot of trouble with it on Windows 64 bit, and even after I fixed
> the dependencies error other issue appears and it couldn't identify files.
>
>  This also is a issue because is hard to install OS related filed on a
> predefined hosting.
>
>
> 3. I found filetype package but it only checks "magic numbers" for a limited
> file types, and docx and other identifies them as zip file(wich are archive
> as technology),
>
>  but I need to identify them as what they really are.
>
>
> What other non OS dependent solutions that can check if the file is faked or
> with no extension exist ? (pdf,doc,docs,csv,xls,xlsx, ods,odt,odm)
>
>
>
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-users+unsubscr...@googlegroups.com.
> To post to this group, send email to django-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/b5280e1b-ef5c-4749-a243-c75b3275c897%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



-- 
Melvyn Sopacua

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CA%2Bgw1GWe32PjqYSWTDGRr7orBWL5O1do5uQSCxkPoFjAWz-JLw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to