On Wed, Nov 18, 2009 at 11:01:51AM +0100, Daniel Näslund wrote:
> On Wed, Nov 18, 2009 at 10:49:15AM +0100, Stefan Sperling wrote:
> > On Wed, Nov 18, 2009 at 10:37:01AM +0100, Daniel Näslund wrote:
> > > Index: subversion/libsvn_subr/stream.c
> > > ===================================================================
> > > --- subversion/libsvn_subr/stream.c       (revision 881392)
> > > +++ subversion/libsvn_subr/stream.c       (arbetskopia)
> > > @@ -1347,3 +1347,44 @@
> > >  
> > >    return SVN_NO_ERROR;
> > >  }
> > > +
> > > +svn_error_t *
> > > +svn_stream_detect_binary_mimetype(const char **mimetype,
> > > +                                  svn_stream_t *stream)
> > > +{
> > > +  static const char * const generic_binary = "application/octet-stream";
> > > +  char block[1024];
> > > +  apr_size_t amt_read = sizeof(block);
> > > +
> > > +  /* Default return value is NULL. */
> > > +  *mimetype = NULL;
> > > +
> > > +  SVN_ERR(svn_stream_read(stream, block, &amt_read));
> > > +
> > > +  if (amt_read > 0)
> > > +    {
> > > +      apr_size_t i;
> > > +      apr_size_t binary_count = 0;
> > > +
> > > +      for (i = 0; i < amt_read; i++)
> > > +        {
> > > +          if (block[i] == 0)
> > > +            {
> > > +              binary_count = amt_read;
> > > +              break;
> > > +            }
> > > +          if ((block[i] < 0x07)
> > > +              || ((block[i] > 0x0D) && (block[i] < 0x20))
> > > +              || (block[i] > 0x7F))
> > > +            {
> > > +              binary_count++;
> > > +            }
> > 
> > Unless I'm mistaken the "greater 0x7F" check will trigger on *any* UTF-8
> > continuation byte. See http://tools.ietf.org/html/rfc3629#section-3
> 
> Yes, it will and this code is used for all the autoprops stuff! But
> strange results has been hidden by the fact that the detection code
> first checks for file endings. That's my guess atleast. A japanese text
> would be considered binary! 
>
> As I'm saying further down. I have only duplicated a part of
> svn_io_detect_mimetype2() and intend to refactor this part into a helper
> func in libsvn_subr.

Using libmagic is possible, at least from a legal point of view.
It has a very simple 2-clause BSD-style license so we could link to it.

Stefan

Reply via email to