On Tue, Nov 19, 2013 at 10:32 AM, Bill Moseley <[email protected]> wrote:
> Anyone aware of a good, portable way in Perl to encode the filename in a > Content-Disposition header? I would like to support UTF8 filenames, but > support in browsers is unclear (if not changing). > > Is this complexity something that the Catalyst framework should handle? > It's one of those areas where it's easy to get wrong (I can see many > different approaches in our own code). > > http://greenbytes.de/tech/tc2231/ > > > http://stackoverflow.com/questions/93551/how-to-encode-the-filename-parameter-of-content-disposition-header-in-http > I have no idea what the client can accept or what its OS uses as a path-separator, and I don't want to go down the client-sniffing path, anyway. I have a user-supplied character string that I want to use as the filename, which I have to assume can contain any unicode character since it's user-supplied data. >From my limited tests it seems most modern browsers are supporting the "filename*" extension. Each browser does some special handling (like replacing the path-separator, or adding a file extension based on content-type if no file extension is in the filename). All I want to do is make valid HTTP headers and let the client decide how to handle it, but also provide a usable filename (not just underscores, for example). So, all I'm after is to make this valid markup: $c->res->header( content_disposition => qq[attachment; filename="$ascii_file"; filename*=UTF-8''$utf8_file] ); The filename* is easy, I'm finding: my $utf8_file = uri_escape( Encode::encode( 'UTF-8' => $filename ) ); But the $ascii_file is a bit more work. Percent-encoding doesn't work. So, have to do a bit of filtering. See any easier/cleaner/more-correct approach? When I see this much code I tend to think it's the wrong approach. # Convert to ASCII using underscore as replacement my $ascii_file = Encode::encode( ascii => $filename, sub { '_' } ); # Remove quotes as we want to use quoted form of "filename" and preserve whitespace. $ascii_file =~ s/"/_/g; # Replace non-printable characters with underscore, and collapse dups $ascii_file =~ s/[^[:print:]]/_/g; $ascii_file =~ s/_{2,}/_/g; # Split off the extension so can check length of filename w/o extension. # Of course, $ext could end up as dot + underscore. my ( $base, $ext ) = split /(\.\w+)$/, $ascii_file; # Use default filename if we don't have more than three "meaningful" characters. # very subjective. $base = 'your_file' unless ( () = $base =~ /[A-Za-z0-9]/g ) > 3; # Stuff the extension back on. $ascii_file = $base; $ascii_file .= $ext if defined $ext; Again, "filename*" support is good, and I'm not trying to prevent buggy clients from doing something stupid (e.g. filename=/etc/passwd), but want to provide a reasonable fallback to "filename". Perhaps the simple solution is to always use "filename=your_file" and hope most clients use the filename* extension. -- Bill Moseley [email protected]
_______________________________________________ List: [email protected] Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/[email protected]/ Dev site: http://dev.catalyst.perl.org/
