Greetings.

On 2013 Sep 17, at 14:51, Jay McCarthy <jay.mccar...@gmail.com> wrote:

> I think it's an obvious request, but a character flaw of mine is not
> doing things unless they can be done really good. In this case, I see
> a hash table as a "parse" of the headers. It's not obvious to me how
> to parse them. For example...

I don't think that all of these things are necessary, or even obviously 
desirable, for a 'parse of the headers'.  What I would expect from a 
parse-headers function would be something that makes available the collection 
of headers in a convenient way -- that 'lifts it off the wire', if you like -- 
but I wouldn't expect much more.  Specifically, if I want to deal with 
semantics (perhaps involving the implications of repeated headers), or if I 
want to parse the _values_ of headers, I would expect these to be a logically 
separate operations.

That doesn't rule out providing added-value functions which do the extra work, 
possibly composed with a core parse-headers function, but a thorough (meaning 
robust and compliant) implementation of the core part of the job should I think 
be regarded as 'done well', even if there's a larger bells-and-whistles job 
that one can envisage.

In particular...

> - The same header can appear many times, so (Key -> Value) is
> incorrect, unless you overwrite one. It would be better to have (Key
> -> (Listof Value)) but that feels really ugly since most of the time
> there will just be one

A core function which returned an alist of the headers, combined with a 
alist->hash function which (say) concatenated the values of repeated headers, 
would do the majority of the work in the majority of cases.

I as a user would not grumble at having to do this extra bit of work for those 
headers which needed it.

(incidentally, I don't recall any HTTP headers which can meaningfully be 
repeated, and a quick scan of RFC 2616 doesn't remind me of any -- which ones 
are these?)

> - The spec doesn't mandate case sensitivity on headers, so I would
> need to canonicalize "ACCept-ENCodiNG" to something else. Maybe
> 'Accept-Encoding?

Canonicalising them all to symbols (that is 'accept-encoding and friends) would 
make this nicely apparent to users.

> - The value of many headers is not really a string. For instance,
> Content-Length is a number, Cache-Control is an association list,
> Content-Disposition is complicated, etc. I feel like it is
> disingenuous to only partial parse.

A (Listof (Cons symbol? string?) would be fine as the return for most cases, 
leaving the parse of the value for some separate function.  For example 
get-content-length with (-> (Listof (Cons symbol? string?)) number?).

That's especially true for values which themselves have complicated syntaxes.  
I've twice had to parse the rather intricate syntax of the Accept header, and I 
think I re-did it the second time, rather than re-use my first attempt at it, 
because the question I wanted to ask of the header was different.  If there had 
been a parse-accept-header-content function, I would doubtless have used it, 
but I'm not sure how much real thought it would have saved me.

Or in other words, there's not necessarily a unique best way to parse the more 
intricate header values.

(I'd be happy to share that Accept parse code if it was useful).

----

I'm jumping in the the middle of this thread, so apologies if this has already 
been covered.

Best wishes,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK


____________________
  Racket Users list:
  http://lists.racket-lang.org/users

Reply via email to