Gerfried Fuchs said: > * Andrew Shugg <[EMAIL PROTECTED]> [2003-03-23 00:23]: > > Firstly, I know it was alfie's preference to have only the first paragraph > > (as bounded by <p>...</p>) of the DSA in the RDF file, but this doesn't > > always make good sense. Take for example DSA 265, where there is a </p> > > right before the actual interesting bit that you'd actually want to read. > > The intention of the first paragraph is to get an overview of the DSA. > If one wants to read more one can always come back to the website with > the included link.
Yes, I understand the reason for doing it that way ... I would just like to have the option of being able to read the entire article in RSS. =) Maybe we could have 'dsa', 'dsa-long' and 'dsa-full'? And so to each their own? > I like to disagree. RSS was always meant to just get the people > interested, IMHO. The facility to link to the full article isn't there > just for fun. I'm coming at it from the same perspective as reading a blog in an RSS reader. It's lighter, faster, and doesn't need a web browser open. It can be used as an alternative transport to HTML for textual information, not just a simple headline fetch method. > > Secondly, as far as I know the HTML tags being used in the DSA wml files > > are not valid in RDF. I've looked through the W3C docs on RDF and can't > > find anything that says HTML is allowed in <description> containers. It > > might work in straw but doesn't in NNWL. So I think HTML tags should be > > removed from the RDF format. > > *hmm* Interesting. Then many of the other sites that I have taken a > look at have simiar problems, like e.g. advogato. I've worked this one out, and at least this one's easy: the HTML tags in the RDF need to be encoded as HTML entites. ie, <p> -> <p> > Don't get me wrong, I'm not fully against your suggestions, at least I > am not unconvincable. If some others speak up that your changes are a > good thing I'm willing to change my mind about it. Oh, I expect I'm completely on my own here. =) But if you can at least fix the entity encodings in $moreinfo I would be very grateful: I've confirmed that s#<(/?\w+)>#<$1>#g on that line results in a 'valid' RSS feed according to the online RSS validator: http://feeds.archive.org/validator/ (And more importantly the document renders properly in my RSS reader!) The Perl s// pattern I've used above may not be flexible enough, or not handle 8-bit stuff properly, but there are better ways. You could use the SGML::ISO8859::str2sgml() or HTML::Entities::encode_entities() functions to achieve this, for example. Thanks, Andrew. -- Andrew Shugg <[EMAIL PROTECTED]> http://www.neep.com.au/ "Just remember, Mr Fawlty, there's always someone worse off than yourself." "Is there? Well I'd like to meet him. I could do with a good laugh."