On Wed, Jan 05, 2005 at 04:32:07PM -0500, William Ballard wrote:
echo '</Long-Description></entry></packages>'
^^^
Should have closed the CDATA tag here. The short description tag should probably be wrapped in CDATA too. If any package descriptions contain "]]>", it'll break it.
I was able to succesfully turn the sarge/contrib (i386) Packages file into a valid XML file with the following modified version of your script. It is still definately a hack though. Especially the way it escapes Non-ASCII characters.
Since it contains a few long lines, I attached it. It's under 1k in size.
hth
cu, sven
#!/bin/bash
PACKAGES=$1 CAT=cat if [[ ! -f ${PACKAGES} ]]; then echo ${PACKAGES not found exit 1 fi if file ${PACKAGES} | grep -q gzip ; then CAT=ZCAT fi echo '<packages><entry>' ${CAT} ${PACKAGES} \ | grep-dctrl . \ | sed -r \ -e 's/&/\&/g;s/</\</g;s/>/\>/g;s/ñ/\ũ/g;s/é/\ş/g;s/í/\ţ/g' \ -e 's/(Description): (.+)/<\1><Short-Description>\2<\/Short-Description><Long-Description><CDATA>/' \ -e 's/^([^: ]+): (.+)/<\1><CDATA>\2<\/CDATA><\/\1>/' \ -e 's/^$/><\/CDATA><\/Long-Description><\/Description><\/entry><entry>/' \ | head -n-1 echo '</CDATA></Long-Description></Description></entry></packages>'