[EMAIL PROTECTED] | When rightclicking a, for example, pdf file on windows, one normally | gets a screen with three or four tags. Clicking on one of the summary | tag one can get some info like "title", "Author", "category", | "keyword" | etc..
[warning: not my area of expertise] That information's held in NTFS Alternate Data Streams. If you search around with terms like NTFS (ADS OR "Alternate Data Streams") you'll see a whole raft of info on the subject. In MS Office (and other OLE documents) the information is exposed as what's called Structured Storage. I've got a bit of a wrapper round it in my winshell module, which you could either use direct or simply take as the starting point for what you're after: http://timgolden.me.uk/python/winshell.html or else just search for OLE Structured Storage Python can read ADS normally; simply specify the alternate data stream colon syntax when you open a file: info = open ("temp.pdf:\x05SummaryInformation").read () but you have to know what to do with it when you get it. Sorry, don't have time to play with it right now; hopefully someone more knowledgeable can chip in. TJG ________________________________________________________________________ This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ -- http://mail.python.org/mailman/listinfo/python-list