Return RDFa meta tags via Metadata
----------------------------------

                 Key: TIKA-728
                 URL: https://issues.apache.org/jira/browse/TIKA-728
             Project: Tika
          Issue Type: Improvement
            Reporter: Ken Krugler
            Assignee: Ken Krugler
            Priority: Minor


Open Graph <meta> tags currently get stripped out, and also aren't put into the 
metadata map.

The reason why is that Open Graph uses RDFa:

http://stackoverflow.com/questions/2704942/html-validation-error-for-property-attribute/2705090#2705090

Since <meta property="xxx" content="yyy" /> isn't valid for XHTML 1.0, these 
tags can't be emitted.

We could take a tag like:

<meta property="og:url" content="http://www.imdb.com/title/tt0117500/"; />

and put it into the metadata map as "og:url" => 
"http://www.imdb.com/title/tt0117500/";


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to