Thanks for the responses,

Let me reiterate the Product requirements:

   1. Support for a syntax and vocabulary already in wide use on the web to
   allow the creation of cards for the largest possible volume of existing
   pinnable content
   2. Support for a syntax with a large enough and/or extensible vocabulary
   to allow cards to be created for all the types of pinnable content and
   associated actions we need in Gaia

We need to deliver this by B2G 2.5 FL in September.

*Existing Web Content*
I think we're agreed that Open Graph gives us enough of a minimum viable
product for the first requirement. However, it's not OK to just hard code
particular og types into Gecko, we need to be able to experiment with cards
for lots of different Open Graph types without having to modify Gecko every
time (imagine system app addons with experimental card packs).

Open Graph is just meta tags and we already have a mechanism for detecting
specific meta tags in Gaia - the metachange event on the Browser API. As a
minimum all we need to do to access Open Graph meta tags is to extend this
event to include all meta tags with a "property" attribute, which is only
used by Open Graph. We could go a step further and extend the event to all
meta tags, which would also give us access to Twitter card markup for
example, but that isn't essential. We do not need an RDFa parser for this,
we can filter/clean up the data in the system app in Gaia where necessary
(the system app is widely regarded to be part of the platform itself).

*Gaia Content*

Open Graph does not have a large enough vocabulary, or (as Kelly says) the
ability to associate actions with content, needed for the second
requirement. Schema.org has a large existing vocabulary which basically
fulfils these use cases, though some parts are more tested than others,
with examples given in Microdata, RDFa and JSON-LD syntaxes, eg:

   - Contact - http://schema.org/Person
   - Event - http://schema.org/Event
   - Photo - http://schema.org/Photograph
   - Song - http://schema.org/MusicRecording
   - Video - http://schema.org/VideoObject
   - Radio station - http://schema.org/RadioChannel
   - Email - http://schema.org/EmailMessage
   - Message - http://schema.org/Comment

Schema.org also provides existing schemas for actions associated with items
(https://schema.org/docs/actions.html), although examples are only given in
JSON-LD syntax. Schema.org is just a vocabulary and Tantek tells me it's
theoretically possible to express this vocabulary in Microformats syntax
too - it's possible to create new vendor prefixed types, or suggest new
standard types to be added to the Microformats wiki. This would be required
because Microformats does not have a big enough existing vocabulary for
Gaia's needs. Microdata, RDFa and JSON-LD use URL namespaces so are
extensible by design with a non-centralised vocabulary (this is seen as a
strength by some, as a weakness by others).

The data we have [1][2][3][4] shows that Microdata, then RDFa (sometimes
considered to include Open Graph), is used by the most pinnable content on
the web, but the data does not include all modern Microformats. We also
don't have any data for JSON-LD usage. However, existing usage is not the
most important criteria for the second requirement, it's how well it fits
the more complex use cases in Gaia (and how much work it is to implement).

There is resistance to implementing a full Microdata or RDFa parser in
Gecko due to its complexity. JSON-LD is more self-contained by design (for
better or worse) and could be handed over to the Gaia system app directly
via the Browser API without any parsing in Gecko. Microformats is possibly
less Gecko work to implement than Microdata or RDFa, but more than JSON-LD.

*Conclusions*

My conclusion is that the least required work in Gecko for the highest
return would be:

   1. *Open Graph* (bug 1178484) - Extending the existing metachange
   Browser API event to include all meta tags with a "property" attribute.
   This would allow Gaia to add support for all of the Open Graph types,
   fulfilling requirement 1.
   2. *JSON-LD* (bug 1178491) - Adding a linkeddatachange event to the
   Browser API which is dispatched by Gecko whenever it encounters a script
   tag with a type of "application/ld+json" (as per the W3C recommendation
   [5]), including the JSON content in the payload of the event. This would
   allow the Gaia system app to support existing schema.org schemas
   (including actions), with the least amount of work in Gecko, and already in
   a JSON format it can store directly in the Places database
   (DataStore/IndexedDB).

Kan-Ru is the owner of the Browser API module in Gecko and has said he's
happy with this approach and is happy to review the code. Let's go ahead
with that now, unblocking the work on the Gaia side. (Note that I have no
intention of building a full RDF style parser in Gaia, we'll just extract
the data we need from the JSON, for the good reasons that Marcos refers to).

It's clear that there's not a consensus amongst everyone that JSON-LD is
the best format for Mozilla to promote for structured data on the web going
forward, for example some people would prefer to promote Microformats. I
would suggest that they go ahead with implementing Microformats in Gecko
and we can use it in Gaia when it's ready. I would recommend exposing it to
Gaia via a getStructuredData() method on the Browser API (bug 1169634)
which returns a Promise which resolves with the canonical JSON
representation of any Microformats data present in a document. This will
then allow us to add the necessary support in the Gaia system app. (When
implementing this it might also make sense to hook it up to the Open Graph
and JSON-LD support to create a single API with support for multiple
formats).

Then I suggest we leave it up to Gaia developers (who are now web content
authors!) whether they want to use Open Graph, JSON-LD, Microformats, or
any other format which gets supported in Gecko, for their content. We can
benefit from their experience of which formats best support their use cases
in decision making going forward.

In the mean time, given our tight schedule, I would be grateful if we could
not to block the Gaia work on the implementation of Microformats or any
more discussion on which formats we'd like to promote going forward.

Thanks for everyone's input on this so far, I hope we can now get to work.

Ben


   1. Web Data Commons website based on Common Crawl corpus (2009-2014)
   http://webdatacommons.org/
   2. Web Data Commons Paper based on Common Crawl Corpus (2009-2012)
   http://events.linkeddata.org/ldow2012/papers/ldow2012-inv-paper-2.pdf
   3. Yahoo post based on Yahoo corpus (2011)

   https://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/
   4. Yahoo paper based on Bing corpus (2012)
   http://events.linkeddata.org/ldow2012/papers/ldow2012-inv-paper-1.pdf
   5. http://www.w3.org/TR/json-ld/#embedding-json-ld-in-html-documents





On 29 June 2015 at 11:19, <kda...@mozilla.com> wrote:

> Let me start by saying I don't care which format we use. (Formats come,
> and formats go.) I do care, however, that my use case is supported.
>
> My use case, speech enabling web apps and web pages for Firefox OS's voice
> assistant Vaani, requires that the chosen format support something akin to
> schema.org's actions[1] as well as the ability for anyone to add custom
> actions. This use case is also required by the Taipei team working on the
> Firefox OS TV.
>
> Open Graph[2] does not support such actions. Thus, it is not sufficient
> for our use case. (Facebook extended Open Graph with actions[3]. However,
> the set of valid actions is completely under Facebook's control which makes
> their Open Graph extension a non-starter.)
>
> Microdata[4], RDFa[5], and JSON-LD[6] do support actions. Hence, support
> for at least one of these is sufficient for our use case.
>
> Microformats[7] currently does not support actions. Hence, it is not
> sufficient for our use case.
>
> The Vaani team and the Taipei team working on the Firefox OS TV would love
> to base our work on that being done for pinning the web. (One of the 3
> virtues of a programmer *is* laziness.) However, if neither Microdata,
> RDFa, nor JSON-LD is supported, we will, unfortunately, be forced to go our
> own way.
>
> [1] http://schema.org/Action
> [2] http://ogp.me/
> [3] https://developers.facebook.com/docs/sharing/opengraph/using-actions
> [4] http://www.w3.org/TR/microdata/
> [5] http://www.w3.org/TR/xhtml-rdfa-primer/
> [6] http://www.w3.org/TR/json-ld/
> [7] http://microformats.org/wiki/Main_Page
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to