As far as I understand your proposal, big +1 on this from Berkeley. We could really use this functionality.

Thanks!
Michelle

On 1/30/13 12:51 AM, Tobias Wunden wrote:
Looking at the current list of bugs in Jira, it becomes obvious that a number 
of them is centered around archival. One problem is that the archive service, 
unless correctly configured, will try to archive things that shouldn't be 
archived, such as the files that have been copied to the streaming server 
featuring an rtmp:// url. But to allow for retraction some elements--namely 
those that represent distributions--have to be retained in the media package 
even though their content is not subject to archival.

Then at the moment it is very difficult for the archive to figure out whether a 
media package has been distributed, and if so, to which distribution channels. 
In order to move Matterhorn towards a media management system it is an 
indispensable feature to make this clearly determinable.

Another goal of this proposal is to unify the concept of distribution. In the 
current situation Engage is handled quite differently than e.g. YouTube where 
separate steps for copying and publishing to search have to be done. The goal 
is to have a 1:1 relationship between a distribution channel, its distribution 
service and associated workflow operation handler and the corresponding entry 
in the media package.

PROBLEM

The reason for these problems is that we are currently using a concept called 
"derived tracks" to detect distribution status, for example if there is a 
track, that is derived from the source track (the ingested media), and whose url starts 
with the download url, then we know that the media package has been distributed to the 
download server. This is not ideal for many reasons, with the most prominent ones being:

- the download artifact may be used by different presentations of the 
mediapackage, e. g. the Media Module, a video repository connected through 
OAI-PMH metadata harvesting, RSS/ATOM feeds etc. So what does retracting this 
mediapackage really mean, or what does it mean for the representations if the 
media is removed from the download server?

- the admin ui that wants to provide the administrator with the link to the "final 
product" (i. e. the representation of the media package on the Engage UI or on 
YouTube) needs to have in depth knowledge about these representations, for example it 
needs to know that youtube tracks have a URL starting with youtube.com, so it would 
determine distribution status to youtube by going through the mediapackage, looking at 
all tracks to find one with a matching url.

PROPOSED SOLUTION

We are proposing a solution to all these problems that allows Matterhorn to 
indicate to the administrator which channels a certain MediaPackage has been 
distributed to without the need for the admin ui to have knowledge about 
specific track properties for a given distribution channel.

A new element is introduced to the Mediapackage called "<presentation>" that 
identifies the distribution channel as well as the url that is used to consume the distributed 
artifact. This url can point to e.g. a web page with the embedded video in case of channels 
like Engage/Player or YouTube or a feed URL in case of an RSS feed.

Which elements have been actually used to make up a distribution and to keep 
track of them in order to allow for retraction now lies completely whithin the 
responsibility of the distribution channel. To support some simple data storage 
right in the media package the new element features a simple key/value 
dictionary. These key/value pairs may also be used to implement efficient 
storage and retraction strategies.

<mediapackage>
  ...
  <presentations>

    <presentation id="p-1" channel="youtube">
      <uri>http://www.youtube.com/watch?v=D1R-jKKp3NA</uri>
      <mimetype>text/html</mimetype>
      <!-- the dictionary is freely managed by the distribution channel and may 
take arbitrary key/value data -->
      <dict>
        <value key="access-token">D1R-jKKp3NA</value>
      </dict>
    </presentation>

    <presentation id="p-2" channel="engage">
      
<uri>http://downloads.myinstitution.edu/engage/ui/watch.html?id=123123s</uri>
      <mimetype>text/html</mimetype>
    </presentation>

    <presentation id="p-3" channel="feeds">
      <uri>http://downloads.myinstitution.edu/feeds/entries/342345</uri>
      <mimetype>application/rss+xml</mimetype>
    </presentation>

  </presentations>
  --
</mediapackage>

By adding this element to a media package, it would immediately be obvious to 
which channels it has been distributed to (and how it can be reached in that 
channel), and rather than creating data structures in the mediapackage and 
using those to derive the distribution status and guess the actual 
representation, the entry points into consumption of the mediapackage are 
clearly defined.

This change doesn't touch existing data structures and will therefore not 
impact existing functionality. But it will allow us to close out all of the 
remaining bugs that are related to archival and retraction.

Looking forward to your +/-1's and/or comments.

Tobias
_______________________________________________
Matterhorn mailing list
Matterhorn@opencastproject.org
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
matterhorn-unsubscr...@opencastproject.org
_______________________________________________

--
Michelle Ziegmann PMP
Technical Project Manager
Educational Technology Services
University of California Berkeley
_______________________________________________
Matterhorn mailing list
Matterhorn@opencastproject.org
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
matterhorn-unsubscr...@opencastproject.org
_______________________________________________

Reply via email to