Re: [Opencast Matterhorn] Mediapackage element #proposal

Rubén Pérez Fri, 08 Feb 2013 05:37:56 -0800

Karen,

Sorry for keeping you waiting for so long. I've been swamped these past two
days and I did want to give you an elaborate answer.


I'd like to briefly state my general opinion here: I don't think that this
proposal is not necessary; quite the contrary, I do think that we lack the
ability to keep track of where our media has been published. What I'm
saying is that we need to be exhaustive about it, not just note down all
the (so to speak) "external references" to our media with no more hierarchy
than "who put those references there", because such references may depend
on each other in ways that "those who put them there" may not understand
(or even should not).

You may find interesting this presentation be
Vicente<http://www.powershow.com/view/a9d49-YmFhO/PuMuKIT_and_PDP_architecture_powerpoint_ppt_presentation>
(I
mean the "PDP Architecture" part, although PuMuKIT is also an interesting
piece of software :P). I think we should take into account the difference
between "distribution" and "publication" if we want to come up with a good
solution that works well in the vast majority of the situations.

Let me explain this with your specific example: when you distribute your
files via streaming, you *still* have to *distribute* those files to the
streaming server *AND** publish them* (announce them, include them in the
search index, create some RSS pointing at the player, etc). Even though the
file is not directly accessible (downloadable), this does not change the
fact that if you retract the files from the streaming server, all the other
stuff is invalid and outdated.

I see two ways to go:

   1. We do not care about this interdependency. If we retract media from
   the streaming server,  we must explicitly "unpublish" the reference(s) to
   that media, too. So this is a manual process, that will become more
   complicated as the number of available channels grows.
   2. We make the distribution services notify the publishing services
   about certain media being removed, so that they can "unpublish" their
   references to the media.

Of course I think the 2nd option is the best. Since we are going to add
information about what and where is distributed in our system, why not
including info about how those items depend on others, so that we can
manage them in a consistent way?

Now, to address your issues specifically:

   1. The way this is handled currently is that Matterhorn copies the
   streamable file to some folder that is shared by the streaming server.
   Matterhorn allows configuring your streaming server URL so that you can
   match the route to that file in the folder to a route within your streaming
   server URL.
   I agree that this is not the best method one can think of, but bear in
   mind that any form of media distribution involves, in the end, some sort of
   copying or moving files around.
   However I don't understand your statement "without Matterhorn trying to
   download the file". If the system is well configured, the files should be
   streamed from your server directly.

   2. For the record, this is already possible, since you know: a) the
   directory where the streaming files are stored in Matterhorn, b) the base
   URL of your streaming server and c) the URL of your track. Removing b) from
   c) and appending the result to a) will give you the actual location of the
   file in your system.
   Of course I think this is an sloppy way to solve the problem, and there
   are many downsides to it. What we *really* should do is keeping the
   internal URLs of all the assets in a MediaPackage while we also have the
   URLs from where such assets can be accessed from the outside (if there are
   any), which is all this proposal is about.


Best regards,

Rubén Pérez Vázquez
<http://www.teltek.es>
www.teltek.es



2013/2/6 Karen Dolan <kdo...@dce.harvard.edu>

> Ruben,
>
> While your are still strongly thinking on this subject, consider how the
> following situation would benefit from the proposal or a modified proposal.
> In short we are trying archive the final product that is distributed, but
> the distribution protocol should not, and can not, be used by the archive
> process.
>
> Situation:
>
> A Matterhorn ingests raw media input, processes it, and distributes it to
> a streaming server. The distribution stream protocol is RTMP.
>
> The Matterhorn media package now only knows about the raw input files and
> the final distribution. The distribution is accessible by students via a
> streaming server through a flash player via the RTMP protocol. The RTMP
> protocol only streams through something that can play it, no file download
> possible (totally different topic of conversation!).
>
> This (customized) workflow may be missing some critical step, because the
> media package does not remember where it copied the actual transcode files.
> It only knows the transcoded files ended up accessible via RTMP, which
> means they fell off the edge of the Earth for Matterhorn.
>
> So, we have 2 issues
> 1) How can the final distribution be played from the Matterhorn UI via
> RTMP through our flash server, without Matterhorn trying to download the
> file.
> 2) How do we identify and store (archive) media files that distributed and
> accessible to others via RTMP, but accessible to Matterhorn as regular
> files from the directory where it saved them, without making lots of extra
> copies.
>
> Cordially,
> Karen
>
> I agree and disagree as we were discussing presentation vs. distribution
>> as well. The reason for sticking with "presentation" in the proposal was
>> that distributing a file (e. g. to the download server) is different from
>> presenting it (e. g. on a video portal or in feeds). In fact, multiple
>> representations may be based on the same distribution.
>
>
> 3) I have just realized of the "channel" argument in the second example
>>> given by Tobias. Shouldn't we make a difference between elements
>>> distributed as "downloads" and as "streaming"? In that case, "engage" would
>>> be an ambiguous term, and we should specify 'channel="engage-download"' or
>>> 'channel="engage-streaming"' (or, at least, 'channel="download"' or
>>> 'channel="streaming"'). I know those are mere examples, and not "real"
>>> pieces of xml, but I think it's good to point this out and make it clear.
>>
>>
>
> Again, the presentations should be differentiated from the file
>> distribution. If this proposal is accepted, it would probably be up to the
>> presentation services (e. g the search service aka engage) to use the
>> distribution services to place the files rather than the workflow randomly
>> throwing files onto the download server. Every presentation instance would
>> need to make sure that the files or streams they are representing is in the
>> right place (and removed if retracted). So as a result, distribution would
>> be a responsibility of each presentation channel.
>
>
> I think the workflows are no more "randomly throwing files onto the
> download server" than the Youtube service is "randomly storing files at
> some external server who-knows-where". The distribution service knows which
> elements are distributed simply by inspecting their URL. That doesn't mean
> we cannot be more explicit about the state of "distribution" of the
> Mediapackage elements, but it doesn't make it "random" either.
>
> That being say, I think the proposal is lacking a clear way of defining
> relationships between *published* elements. If we agree that the URL that
> opens a certain Mediapackage in the engage player is a *presentation*,
> then we *must* agree that the URL from where the media is fetched in the
> download server is *also* a presentation, because any user can download
> the files directly without using the Engage player at all, or maybe stream
> them directly using, say, VLC.
>
> For instance, we should end up with, at least, three *presentations* if
> we publish a Mediapackage at the engage server via download: one entry for
> the presenter file at the download server, another for the slides video at
> the download server, and another one for the URL to the video in the Engage
> server. But this last *presentation* depends on both the others in order
> to be available.
>
> Therefore, no matter if the publishing services take care of triggering
> the distribution of the relevant media (which involves modifying the
> current services and implement ways to *decide* which type(s) of
> distribution they will use) or the workflows distribute the media
> explicitly, we must be exhaustive with all the kinds of *presentations* a
> MediaPackage has, or at least state explicitly that the Engage Service (or
> rather the download and streaming services) are an exception.
>
> And, still, in the case we go with the refactoring of the publishing
> services to manage the distribution of the media, we need a way to
> coordinate between different publishing systems using the same distributed
> media (several *publications* accessing the same resources via the
> download server, for instance).
>
>
>
> _______________________________________________
> Matterhorn mailing list
> Matterhorn@opencastproject.org
> http://lists.opencastproject.org/mailman/listinfo/matterhorn
>
>
> To unsubscribe please email
> matterhorn-unsubscr...@opencastproject.org
> _______________________________________________
>

_______________________________________________
Matterhorn mailing list
Matterhorn@opencastproject.org
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
matterhorn-unsubscr...@opencastproject.org
_______________________________________________

Re: [Opencast Matterhorn] Mediapackage element #proposal

Reply via email to