On 13-02-04 09:24 AM, Karen Dolan wrote:
> Hi all,
> 
> On Friday at the Un-Conference, I mentioned how we had series catalog
> data loss from capture agents that unintentionally modified the series
> catalog.The capture agent parses the series catalog received from
> Matterhorn into a structure that is not able preserve all the elements.
> The unparsed elements are lost when the series catalog is reconstructed
> for its return trip to Matterhorn via ingest.

Karen, can you list the CAs that are doing this please?  These files are
passed as base 64 encoded attachments, and any CA worth its salt should
be unpacking them and ingesting them unmolested.  There should be no
modification of the data unless the agent has taken new input (from a
command console, or whatever).  It most certainly should not be losing
data just because it was passed through from scheduler to CA...

In regards to the rest of your email, are you proposing this for 1.4, or
1.5?  I was unable to attend your session, so I'm a bit out of the loop!

G

> I request feedback from the community on Matterhorn policy for catalog
> updates by the ingest service. My desire is to maintain the ability for
> Matterhorn to create new catalogs on ingest, but not allow updates to
> series or episode catalogs using the ingest service. Capture agents that
> create episode and series fresh from the capture agent interfaces would
> continue to be created in Matterhorn. But, capture agents that support
> editing the series and episode catalogs would need to use the recording
> and series REST endpoints directly.
> 
> I put a potential patch http://opencast.jira.com/browse/MH-928  (just 5
> lines new code, adds 1 try catch and 1 extra method call, see second
>  diff below). But there is a conflict from the header comments (see
> first diff below). The comment conflict is "potentially modified dublin
> core document" series data should be honored by the ingest service. This
> is the policy that needs clarification from the community.
> 
> // "-" signifies a current line proposed to be removed, "+" signifies
> the replacement
> //  The IngestServiceImpl.updateSeries() comment change:
> 
> -   * Updates the persistent representation of a series based on a
> potentially modified dublin core document.
> +   * Only create a series if it does not exist. This ensures that the
> series exists for the ingested 
> +   * package's reference to isPartOf, yet prevents more current data in
> the system from being overwritten.
> 
> //  The IngestServiceImpl.updateSeries() code change proposal:
> 
> -          seriesService.updateSeries(dc);
> +          try {
> +            // test if series exists
> +            seriesService.getSeries(id); 
> +            logger.debug("Series id {} already exists. Ignoring series
> catalog from ingest.", id);
> +          } catch (NotFoundException nf) { 
> +            // safe to create series
> +            seriesService.updateSeries(dc);
> +            isCreated = true;
> +            logger.info("Ingest created new series with id {} ", id);
> +          }
> 
> FYI - From previous code commit comments, the reason for series
> update/create at this early point in the ingest process is that ingests
> associated to a series but found in Matterhorn, puts the ingest in
> error. The embedded series catalog cannot be processed at the regular
> ingest zip file processing, because the threads launched there cannot
> flush the new series in time for the package's isPartOf to find the
> series match. The ingest fails if the series is not created prior to
> processing the ingest zip.
> 
> During the Friday Matterhorn Un-conference, there was an idea of
> de-coupling rich metadata from Matterhorn versus enabling Matterhorn to
> accommodate it. It would be helpful to investigate how Matterhorn could
> accommodate at least the transport and integrity of rich metadata so it
> can be harvested as supplemental data to the event media. This would
> prevent the requirement of placing Matterhorn in a metadata
> infrastructure. It allows Matterhorn to be more self sufficient and
> self-standing.
> 
> - Karen
> 
> 
> _______________________________________________
> Matterhorn mailing list
> Matterhorn@opencastproject.org
> http://lists.opencastproject.org/mailman/listinfo/matterhorn
> 
> 
> To unsubscribe please email
> matterhorn-unsubscr...@opencastproject.org
> _______________________________________________
> 


Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Matterhorn mailing list
Matterhorn@opencastproject.org
http://lists.opencastproject.org/mailman/listinfo/matterhorn


To unsubscribe please email
matterhorn-unsubscr...@opencastproject.org
_______________________________________________

Reply via email to