RĂ¼diger, The main reason for the job arguments growing is that the mediapackage is a common job argument. We were just recently discussing this problem internally at Entwine and were looking at the existing options to solve this problem. Unfortunately, there are a number of issues that prevent a simple solution, and we are convinced that this needs a closer look with more time at hand than what is left for 1.4.
One alterternative we considered was providing the (large) job arguments as downloads on the working file repository and passing urls as the arguments instead of the actual data (references instead of values). The same would be necessary for the job's return value (payload). However, this approach has two major drawbacks: 1) All REST endpoints and their remote implementations would need to be changed to accept URLs instead of values. An additional caveat would be that now you need to put your values on an HTTP server (or service) first before you can start using the REST docs for manual operation/debugging. 2) Once the job has gone through (succeeded or failed), there has to be proper cleanup of the values (likely a removal of the artifacts from the workging file repository), and now your operations are incomplete (they only contain the references but not the values). This would then be the same as removing the operations from the job arguments table in the database. All this means that you are likely to run into the same problem with 1.4 as well, while switching to a different database engine may or may not help. Until then, scheduled removal of the jobs is our best bet and could easily be implemented as a configurable option as part of a 1.4.1. I would welcome thoughts and comments from others on how to solve the problem. On thing that is of interest to me: are you using database indices on the job table (and others?). Tobias On 16.01.2013, at 13:47, Ruediger Rolf <rr...@uni-osnabrueck.de> wrote: > Hi list, > > I want to raise a problem with our 1.3 production system and ask if we will > run into this in 1.4 too. > > After nearly a year in production and around 750 recordings our database got > extremly large ( > 2GB). The reason is mainly that the table job_arguments > got very large (>1.7GB). This results in a mysql database that is running on > 100% load permanently. > > We reported this in MH-9342 [1] and I've seen that Stephen Marquard has > report the same in MH-9031 as I just noticed. > I did my testing that I can delete the job_arguments without doing any harm, > and Tobias seems to be on the same opinon in his comment on MH-9031. > > So my question would be: is this addressed or even fixed in 1.4 that we > delete finished jobs in the DB? This bug won't come up in our QA testing, as > we will not process several hundred jobs there. But it will be a serious > problem for any production system, although there is an easy fix for this. > > Thanks > RĂ¼diger > > [1] http://opencast.jira.com/browse/MH-9342 > [2] http://opencast.jira.com/browse/MH-9031 _______________________________________________ Matterhorn mailing list Matterhorn@opencastproject.org http://lists.opencastproject.org/mailman/listinfo/matterhorn To unsubscribe please email matterhorn-unsubscr...@opencastproject.org _______________________________________________