2012/2/7 Aristedes Maniatis <a...@maniatis.org>

> On 7/02/12 7:07 AM, Daniel Scheibe wrote:
>
>>
>>
>> Am 06.02.2012 00:21, schrieb Aristedes Maniatis:
>>
>>> On 6/02/12 9:48 AM, Daniel Scheibe wrote:
>>>
>>>> It's causing me an out of memory / permgen space exception since (from
>>>> what i can tell) Cayenne is keeping all the records im memory until
>>>> the commitChanges() call.
>>>>
>>>> Am 05.02.2012 23:32, schrieb Aristedes Maniatis:
>>>>
>>>>> On 6/02/12 7:26 AM, Daniel Scheibe wrote:
>>>>>
>>>>>> Thanks Ari,
>>>>>>
>>>>>> i'm trying to store information about the content of archives (ZIP and
>>>>>> RAR) inside a database table. They usually consist of a few hundred up
>>>>>> to 20-30k thousand entries and i would like to keep this inside a
>>>>>> transaction either suceeding or failing in a whole for a single
>>>>>> archive. Of course i could issue commits in between and process them
>>>>>> in chunks but its more work and harder to maintain the rollback of the
>>>>>> process if something goes wrong or fails during the import phase. As
>>>>>> previously mentioned i have it working with plain SQL statements but
>>>>>> would love to integrate Cayenne in between to get the full advantage
>>>>>> and convenience of an O/R mapper. Furthermore i'm using XADisk's
>>>>>> support for transactions on the filesystem level to keep it
>>>>>> synchronized with the database (also works out great with larger
>>>>>> transactions).
>>>>>>
>>>>>> Appreciating your help! I will search the list again, maybe i've
>>>>>> missed some stuff that has been discussed already.
>>>>>>
>>>>>> Cheers,
>>>>>> Daniel
>>>>>>
>>>>>> Am 05.02.2012 13:28, schrieb Aristedes Maniatis:
>>>>>>
>>>>>>> On 5/02/12 10:11 PM, Daniel Scheibe wrote:
>>>>>>>
>>>>>>>> Hi guys,
>>>>>>>>
>>>>>>>> i'm trying to figure out how Cayenne is handling transactions in
>>>>>>>> regards of when data is transferred to the database. I am trying to
>>>>>>>> get some large transactions (many small records or a few very large
>>>>>>>> records) working and so far haven't had luck. If i understand
>>>>>>>> correcly, Cayenne is holding my records in a cache and when i
>>>>>>>> execute
>>>>>>>> the commitChanges() on the ObjectContext it starts to transfer all
>>>>>>>> of
>>>>>>>> the data to the database (isolated in a real database
>>>>>>>> transaction). So
>>>>>>>> my question is (and please correct me if my previuos assumption is
>>>>>>>> wrong) how can I influence on this behaviour since i pretty much
>>>>>>>> need
>>>>>>>> Cayenne to use the database transaction more directly. When i skip
>>>>>>>> Cayenne and issue SQL statements right through the JDBC connection
>>>>>>>> (ie.g. BEGIN TRANSACTION, UPDATE, UPDATE, UPDATE..., COMMIT
>>>>>>>> TRANSACTION) it works fine as with MySQL i'm only restricted to the
>>>>>>>> maximum allowed size of the InnoDB transaction logfile which is
>>>>>>>> quite
>>>>>>>> sufficient for my purposes. On the other hand
>>>>>>>>
>>>>>>> i
>>>>>>>
>>>>>>>> completely understand that Cayenne would not know when i've finished
>>>>>>>> working with an Entity as there is no kind of "save" command on the
>>>>>>>> Entity itself.
>>>>>>>>
>>>>>>>> Oh, this of course only goes for storing data inside the database,
>>>>>>>> not
>>>>>>>> for reading.
>>>>>>>>
>>>>>>>> Maybe some of you guys with a bit more insight into the Cayenne
>>>>>>>> internals can help me out on this or point me to the right
>>>>>>>> direction.
>>>>>>>>
>>>>>>>> Thanks in advance and a refreshing weekend!
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Daniel
>>>>>>>>
>>>>>>>
>>>>>>> Can you explain a little more what you are trying to do and what
>>>>>>> problem
>>>>>>> you are trying to avoid. Yes, there are ways to control the
>>>>>>> transactions
>>>>>>> through Cayenne (search this list for many previous thread discussing
>>>>>>> this), but it sounds like you just need to commit your context at the
>>>>>>> appropriate places in your code.
>>>>>>>
>>>>>>> Ari
>>>>>>>
>>>>>>>
>>>>> OK, good. Then what is the problem with using a single commit for the
>>>>> whole batch?
>>>>>
>>>>> Ari
>>>>>
>>>>
>>>
>>> Daniel,
>>>
>>> Why do you post at the top of the emails when I've replied to the bottom?
>>>
>>> Now that we actually know what your problem is that you are trying to
>>> solve, perhaps someone can help. I believe the only approach is to take
>>> control of the transactions yourself in order to be able to bypass the
>>> normal behaviour of Cayenne with regard to committing and wrapping
>>> several commits in one transaction. Then you can break up the Cayenne
>>> commits into smaller chunks and wrap the whole thing in a database
>>> transaction you manage yourself. I've never needed to do that myself,
>>> but I believe it is possible.
>>>
>>>  They usually consist of a few hundred up to 20-30k thousand entries
>>>>
>>>
>>> If that means 20,000-30,000 entries, and the objects aren't too big, you
>>> should be able to commit that in one go with a bit more RAM (yes,
>>> Cayenne will keep the objects in memory). If you mean 20,000-30,000
>>> thousand entries (that is, 20-30 million) then I think you are out of
>>> the realm where an ORM is a good solution for your problem.
>>>
>>>
>>> Ari
>>>
>>>
>>>
>> Hi Ari,
>>
>> sorry my E-Mail client is used to quote from the top :) Thanks for you
>> response, i tried to change my scenario a bit to get it right with Cayenne.
>> Right know i'm inserting one record (Entity) with a byte buffer (mapped to
>> a BLOB) and committing it right after through commitChanges() on the object
>> context. The buffer is allocated once at the beginning of the loop (64kb)
>> and reused for each new entity that is inserted. The problem here is that
>> Cayenne is still eating up a lot of memory when creating let's say 1000
>> objects (1 object, then a commit, etc.) and causes an out of memory
>> exception after some time. Is there any sort of a "write" cache that i
>> could clear after doing a commit or something else? I don't see any reason
>> why the memory consumption is growing and even if it caches something,
>> everything that is committed should get GC'd at some point. I already
>> isolated the problem, if i just remove the newObject/setData/**commitChanges
>> calls the process runs just fine. Maybe
>>
> i
>
>> can provide further information and some profiling results tomorrow.
>>
>> I'm using Cayenne 3.0.2. and MySQL 5.
>>
>> Btw: Are there any plans on integrating streams instead of byte[] buffers
>> into cayenne at some point in the future? MS-SQL Server has this FileStream
>> data type which looks promising but would require to channel the JDBC
>> driver streams support all the way up to the Entity level.
>>
>
>
> I can't speak for anyone else, but I'd have no interest in trying to fit
> this approach into an ORM. Databases are not great places to store this
> type of data.
>
> You may need to discard the Cayenne context between each batch that you
> process. That should help with gc.
>
>
> Ari
>
>
>
> --
> -------------------------->
> Aristedes Maniatis
> GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
>

Hi,

I completely share your point about storing binary stuff in databases.
There are better places, it's just that today when everyone is working and
programming against Entity interfaces it's kind of annoying that theres no
"complete" solution to store object and binary file content in a convenient
standardised way supporting transactions. Even the most common approach is
just a workaround. But right, that's not a Cayenne topic. I guess i've to
do some more research first. Thanks for the information regarding the
garbage collection!

Cheers,
Daniel

Reply via email to