Re: [hibernate-dev] Hibernate Developer IRC meeting - 5/03

Eric Dalquist Fri, 04 May 2012 05:21:42 -0700

Next time I grab a heap dump from one of our prod boxes I can poke 
around in the org.hibernate classes a bit and let you know what YourKit 
says if you guys want :)


-Eric

On 5/4/12 7:14 AM, Steve Ebersole wrote:
> Apparently this did not go through to the list the first time, sorry...
>
> Completely agree.
>
> I focused on perf in my last comment but I dont think memory is all that
> much different.  The declaration of all that state already has to be
> accounted for in its current "flattened" parallel-array representation.
>    The trades off here are:
> 1) X number of array declarations versus 1
> 2) overhead of the class definition; again its actual state field memory
> footprint is already accounted for so we really are just talking about
> small amount of memory here.
>
> Certainly I think its a great idea to try to actually calculate and
> compare the memory diffs here. I am pretty confident the difference is
> negligible.  But either way Hardy's point about higher likelihood of
> bugs is the biggest concern.  In my experience lack of cohesive
> encapsulation is just a recipe for situations where hard to find
> problems creep into the code.
>
>
> On 05/04/2012 05:39 AM, Hardy Ferentschik wrote:
>> Even taking the risk of pouring oil onto the fire, I think a simpler data 
>> structure wins in most cases over
>> the parallel arrays. It is much harder to use the latter and easier to make 
>> mistakes which leads to more
>> bugs and higher maintenance costs.
>>
>> As Sanne is saying performance questions are tricky. So many thing are 
>> happening with the code our days
>> before they are getting executed on the bare metal that it is hard to know 
>> what performance impacts a certain
>> change has. In the end you just have to measure.
>>
>> Personally I think we should primarily strive for a better and easier to use 
>> API. Oppertunities to optimizes arise
>> then often naturally.
>>
>> And now my dear disciples let me close with:
>>    "The First Rule of Program Optimization: Don't do it. The Second Rule of 
>> Program Optimization (for experts only!): Don't do it yet." — Michael A. 
>> Jackson
>>
>> :-)
>>
>> --Hardy
>>
>> On May 4, 2012, at 11:58 AM, Sanne Grinovero wrote:
>>
>>> tricky subject.
>>> I'm confident that there are many cases in which we should have used
>>> arrays rather than maps, especially for temporary objects which aren't
>>> short lived enough (an HashMap living in the scope of a single method
>>> is going to be cheap). We should have either objects allocated for
>>> very long (like forever in the scope of the SessionFactory), or very
>>> short.
>>>
>>> In the case of how we keep metadata, I think performance would be
>>> dominated not that much by the fact it's a slightly bigger object but
>>> by prefetching and what is going to be available in the cache lines
>>> you just have filled in: obviously cache is way faster than memory so
>>> being clever in the sequence you lay out your data structure could
>>> speed you up by a couple of orders of magnitude.
>>>
>>> Using primitives and array matrixes makes the data smaller, hence more
>>> likely to fit in the cache; but if using an array of objects in which
>>> each object collects the needed fields in one group, that's likely
>>> going to be faster.. but I'm making assumptions on how this structure
>>> is going to be read more frequently.
>>>
>>> For example when declaring a matrix as an [ ][ ], performance will be
>>> very different depending if you read by columns or rows - forgot which
>>> one is better now - but in that case if the common use case is using
>>> the slower path it's usually a good idea to invert the matrix.
>>>
>>> I'd love it if we could enter this space, or even if it's not suited
>>> for it, at least be considered "lite":
>>> http://stackoverflow.com/questions/10374735/lucene-and-ormlite-on-android
>>>
>>> Sanne
>>>
>>> On 4 May 2012 10:07, Emmanuel Bernard<emman...@hibernate.org>   wrote:
>>>> Performance I don't know, you are probably right. But memory wise, that 
>>>> could be way different.
>>>> Even ignoring the overhead of the object + pointer in memory, the 
>>>> alignment of boolean or other small objects would make a significant 
>>>> impact.
>>>>
>>>> Of course if we are talking about 20 values, we should not bother. But 
>>>> persisters and the like store more than 20 values and we have more than 
>>>> one persister / loader. It might be inconsequential in the end but that 
>>>> might be worth testing.
>>>>
>>>> On a related note it's up for debate whether or not putting data in a hash 
>>>> map for faster lookup later is worth it in all cases:
>>>>
>>>> - it takes much more space than raw arrays
>>>> - array scan might be as fast or faster for a small enough array. As we 
>>>> have seen in Infinispan and OGM, computing a hash is not a cheap operation.
>>>>
>>>> Again this require testing but I am guilty as charge of using collections 
>>>> in AnnotationBinder when doing some computations that would be better off 
>>>> written as an array + array scan.
>>>>
>>>>
>>>> On 3 mai 2012, at 19:32, Steve Ebersole wrote:
>>>>
>>>>> I seriously doubt the performance cost of 20 'parallel arrays' versus 1 
>>>>> array of Objects holding those 20 values is anything but negligible at 
>>>>> best.
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev


_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] Hibernate Developer IRC meeting - 5/03

Reply via email to