Re: [hibernate-dev] Hibernate Developer IRC meeting - 5/03

Steve Ebersole Fri, 04 May 2012 05:14:49 -0700

Apparently this did not go through to the list the first time, sorry...

Completely agree.


I focused on perf in my last comment but I dont think memory is all that 
much different.  The declaration of all that state already has to be 
accounted for in its current "flattened" parallel-array representation. 
  The trades off here are:
1) X number of array declarations versus 1
2) overhead of the class definition; again its actual state field memory 
footprint is already accounted for so we really are just talking about 
small amount of memory here.

Certainly I think its a great idea to try to actually calculate and 
compare the memory diffs here. I am pretty confident the difference is 
negligible.  But either way Hardy's point about higher likelihood of 
bugs is the biggest concern.  In my experience lack of cohesive 
encapsulation is just a recipe for situations where hard to find 
problems creep into the code.


On 05/04/2012 05:39 AM, Hardy Ferentschik wrote:
> Even taking the risk of pouring oil onto the fire, I think a simpler data 
> structure wins in most cases over
> the parallel arrays. It is much harder to use the latter and easier to make 
> mistakes which leads to more
> bugs and higher maintenance costs.
>
> As Sanne is saying performance questions are tricky. So many thing are 
> happening with the code our days
> before they are getting executed on the bare metal that it is hard to know 
> what performance impacts a certain
> change has. In the end you just have to measure.
>
> Personally I think we should primarily strive for a better and easier to use 
> API. Oppertunities to optimizes arise
> then often naturally.
>
> And now my dear disciples let me close with:
>   "The First Rule of Program Optimization: Don't do it. The Second Rule of 
> Program Optimization (for experts only!): Don't do it yet." — Michael A. 
> Jackson
>
> :-)
>
> --Hardy
>
> On May 4, 2012, at 11:58 AM, Sanne Grinovero wrote:
>
>> tricky subject.
>> I'm confident that there are many cases in which we should have used
>> arrays rather than maps, especially for temporary objects which aren't
>> short lived enough (an HashMap living in the scope of a single method
>> is going to be cheap). We should have either objects allocated for
>> very long (like forever in the scope of the SessionFactory), or very
>> short.
>>
>> In the case of how we keep metadata, I think performance would be
>> dominated not that much by the fact it's a slightly bigger object but
>> by prefetching and what is going to be available in the cache lines
>> you just have filled in: obviously cache is way faster than memory so
>> being clever in the sequence you lay out your data structure could
>> speed you up by a couple of orders of magnitude.
>>
>> Using primitives and array matrixes makes the data smaller, hence more
>> likely to fit in the cache; but if using an array of objects in which
>> each object collects the needed fields in one group, that's likely
>> going to be faster.. but I'm making assumptions on how this structure
>> is going to be read more frequently.
>>
>> For example when declaring a matrix as an [ ][ ], performance will be
>> very different depending if you read by columns or rows - forgot which
>> one is better now - but in that case if the common use case is using
>> the slower path it's usually a good idea to invert the matrix.
>>
>> I'd love it if we could enter this space, or even if it's not suited
>> for it, at least be considered "lite":
>> http://stackoverflow.com/questions/10374735/lucene-and-ormlite-on-android
>>
>> Sanne
>>
>> On 4 May 2012 10:07, Emmanuel Bernard<emman...@hibernate.org>  wrote:
>>> Performance I don't know, you are probably right. But memory wise, that 
>>> could be way different.
>>> Even ignoring the overhead of the object + pointer in memory, the alignment 
>>> of boolean or other small objects would make a significant impact.
>>>
>>> Of course if we are talking about 20 values, we should not bother. But 
>>> persisters and the like store more than 20 values and we have more than one 
>>> persister / loader. It might be inconsequential in the end but that might 
>>> be worth testing.
>>>
>>> On a related note it's up for debate whether or not putting data in a hash 
>>> map for faster lookup later is worth it in all cases:
>>>
>>> - it takes much more space than raw arrays
>>> - array scan might be as fast or faster for a small enough array. As we 
>>> have seen in Infinispan and OGM, computing a hash is not a cheap operation.
>>>
>>> Again this require testing but I am guilty as charge of using collections 
>>> in AnnotationBinder when doing some computations that would be better off 
>>> written as an array + array scan.
>>>
>>>
>>> On 3 mai 2012, at 19:32, Steve Ebersole wrote:
>>>
>>>> I seriously doubt the performance cost of 20 'parallel arrays' versus 1 
>>>> array of Objects holding those 20 values is anything but negligible at 
>>>> best.
>
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev

-- 
st...@hibernate.org
http://hibernate.org
_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] Hibernate Developer IRC meeting - 5/03

Reply via email to