Apparently this did not go through to the list the first time, sorry... Completely agree.
I focused on perf in my last comment but I dont think memory is all that much different. The declaration of all that state already has to be accounted for in its current "flattened" parallel-array representation. The trades off here are: 1) X number of array declarations versus 1 2) overhead of the class definition; again its actual state field memory footprint is already accounted for so we really are just talking about small amount of memory here. Certainly I think its a great idea to try to actually calculate and compare the memory diffs here. I am pretty confident the difference is negligible. But either way Hardy's point about higher likelihood of bugs is the biggest concern. In my experience lack of cohesive encapsulation is just a recipe for situations where hard to find problems creep into the code. On 05/04/2012 05:39 AM, Hardy Ferentschik wrote: > Even taking the risk of pouring oil onto the fire, I think a simpler data > structure wins in most cases over > the parallel arrays. It is much harder to use the latter and easier to make > mistakes which leads to more > bugs and higher maintenance costs. > > As Sanne is saying performance questions are tricky. So many thing are > happening with the code our days > before they are getting executed on the bare metal that it is hard to know > what performance impacts a certain > change has. In the end you just have to measure. > > Personally I think we should primarily strive for a better and easier to use > API. Oppertunities to optimizes arise > then often naturally. > > And now my dear disciples let me close with: > "The First Rule of Program Optimization: Don't do it. The Second Rule of > Program Optimization (for experts only!): Don't do it yet." — Michael A. > Jackson > > :-) > > --Hardy > > On May 4, 2012, at 11:58 AM, Sanne Grinovero wrote: > >> tricky subject. >> I'm confident that there are many cases in which we should have used >> arrays rather than maps, especially for temporary objects which aren't >> short lived enough (an HashMap living in the scope of a single method >> is going to be cheap). We should have either objects allocated for >> very long (like forever in the scope of the SessionFactory), or very >> short. >> >> In the case of how we keep metadata, I think performance would be >> dominated not that much by the fact it's a slightly bigger object but >> by prefetching and what is going to be available in the cache lines >> you just have filled in: obviously cache is way faster than memory so >> being clever in the sequence you lay out your data structure could >> speed you up by a couple of orders of magnitude. >> >> Using primitives and array matrixes makes the data smaller, hence more >> likely to fit in the cache; but if using an array of objects in which >> each object collects the needed fields in one group, that's likely >> going to be faster.. but I'm making assumptions on how this structure >> is going to be read more frequently. >> >> For example when declaring a matrix as an [ ][ ], performance will be >> very different depending if you read by columns or rows - forgot which >> one is better now - but in that case if the common use case is using >> the slower path it's usually a good idea to invert the matrix. >> >> I'd love it if we could enter this space, or even if it's not suited >> for it, at least be considered "lite": >> http://stackoverflow.com/questions/10374735/lucene-and-ormlite-on-android >> >> Sanne >> >> On 4 May 2012 10:07, Emmanuel Bernard<emman...@hibernate.org> wrote: >>> Performance I don't know, you are probably right. But memory wise, that >>> could be way different. >>> Even ignoring the overhead of the object + pointer in memory, the alignment >>> of boolean or other small objects would make a significant impact. >>> >>> Of course if we are talking about 20 values, we should not bother. But >>> persisters and the like store more than 20 values and we have more than one >>> persister / loader. It might be inconsequential in the end but that might >>> be worth testing. >>> >>> On a related note it's up for debate whether or not putting data in a hash >>> map for faster lookup later is worth it in all cases: >>> >>> - it takes much more space than raw arrays >>> - array scan might be as fast or faster for a small enough array. As we >>> have seen in Infinispan and OGM, computing a hash is not a cheap operation. >>> >>> Again this require testing but I am guilty as charge of using collections >>> in AnnotationBinder when doing some computations that would be better off >>> written as an array + array scan. >>> >>> >>> On 3 mai 2012, at 19:32, Steve Ebersole wrote: >>> >>>> I seriously doubt the performance cost of 20 'parallel arrays' versus 1 >>>> array of Objects holding those 20 values is anything but negligible at >>>> best. > > > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev -- st...@hibernate.org http://hibernate.org _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev