Re: [hibernate-dev] Coordinates storage in Lucene index for spatial functionality

2012-05-15 Thread Sanne Grinovero
Hi Nicolas, thanks looking better.
Could you now change it for longer runs? If I think of 2K invocations,
each taking 5ms, that's not more than 10 seconds.. It took me several
minutes to load all the data needed for the test, then just some
seconds to run the tests.. not worth the load time ;)
To consider aspects such as performance loss due to garbage generation
you'd need a steady run for 45 minutes at least, and take a look to
the results in the average of the second half of the test run. (and
not using System.out every 4 milliseconds)

On a side note, what do you need System.exit(0); for ? You should
close the SessionFactory.

Cheers,
Sanne

On 15 May 2012 14:04, Nicolas Helleringer  wrote:
> I did the seed on the random generator.
>
> Here are some results:
>
> Degrees 2K calls
> Mean time with Grid : 4.769457488717949 ms. Average number of docs  fetched
> : 2524.982564102564
> Mean time with Grid + Distance filter : 6.501712946153845 ms. Average number
> of docs  fetched : 426.1876923076923
> Mean time with DoubleRange : 14.336663392307692 ms. Average number of docs
>  fetched : 543.6035897435897
> Mean time with DoubleRange + Distance filter : 19.7123163574359 ms. Average
> number of docs  fetched : 426.1876923076923
>
> Radians 2K calls
> Mean time with Grid : 4.430686068205128 ms. Average number of docs  fetched
> : 2524.982564102564
> Mean time with Grid + Distance filter : 6.717519717948718 ms. Average number
> of docs  fetched : 426.1876923076923
> Mean time with DoubleRange : 14.35186034 ms. Average number of docs  fetched
> : 543.6035897435897
> Mean time with DoubleRange + Distance filter : 20.073972284102563 ms.
> Average number of docs  fetched : 426.1876923076923
>
> Radians 50k calls
> Mean time with Grid : 4.440979528643216 ms. Average number of docs  fetched
> : 2459.169386934673
> Mean time with Grid + Distance filter : 6.722681398331658 ms. Average number
> of docs  fetched : 416.2335879396985
> Mean time with DoubleRange : 14.532376860201005 ms. Average number of docs
>  fetched : 530.2923618090452
> Mean time with DoubleRange + Distance filter : 20.21980649284422 ms. Average
> number of docs  fetched : 416.2335879396985
>
> On the random part you can see by looking at the average umber of docs on
> the 2k calls that the seed did its works, the requests are the same.
>
> As you can see there is not such a difference between 2k and 50k calls runs.
>
> What I have investigated too is the overhead of the distance filter over the
> double range approach. I do fear that the wrapping
> of the lat,long range query in a QueryWrapperFilter is costly but i cannnot
> prove it, yet.
>
> Back to the main question : does radian storage gives better performance ? I
> cannot say with my test env. It seems pretty close to me.
> Maybe if someone manages to launch the bench on a different environnement.
>
> Niko
>
> PS : both branches are up to date in my github
> : https://github.com/nicolashelleringer/hibernate-search/tree/HSEARCH-923 & https://github.com/nicolashelleringer/hibernate-search/tree/HSEARCH-923-RADIANS
>
>  2012/5/14 Nicolas Helleringer 
>>>
>>> maybe even simpler set a constant as the seed of your random
>>> generator: should provide a reproducible sequence of values.
>>
>> /facepalm
>> I should have guess that :s
>>
>> Niko
>>
>>>
>>> >>
>>> >> On 11 May 2012 08:40, Nicolas Helleringer
>>> >> 
>>> >> wrote:
>>> >> > There, back and again ...
>>> >> >
>>> >> > After fixing a bug in grid search here are some updated results on
>>> >> > 2k
>>> >> > calls
>>> >> >
>>> >> > Degrees :
>>> >> > Mean time with Grid : 4.4897266425641025 ms. Average number of docs
>>> >> >  fetched
>>> >> > : 2506.96
>>> >> > Mean time with Grid + Distance filter : 6.4930799487179485 ms.
>>> >> > Average
>>> >> > number of docs  fetched : 425.33435897435896
>>> >> > Mean time with DoubleRange : 14.430638703076923 ms. Average number
>>> >> > of
>>> >> > docs
>>> >> >  fetched : 542.0410256410256
>>> >> > Mean time with DoubleRange + Distance filter : 20.483300545128206
>>> >> > ms.
>>> >> > Average number of docs  fetched : 425.33435897435896
>>> >> >
>>> >> > Radians :
>>> >> > Mean time with Grid : 5.650845744102564 ms. Average number of docs
>>> >> >  fetched
>>> >> > : 5074.830769230769
>>> >> > Mean time with Grid + Distance filter : 8.627138825128204 ms.
>>> >> > Average
>>> >> > number
>>> >> > of docs  fetched : 426.7902564102564
>>> >> > Mean time with DoubleRange : 15.337755502564102 ms. Average number
>>> >> > of
>>> >> > docs
>>> >> >  fetched : 1087.705641025641
>>> >> > Mean time with DoubleRange + Distance filter : 20.82852138769231 ms.
>>> >> > Average
>>> >> > number of docs  fetched : 426.7902564102564
>>> >> >
>>> >> > Next thing I do not explain yet is the distance filter overhead
>>> >> > mismatch
>>> >> > :
>>> >> > It is less on grid search with more docs to test than on
>>> >> > DoubleRange.
>>> >> >
>>> >> > Niko
>>> >> >
>>> >> >
>>> >> > 2012/5/7 Nicolas Helleringer 
>>> >> >>
>>> >> >

Re: [hibernate-dev] Coordinates storage in Lucene index for spatial functionality

2012-05-15 Thread Nicolas Helleringer
>
> On a side note, what do you need System.exit(0); for ? You should
> close the SessionFactory.
>
Because i m better with geo/data than with code =)
Thanks for pointing me the right direction.

The last numbers series is from a 50k calls run in radian mode that lasted
45 minutes.

For each center the bench runs the 4 modes of request ending up in ~45 ms
the loop.

Niko


> Cheers,
> Sanne
>
> On 15 May 2012 14:04, Nicolas Helleringer 
> wrote:
> > I did the seed on the random generator.
> >
> > Here are some results:
> >
> > Degrees 2K calls
> > Mean time with Grid : 4.769457488717949 ms. Average number of docs
>  fetched
> > : 2524.982564102564
> > Mean time with Grid + Distance filter : 6.501712946153845 ms. Average
> number
> > of docs  fetched : 426.1876923076923
> > Mean time with DoubleRange : 14.336663392307692 ms. Average number of
> docs
> >  fetched : 543.6035897435897
> > Mean time with DoubleRange + Distance filter : 19.7123163574359 ms.
> Average
> > number of docs  fetched : 426.1876923076923
> >
> > Radians 2K calls
> > Mean time with Grid : 4.430686068205128 ms. Average number of docs
>  fetched
> > : 2524.982564102564
> > Mean time with Grid + Distance filter : 6.717519717948718 ms. Average
> number
> > of docs  fetched : 426.1876923076923
> > Mean time with DoubleRange : 14.35186034 ms. Average number of docs
>  fetched
> > : 543.6035897435897
> > Mean time with DoubleRange + Distance filter : 20.073972284102563 ms.
> > Average number of docs  fetched : 426.1876923076923
> >
> > Radians 50k calls
> > Mean time with Grid : 4.440979528643216 ms. Average number of docs
>  fetched
> > : 2459.169386934673
> > Mean time with Grid + Distance filter : 6.722681398331658 ms. Average
> number
> > of docs  fetched : 416.2335879396985
> > Mean time with DoubleRange : 14.532376860201005 ms. Average number of
> docs
> >  fetched : 530.2923618090452
> > Mean time with DoubleRange + Distance filter : 20.21980649284422 ms.
> Average
> > number of docs  fetched : 416.2335879396985
> >
> > On the random part you can see by looking at the average umber of docs on
> > the 2k calls that the seed did its works, the requests are the same.
> >
> > As you can see there is not such a difference between 2k and 50k calls
> runs.
> >
> > What I have investigated too is the overhead of the distance filter over
> the
> > double range approach. I do fear that the wrapping
> > of the lat,long range query in a QueryWrapperFilter is costly but i
> cannnot
> > prove it, yet.
> >
> > Back to the main question : does radian storage gives better performance
> ? I
> > cannot say with my test env. It seems pretty close to me.
> > Maybe if someone manages to launch the bench on a different
> environnement.
> >
> > Niko
> >
> > PS : both branches are up to date in my github
> > :
> https://github.com/nicolashelleringer/hibernate-search/tree/HSEARCH-923 &
> https://github.com/nicolashelleringer/hibernate-search/tree/HSEARCH-923-RADIANS
> >
> >  2012/5/14 Nicolas Helleringer 
> >>>
> >>> maybe even simpler set a constant as the seed of your random
> >>> generator: should provide a reproducible sequence of values.
> >>
> >> /facepalm
> >> I should have guess that :s
> >>
> >> Niko
> >>
> >>>
> >>> >>
> >>> >> On 11 May 2012 08:40, Nicolas Helleringer
> >>> >> 
> >>> >> wrote:
> >>> >> > There, back and again ...
> >>> >> >
> >>> >> > After fixing a bug in grid search here are some updated results on
> >>> >> > 2k
> >>> >> > calls
> >>> >> >
> >>> >> > Degrees :
> >>> >> > Mean time with Grid : 4.4897266425641025 ms. Average number of
> docs
> >>> >> >  fetched
> >>> >> > : 2506.96
> >>> >> > Mean time with Grid + Distance filter : 6.4930799487179485 ms.
> >>> >> > Average
> >>> >> > number of docs  fetched : 425.33435897435896
> >>> >> > Mean time with DoubleRange : 14.430638703076923 ms. Average number
> >>> >> > of
> >>> >> > docs
> >>> >> >  fetched : 542.0410256410256
> >>> >> > Mean time with DoubleRange + Distance filter : 20.483300545128206
> >>> >> > ms.
> >>> >> > Average number of docs  fetched : 425.33435897435896
> >>> >> >
> >>> >> > Radians :
> >>> >> > Mean time with Grid : 5.650845744102564 ms. Average number of docs
> >>> >> >  fetched
> >>> >> > : 5074.830769230769
> >>> >> > Mean time with Grid + Distance filter : 8.627138825128204 ms.
> >>> >> > Average
> >>> >> > number
> >>> >> > of docs  fetched : 426.7902564102564
> >>> >> > Mean time with DoubleRange : 15.337755502564102 ms. Average number
> >>> >> > of
> >>> >> > docs
> >>> >> >  fetched : 1087.705641025641
> >>> >> > Mean time with DoubleRange + Distance filter : 20.82852138769231
> ms.
> >>> >> > Average
> >>> >> > number of docs  fetched : 426.7902564102564
> >>> >> >
> >>> >> > Next thing I do not explain yet is the distance filter overhead
> >>> >> > mismatch
> >>> >> > :
> >>> >> > It is less on grid search with more docs to test than on
> >>> >> > DoubleRange.
> >>> >> >
> >>> >> > Niko
> >>> >> >
> >>> >> >
> >>> >> > 2012/5/7 Nicolas Helleringer 
> 

[hibernate-dev] [OGM] OGM-174 Composite id fail on MongoDB: MongoDBDialect only takes the first id column into account and force _id

2012-05-15 Thread Guillaume SCHEIBEL
Hi,

Just to be sure, there is currently any test on the suite about composite
id right ?
As I said on github, I'm taking this one and then OGM-179  (duplication
between "id" and "_id") which concerns also the ID field management.

Do you have any suggestions, metaphysical thoughts, or anything else about
 that ?

Have a nice day,
Guillaume
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] [OGM] OGM-174 Composite id fail on MongoDB: MongoDBDialect only takes the first id column into account and force _id

2012-05-15 Thread Emmanuel Bernard
No test yet, or they would have failed :)

I did write an email on the subject a while ago ont he subject of (composite) 
id in mongodb, you will have to dig it up.

On 15 mai 2012, at 18:08, Guillaume SCHEIBEL wrote:

> Hi,
> 
> Just to be sure, there is currently any test on the suite about composite
> id right ?
> As I said on github, I'm taking this one and then OGM-179  (duplication
> between "id" and "_id") which concerns also the ID field management.
> 
> Do you have any suggestions, metaphysical thoughts, or anything else about
> that ?
> 
> Have a nice day,
> Guillaume
> ___
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev


___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


[hibernate-dev] Discriminated Multi-Tenancy and Inheritance

2012-05-15 Thread Steve Ebersole
My current thinking here is that the discrimination would only be 
definable at the root of a mapped inheritance hierarchy.

Thoughts?  Votes?


-- 
st...@hibernate.org
http://hibernate.org
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


[hibernate-dev] Multi-Tenancy and "shared" data

2012-05-15 Thread Steve Ebersole
Multi-tenant setups sometimes have data that is shared between the 
tenants (codec tables, etc).

I think the first question is do we want to support this mixing?  I 
think it is common enough that it is worthwhile to support it.  And I do 
not think it is complicated enough to be painful to implement.  As long 
as we assume that there is some form of database-level availability 
between shared, non-shared data (even for the DATABASE and SCHEMA 
strategies) I think we will be fine.

Assuming we do support it, there is a decision we need to make about how 
we differentiate shared (tenant aware) and non-shared (non-tenant aware) 
data, especially important when we talk about the DISCRIMINATOR approach 
which touches on a more general outstanding decision with regard to 
supporting DISCRIMINATOR multi-tenancy.  Basically whether entities are 
inclusively considered multi-tenant when the user has specified 
DISCRIMINATOR, or whether we expect some form of annotation stating the 
entity is multi-tenant.  Personally, I think the inclusive approach (all 
entities are assumed multi-tenant) is probably the better approach.  In 
which case we need an annotation to say "this entity is not multi-tenant".


-- 
st...@hibernate.org
http://hibernate.org
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] Discriminated Multi-Tenancy and Inheritance

2012-05-15 Thread John Verhaeg
Definitely seems reasonable for now, especially considering this will be our 
first implementation.  I guess I'm also assuming we have no user requests that 
would oppose this decision.

On May 15, 2012, at 11:45 AM, Steve Ebersole wrote:

> My current thinking here is that the discrimination would only be 
> definable at the root of a mapped inheritance hierarchy.
> 
> Thoughts?  Votes?
> 
> 
> -- 
> st...@hibernate.org
> http://hibernate.org
> ___
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev

JPAV





___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] Multi-Tenancy and "shared" data

2012-05-15 Thread John Verhaeg
I'd agree.  It would seem strange to assume the lack of inheritance in this 
case and have to annotate multi-tenancy throughout the chain of associations.

On May 15, 2012, at 12:04 PM, Steve Ebersole wrote:

> Multi-tenant setups sometimes have data that is shared between the 
> tenants (codec tables, etc).
> 
> I think the first question is do we want to support this mixing?  I 
> think it is common enough that it is worthwhile to support it.  And I do 
> not think it is complicated enough to be painful to implement.  As long 
> as we assume that there is some form of database-level availability 
> between shared, non-shared data (even for the DATABASE and SCHEMA 
> strategies) I think we will be fine.
> 
> Assuming we do support it, there is a decision we need to make about how 
> we differentiate shared (tenant aware) and non-shared (non-tenant aware) 
> data, especially important when we talk about the DISCRIMINATOR approach 
> which touches on a more general outstanding decision with regard to 
> supporting DISCRIMINATOR multi-tenancy.  Basically whether entities are 
> inclusively considered multi-tenant when the user has specified 
> DISCRIMINATOR, or whether we expect some form of annotation stating the 
> entity is multi-tenant.  Personally, I think the inclusive approach (all 
> entities are assumed multi-tenant) is probably the better approach.  In 
> which case we need an annotation to say "this entity is not multi-tenant".
> 
> 
> -- 
> st...@hibernate.org
> http://hibernate.org
> ___
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev

JPAV





___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] [OGM] OGM-174 Composite id fail on MongoDB: MongoDBDialect only takes the first id column into account and force _id

2012-05-15 Thread Sanne Grinovero
On 15 May 2012 17:08, Guillaume SCHEIBEL  wrote:
> Hi,
>
> Just to be sure, there is currently any test on the suite about composite
> id right ?
> As I said on github, I'm taking this one and then OGM-179  (duplication
> between "id" and "_id") which concerns also the ID field management.
>
> Do you have any suggestions, metaphysical thoughts, or anything else about
>  that ?

yes, you should take the issue on JIRA and assign it to yourself ;-)
Thanks,
Sanne

>
> Have a nice day,
> Guillaume
> ___
> hibernate-dev mailing list
> hibernate-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev

___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] Memory consumption

2012-05-15 Thread Andrej Golovnin
Hi Steve,

> Take 'temporaryIdTableDDL' as the perfect example.  Hibernate cannot know at 
> start up (building the SessionFactory) that the application will or will not 
> use HQL updates/deletes against "multi-table structures" that would therefore 
> need access to 'temporaryIdTableDDL'?
> 
> And the problem with lazily generating them later is that that would require 
> keeping around the Configuration as part of the SessionFactory.

If we optimize memory consumption of Configuration, I doubt it would be a big 
problem
to keep Configuration in memory.

> 
> Hm, I had not thought of the UserType ServiceLoader approach.  Thats 
> interesting, though not at all related to the "instance per usage" situation.
> 
> If you want to get around having an instance-per-usage, use the TypeRegistry: 
> http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html_single/#types-registry

Is it possible to use TypeRegistry it in JEE environment? I mean, the container 
creates
the Configuration and SessionFactory. Is it possible to hook up into this 
process
and add my own UserTypes?

Configuration could use ServiceLoader to find all application specific UserTypes
and register them in the TypeRegistry.

> 
> Configuration is not kept around by Hibernate itself once the SessionFactory 
> is built.  Nor is SimpleValue.  And as Hardy already mentioned, that code is 
> undergoing major changes already for 5.0.

As Sanne already mentioned Configuration is not garbaged collected in Hibernate 
4.
In memory dumps I see three objects, which are still references Configuration
once the SessionFactory is built:

- direct reference SessionFactoryServiceRegistryImpl
- indirect reference from inner class created in 
Configuration#buildSessionFactory() (lines 1770-1781)
- indirect reference from inner class created in constructor of 
SessionFactoryImpl (lines 230-253)

Best regards
Andrej Golovnin
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] Memory consumption

2012-05-15 Thread Andrej Golovnin
Hi Sanne,

> Andrej, what's the overall size you would save by dropping the
> Configuration object?

16187776 bytes.

Best regards,
Andrej Golovnin
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev


Re: [hibernate-dev] Memory consumption

2012-05-15 Thread Andrej Golovnin
Hi all,

finally I have found the cause of this problem.

But before I describe what caused it, I would like to make two suggestions:

1. Consider avoiding usage of String#toLowerCase() and String#toUpperCase()
without specifying a Locale. Users running Hibernate on systems with 
Turkish as default
locale may see unexpected results. In Turkish there are two lowercase 
letters 
\u0069 ‘i’ and \u0131 ‘ı’ (dotless ‘I’). And they are totally unrelated. 
Their uppercase
versions are \u0130 ‘İ’ (capital letter ‘I’ with dot above it) and \u0049 
‘I’.
So if you convert \u0049 ‘I’ to lower case and the system uses Turkish as 
default
locale, you will get \u0131 ‘ı’ (dotless ‘I’) and not 'i', e.g. 
'MY_COLUMN_I' would be converted
to 'my_column_ı' and not 'my_column_i'. If you run Hibernate tests with 
Turkish as default locale
some tests fail. I have for now no idea, what locale should be used instead 
of the default
one. :-(

2. Do not use HashMap#clone(). If the JVM is started with the option
-XX:+AggressiveOpts, the HashMap#clone() may produce memory leaks.
See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7042126.
IMHO the copy-constructor should be as fast as #clone().

Now to the memory consumption problem. The problem was introduced with the fix
of https://hibernate.onjira.com/browse/HHH-4546. Additional LockModes lead
to increased memory consumption. When I comment the changes made by 
HHH-4546 in the method AbstractEntityPersister#createLoaders() out, I get nearly
the same size for SessionFactoryImpl as it was in Hibernate 3.2.7
(see http://goo.gl/UB47c).

I have created a small patch to solve/avoid this problem. It creates loaders 
lazily for some
of LockModes (see http://goo.gl/wUn4w). With this changes the memory consumption
of SessionFactoryImpl right after server startup drops from ca. 370MB to
ca. 132MB (see http://goo.gl/GQw3p).

What do you think about this changes?
Btw it passes all tests. :-)

One more thing: Could we have a property similar to 
"hibernate.listeners.envers.autoRegister"
for Validator? I would like to disable Validator completely as it produces too 
much garbage
(look at the pictures) when the application is deployed. 

Best regards
Andrej Golovnin
___
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev