Re: [hibernate-dev] HHH-11042 Count distinct tuples

Christian Beikov Sat, 03 Sep 2016 00:49:16 -0700

Hey Gail,

I guess we have to introduce a configuration parameter then or extend 
the dialect with a method for these cases?
Should I implement a POC? If so, which approach should I use and how 
would I access the parameter if I should do it like that?


Regards,
Christian

Am 01.09.2016 um 23:13 schrieb Gail Badner:
> Hi Christian,
>
> I've been following your discussion on HHH-11042 and thinking about 
> your solution. I am concerned that some dialects do not treat null the 
> way you describe. SQL Server has a property that changes how null is 
> treated:
> SET ANSI_NULLS { ON | OFF } [1]
> [1] mentions that in a future release it will not be possible to set 
> to OFF, but for now, this is a concern. I also found [2] which says: 
> "For the DISTINCT keyword, null values are considered to be duplicates 
> of each other. When DISTINCT is included in a SELECT statement, only 
> one NULL is returned in the results, regardless of how many null 
> values are encountered."
> Sybase may be similar. I'm not sure if there are other dialects that 
> could be affected.
> I agree that the dialect should be able to override the behavior.
> Regards,
> Gail
> [1] https://msdn.microsoft.com/en-us/library/ms188048.aspx [2] 
> https://technet.microsoft.com/en-us/library/ms187831(v=sql.105).aspx 
> <https://technet.microsoft.com/en-us/library/ms187831%28v=sql.105%29.aspx> 
>
>
> On Thu, Sep 1, 2016 at 3:18 AM, Christian Beikov 
> <christian.bei...@gmail.com <mailto:christian.bei...@gmail.com>> wrote:
>
>     Hey,
>
>     I wanted to start a discussion regarding this issue:
>     https://hibernate.atlassian.net/browse/HHH-11042
>     <https://hibernate.atlassian.net/browse/HHH-11042>
>
>     Although the Dialect class contains the method
>     "supportsTupleDistinctCount", it is never used, so when doing a count
>     distinct on a tuple, it just renders the tuple instead of doing a
>     fallback or throwing an error.
>
>     I suggested the OP to override the count function in the dialect to do
>     whatever he thinks is best but then I realized that the count function
>     is not even used as the logic is hard coded in some locations. The
>     problematic location in this case is
>     "org.hibernate.hql.internal.ast.tree.IdentNode.resolveAsAlias" which
>     does not consider the function at all but renders the SQL directly.
>
>     After suggesting him to introduce a custom function instead and some
>     discussion on how count distinct could be reliably implemented I
>     think I
>     found a solution that might work for most databases.
>
>     On stackoverflow and other sites it is often suggested to use a
>     checksum
>     to workaround this limitation which obviously is not a good idea. I
>     proposed to do concatenation with a separator that doesn't appear
>     in the
>     string and apparently the character '\0' is a valid character which
>     makes it a good candidate as that should normally not appear in a
>     string.
>
>     The final solution to the problem looks something like the following
>
>     count(distinct case when col1 is null or col2 is null then null else
>     col1 || '\0' || col2 end) + count(case when col1 is null or col2
>     is null
>     then 1 end)
>
>     The first count does a count distinct on all columns concatenated with
>     '\0' where all values are not null. The second just counts the cases
>     where one of the column values was null. Together that emits the
>     proper
>     count based on the assumption that '\0' does not appear in the
>     columns.
>
>     What do you think about that solution? I would like to implement
>     it that
>     way and do a PR.
>
>     I would also like to make use of the count function registered in the
>     dialect to make this overrideable. Hope that's okay?
>
>
>     Regards,
>     Christian
>     _______________________________________________
>     hibernate-dev mailing list
>     hibernate-dev@lists.jboss.org <mailto:hibernate-dev@lists.jboss.org>
>     https://lists.jboss.org/mailman/listinfo/hibernate-dev
>     <https://lists.jboss.org/mailman/listinfo/hibernate-dev>
>
>

_______________________________________________
hibernate-dev mailing list
hibernate-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev

Re: [hibernate-dev] HHH-11042 Count distinct tuples

Reply via email to