I'm not sure there is an easy way to convert Instant and DateTime to a numeric value. The problem is that the resolution for temporal types is nanoseconds, the following datetime is valid:
year: -999.999.999 month: 12 day: 31 hour: 23 minute: 59 second: 59 nanos: 999.999.999 It gets more complicated when we need to store Offset or time zone. > Great point, we should accept the user's domain type exclusively and > take the conversion burden from the user; especially since we know the > correct conversion strategy. This is already supported (for certain types at least), you don't need to round the dates to execute the queries. I think the documentation is not up to date. Davide On Mon, Aug 10, 2015 at 11:37 AM, Sanne Grinovero <sa...@hibernate.org> wrote: > On 10 August 2015 at 11:04, Hardy Ferentschik <ha...@hibernate.org> wrote: > > Hi, > > > > sorry, I am late to the game, but I here are some more thoughts on this. > > > > I think the consensus so far is that > > > > # Date/time types which represent an instant in time are treated as > usual. > > They can be string encoded (per default yyyyMMddHHmmssSSS) or > numerically > > in which case the numeric long value equals the epoch time of the > represented > > date. > > Correct that's the consensus so far. I'd like to challenge one more > detail though: > does it still make sense to allow string-encoded? > > I think not, we did allow it primarily because a long time ago that > was the only way, then it became one of the options -but still the > default - and more recently it became the non-default way. > > With these new types,backwards compatibility is a non-issue. So unless > someone makes a strong case for needing these as String in the index, > what about we drop some complexity? > > Remember: > - Hibernate Search is not an Objects/index mapper so we're not aiming > at creating any index schema possible, we're aiming at taking > advantage of the index for practical purposes ("I want it to be a > string in the index" is not a valid argument - use your own > fieldbridge in case) > - With Projections we have to re-transform things back into their > Java original type, so how we encode things in the index is irrelevant > from a semantics point of view; I think the only valid challenge would > need to come from a performance or storage space perspective, in both > cases I'm pretty sure the numeric encoding would win. > > > # Date/time types which do not represent an instant in time can also be > > encoded as string or number, but in the latter case the numeric > representation > > is given by interpreting the string representation as number. > > > > So far so good. There are a couple of more things to think about. > > > > # Query time gets interesting and I think we need to improve the DSL in > unison > > with adding support for these new types. Check out this example from > DSLTest [1] > > > > query = monthQb > > .range() > > .onField( "estimatedCreation" ) > > .ignoreFieldBridge() > > .andField( "justfortest" ) > > > .ignoreFieldBridge().ignoreAnalyzer() > > .from( DateTools.round( from, > DateTools.Resolution.MINUTE ) ) > > .to( DateTools.round( to, > DateTools.Resolution.MINUTE ) ) > > .excludeLimit() > > .createQuery(); > > > > If a date is numerically encoded you need to specify numbers for the > from and to values. ATM, > > we recommend to use the Lucene specific DateTools to get the numeric > representation. With the support > > ofthe new date types things will get confusing for the user. How does > one "create" the numeric representation > > of a LocalDate (and how does one know how it looks like in the first > place and how it differs from the epoch time)? > > Great point, we should accept the user's domain type exclusively and > take the conversion burden from the user; especially since we know the > correct conversion strategy. > > > We have been discussing before whether Hibernate Search needs to offer > its own version of DateTools. > > I think it would be time to do so and include helpers for the new > date/time types. This also reduces the exposure > > to Lucene specific types. > > +1 to encapsulate it, but I don't expect people to need it at all in > the above case? But good for other more advanced needs. > > > > > Even better though would be, if we would be able to support directly the > use of date types in the from and to clauses. > > It would be the responsibility of the DSL to round the specified types > to the appropriate level based on the field's > > configuration/metadata. Even in this scenario though a Search specific > DateTools might be necessary for the cases > > where the date specified in to/from needs to be rounded differently than > the field itself. > > +1 > > > Last but not least, the documentation needs to be updated. At the > moment, the docs are silent about all the complexity > > around dates. With the support of the new types, the docs needs to be > more explicit and describe the subtleties at play. > > +1 created HSEARCH-1958 > > Thanks, > Sanne > > > > > > --Hardy > > > > > > On Wed, Aug 05, 2015 at 05:40:16PM +0100, Sanne Grinovero wrote: > >> On 5 August 2015 at 17:22, Davide D'Alto <dav...@hibernate.org> wrote: > >> >> Proposal: use numeric but still - rather than taking the milliseconds > >> >> from epoch, take the resulting number from YYYYMMDD ? > >> > > >> > I don't think I understand what you mean with "the resulting number > from > >> > YYYYMMDD". > >> > Wouldn't be similar to get the number of days from epoch? > >> > >> No because epoch is a specific moment *with a timezone*. If you take a > >> calendar date "here", and take the moment in time which represents > >> your beginning of the calendar date, the distance from epoch is not a > >> whole number and you'd have to apply rounding which is timezone > >> specific. > >> > >> By simply encoding the number in the above format, you'd encode today > >> as the number "20150805". > >> That's a whole number which avoids the timezone relativity and can be > >> efficiently encoded in numeric form, and provides the expected sorting > >> properties. > >> > >> > > >> > But basically, you are saying that I can use different numeric > encoding for > >> > different types. Isn't it? > >> > >> Yes, you definitely need different encodings depending on the type and > >> the used options. > >> > >> > So, for example: > >> > > >> > java.util.Date, java.util.Calendar and java.time.Instant, > >> > java.time.LocalDateTime will use number of miliseconds from epoch > >> > java.time.LocalDate: number of days from epoch > >> > >> Except this one ^ I agree with the others. > >> > >> > java.time.LocalTime: number of nanos in a day > >> > >> Conceptually, yes.. but we don't have "nanoseconds" as an option of > >> org.hibernate.search.annotations.Resolution. Should we add it? > >> We would not be able to apply that Resolution on old fashioned > >> Date/Calendar, so that would need a warning or even an exception when > >> applied to old style value types. > >> > >> >> Ok that works but why write all those zeros in the index, when you > can > >> >> just write the date. I realize storage is cheap, but still we need to > >> >> be careful as the index size affects performance ;-) > >> > > >> > I don't think we need to store the 0s. > >> > If I know the type of the field I already know the the time is 0. > >> > >> Exactly > >> > >> > Am I missing something? > >> > >> I probably just misunderstood your proposal, since previously you > >> mentioned: "I would just consider a LocalDate the same as a > >> LocalDateTime with time 00:00:000 (UTC time zone)". > >> If you have to write the days only you don't need to convert to a time > first. > >> This misunderstanding might be related with the fact that you were > >> planning to encode as distance from epoch.. see my first comment on > >> this same email. > >> Since you don't want to look at distance from epoch for this case, the > >> time component really is irrelevant and LocalDate has all the > >> information you need.. simpler ;) > >> > >> Sanne > >> > >> > >> > > >> > > >> > On Wed, Aug 5, 2015 at 5:00 PM, Sanne Grinovero <sa...@hibernate.org> > wrote: > >> > > >> >> On 5 August 2015 at 16:27, Gunnar Morling <gun...@hibernate.org> > wrote: > >> >> >> as I'd like us to consider not > >> >> > applying DateBridge on the new types as it doesn't seem to add much > >> >> > practical value. > >> >> > > >> >> > Ok, that may make sense for types such as LocalDate. But there are > types > >> >> in > >> >> > the new API which - unlike LocalDate - do describe an exact > instant on > >> >> the > >> >> > time line (e.g. ZonedDateTime, Instant). For those IMO it makes > sense for > >> >> > sure to support both encodings, NUMERIC and STRING (similar to > >> >> Date/Calendar > >> >> > so far) and thus apply @DateBridge. > >> >> > >> >> +1 > >> >> > >> >> > Question is whether/how to index/persist TZ information, for > Calendar it > >> >> > seems not been persisted in the index so far? > >> >> > >> >> It's encoding the Calendar's time as distance from epoch, which is a > >> >> neutral encoding so you don't need the TZ. > >> >> > >> >> For the old style Date/Calendar types we always assumed the value was > >> >> a point-in-time, unless explicitly opting in for an alternative > >> >> encoding. > >> >> For example for the "birthday use case" a reasonable setting would > >> >> have been String encoding with resolution=DAY, although passing in a > >> >> Date instance having the right value (as in right timezone) would > have > >> >> been user's responsibility.. we simply take the long it's storing and > >> >> index that with the requested resolution. > >> >> > >> >> Sanne > >> >> > >> >> > > >> >> > > >> >> > 2015-08-05 17:10 GMT+02:00 Sanne Grinovero <sa...@hibernate.org>: > >> >> >> > >> >> >> Inline: > >> >> >> > >> >> >> On 5 August 2015 at 15:42, Davide D'Alto <dav...@hibernate.org> > wrote: > >> >> >> > If a user select a resolution that does not make much sense we > can > >> >> log a > >> >> >> > warning. > >> >> >> > >> >> >> +1 And update the javadoc to mention that some resolution values > don't > >> >> >> apply > >> >> >> > >> >> >> > But I think this might make sense: > >> >> >> > > >> >> >> > @DateBridge(resolution=MONTH) > >> >> >> > LocalDate birthday; > >> >> >> > >> >> >> Ok but how often do you think that will be used? > >> >> >> Sorry playing devil's advocate here, as I'd like us to consider > not > >> >> >> applying DateBridge on the new types as it doesn't seem to add > much > >> >> >> practical value. > >> >> >> > >> >> >> I agree it's worth a shot, but while going ahead keep in mind that > >> >> >> maybe simplifying that is the more elegant solution. > >> >> >> > >> >> >> > On Wed, Aug 5, 2015 at 3:37 PM, Davide D'Alto < > dav...@hibernate.org> > >> >> >> > wrote: > >> >> >> > > >> >> >> >> > What would you do though in case of the following: > >> >> >> >> > > >> >> >> >> > @DateBridge > >> >> >> >> > LocalDate myDate; > >> >> >> >> > > >> >> >> >> > encoding() defaults to NUMERIC, so would you a) raise an > error, or > >> >> b) > >> >> >> >> ignore encoding() for LocalDate and friends? Both seem not > right to > >> >> me. > >> >> >> >> I > >> >> >> >> think there is nothing wrong with using NUMERIC encoding > per-se for > >> >> >> >> these > >> >> >> >> types. We may recommend STRING but if NUMERIC really is what a > user > >> >> >> >> wants I > >> >> >> >> would let them do so. > >> >> >> > >> >> >> I'm all for letting the users have the last word, but this is one > of > >> >> >> those cases in which you don't know if they explicitly want that > or > >> >> >> simply went with the defaults. > >> >> >> > >> >> >> Not a big problem as of course the important thing of defaults is > that > >> >> >> "they work" but I'd really prefer the default to try be the most > >> >> >> appropriate encoding, which is not numeric in this case. > >> >> >> > >> >> >> Proposal: use numeric but still - rather than taking the > milliseconds > >> >> >> from epoch, take the resulting number from YYYYMMDD ? It might > even be > >> >> >> the most efficient encoding, as you don't have the drawback of > >> >> >> clustering which we would have with a numeric encoding working on > the > >> >> >> individual fields, and doesn't have the bloat of string encoding. > >> >> >> > >> >> >> >> > >> >> >> >> +1 > >> >> >> >> > >> >> >> >> > What do you suggest we do if a user maps the following? > >> >> >> >> > >> >> >> >> > @DateBridge(resolution=MILLISECOND) > >> >> >> >> > LocalDate birthday; > >> >> >> >> > >> >> >> >> > >> >> >> >> Nothing really, > >> >> >> >> I would just consider a LocalDate the same as a LocalDateTime > with > >> >> time > >> >> >> >> 00:00:000 (UTC time zone) > >> >> >> > >> >> >> Ok that works but why write all those zeros in the index, when > you can > >> >> >> just write the date. I realize storage is cheap, but still we > need to > >> >> >> be careful as the index size affects performance ;-) > >> >> >> > >> >> >> Sanne > >> >> >> > >> >> >> >> > >> >> >> >> It is equivalent to: > >> >> >> >> LocalDateTime dateTime = date.atStartOfDay( ZoneOffset.UTC ); > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> On Wed, Aug 5, 2015 at 3:24 PM, Gunnar Morling < > gun...@hibernate.org > >> >> > > >> >> >> >> wrote: > >> >> >> >> > >> >> >> >>> > >> >> >> >>> > >> >> >> >>> 2015-08-05 12:41 GMT+02:00 Sanne Grinovero < > sa...@hibernate.org>: > >> >> >> >>> > >> >> >> >>>> Our current implementation converts Date in the long > "distance from > >> >> >> >>>> epoch" to allow correct range-queries treating each Date as > an > >> >> >> >>>> instant > >> >> >> >>>> in time - allowing a universal sorting strategy. But a > LocalDate is > >> >> >> >>>> not an instant-in-time. > >> >> >> >>>> > >> >> >> >>>> A LocalDate is intentionally oblivious of the timezone; as > the > >> >> >> >>>> javadoc > >> >> >> >>>> states, it's useful for birthdays, i.e. symbolic occurrences > and > >> >> >> >>>> potentially legal matters which don't fit into a universal > sorting > >> >> >> >>>> model but rather with the local political scene - we would > need the > >> >> >> >>>> combo {LocalDate, ZoneId} provided to be able to allow > sorting > >> >> across > >> >> >> >>>> different LocalDate - or simply assume that they are all > referring > >> >> to > >> >> >> >>>> the same Zone. > >> >> >> >>>> > >> >> >> >>> > >> >> >> >>> Right, I had the latter in mind and would use UTC for that > purpose. > >> >> >> >>> > >> >> >> >>>> > >> >> >> >>>> I think that if the user is using a LocalDate type, he's > implicitly > >> >> >> >>>> hinting that the timezone is not relevant for the practical > use > >> >> >> >>>> (possibly even wrong); the most faithful representation > would be > >> >> the > >> >> >> >>>> string form in ISO standard format or to encode the > day,month,year > >> >> as > >> >> >> >>>> independent fields? This last detail depends on how it would > be > >> >> more > >> >> >> >>>> efficient to store & query; probably the String format > YYYYMMDD > >> >> would > >> >> >> >>>> be the most efficient internal representation to allow also > correct > >> >> >> >>>> sorting. > >> >> >> >>>> > >> >> >> >>>> I wouldn't use NumericField(s) in this case, as they are more > >> >> >> >>>> effective only with larger ranges, while MM and DD are very > short; > >> >> >> >>>> not > >> >> >> >>>> sure if it's worth splitting the year as a NumericField > either, as > >> >> >> >>>> the > >> >> >> >>>> values will likely be strongly clustered in the same range of > >> >> "recent > >> >> >> >>>> years" - although that might depend on the application but it > >> >> doesn't > >> >> >> >>>> seem worth the complexity, so I'd index & store as a String > >> >> YYYYMMDD. > >> >> >> >>>> > >> >> >> >>> > >> >> >> >>> Agreed that this makes most sense, given the "symbolic" > nature of > >> >> >> >>> LocalDate. > >> >> >> >>> > >> >> >> >>> What would you do though in case of the following: > >> >> >> >>> > >> >> >> >>> @DateBridge > >> >> >> >>> LocalDate myDate; > >> >> >> >>> > >> >> >> >>> encoding() defaults to NUMERIC, so would you a) raise an > error, or > >> >> b) > >> >> >> >>> ignore encoding() for LocalDate and friends? Both seem not > right to > >> >> >> >>> me. I > >> >> >> >>> think there is nothing wrong with using NUMERIC encoding > per-se for > >> >> >> >>> these > >> >> >> >>> types. We may recommend STRING but if NUMERIC really is what > a user > >> >> >> >>> wants I > >> >> >> >>> would let them do so. > >> >> >> >>> > >> >> >> >>>> > >> >> >> >>>> -- Sanne > >> >> >> >>>> > >> >> >> >>>> > >> >> >> >>>> On 5 August 2015 at 11:10, Gunnar Morling < > gun...@hibernate.org> > >> >> >> >>>> wrote: > >> >> >> >>>> > Hi, > >> >> >> >>>> > > >> >> >> >>>> > What's the motivation for using a different representation > in > >> >> that > >> >> >> >>>> case? > >> >> >> >>>> > > >> >> >> >>>> > For the sake of consistency, I'd use milli seconds since > >> >> 1970-01-01 > >> >> >> >>>> across > >> >> >> >>>> > the board. Otherwise it'll be more difficult to compare > fields > >> >> >> >>>> > created > >> >> >> >>>> from > >> >> >> >>>> > properties of different date types. > >> >> >> >>>> > > >> >> >> >>>> > --Gunnar > >> >> >> >>>> > > >> >> >> >>>> > > >> >> >> >>>> > 2015-08-04 19:49 GMT+02:00 Davide D'Alto < > dav...@hibernate.org>: > >> >> >> >>>> > > >> >> >> >>>> >> Hi, > >> >> >> >>>> >> I started to work on the creation of the bridges for the > classes > >> >> >> >>>> >> in > >> >> >> >>>> the > >> >> >> >>>> >> java.time package. > >> >> >> >>>> >> > >> >> >> >>>> >> I was wondering if we want to convert the values to long > using > >> >> the > >> >> >> >>>> existing > >> >> >> >>>> >> approach we have now for java.util.Date. > >> >> >> >>>> >> > >> >> >> >>>> >> In Hibernate Search a java.util.Date is converted into a > long > >> >> that > >> >> >> >>>> >> represents the number of milliseconds since January 1, > 1970, > >> >> >> >>>> >> 00:00:00 > >> >> >> >>>> GMT > >> >> >> >>>> >> using getTime(). > >> >> >> >>>> >> > >> >> >> >>>> >> The same value can be obtain from a java.time.LocaDate > via: > >> >> >> >>>> >> > >> >> >> >>>> >> long epochMilli = date.atStartOfDay( > ZoneOffset.UTC > >> >> >> >>>> >> ).toInstant().toEpochMilli(); > >> >> >> >>>> >> > >> >> >> >>>> >> LocalDate has a method that returns the same value > expressed in > >> >> >> >>>> number of > >> >> >> >>>> >> days: > >> >> >> >>>> >> > >> >> >> >>>> >> long epochDay = date.toEpochDay(); > >> >> >> >>>> >> > >> >> >> >>>> >> > >> >> >> >>>> >> I would use the second approach > >> >> >> >>>> >> > >> >> >> >>>> >> Davide > >> >> >> >>>> >> _______________________________________________ > >> >> >> >>>> >> hibernate-dev mailing list > >> >> >> >>>> >> hibernate-dev@lists.jboss.org > >> >> >> >>>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev > >> >> >> >>>> >> > >> >> >> >>>> > _______________________________________________ > >> >> >> >>>> > hibernate-dev mailing list > >> >> >> >>>> > hibernate-dev@lists.jboss.org > >> >> >> >>>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev > >> >> >> >>>> > >> >> >> >>> > >> >> >> >>> > >> >> >> >> > >> >> >> > _______________________________________________ > >> >> >> > hibernate-dev mailing list > >> >> >> > hibernate-dev@lists.jboss.org > >> >> >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev > >> >> >> _______________________________________________ > >> >> >> hibernate-dev mailing list > >> >> >> hibernate-dev@lists.jboss.org > >> >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev > >> >> > > >> >> > > >> >> _______________________________________________ > >> >> hibernate-dev mailing list > >> >> hibernate-dev@lists.jboss.org > >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev > >> >> > >> > _______________________________________________ > >> > hibernate-dev mailing list > >> > hibernate-dev@lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev > >> _______________________________________________ > >> hibernate-dev mailing list > >> hibernate-dev@lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/hibernate-dev > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev > _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev