On Sunday, January 19, 2014 12:48:36 PM UTC-6, Brian Craft wrote: > > That helps, thanks. It's still unclear to me that this is important enough > to worry about. What application or service is hindered by string encoding > a date in JSON? An example would really help. It's not compelling to assert > or imagine some hypothetical application that benefits from knowing a field > is a date without having any other knowledge of it. I would guess that > cases where this matters are vanishingly few. >
This isn't dates, but it's along the same lines: I'm working on the central hub of a communications distributor/router/thing which has to deal with a wide range of diverse clients that each speak whatever "format" was the most expedient for whoever wrote that piece. The closest thing that we have to a standard is "we mostly want to use JSON." A huge chunk of our messages include UUIDs. This left me with two real options: 1. Special case every incoming message, based on what I know about the sender, and convert the fields that I know are supposed to represent a UUID (based on extremely informal verbal "specs") as a message is read 2. Take the generic, weakly coupled approach: pass every incoming message through a parser that converts every value string into a UUID (if that conversion is possible). The pain here could be alleviated with a formalized schema, but we're working too fast and furious for that. We tossed out the last one of those we had almost 2 months ago. Dates really present the same challenge, but everything that's using those is tied to a relational database. So there, at least, we're forced to stick to something fairly standardized. (Though I *do* have another conversion function for converting those messages, which tends to change about once a week when some other developer decides to change column names...but that's a different story). FWIW, James > > On Sunday, January 19, 2014 9:03:53 AM UTC-8, jonah wrote: >> >> I read these self-describing, extensible points in the context of EDN, >> which has a syntax/wire format for some types- maps, strings, etc- and also >> has an extensibility syntax: >> >> #myapp/Person {:first "Fred" :last "Mertz"} >> >> These tagged elements are "extensions" because they allow values of types >> not known to EDN to be included in the stream, and are "self-describing" in >> two senses: >> >> * if a wire format reader does know how to create a myapp/Person{}, that >> blob of data contains all the information needed to do so >> * if a wire format reader doesn't known how to create a myapp/Person, it >> can still read past this particular element in the stream, because tags >> have a defined envelope, so a reader can figure out where data comprising >> this element ends >> >> The JSON example is mostly about the "extensibility" attribute. JSON's >> format natively supports some types (like strings) but not others (like >> dates), and for those others, JSON's format does not include a way to >> "bucket" or "envelope" data comprising those unknown types. So JSON is not >> extensible. >> >> The google example is mostly about the "self-describing" attribute, and >> to my mind is more accurately framed as a statement about the Internet as a >> whole. Hypothetically, if all data exchange occurred using data formats >> whose details were private arrangements between writers and readers- for >> instance, all servers only spoke ProtocolBuffers and used a different >> schema for each client- there would be no Internet at all, much less a >> google who as a third party is able to broadly read and understand data >> made available by servers. (Or, to your point, any ability to parse >> anything useful from a server data stream by clients lacking knowledge of >> the schema would be at best be inferential and heuristic- possible, but >> infeasible on a large scale.) >> >> With all that said- my read is that Rich bundled those two points >> together in the JSON date example- JSON doesn't have an extensibility >> syntax to support dates, but people still have to transmit dates over JSON, >> so how do they do that? One way is by adopting a "convention", which in >> some ways is better than an out of band schema, because, as you say, a >> convention gives a reader additional information to heuristically interpret >> the stream, but in other ways is worse because it isn't consistent- some >> people will want date fields to look like "dateModified", others will want >> "modifiedDate", and others use "modificationDatetime". >> >> So in a broad sense, it is not desirable to use a data format that does >> not include an extensibility capability which itself is self-describing, >> because a format that lacks extensibility creates a combinatorial explosion >> in conventions to convey values not known to the format, and extensions >> that are not self-describing require out of band agreements between readers >> and writers that can preclude the scalable third-party interoperability >> that is so important to the Internet. >> >> Hope that helps. >> >> >> On Sat, Jan 18, 2014 at 6:08 PM, Brian Craft <craft...@gmail.com> wrote: >> >>> Ok, so consider a different system (besides google) that handles the >>> JSON example. If it has no prior knowledge of the date field, of what use >>> is it to know that it's a date? What is a situation where a system reading >>> the JSON needs to know a field is a date, but has no idea what the field is >>> for? >>> >>> >>> On Saturday, January 18, 2014 1:27:31 PM UTC-8, Jonas wrote: >>>> >>>> IIRC in that particular part of the talk he was specifically talking >>>> about (non-self describing) protocol buffers and not JSON. >>>> >>>> On Saturday, January 18, 2014 10:00:09 PM UTC+2, Brian Craft wrote: >>>>> >>>>> Regarding Rich's talk (http://www.youtube.com/watch?v=ROor6_NGIWU), >>>>> can anyone explain the points he's trying to make about self-describing >>>>> and >>>>> extensible data formats, with the JSON and google examples? >>>>> >>>>> He argues that google couldn't exist if the web depended on >>>>> out-of-band schemas. He gives as an example of such a schema a JSON >>>>> encoding where an out-of-band agreement is made that field names with >>>>> substring "date" refer to string-encoded dates. >>>>> >>>>> However, this is exactly the sort of thing google does. It finds >>>>> dates, and other data types, heuristically, and not through the formats >>>>> of >>>>> the web being self-describing or extensible. >>>>> >>>>> >>>>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To post to this group, send email to clo...@googlegroups.com >>> Note that posts from new members are moderated - please be patient with >>> your first post. >>> To unsubscribe from this group, send email to >>> clojure+u...@googlegroups.com >>> For more options, visit this group at >>> http://groups.google.com/group/clojure?hl=en >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Clojure" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to clojure+u...@googlegroups.com. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >> >> -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.