On Sunday, January 19, 2014 12:48:36 PM UTC-6, Brian Craft wrote:
>
> That helps, thanks. It's still unclear to me that this is important enough 
> to worry about. What application or service is hindered by string encoding 
> a date in JSON? An example would really help. It's not compelling to assert 
> or imagine some hypothetical application that benefits from knowing a field 
> is a date without having any other knowledge of it. I would guess that 
> cases where this matters are vanishingly few.
>

This isn't dates, but it's along the same lines:

I'm working on the central hub of a communications distributor/router/thing 
which has to deal with a wide range of diverse clients that each speak 
whatever "format" was the most expedient for whoever wrote that piece. The 
closest thing that we have to a standard is "we mostly want to use JSON."

A huge chunk of our messages include UUIDs. This left me with two real 
options:
1. Special case every incoming message, based on what I know about the 
sender, and convert the fields that I know are supposed to represent a UUID 
(based on extremely informal verbal "specs") as a message is read
2. Take the generic, weakly coupled approach: pass every incoming message 
through a parser that converts every value string into a UUID (if that 
conversion is possible).

The pain here could be alleviated with a formalized schema, but we're 
working too fast and furious for that. We tossed out the last one of those 
we had almost 2 months ago.

Dates really present the same challenge, but everything that's using those 
is tied to a relational database. So there, at least, we're forced to stick 
to something fairly standardized. (Though I *do* have another conversion 
function for converting those messages, which tends to change about once a 
week when some other developer decides to change column names...but that's 
a different story).

FWIW,
James



>
> On Sunday, January 19, 2014 9:03:53 AM UTC-8, jonah wrote:
>>
>> I read these self-describing, extensible points in the context of EDN, 
>> which has a syntax/wire format for some types- maps, strings, etc- and also 
>> has an extensibility syntax:
>>
>> #myapp/Person {:first "Fred" :last "Mertz"}
>>
>> These tagged elements are "extensions" because they allow values of types 
>> not known to EDN to be included in the stream, and are "self-describing" in 
>> two senses:
>>
>> * if a wire format reader does know how to create a myapp/Person{}, that 
>> blob of data contains all the information needed to do so
>> * if a wire format reader doesn't known how to create a myapp/Person, it 
>> can still read past this particular element in the stream, because tags 
>> have a defined envelope, so a reader can figure out where data comprising 
>> this element ends
>>
>> The JSON example is mostly about the "extensibility" attribute. JSON's 
>> format natively supports some types (like strings) but not others (like 
>> dates), and for those others, JSON's format does not include a way to 
>> "bucket" or "envelope" data comprising those unknown types. So JSON is not 
>> extensible. 
>>
>> The google example is mostly about the "self-describing" attribute, and 
>> to my mind is more accurately framed as a statement about the Internet as a 
>> whole. Hypothetically, if all data exchange occurred using data formats 
>> whose details were private arrangements between writers and readers- for 
>> instance, all servers only spoke ProtocolBuffers and used a different 
>> schema for each client- there would be no Internet at all, much less a 
>> google who as a third party is able to broadly read and understand data 
>> made available by servers. (Or, to your point, any ability to parse 
>> anything useful from a server data stream by clients lacking knowledge of 
>> the schema would be at best be inferential and heuristic- possible, but 
>> infeasible on a large scale.)
>>
>> With all that said- my read is that Rich bundled those two points 
>> together in the JSON date example- JSON doesn't have an extensibility 
>> syntax to support dates, but people still have to transmit dates over JSON, 
>> so how do they do that? One way is by adopting a  "convention", which in 
>> some ways is better than an out of band schema, because, as you say, a 
>> convention gives a reader additional information to heuristically interpret 
>> the stream, but in other ways is worse because it isn't consistent- some 
>> people will want date fields to look like "dateModified", others will want 
>> "modifiedDate", and others use "modificationDatetime".
>>
>> So in a broad sense, it is not desirable to use a data format that does 
>> not include an extensibility capability which itself is self-describing, 
>> because a format that lacks extensibility creates a combinatorial explosion 
>> in conventions to convey values not known to the format, and extensions 
>> that are not self-describing require out of band agreements between readers 
>> and writers that can preclude the scalable third-party interoperability 
>> that is so important to the Internet. 
>>
>> Hope that helps.
>>
>>
>> On Sat, Jan 18, 2014 at 6:08 PM, Brian Craft <craft...@gmail.com> wrote:
>>
>>> Ok, so consider a different system (besides google) that handles the 
>>> JSON example. If it has no prior knowledge of the date field, of what use 
>>> is it to know that it's a date? What is a situation where a system reading 
>>> the JSON needs to know a field is a date, but has no idea what the field is 
>>> for?
>>>
>>>
>>> On Saturday, January 18, 2014 1:27:31 PM UTC-8, Jonas wrote:
>>>>
>>>> IIRC in that particular part of the talk he was specifically talking 
>>>> about (non-self describing) protocol buffers and not JSON.  
>>>>
>>>> On Saturday, January 18, 2014 10:00:09 PM UTC+2, Brian Craft wrote:
>>>>>
>>>>> Regarding Rich's talk (http://www.youtube.com/watch?v=ROor6_NGIWU), 
>>>>> can anyone explain the points he's trying to make about self-describing 
>>>>> and 
>>>>> extensible data formats, with the JSON and google examples?
>>>>>
>>>>> He argues that google couldn't exist if the web depended on 
>>>>> out-of-band schemas. He gives as an example of such a schema a JSON 
>>>>> encoding where an out-of-band agreement is made that field names with 
>>>>> substring "date" refer to string-encoded dates.
>>>>>
>>>>> However, this is exactly the sort of thing google does. It finds 
>>>>> dates, and other data types, heuristically, and not through the formats 
>>>>> of 
>>>>> the web being self-describing or extensible.
>>>>>
>>>>>
>>>>> -- 
>>> -- 
>>> You received this message because you are subscribed to the Google
>>> Groups "Clojure" group.
>>> To post to this group, send email to clo...@googlegroups.com
>>> Note that posts from new members are moderated - please be patient with 
>>> your first post.
>>> To unsubscribe from this group, send email to
>>> clojure+u...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups.google.com/group/clojure?hl=en
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Clojure" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to clojure+u...@googlegroups.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to