Re: [HACKERS] Protocol buffer support for Postgres
On Tue, Apr 26, 2016 at 2:40:49PM -0700, David Fetter wrote: > Should we see about making a more flexible serialization > infrastructure? What we have is mostly /ad hoc/, and has already > caused real pain to the PostGIS folks, this even after some pretty > significant and successful efforts were made in this direction. Hi all. Is anybody working on this right now? I would like to pick this task for the summer. First of all, what do you think about what David said? Should we try and design a generic infrastructure for similar serialization datatypes? If so, will we need to refactor some pieces from the JSON/XML implementation? I looked over the code and it seems nicely decoupled, but I am not sure what this would involve. I've done this before for MySQL[1] (not yet completed), but I'd love to try it for PostgreSQL too. On Tue, Apr 26, 2016 at 11:23:11AM -0700, José Luis Tallón wrote: > Have you investigated JSONB vs ProtoBuf space usage ? >(the key being the "B" -- Postgres' own binary JSON > implementation) This is something I can further investigate, but another (possibly major) benefit of the Protocol Buffers over JSON is that they *still* have a schema. I think they combine advantages from both structured and schemaless data. My best guess is that we shouldn't focus on abstracting *any* serialization paradigm, but only the ones that have a schema (like Thrift or Protocol Buffers). Speaking of schemas, where is the best place to keep that? For MySQL I opted for a plain text file similar to .trg files (the ones used by MySQL for keeping triggers). I'd love to talk more about this. Thank you. Flavius Anton [1] https://github.com/google/mysql-protobuf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protocol buffer support for Postgres
On Thu, Jun 23, 2016 at 1:50 PM, Greg Stark wrote: > On Thu, Jun 23, 2016 at 8:15 PM, Flavius Anton wrote: >> >> I'd love to talk more about this. > > I thought quite a bit about this a few years ago but never really > picked up it to work on. > > Another option would be to allow the output of your select query to be > read in a protobuf. That might be a feature to add in libpq rather > than in the server, or perhaps as a new protocol feature that libpq > would then just switch to which might make it easier to use from other > languages. That might make it easier to use Postgres as a data store > for an environment where everything is in protobufs without losing the > full power of SQL schemas in the database. I agree on this one, I think it's the most /natural/ way of doing things. > As an aside, have you seen Cap’n Proto? It looks more interesting to > me than protobufs. It fixes a lot of the corner cases where protobufs > end up with unbounded computational complexity and seems a lot > cleaner. I've seen it around for quite some time, but my fear is that it is not (yet?) widely adopted. Protocol Buffers themselves are not that popular, let alone Cap'n Proto. By the way, there's also the newer FlatBuffers[1] project, from Google too. They seem a lot like Cap'n Proto, though. I think it's doable to provide some sort of abstraction for these protocols in PostgreSQL, so that it can support all of them eventually, with minimum effort for adding a new one. However, I am skeptical about the practical advantages of having /all/ of them supported. -- Flavius [1] https://github.com/google/flatbuffers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protocol buffer support for Postgres
On Thu, Jun 23, 2016 at 2:54 PM, Flavius Anton wrote: > On Thu, Jun 23, 2016 at 1:50 PM, Greg Stark wrote: >> On Thu, Jun 23, 2016 at 8:15 PM, Flavius Anton wrote: >>> >>> I'd love to talk more about this. >> >> I thought quite a bit about this a few years ago but never really >> picked up it to work on. Any other thoughts on this? My guess is that it might be an important addition to Postgres that can attract even more users, but I am not sure if there's enough interest from the community. If I want to pick this task, how should I move forward? Do I need to write a design document or similar or should I come up with a patch that implements a draft prototype? I am new to this community and don't know the code yet, so I'd appreciate some guidance from an older, more experienced member. Thanks. -- Flavius -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protocol buffer support for Postgres
On Fri, Jun 24, 2016 at 11:35 AM, Álvaro Hernández Tortosa wrote: > > > On 24/06/16 14:23, Flavius Anton wrote: >> >> On Thu, Jun 23, 2016 at 2:54 PM, Flavius Anton >> wrote: >>> >>> On Thu, Jun 23, 2016 at 1:50 PM, Greg Stark wrote: >>>> >>>> On Thu, Jun 23, 2016 at 8:15 PM, Flavius Anton >>>> wrote: >>>>> >>>>> I'd love to talk more about this. >>>> >>>> I thought quite a bit about this a few years ago but never really >>>> picked up it to work on. >> >> Any other thoughts on this? My guess is that it might be an important >> addition to Postgres that can attract even more users, but I am not >> sure if there's enough interest from the community. If I want to pick >> this task, how should I move forward? Do I need to write a design >> document or similar or should I come up with a patch that implements a >> draft prototype? I am new to this community and don't know the code >> yet, so I'd appreciate some guidance from an older, more experienced >> member. >> >> > > Other than protobuf, there are also other serialization formats that > might be worth considering. Not too long ago I suggested working > specifically on serialization formas for the json/jsonb types: > https://www.postgresql.org/message-id/56CB8A62.40100%408kdata.com I believe > this effort is on the same boat. Sure, there are a bunch of these, some of the most popular being: * Cap'n Proto * Flatbuffers * Thrift A longer list is already made here[1]. Meanwhile, I came across another interesting idea. What if, for starters, we don't introduce a completely new serialization format, like Protocol Buffers, but rather make the JSON support more stronger and interesting. What I am thinking is to have a JSON schema (optionally) associated with a JSON column. In this way, you don't have to store the keys on disk anymore and also you'd have your database check for JSON validity at INSERT time. I think there are two big advantages here. What do you think about this one? -- Flavius [1] https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers