pgcrypto question

2019-10-07 Thread Erik Aronesty
Currently, it is my understanding that the pgcrypto library requires
the user to send a password or private key up to the server for
decryption.

Is there a notion of a client-side encrypt/decrypt plugin when doing a
postgres query?

For example, a user could query postgres, get back data of type
"encrypted", and a "libpq" plugin could decode/decrypt those columns
that are of data type "encrypted" in a manner transparent to the
user of the client

Obviously I could write this by intercepting the relevant libpq calls
using LD_PRELOAD or Microsoft's "Detours" ... but it there a better
way to do that?

- Erik




Re: pgcrypto question

2019-10-07 Thread Erik Aronesty
Good idea for "psycopg".  It would be easy for a POC, but I think the
only meaningful layer to operate at would be a libpq drop-in
replacement that intercepts PQgetvalue, PQprepare, PQexecParams,
PQexecPrepared ... etc.   That way odbc, python, node, etc would "just
work" as long as you used LD_PRELOAD appropriately.

I never like building things that way though... it would be super cool
if libpq optionally supported client-side plugins too, but maybe some
day if there are more use cases.

On Mon, Oct 7, 2019 at 2:08 PM Tomas Vondra
 wrote:
>
> On Mon, Oct 07, 2019 at 12:05:16PM -0400, Erik Aronesty wrote:
> >Currently, it is my understanding that the pgcrypto library requires
> >the user to send a password or private key up to the server for
> >decryption.
> >
>
> Correct. In the naive case the key is included in each SQL query, which
> does have various issues. Bruce Momjian has a nice extension that allows
> you to fix that by loading the key into backend memory:
>
>   http://momjian.us/download/pgcryptokey/
>
> >Is there a notion of a client-side encrypt/decrypt plugin when doing a
> >postgres query?
> >
> >For example, a user could query postgres, get back data of type
> >"encrypted", and a "libpq" plugin could decode/decrypt those columns
> >that are of data type "encrypted" in a manner transparent to the
> >user of the client
> >
> >Obviously I could write this by intercepting the relevant libpq calls
> >using LD_PRELOAD or Microsoft's "Detours" ... but it there a better
> >way to do that?
> >
>
> AFAIk that's usually done at the application level, i.e. the application
> is sending/receiving encrypted data, and the database simply sees bytea
> columns. I'm not aware of a driver doing that transparently, but it
> seems like an interesting idea - I wonder if it could be done e.g. in
> psycopg as an extension, or something like that.
>
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgcrypto question

2019-10-07 Thread Erik Aronesty
Actually I found a nice open source product  "Acra" ... seems to do
the whole thing via a proxy.  Now I need to see if I can customize the
encryption enough using a plugin (but at least I can fork it and start
from there).   A proxy encryption system seems to be the right call,
then all my client apps can stay the same.


On Mon, Oct 7, 2019 at 3:49 PM Tomas Vondra
 wrote:
>
> On Mon, Oct 07, 2019 at 02:51:30PM -0400, Erik Aronesty wrote:
> >Good idea for "psycopg".  It would be easy for a POC, but I think the
> >only meaningful layer to operate at would be a libpq drop-in
> >replacement that intercepts PQgetvalue, PQprepare, PQexecParams,
> >PQexecPrepared ... etc.   That way odbc, python, node, etc would "just
> >work" as long as you used LD_PRELOAD appropriately.
> >
>
> It's not clear to me how would that know which columns are encrypted,
> with what key, etc. Because those encrypted columns are essentially just
> regular bytea columns, so there's no easy way to distinguish them.
>
> I'm no psycopg2 expert, but it does have some infrastructure for casting
> PostgreSQL types to Python types, and I guess that could be used for the
> encryption.
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?

2019-12-15 Thread Erik Aronesty
You can always tweak fnv for whatever bite-size or bit size you want.
 Sometimes I know a little information about my data shape and make a
custom fnv that only looks at the first half for the last half of a string,
etc.

On Wed, Dec 11, 2019, 1:02 PM Erwin Brandstetter  wrote:

> Thanks for the suggestion. Seems like a good assumption and I have been
> using hashtext() in the past. But I am uncertain whether it is the best
> option.
>
> Guess Tom's warning in
> https://www.postgresql.org/message-id/9434.1568839...@sss.pgh.pa.us about
> portability only refers to hashtextextended() and friends not being there
> in Postgres 10 or older.
>
> But why are none of these functions documented? Does the project still not
> ...
>
> > want people to rely on them continuing to have exactly the current
> behavior.
>
> I am not complaining, maybe just nobody did the work. But it's also
> mentioned in this old thread, that hastext() changed in the past. Is all of
> that outdated and we are welcome to use those functions for indexing?
>
> https://www.postgresql.org/message-id/flat/24463.1329854466%40sss.pgh.pa.us#c18e62281dc78f6d64c1a4d41ab8569b
> 
>
> Filtering with amprocnum = 2 gets functions producing bigint in Postgres
> 11 or later.  Not sure about the exact meaning of amprocnum, manual says
> "Support function number".
>
> Remaining problem either way: no hash function returning bigint for
> Postgres 10.
>
> Regards
> Erwin
>
> On Tue, Dec 10, 2019 at 11:13 PM Laurenz Albe 
> wrote:
>
>> On Tue, 2019-12-10 at 22:11 +0100, Erwin Brandstetter wrote:
>> > I am looking for stable hash functions producing 8-byte or 4-byte
>> hashes from long text values in Postgres 10 or later.
>> >
>> > [...]
>> >
>> > There is an old post from 2012 by Tom Lane suggesting that hashtext()
>> and friends are not for users:
>> >
>> > https://www.postgresql.org/message-id/24463.1329854466%40sss.pgh.pa.us
>>
>> Changing a hash function would corrupt hash indexes, wouldn't it?
>>
>> So I'd expect these functions to be pretty stable:
>>
>> SELECT amp.amproc
>> FROM pg_amproc AS amp
>>JOIN pg_opfamily AS opf ON amp.amprocfamily = opf.oid
>>JOIN pg_am ON opf.opfmethod = pg_am.oid
>> WHERE pg_am.amname = 'hash'
>>   AND amp.amprocnum = 1;
>>
>> Or at least there would have to be a fat warning in the release notes
>> to reindex hash indexes.
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
>>