Re: Hardware for writing/updating 12,000,000 rows per hour

2019-07-27 Thread Neil
eting for the one resource.  With 3 
drives, depending on the I/O architecture (not a laptop) you can probably get 3 
times the I/O throughput, or 40+MB/sec on each HDD in our contrived example.

The other major help is memory. If you can fit the whole database in memory or 
at least the part that is being used regularly, then slow I/O from an HDD is 
not so important for update operations.  But again that depends on the data 
access patterns, if you are updating the same tuple over and over then it can 
stay in memory most of the time and can reduce database HDD I/O.  You still 
have WAL I/O.  Lets say this is about 39GB/hour (10.9MB/sec * 3600).  So with 
39 GB of data to hold most of the database in memory, would probably require 
48GB or 64GB of main memory.  Of course there are many reasons why this would 
work with less or need more. Additional memory might allow you to continue to 
use HDDs depending on access patterns.  Sometimes more memory is cheaper than 
more or faster disks.

Now about SDDs.  SDDs should easily give you 6 to 10 times the throughput, 
provided the computer is designed for SDDs.  This means that you might, based 
on the back of hand calcs here, get by with one SDD.  However from a 
reliability standpoint I don’t think anyone would recommend that.

The other issue you need to consider is how many CPU cores you should have.  
This won’t effect I/O throughput, but it will determine how many simultaneous 
connections can be processing at one time.  It might be that CPU starvation is 
limiting the data rate of your current hardware.  So if you have more CPUs the 
sustained or peak data rates might go up, depending on how you calculated the 
’12,000,000 writes/updates per hour’.

None of these ideas here consider reliability which would determine whether the 
3 drives are bare, mirrored, raid, or jbods.  Also note that any form of raid 
can reduce the throughput and have other reliability problems if not correctly 
engineered.  Data loss is a completely different set of issues and can be 
handled many ways.

Raid may provide reliability if your storage fails, but may not provide 
reliability to the customer if the computer itself fails, the power supply 
fails, or the network fails.  If customer reliability is critical, then I would 
expect you to have duplicate hardware with failover.  In this case, for 
example, it is not completely crazy to run the OS drive as a single bare drive, 
no raid.  If it fails, the system just switches to the failover hardware while 
you fix it.  If you cannot afford this type of redundancy, then you might have 
a single system with redundant power suppliers, mirrored drives, etc.  It all 
depends on your risk issues, the amount of money you have to spend, and the 
technical knowledge you have to manage the system.  This last part is very 
important when a failure occurs.  You don’t want to have to learn while you are 
in a failure or recovery situation. The more complex your system is to support 
high availability the more knowledge you need.

If you are unsure of these issues, which are pretty basic system admin and 
hardware design issues, then you should get someone with experience to help you.

Neil
https://www.fairwindsoft.com <https://www.fairwindsoft.com/>








Re: How do I create a Backup Operator account ?

2019-07-29 Thread Neil


> On Jul 29, 2019, at 7:49 PM, Marcos Aurelio Nobre  
> wrote:
> 
> Hi all.
> 
> I need to create a PostgreSQL account , that only do a Backup and Restore 
> operations over a PGSQL Database .
> 
> My main difficulty is to only assign Backup and Restore capability to one 
> user account, unlike the ability to perform DQL and DML operations for 
> ordinary user accounts. 

I’m not sure I understand what you are really trying to do.  If this 
Backup/Restore users is going to backup and restore the complete database then 
they will have to have essentially superuser permissions.  Otherwise they will 
not have access to all of the data.  They will have to have permission for DDL 
and DML as the restore has the potential for dropping and creating a bunch of 
objects, and at least DELETE, TRUNCATE, COPY and/or INSERT data.

What are you trying to protect against?

> 
> In POSTGRESQL there is no backup / restore privilege because these operations 
> are performed by server specific utilities.
> 
> One possibility that I thought would be:
> 
> I was wondering if it was possible to configure the pgAdmin4 menus to be 
> available for a particular login, only Backup & Restore items. But I'm not 
> sure how to associate a bank login account with a specific menu setting of a 
> client tool.
> 
> Could someone give me any ideas or directions?

Neil
https://www.fairwindsoft.com






Re: CVE-2018-1058

2019-10-16 Thread Neil


> On Oct 16, 2019, at 2:55 PM, Ron  wrote:
> 
> On 10/16/19 2:40 PM, Adrian Klaver wrote:
>> On 10/14/19 3:27 PM, Lizeth Solis Aramayo wrote:
>>> Good afternoon,
>>> 
>>> I am working with postgresql 9.6.15 and I need to restore in a 9.6.5 
>>> version,  I got an error, and  I found this page to install a patch
>> 
>> What commands did you use to dump the 9.6.15 version and restore to the 
>> 9.6.5 version?
>> 
>> Which versions software did you use to do above?
>> 
>> What was the error?
>> 
>> The reason why you can't upgrade the 9.6.5 to 9.6.15?
> 
> There are a thousand and one -- nay, a million and ten -- crazy reasons why 
> software can't be upgraded. (Mostly due to "Process" in large organizations.) 
>  It’s best just to swallow “why can't you upgrade" and answer the question.

Well, I don’t know any organization where applying a one time patch is safer, 
less bug prone, and cheaper than doing a well tested point upgrade for 
postgres.  So the question seems very relevant to me.

In addition, if the company is not going to keep updated to latest point 
upgrades (meaning they are not applying security and bug fixes) then why would 
they expect free support.  If they want to play with fire by applying 
individual patches, then, from my standpoint they are on their own.  The 
decision not to do regular maintenance has consequences and individual patches 
are not guaranteed to be bug free for the system. While the developers try not 
to miss dependencies, the OP should understand that the Postgres build farm 
will never have run a configuration with only their individual patch applied 
against an older system.  Sounds really risky to me. 

So the reason to ask the question is to make sure the OP understands the high 
level of risk they are undertaking.








Re: GSSAPI Authentication for pgadmin4 macOS client

2020-10-27 Thread Neil

> On Oct 27, 2020, at 12:23 PM, Tom Lane  wrote:
> 
> Sean McDaniel  writes:
>> I'm using pgadmin4 on a macOS system and I'm having trouble connecting to a
>> server of interest. The "Connect to Server" dialog box has a warning:
>> "GSSAPI authentication not supported". I cannot proceed beyond that point.
> 
>> It seems to be a GSSAPI issue. I have postgres installed via homebrew and
>> it supports GSSAPI.  I can connect directly to the server using psql (and
>> pgcli) from the command line without issue.
> 
> A quick look through our source code finds that error string only in
> libpq; it indicates that libpq was built without GSSAPI support.
> 

On my Mac pgadmin4 seems to use its own libpq.

You can find it at: /Applications/pgAdmin\ 
4.app/Contents/Frameworks/libpq.5.dylib 

> Since you say that you can connect with psql, it sounds like there is
> more than one copy of libpq.dylib on your machine and not all of them
> support GSSAPI.  You could use "otool -L" on psql to verify which
> libpq it's linked to.  Perhaps the same thing will work on pgadmin,
> but I'm unsure which file to check for that.  Anyway the bottom line
> here is almost certainly that pgadmin isn't using your homebrew
> build of libpq, but some other one.
> 
>> I have tried to get pgadmin4 to use these binaries by putting
>> /usr/local/bin under Preferences > Paths > Binary Paths for "PostgreSQL
>> Binary Path" but that doesn't fix the issue, even after a pgadmin4 restart.
> 
> AFAIK that's only likely to help when pgadmin invokes pg_dump or
> some other tool, not for its own connections to the server.
> 
>   regards, tom lane
> 
> 



DBAs using PostgreSQL sought for paid research study

2021-06-30 Thread Neil Turner
Hi,

Redgate software<https://www.red-gate.com/> a leading provider of database 
development and monitoring tools are carrying out some research to better 
understand how organisations monitor their PostgreSQL databases. If you're a 
DBA (or similar) looking after production PostgreSQL databases we'd like to 
speak to you. A $150 Amazon voucher (or equivalent in local currency) is 
available for all participants taking part in one of the 60 min research calls.

If you'd like to take part in this research enter your details at: 
https://redgate.research.net/r/postgresresearch

Please note your details will not be used for any other purpose. There are a 
limited number of research calls so we can't guarantee that everyone will be 
invited to take part in one.

Thanks,

Neil

Neil Turner
Lead product designer, Redgate
[signature_789456171]<https://www.redgate.com/>

Redgate respects your privacy. To find out how we handle your personal data, 
see our privacy policy at https://www.red-gate.com/privacy. Use of our software 
is subject to our license agreement, https://www.red-gate.com/eula.
Our head office is Newnham House, Cambridge Business Park, CB4 0WZ, United 
Kingdom, and we are registered in the UK as company #3857576.


Re: The tragedy of SQL

2021-09-14 Thread FWS Neil


> On Sep 14, 2021, at 11:10 AM, Michael Nolan  wrote:
> 
> I started programming in 1967, and over the last 50+ years I've programmed in 
> more languages than I would want to list.  I spent a decade writing in 
> FORTRAN on a GA 18/30 (essentially a clone of the IBM 1130) with limited 
> memory space, so you had to write EFFICIENT code, something that is a bit of 
> a lost art these days.  I also spent a decade writing in COBOL.
> 
> I've not found many tasks that I couldn't find a way to write in whatever 
> language I had available to write it in.  There may be bad (or at least 
> inefficient) languages, but there are lots of bad programmers.  

Yep, me too.  I would say that SQL has not achieved its design goals yet.  The 
original concept was to write what you want to achieve and have the server 
figure out the best way to get at it.  

What people hate about SQL is that the programmer has to optimize SQL to get 
acceptable performance.  And the optimization is different for every 
implementation.  I think SQL has not hit its stride yet.  When the common $1000 
server has 1024+ CPUs and 1+TB memory, and SQL implementations have adopted 
good multithreading architecture with access to 1024+ CPU dedicated AI engines, 
etc. a lot of the crap associated with performant SQL will go away.

At this point, I think it will be smart to strip out implementation details 
that have made it into the SQL syntax.  There will no longer be a need for it. 
This will make the SQL language simpler and easier to use, understand, and 
reason about.  

Of course, that might not happen until my grandchildren are retired and in a 
nursing home.  But who knows, stranger things have happened.

Neil
www.fairwindsoft.com



Re: The tragedy of SQL

2021-09-15 Thread FWS Neil


> On Sep 15, 2021, at 2:44 PM, Merlin Moncure  wrote:
> 
> On Tue, Sep 14, 2021 at 3:16 PM FWS Neil  <mailto:n...@fairwindsoft.com>> wrote:
>> 
>>> On Sep 14, 2021, at 11:10 AM, Michael Nolan  wrote:
>>> 
>>> I started programming in 1967, and over the last 50+ years I've programmed 
>>> in more languages than I would want to list.  I spent a decade writing in 
>>> FORTRAN on a GA 18/30 (essentially a clone of the IBM 1130) with limited 
>>> memory space, so you had to write EFFICIENT code, something that is a bit 
>>> of a lost art these days.  I also spent a decade writing in COBOL.
>>> 
>>> I've not found many tasks that I couldn't find a way to write in whatever 
>>> language I had available to write it in.  There may be bad (or at least 
>>> inefficient) languages, but there are lots of bad programmers.
>> 
>> Yep, me too.  I would say that SQL has not achieved its design goals yet.  
>> The original concept was to write what you want to achieve and have the 
>> server figure out the best way to get at it.
>> 
>> What people hate about SQL is that the programmer has to optimize SQL to get 
>> acceptable performance.  And the optimization is different for every 
>> implementation.  I think SQL has not hit its stride yet.  When the common 
>> $1000 server has 1024+ CPUs and 1+TB memory, and SQL implementations have 
>> adopted good multithreading architecture with access to 1024+ CPU dedicated 
>> AI engines, etc. a lot of the crap associated with performant SQL will go 
>> away.
>> 
>> At this point, I think it will be smart to strip out implementation details 
>> that have made it into the SQL syntax.  There will no longer be a need for 
>> it. This will make the SQL language simpler and easier to use, understand, 
>> and reason about.
> 
> I think you ought to recognize that many people on this list make
> money directly from managing that complexity :-).
> 

I did not intend to disparage anyone.  People, including myself, make money 
when they provide value and there is certainly value here.

But, I am not sure I understand your inference.  Are you saying (and I am not 
implying you are) that PostgreSQL does not progress in line with the original 
SQL goals of simplifying data access because people are making money off of the 
current complexity?

My only point was that SQL was originally conceived to free the programmer from 
implementation details.  Today, after 47 years, it still has not achieved that 
goal.  As computers get more powerful, of course, they will process more data, 
but they also have the option of moving more of the implementation decision 
making away from the programmer to let the machine figure out the best way to 
handle the request.  Therefore, I do not think SQL has hit its stride yet.

Programmers create a dozens of new languages every 10 years or so.  Only a few 
have stood the test of time.  SQL is one of those.  For all its faults, it 
still is amazingly powerful.

Neil
www.fairwindsoft.com



Re: Timestamp with vs without time zone.

2021-09-21 Thread FWS Neil


> On Sep 21, 2021, at 12:34 PM, Dave Cramer  wrote:
> On Tue, 21 Sept 2021 at 13:20, Peter J. Holzer  <mailto:hjp-pg...@hjp.at>> wrote:
> On 2021-09-21 20:50:44 +1200, Tim Uckun wrote:
> > That's all true and I won't argue about the madness that is timezones
> > in the world. I am simply thinking it would be some sort of a struct
> > like thing which would store the numerical value of the time stamp and
> > also the time zone that time was recorded in.  Presumably everything
> > else is an insane calculation from there. What was the offset on that
> > day? I guess it depends on the daylight savings time. What would the
> > conversion to another time zone be? That would depend on the DST
> > settings on that day in both places.
> 
> Yes, but HOW IS THAT TIME ZONE STORED? 
> 
> As a user you can say "I don't care, just make it work somehow".
> 
> But as a developer you have to decide on a specific way. And as a
> database developer in particular you would have to choose a way which
> works for almost everybody.
> 
> And that's the problem because ...
> 
> > Mankind can't agree on what side of the road to drive on, what the
> > electrical voltage should be at the wall, what those plugs should be,
> > how you should charge your phone or anything else for that matter
> 
> ... people have different needs and it would be difficult to satisfy
> them all.
> 
> Simply storing an offset from UTC is simple, fast, doesn't take much
> space - but it would be almost as misleading as the current state. A
> simple offset is not a time zone.
> 
> Storing the IANA timezone names (e.g. 'Europe/Vienna') would store an
> identifier for what most people think of as a time zone - but that takes
> a lot of space, it needs a lookup for almost any operation and worst of
> all, you couldn't index such a column (at least not with a btree index)
> because the comparison functions aren't stable.
> 
> You could use a numeric indentifier instead of the name, that would take
> less space but wouldn't solve the other problems (and add the problem
> that now you have just added another mapping which you need to maintain).
> 
> There are other ways, but I'm sure they all have some pros and some
> cons. None will be perfect.
> 
> So I don't think there is an obvious (or even non-obvious, but clearly
> good) way for the PostgreSQL developers to add a real "timestamp with
> timezone" type.
> 
> As an application developer however, you can define a compound type (or
> just use two or three columns together) which satisfies the needs of
> your specific application.
> 
> > It's just that the phrase "timestamp with time zone" would seem to
> > indicate the time zone is stored somewhere in there.
> 
> I absolutely agree. Calling a type which doesn't include a timezone
> "timestamp with timezone" is - how do I put this? - more than just
> weird. "timestamp without timezone" should be called "local timestamp
> with unspecified timezone" and "timestamp with timezone" should be
> called "global timestamp without timezone". However, those aren't SQL
> names.
> 
> 
> I would say this is a perspective thing. It's a timestamp with a time zone 
> from the client's perspective.

A timestamp cannot have a time zone and be a valid timestamp.

Let me explain.

A timestamp is a single time that exists in the world.  For example March 1, 
2021, 4:15 am is a timestamp.

If you add a time zone (other than UTC) then a time stamp is not always a 
single time that exists in the world.

For example in the spring using time zone American/Chicago, on April 14, 2021 
the time zone time changes at 2am to become 3am.  The time April 14, 2021, 2:30 
am simply does not exists.  And therefore cannot be a timestamp.  Apple’s APIs 
will by default automatically change 2:30am to 3:00am.  Is that correct?  Or 
should it change to 3:30am?  Apple has the option for the latter, but the APIs 
don’t work.

In the fall it is even worse.  Using time zone America/Chicago, on November 7, 
2021, 1:30 am occurs twice.  That does not work as a timestamp.  Which one do 
you use, the early one or the late one.  Apple’s APIs give you a choice.

The point being that people do expect to see times in local time, but the only 
real timestamp is UTC and I can’t ever imagine a need to store time zone 
information related to a timestamp.  If you need to store the location that 
data originated from, then store the location or the Time Zone, but it should 
not be connected to the timestamp.  Location data is completely different than 
time data.

Neil
www.fairwindsoft.com