Re: [GENERAL] Large data and slow queries

2017-04-27 Thread Samuel Williams
). > > *** We blogged yesterday about TimescaleDB's partitioning design choices in > more depth, if you are interested: > https://blog.timescale.com/time-series-data-why-and-how-to-use-a-relational-database-instead-of-nosql-d0cd6975e87c > > > On Thu, Apr 20, 2017 at 5:43

Re: [GENERAL] Large data and slow queries

2017-04-20 Thread Samuel Williams
Scott, Vick, the vast majority of the data is generic. But there are some specific events we need to look up quickly which are probably less than a few 100,000 records. We did evaluate partial indexes vs full indexes. The partial index speeds up our specific queries significantly while only taking

Re: [GENERAL] Large data and slow queries

2017-04-20 Thread Samuel Williams
such that there are no more than about 10 million rows per partition (I've >> done this by using a id % 100 computation). Maybe in your case it makes >> sense to partition it based on the "what" field, because it appears you are >> trying to do that with your partia

[GENERAL] cluster on brin indexes?

2017-04-19 Thread Samuel Williams
I see this, but no follow up: https://www.postgresql.org/message-id/CAEepm%3D2LUCLZ2J4cwPv5DisHqD9BE_AXnqHGqf0Tj-cvtiqVcQ%40mail.gmail.com So, is it possible or not? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresq

Re: [GENERAL] Large data and slow queries

2017-04-19 Thread Samuel Williams
Okay, so after changing longitude/latitude to float4, and re-organizing the table a bit, I got the query down from about 8 minutes to 40 seconds. The details are in the gist comments: https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121 Now, just need to get performance another 2 ord

Re: [GENERAL] Large data and slow queries

2017-04-19 Thread Samuel Williams
Ah right, yeah, it's insert only. So, it's never been vacuumed. On 20 April 2017 at 01:25, Stephen Frost wrote: > Greetings, > > * Samuel Williams (space.ship.travel...@gmail.com) wrote: >> Thanks for all the suggestions Stephen. >> >> > That explain a

Re: [GENERAL] Large data and slow queries

2017-04-19 Thread Samuel Williams
;m using the default 9.6 config, I thought that auto-vacuum was on by default? On 20 April 2017 at 00:48, Stephen Frost wrote: > Greetings, > > * Samuel Williams (space.ship.travel...@gmail.com) wrote: >> We want the following kinds of query to be fast: > > "kinds of q

Re: [GENERAL] Large data and slow queries

2017-04-19 Thread Samuel Williams
Martijn that is a good question. It's because we are only concerned with a subset of events for this index and this particular query. The query planner can recognise this and use the index correctly. By doing this, we reduce the size of the index significantly. In the best case, where we only wante

Re: [GENERAL] Large data and slow queries

2017-04-19 Thread Samuel Williams
ing the different indexes explicitly? The correlation between user_id and location... well, it's somewhat temporally related. On 19 April 2017 at 22:50, Alban Hertroys wrote: > >> On 19 Apr 2017, at 6:01, Samuel Williams >> wrote: >> >> Hi. >> >> We hav

Re: [GENERAL] Large data and slow queries

2017-04-19 Thread Samuel Williams
> Did that 50% performance gain come from just the datatype, or that fact that > the index became smaller? How would one measure this? On 19 April 2017 at 19:48, John R Pierce wrote: > On 4/19/2017 12:31 AM, vinny wrote: >> >> Given the number of records, my first thought was either partitionin

Re: [GENERAL] Large data and slow queries

2017-04-18 Thread Samuel Williams
16:42, John R Pierce wrote: > On 4/18/2017 9:01 PM, Samuel Williams wrote: >> >> We want the following kinds of query to be fast: >> >> SELECT ... AND (latitude > -37.03079375089291 AND latitude < >> -36.67086424910709 AND longitude > 174.6307139779924 AND lon

Re: [GENERAL] Large data and slow queries

2017-04-18 Thread Samuel Williams
Oh, I've also tried earth distance and ll_to_earth in a GIST index... it was slower that the BTREE index on a small subset of data in my tests. On 19 April 2017 at 16:01, Samuel Williams wrote: > Hi. > > We have 400,000,000 records in a table (soon to be 800,000,000), here >

[GENERAL] Large data and slow queries

2017-04-18 Thread Samuel Williams
Hi. We have 400,000,000 records in a table (soon to be 800,000,000), here is the schema (\d+) https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121 We want the following kinds of query to be fast: SELECT DISTINCT "user_event"."user_id" FROM "user_event" WHERE "user_event"."what" IN

Re: [GENERAL] Index size

2016-12-09 Thread Samuel Williams
Thanks Kevin, that makes sense. Yeah, I understand the architectural difference a bit more now. I also read that when you change a column which is not index, all the indexes for that row need to be updated anyway. Is that correct? On 7 December 2016 at 05:27, wrote: > Samuel Williams wr

Re: [GENERAL] Index size

2016-12-05 Thread Samuel Williams
Melvin, uh... I'm a software engineer... since when was it a problem to want to know how things work and why they are different? If you have nothing to contribute of a relevant technical nature, please don't reply, I'm really not interested.

Re: [GENERAL] Index size

2016-12-04 Thread Samuel Williams
Melvin, of course there are differences. However, I suspect there are at least SOME tangible differences which can be identified. On 4 December 2016 at 15:53, Melvin Davidson wrote: > > > On Sat, Dec 3, 2016 at 9:32 PM, Steve Atkins wrote: > >> >> > On Dec 3, 2016,

Re: [GENERAL] Index size

2016-12-04 Thread Samuel Williams
So, uh, my main question was, does MySQL add null values to an index, and is this different from Postgres. The schema is irrelevant, except that the column allows null values. I noticed when you create an index you can add a where clause. Could it be I should add WHERE the fields are not null? Her

Re: [GENERAL] Index size

2016-12-03 Thread Samuel Williams
Thanks everyone for your feedback so far. I've done a bit more digging: MySQL in MBytes (about 350 million rows): index_user_event_on_what_category_id_created_at_latlng | 22806.00 index_user_event_for_reporting | 18211.00 index_user_event_on_created_at | 9519.00 index_user_event_on_user_id | 6884

Re: [GENERAL] Index size

2016-11-30 Thread Samuel Williams
Is there any reason why for the same data set, and same indexes, that the data in postgres would be significantly larger than innodb/mariadb? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-gener

[GENERAL] Index size

2016-11-29 Thread Samuel Williams
I'd like to understand a bit more about indexes in PG. When I have a row in a table, and an index, say, for a single column, does that duplicate the entire row on disk? Or is there some kind of id lookup involved? Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To

Re: [GENERAL] initdb createuser commands

2016-11-10 Thread Samuel Williams
> Really? So naming them pg_initdb and pg_createdb would help to clarify their > use? Yes. > Perhaps you missed: https://www.postgresql.org/docs/9.6/static/app-pg-ctl.html I meant a man page that details the ENTIRE Postgres command line tools. > Command line aliases and other stuff I've been

Re: [GENERAL] initdb createuser commands

2016-10-31 Thread Samuel Williams
On 1 November 2016 at 06:50, Daniel Verite wrote: > Samuel Williams wrote: > >> John - that's an interesting example. If it's that easy, why isn't >> that the approach given in tutorials and other documentation? What was >> the motivation for the c

Re: [GENERAL] initdb createuser commands

2016-10-30 Thread Samuel Williams
Sorry, just to clarify, b "worst" I don't mean functionality, I mean the way the commands are named and organised. On 31 October 2016 at 13:07, Samuel Williams wrote: > Mike, I agree with "the postgres way of doing things". I'm suggesting that > >> th

Re: [GENERAL] initdb createuser commands

2016-10-30 Thread Samuel Williams
#x27;s worthy of some attention. On 31 October 2016 at 12:51, Mike Sofen wrote: > From: Samuel Williams Sent: Sunday, October 30, 2016 3:42 PM > As a community I'd think that having feedback from a new user would be > valuable since as you say, sometimes when you get ingrained i

Re: [GENERAL] initdb createuser commands

2016-10-30 Thread Samuel Williams
x27;d feel more confident to invest the time and effort required to improve things. But so far, I'm getting the opposite. Kind regards, Samuel On 31 October 2016 at 04:30, Adrian Klaver wrote: > On 10/30/2016 12:15 AM, Samuel Williams wrote: >> >> Adrian, I like the idea

Re: [GENERAL] initdb createuser commands

2016-10-30 Thread Samuel Williams
John - that's an interesting example. If it's that easy, why isn't that the approach given in tutorials and other documentation? What was the motivation for the createuser command? On 30 October 2016 at 20:20, John R Pierce wrote: > On 10/30/2016 12:15 AM, Samuel Williams w

Re: [GENERAL] initdb createuser commands

2016-10-30 Thread Samuel Williams
ly just my 2c, I'd be happy to submit a PR but can you confirm intention to work on it to acceptance? Otherwise I'm just wasting everyone's time including my own :) On 30 October 2016 at 13:31, Gavin Flower wrote: > On 30/10/16 11:25, John R Pierce wrote: >> >> On 1

Re: [GENERAL] initdb createuser commands

2016-10-29 Thread Samuel Williams
FYI, https://wiki.archlinux.org/index.php/PostgreSQL mentions initdb, createuser, createdb and several others. I think my suggestion is still relevant and something that would improve the system for new users :) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make change

Re: [GENERAL] initdb createuser commands

2016-10-29 Thread Samuel Williams
l of engineering and thoughtfulness. On 29 October 2016 at 13:46, John R Pierce wrote: > On 10/28/2016 4:55 PM, Samuel Williams wrote: >> >> Even if maintaining the old commands, they could print >> out to stderr "This command is deprecated and now wraps `pg_ctl init`

Re: [GENERAL] initdb createuser commands

2016-10-29 Thread Samuel Williams
rience a lot better. Given how awesome postgres is in other areas, it's disappointing that the command line tools don't reflect the same level of engineering and thoughtfulness. On 29 October 2016 at 16:36, Samuel Williams wrote: > I think there is an opportunity here to make the exp

Re: [GENERAL] initdb createuser commands

2016-10-28 Thread Samuel Williams
we update the Arch Linux documentation to prefer these commands? What is the recommendation here? On 29 October 2016 at 12:39, John R Pierce wrote: > On 10/28/2016 4:31 PM, Samuel Williams wrote: >> >> Just wondering as the naming of these commands seems overly generic >> and for a

[GENERAL] initdb createuser commands

2016-10-28 Thread Samuel Williams
Hello, Perhaps I'm missing something. However, it seems to me that there are several "generically" named commands, e.g. initdb, createuesr which come as part of postgresql. In my mind, these commands are sufficiently generic that they might clash with other commands. It's also not obvious they ar