What is the fan-out (number of child nodes) on each B-tree node in
postgresql? Is it dependent of the size of the keys being indexed? If
so: How?
In B-trees all non-leaf nodes have a bunch of pointers to its child
nodes. What is the size of such a pointer?
Thanks
--
What is the fan-out (number of child nodes) on each B-tree node in
postgresql? Is it dependent of the size of the keys being indexed? If
so: How?
In B-trees all non-leaf nodes have a bunch of pointers to its child
nodes. What is the size of such a pointer?
Thanks
---(en
I doubt this is in the documentation, but you can always read the
source. I'd take a look around:
> src/backend/access/nbtree/
Maybe its just me that is blind, but I couldn't find anything on this
particular issue there. :-(
Any other suggestions?
---(end of broadcas
In any case, I think the answer to your original question is that the
fan-out can be up to several hundred per level, but it's not fixed.
OK, its beginning to make sense. So the fan-out is given by the key size
and each child node is stored in its own page. Is that correct?
Thanks in advance!
In postgres, everything is done in pages, so how ever many keys fit in
a page. Bigs keys mean less. For integers you can fit an awful lot of
keys.
OK, interesting. Does that mean, that when a node containing only small
values (e.g. integers) is split, then it gets an awful lot of child node
In postgresql 8.2 I need a bitmap index for a certain column, but how do
I create it?
(No, a btree is not sufficient. :-) )
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
When using a function I need to parse each value as argument, e.g.
myFunction(arg1, arg2, ..., arg999)
However, if I can guarantee that the function will only be used with a
certain table, can I then omit the arguments and letting the function
refer to the table columns directly?
In that way
How can I obtain this non-argument-requiring function feature?
I solved it myself: Define the function using composite types
(http://www.postgresql.org/docs/8.2/interactive/xfunc-sql.html)
---(end of broadcast)---
TIP 4: Have you searched our li
In a web application I would like to use tsearch2 to search for by-user
entered key words. That is, the user provides the keywords in a space
separated list in some input text field. For that I use
to_tsquery('') but I would like to do this in a safe way
so that the user cannot misuse to_tsquer
I am thinking about how to model tags for a web site in the database.
(Each site entry is assigned a number of tags and a match can then later
be performed between these tags and tags entered by the visitor).
Is tsearch2 the way to do it? Each site entry could then store its tags
in a tsvector
Isn't plainto_tsquery() what you're looking for?
Yes if plainto_tsquery() is safe enough for inputting user search
keywords it looks like it is. I didn't knew it existed.
Thanks
---(end of broadcast)---
TIP 5: don't forget to increase your free
I have some questions related to tsearch2:
1) Is
...WHERE rank(myTsVector, myTsQuery) > 0 ...
just as fast as
...WHERE myTsVector @@ myTsQuery...
?
2)) I will use plainto_tsquery() to parse search keys entered by a
website user to a tsquery. However, if only some of the entered keywords
Does anyone know where I can request an OR-version of plainto_tsquery()?
I don't understand why it doesn't exist already: In most cases, when
using user entered keywords to search for, there should be returned some
rows even though not ALL keywords are matched.
---(en
Thanks for your response! Let me try to elaborate what I meant with my
original post.
If R is the set of words in the tsvector for a given table row and S is
the set of keywords to search for (entered by e.g. a website user) I
would like to receive all rows for which the intersection between R
I need a way to perform a series of money transactions (row inserts)
together with some row updates in such a way that integrity is ensured
and performance is high.
I have two tables:
ACCOUNTS (
account_id int,
balance int
);
TRANSACTIONS (
transaction_id int,
source_
OK, thanks. But what with the second question in which the UPDATE is
based on a SELECT max(...) statement on another table? How can I ensure
that no other process inserts a row between my SELECT max() and UPDATE -
making my SELECT max() invalid?
A table lock could be an option but I am only in
It has been suggested [1] that a good way to select K random rows
efficiently from a table is to
1) add a new column, random_number, to the table
and initialize it with random()
2) perform the following query:
SELECT *
FROM mydata
WHERE random_number >= (SELECT RANDO
Another way to look at the problem is: How do I sample a subset of size
K efficiently? A query like
SAMPLE 1000 OF
(SELECT * FROM mydata WHERE )
should return 1000 random rows from the select statement so that two
consecutive evaluations of the query would only with very little
probabil
How important is true randomness?
The goal is an even distribution but currently I have not seen any way
to produce any kind of random sampling efficiently. Notice the word
"efficiently". The naive way of taking a random sample of size K:
(SELECT * FROM mydata ORDER BY random() LIMIT )
is
All you're doing is picking random =subsequences= from the same
permutation of the original data.
You have some good points in your reply. I am very much aware of this
non-random behavior you point out for the "static random-value column"
approach but at least it is fast, which is a requiremen
I really need some input:
In a system that handles money transfers I have a table to store each
money transfer. A number of different events can result in a money
transfer but the events are so different that each event type is stored
in its own table.
So we have a schema of the form:
TRA
21 matches
Mail list logo