Thank you, it's ideal for me :)
2015-02-27 15:21 GMT+03:00 Michael Paquier :
> On Fri, Feb 27, 2015 at 9:13 PM, Vadim Gribanov
> wrote:
> > Hi! Where i can find explanation about how postgresql works with shared
> > memory?
>
> Perhaps this video of Bruce's pr
Hi! Where i can find explanation about how postgresql works with shared
memory?
---
Best regard Vadim Gribanov
Linkedin: https://www.linkedin.com/in/yoihito
Skype: v1mk550
Github: https://github.com/yoihito
Hello!
This is a patch that allows choosing not to dump the data for the selected
tables.
The intended usage is to make backups smaller and faster, by allowing skipping
unneeded data, while still generating a backup that can be restored and obtain
a fully working application.
I use it to avoi
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I am also don't know how use NEW,OLD in plpgsql
but in pltcl possible to use $NEW($my_field), $OLD($my_field)
--
Vadim Passynkov
-Original Message-
From: Gaetano Mendola [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 14, 2004 2:35 PM
To: [EMAIL PROTECTED]
Subject: [HACKERS] NEW
e Project was one of the
greatest adventures in my life.
Thanks to everyone!
Good luck on your ways.
And - long live to Postgres!!!
Vadim
nd transaction could
see COMMITTED state but still old (subtrans id) in xmin: it's not
guaranteed that changes made on CPU1 (V1 was changed first, then V2 was
changed) will appear at the same order on CPU2 (V2 may come first, then V1).
Vadim
_
Rev
nd transaction could
see COMMITTED state but still old (subtrans id) in xmin: it's not
guaranteed that changes made on CPU1 (V1 was changed first, then V2 was
changed) will appear at the same order on CPU2 (V2 may come first, then V1).
Vadim
_
Rev
ery (because I have to write two files for every transaction,
> rather than one)
Control file is not updated "for every transaction", only on a few special
events
like checkpoint.
Vadim
---(end of broadcast)---
TIP 3: if posting/readi
.
> >
> > So if you do this, do you still need to store that information in
> > pg_control at all?
Yes: to "speeds up the recovery process".
Vadim
_
Sector Data, LLC, is not affiliated with Sector, Inc., or SIAC
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
I'll be on vacation from 12/27/02 till 01/20/03.
Vadim
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
I'll be on vacation from 12/27/02 till 01/20/03.
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
ement the multi-master hooks.
Is it about periodic Phases 2 & 3 or about using Phase 3' LockTable
in Phase 1? The first one definitely can wait but the second one should
be resolved before merging pg-r code with main CVS, imo.
Vadim
---(end of broadcast)--
t miss something here.) Also it seems that we could
perform Phases 2 & 3 periodically during transaction execution.
This would make WS smaller and conflicts between long running
transactions from different sites would be resoved faster.
Comments?
Vadim
__
for concurrent same-row-update); go to 1.
3. Else (PK exists and no one changing it right now) -> proceed.
PK transaction does the same:
1. No FK -> proceed.
2. FK inserted/updated/selected for update by concurrent transaction
F -> wait for F commit/abort; go to
> Wouldn't it work for cntxDirty to be set not by LockBuffer, but by
> XLogInsert for each buffer that is included in its argument list?
I thought to add separate call to mark context dirty but above
should work if all callers to XLogInsert always pass all
modified buffers - please ch
DATE t SET a = 1 WHERE b = 2;
2nd client executes UPDATE t SET a = 2 WHERE b = 2;
at "the same time" you don't know in what order these
queries will be executed on two different servers (because
you can't control what transaction will lock record(s)
for update first).
Vadim
--
ob of both PITR and pg_dump.
As I already said - agreed -:)
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
lternative to
pg_dump/pg_restore, but I agreed that it's not the big
feature to worry about.
Vadim
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
commands from pg_copy required and couldn't
be backup accomplished by issuing single command
ALTER SYSTEM BACKUP
(even from pgsql) so backup process would die with entire system -:)
As for tape changing, maybe we could use some timeout and then just
stop backup process.
cussed then sorry - I'm not going
to start new discussion).
Vadim
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
ding all data files through
our shared buffer pool? Sorry, I just don't see point in this
when tar ect will work just fine. At least for the first release
tar is SuperOK, because of there must be and will be other
problems/bugs, unrelated to how to read data files, and so
the sooner we start testing
> So ALTER SYSTEM BEGIN BACKUP would turn on atomic write
> and then checkpoint the database.
> So while the OS copy of the data files is going on the
> atomic write would be enabled. So any read of a partial
> write would be fixed up by the usual crash recovery mechanism.
Yes, s
> > How do you get atomic block copies otherwise?
>
> Eh? The kernel does that for you, as long as you're reading the
> same-size blocks that the backends are writing, no?
Good point.
Vadim
---(end of broadcast)---
TIP 2:
block
and on restart you compare log record' LSN with
data block' LSN, they are equal and so you *assume*
that actual data are in place too, what is not the case?
I always thought that the whole point of PITR is to be
able to restore DB fast (faster than pg_restore) *AND*
up to the la
h surely will get WAL-logged even if we
> persuade the buffer manager not to do it for the data pages. Is that
> a problem? Not sure.
It was not about any problem. I just mean that local buffer pool
still could be used for temporary relations if someone thinks
that it has any sence,
for finding what you need to read, too.)
>
> How do you get atomic block copies otherwise?
You don't need it.
As long as whole block is saved in log on first after
checkpoint (you made before backup) change to block.
Vadim
---(end of broadcast)-
r temporary) relations to close this issue, yes? I personally
don't see any performance issues if we do this.
Vadim
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL
ms my words were
too short). Above is how to do this. And though I agreed that it's not
very convenient/handy/cosy to *take care* and fetch numbers in
separate committed transaction, but it's required only in those special
cases and I think it's better than do fsync() per each nextv
> Attached is a patch against current CVS that fixes both of the known
> problems with sequences: failure to flush XLOG after a transaction
Great! Thanks... and sorry for missing these cases year ago -:)
Vadim
---(end of broadcast)---
ts that occur from the table
> inserts must not happen in the same xact as the nextval's
> XLogInserts. I can demonstrate the behavior quite easilly, and
> Bruce posted results that confirmed it.
Just wait until Tom adds check for system RedoRecPtr in nextval()
and try to reproduce this behavi
d to his/her account)?
---
I agree that if nextval-s were only "write" actions in transaction
and they made some XLogInsert-s then WAL must be flushed at commit
time. But that's it. Was this fixed? Very easy.
Vadim
---(end of broadcast)--
> >> Um, Vadim? Still of the opinion that elog(STOP) is a good
> >> idea here? That's two people now for whom that decision has
> >> turned localized corruption into complete database failure.
> >> I don't think it's a good tradeoff.
>
>
t; Mind you, I'm not actually advocating that we do any of this ;-).
I understand -:)
> I was just sketching a possible implementation approach in
> case someone wants to try it.
And I'm just sketching possible problems -:)
Vadim
---(end of broadcast)
pletion report.
But what about BEFORE insert/update triggers which could insert
records too?
Vadim
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
000
clients current pgbench implementation is very poor.
BTW2 - shouldn't we learn if there are really portability/performance
issues in using POSIX mutex-es (and cond. variables) in place of
TAS (and SysV semaphores)?
Vadim
---(end of broadcast)-
defaultTblNode and indexTblNode.
If it's too late to do for 7.2 then let's wait till 7.3.
Vadim
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
w what needs to be
> changed and I will work on it this weekend.
Just change index' dir naming as was already discussed.
Vadim
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
t; that even if performance doesn't increase, this patch as alot of other
> benefits for admins.
Agreed.
Vadim
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
n mind. Such tests would be more valuable.
Vadim
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
to set
relkind in relation structure, but I don't see any reason to
do so when we can just use different tblnode number for indices and
name index dirs just like other dirs under 'base' named - ie
only tblnode number is used for dir names, without any additions
unr
ode & relnode)
must be used to identify file, no any other logical information
totally unrelated to storage issues.
Vadim
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
ntax to specify default tablespace for indices.
Unfortunately I removed message with patch, can you send it
to me, Bruce?
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
where we got bad LSN.
Maybe it comes from restored pages or from checkpoint LSN,
due to errors in XLogCtl initialization, but for sure it looks
like bug in WAL code.
> Vadim, what do you think of reducing this elog from STOP to a notice
> on a permanent basis? ISTM we saw cases during 7.1 be
e?
Just set t_data->t_oid = newoid() - this is what backend does
in heapam.c:heap_insert().
Vadim
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
> So, rather than going over everone's IANAL opinons about mixing
> licenses, let's just let Massimo know that it'd just be a lot
> easier to PostgreSQL/BSD license the whole thing, if he doesn't
> mind too much.
Yes, it would be better.
Vadim
he source code eventually anyway.
And we think that no one will try to fork and commercialize
server code - todays, when SAP & InterBase open their DB
code, it seems as "no-brain".
Vadim
---(end of broadcast)---
TIP 1: subscrib
what used in user_lock funcs. So, that licence
is unenforceable to everything... except of func names -:)
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
bout ODBC cross-tx cursors -:(
Anyway, *MSQL*, Oracle, Informix - all have osmgr. Do they
have cross-tx cursors in their ODBC drivers?
Vadim
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
overwriting smgr,
> there's no system item to detect the change of tuples.
So, is tid ok to use for your purposes?
I think we'll be able to restore old tid along with other tuple
data from rollback segments, so I don't see any problem from
osmgr...
Vadim
--
27;s do one step back - you wrote:
> My assumption is that once you link that code into the backend,
> the entire backend is GPL'ed and any other application code
> you link into it is also (stored procedures, triggers, etc.)
So, one would have to open-source and GPL
u didn't link in libreadline.
Application would explicitly call user_lock() functions in
queries, so issue is still not clear for me. And once again -
compare complexities of contrib/userlock and backend' userlock
code: what's reason to cover contrib/userlock by GPL?
Vadim
as binary-only, and actually can't be sold for much because you
> have to make the code available for near-zero cost.
I'm talking not about solding contrib/userlock separately, but
about ability to sold applications which use contrib/userlock.
Sorry, if it was not clear.
Vadim
OID for XactLock purposes,
// so no problem
tag.objId.xid = _user_key_;
--
- but I like standard solutions more -:)
(BTW, key-locking was requested by others a long ago.)
Vadim
---(end of broadcast)-
e resides in shared memory.
How is proposed "key locking" is different from user locks we
have right now? Anyone can try to acquire many-many user locks.
Vadim
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
becomes a problem I can easily change it, but
> I prefer the GPL if possible.
Actually I don't see why to cover your contrib module by GPL.
Not so much IP (intellectual property) there. Real new things
which make new feature possible are in lock manager.
Vadim
-
e - just a few bytes in shmem for
key. Auxiliary table would keep refcounters for keys.
> Why wouldn't it work with serializable isolevel?
Because of selects see old database snapshot and so you
wouldn't see key inserted+committed by concurrent tx.
Vadim
tion you would first lock a key in excl mode
(for duration of transaction), than try to select and insert unless
found? (Note that this will not work with serializable isolevel.)
Comments?
Vadim
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
; --
> Rod Taylor
>
> This message represents the official view of the voices in my head
>
> - Original Message -
> From: "Mikheev, Vadim" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Friday, August 17, 2001 2:48 PM
> Subject: [HAC
allowed to lock objects in table - missed in 1.
One could object that 1. is good because user locks never wait.
I argue that "never waiting" for lock is same bad as "always waiting".
Someday we'll have time-wait etc features for general lock method
and everybody will be
hread_cond_wait) and should implement light lmgr,
probably with priority locking.
Vadim
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
t;
> Good question. I know the number of function calls to spinlock stuff
> is huge. Seems real semaphores may be a big win on multi-cpu boxes.
Ok, being tired of endless discussions I'll try to use mutexes instead
of spinlocks and run pgbench on my Solaris WS 10 and E4500 (4 CPU
ormance, right?
No. As long as no one proved with test that mutexes are bad for
performance...
Funny, such test would require ~ 1 day of work.
> Should we be spinning waiting for spinlock on multi-cpu machines? Is
> that the answer?
What do you mean?
Vadim
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
one cases we need in
light lmgr (when we're going to keep lock long enough, eg for IO)
and in another cases we'd better to proceed with POSIX' mutex-es
or semaphores instead of spinlocks. Queueing backends waiting
for spinlock sounds like nonsense - how are you going to protect
such q
x27; data file is updated at checkpoint time,
so - not so much IO. I really think that using sequences for system
tables IDs would be good.
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
erything would work perfectly.
>
> I meant we use them in many cases to link entries, and in
> pg_description for descriptions and lots of other things
> that may use them in the future for system table use.
So, add class' ID (uniq id from pg_class) when linking.
Vadim
y
pg_proc, if we want to find table with oid Y we query
pg_class - we always use oids in context of "class"
to what an object belongs. This means that two tuples
from different system tables could have same oid values
and everything would work perfectly.
There is no magic around OIDs.
Va
that a spinlock per PROC structure would be a better answer,
> either; the overhead of getting and releasing each lock would be
> nontrivial, considering the small number of instructions spent at
> each PROC in these routines.
Isn't spinlock just a few ASM instructions?... on most platform
of our spinlocks under load.
We shouldn't use light locks everywhere. Updating/reading MyProc.xid
is very good place to use simple spinlocks... or even better mutexes.
Vadim
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
compilation).
Isn't it possible for PL/_ANY_L_ too?
> In the PL/pgSQL case it *might* be possible. But is it worth
> it?
Sure.
Vadim
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister comm
really good world, so it'll not be
> possible for PL's.
Why is it possible in Oracle' world? -:)
Vadim
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
.
In good world rules (PL functions etc) should be automatically
marked as dirty (ie recompilation required) whenever referenced
objects are changed.
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
" lock requests. For example,
> if GetSnapshotData could use a shared lock on SInvalLock, it'd
> improve concurrency.
Yes, we already told about light lock manager (no deadlock detection etc).
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
alLockId);
> SpinRelease(XidGenLockId);
>
> which is really necessary if you want to avoid assuming that
> TransactionIds can be fetched and stored atomically.
To avoid that assumption one should add per MyProc spinlock.
Vadim
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
y query snapshot and only if
concurrently
updated rows (from that set) satisfy query qual => a row must satisfy
snapshot
*and* query qual = double satisfaction guaranteed -:))
And let's remember that this behaviour is required for current RI
constraints
implementation.
Vadim
-
> I am trying to understand why GetSnapshotData() needs to acquire the
> SInval spinlock before it calls ReadNewTransactionId, rather than after.
> I see that you made it do so --- in the commit at
>
http://www.ca.postgresql.org/cgi/cvsweb.cgi/pgsql/src/backend/storage/ipc/sh
mem.c.diff?r1=1.41&r2
bugs.
> lose all in the log, or just those not yet written to the DB?
BAR is not for "simple crash" but for the disk crashes. In this case
one will lose as much as WAL files lost.
Vadim
---(end of broadcast)---
TIP 5: Have you check
"
> here, and probably also to invoke HeapTupleSatisfiesNow() via the
> HeapTupleSatisfies() macro so that infomask update is checked for.
> Vadim, what do you think?
Looks like there is no drawback in locking buffer so let's lock it.
Vadim
---(end of br
Baby girl on Jun 27.
Vadim
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html
> Any better ideas out there?
Names were always hard for me -:)
> Where did the existing lock type names
> come from, anyway? (Not SQL92 or SQL99, for sure.)
Oracle. Except for Access Exclusive/Share Locks.
Vadim
---(end of broadcast)
use. But
PG is not locking system, no reasons to add key lock overhead, because
of PG internals are able to handle dirties and we need just add same
abilities to externals.
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
user has no ability to choose what's
right for him.
> some level (when we escalate the lock to a full table lock?)
> we simply forget about single keys, but have a new index
> access function that checks the entire index for uniqueness.
I wou
> > update a set a=a+1 where a>2;
> > ERROR: Cannot insert a duplicate key into unique index a_pkey
>
> This is a known problem with unique contraints, but it's not
> easy to fix it.
Yes, it requires dirty reads.
Vadim
---
uplicate key into unique index a_pkey
We use uniq index for UK/PK but shouldn't. Jan?
Vadim
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
cannot be easily
> rolled back or undone.
What do you mean?
Vadim
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> I had a baby girl on Tuesday. I am working through my
> backlogged emails
> today.
Congratulations -:)
Vadim
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
plications were produced.
And my point was that needless to talk about rollbacks in
non-transaction system and in transaction system one has to
implement rollback somehow.
> BTW, do you know what strategy is used by BSDDB/SDB for
> rollback/undo ?
AFAIR, they use O-smgr => UNDO is requ
> OTOH it is possible to do without rolling back at all as
> MySQL folks have shown us ;)
Not with SDB tables which support transactions.
Vadim
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
ed
disk space (and, oh yeh, it's not easy to implement -:)).
So, any other opinions about value of O-smgr?
Vadim
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
are whole pages stored in rollback segments or just the modified data?
This is implementation dependent. Storing whole pages is much easy to do,
but obviously it's better to store just modified data.
Vadim
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
ms overwrite smgr has mainly advantages in terms of speed for operations
> other than rollback.
... And rollback is required for < 5% transactions ...
Vadim
---(end of broadcast)---
TIP 6: Have you searched our list archives?
http://www.postgresql.org/search.mpl
ons will not read from table
with big amount of dead data at all! So - why keep dead data in datafiles
for long time? This obviously affects overall system performance.
Vadim
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
d allow our rollback
segments to grow without limits as well.
> > Non-overwriting smgr can eat all disk space...
> >
> > > You didn't know that? Vadim did ...
> >
> > Didn't I mention a few times that I was inspired by Oracle? -:)
>
> Looking
of transactions to log storage size.
Don't be kidding - in any system transactions size is limitted
by available storage. So we should tell that more disk space
is required for UNDO. From my POV, putting $100 to buy 30Gb
disk is not big deal, keeping in mind that PGSQL requires
$ZERO to be used.
d records from rollback segments should
be faster than from datafiles.
> > > You didn't know that? Vadim did ...
> >
> > Didn't I mention a few times that I was
> > inspired by Oracle? -:)
>
> How does it do MVCC with an overwriting storage manager ?
1. Syst
ve been discussed.
This would be hell.
Vadim
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
iting smgr later
and implement UNDO then. For the moment we could just
change checkpointer to use checkpoint.redo instead of
checkpoint.undo when defining what log files should be
deleted - it's a few minutes deal, and so is changing it
back.
Vadim
---(end of broadcast)--
don't think that Oracle writes entire page as before image - just
tuple data and some control info. As for additional IO - we'll do it
anyway to remove "before image" (deleted tuple data) from data files.
Vadim
---(end of broadcast)---
> >> Impractical ? Oracle does it.
> >
> >Oracle has MVCC?
>
> With restrictions, yes.
What restrictions? Rollback segments size?
Non-overwriting smgr can eat all disk space...
> You didn't know that? Vadim did ...
Didn't I mention a few times t
euse all tuples.
This is what we're discussing now -:)
If community will not like UNDO then I'll probably try to implement
dead space collector which will read log files and so on. Easy to
#ifdef it in 7.2 to use in 7.3 (or so) with on-disk FSM. Also, I have
to impleme
1 - 100 of 472 matches
Mail list logo