Re: Reg: Help to understand the source code

Preethi S Thu, 23 Apr 2020 18:13:32 -0700

Thank you Paul! This certainly helps.

On Thu, Apr 23, 2020 at 12:26 PM Paul Jungwirth <p...@illuminatedcomputing.com>
wrote:


> On 4/23/20 8:44 AM, Preethi S wrote:
> > I am fairly new to postgres and I am trying to understand how the data
> > is processed during the insert from buffer to the disk. Can someone help
> > me with that? Also, I would like to see source code workflow. Can
> > someone help me with finding the source code for the data
> > insertion/modification workflow.
>
> I'm also a Postgres hacker newbie, but I've spent some time adding
> SQL:2011 FOR PORTION OF support to UPDATE/DELETE, so I've gone through
> that learning process. (I should say "going through". :-)
>
> I'd say be prepared to spend a *lot* of time reading the code.
> Personally I use `grep -r` a lot and just read and read. For specifics
> you can use a debugger or insert `ereport(NOTICE, (errmsg("something
> %s", foo)))` and run queries (or the test suite). Also many subfolders
> have an extensive README that will guide you. Some of the READMEs may
> take an hour or more to get through and understand, but reading them is
> worth it.
>
> It helped me a lot to spend several years writing occasional Postgres C
> extensions before really doing anything in the core codebase. There are
> lots of basics you learn that way. There are a bunch of articles and
> presentations out there about that you might find helpful.
>
> Postgres processes queries in several steps:
>
> - parse
> - analyze
> - rewrite
> - plan
> - optimize
> - execute
>
> The parse step is a bison grammar (look for gram.y). Basically it fills
> in structs cutting up what the user typed.
>
> The analyze step starts to make sense of the parse results. Look at
> parser/analyze.c. It maps input strings to database objects---for
> example looking up table/column names (and making sure they really
> exist). Here you're sort of just copying things from the parse structs
> to different structs. You're building up Node trees that later steps can
> use. I think the analyze step is often considered to be still part of
> the parse phase.
>
> It seems like each SQL "clause" has its own transformFoo function, so
> probably you'll want to add your own (transformMyAwesomeFeatureClause)
> and then call it from its "parent" (e.g. transformUpdateStmt).
>
> If you add new Node types you'll need to edit nodes/*funcs.c and also
> probably teach some switch statements how to handle them. If you are
> filling in a struct but then later in the pipeline find that what you
> wrote isn't there anymore, you probably forgot to implement a copy
> function.
>
> The rewrite/plan/optimize steps aren't things you need to worry about
> too much if you're interested in DML, but you can read more about them
> in the source code. Especially rewrite is pretty niche (views and RULEs).
>
> The execute step is the most challenging I think. It has its own Node
> trees and also keeps an execution state. Probably you'll need to look at
> src/backend/executor/nodeModifyTable.c among others. You'll also need to
> learn about TupleTableSlots. (If anyone here has a good learning
> resource for TTS I would also be glad to read it.)
>
> I'm afraid this description is comically dumbed down, but hopefully it
> can be something like a map. I'd probably just take an UPDATE statement
> and try to trace it through the pipeline, and maybe experiment with
> small changes along the way. You can add things to src/test/regress as
> you go.
>
> And the mailing list is a very friendly place to ask questions.
>
> Yours,
>
> --
> Paul              ~{:-)
> p...@illuminatedcomputing.com
>
>
>

Re: Reg: Help to understand the source code

Reply via email to