Re: Where can I find the doxyfile?

2024-02-01 Thread John Morris
Here is the updated patch. It should fix the meson issue when no doxygen is 
present.


doxygen-v3.patch
Description: doxygen-v3.patch


Re: Where can I find the doxyfile?

2024-02-04 Thread John Morris
>> I wish you'd stop proposing the lex filter so that we can discuss the
>> Doxyfile.in file and the accompanying Meson rule.

I see no obstacle to discussing Doxyfile.in and the meson.build file. Your 
suggestions are welcome. If you can find the original Doxyfile, it
would make sense to incorporate it.

>> but having something that purports to display our source code and
>> then show something that is*not* our source tree sounds insane.

The filter enables doxygen to interpret Postgres comments. Any code displayed 
is the original code, not the filtered code.

>> Speaking from my personal point of view, I make very little use
>> of our Doxygen pages,

Same for me. The pages are missing critical information, such as function 
descriptions and descriptions of structure fields. This filter is an attempt at 
fixing those problems.

Of course, nothing will fix doxygen when there are no comments to begin with. 
If we want comprehensive doxygen output, we need to add comments. However, we 
don’t need to add those irritating extra symbols.




Re: Where can I find the doxyfile?

2024-02-05 Thread John Morris
>> It seems something you want to keep for your personal use.
>> I think this filter is the kind of genious idea that it's OK if you
>> can do it at home

I don’t agree with your characterization.

The purpose of the filter is to bring existing Postgres comments into the 
doxygen output. While I haven’t done a full survey, the majority of Postgres 
code has comments describing functions, globals, macros and structure fields.

Currently, those comments are thrown away. They do not appear in the existing 
Doxygen output.

The filter itself is not rocket science. It is a straightforward scanner built 
using the flex tool. I understand many people are more comfortable with perl or 
python, but flex is uniquely appropriate for creating small, efficient scanners.

The resulting output is not perfect. However, it is a major step forward from 
what we are currently producing. Over time, we can gradually update Postgres 
comments to fill in missing information. Eventually, doxygen output could 
become a “first look” tool for looking at code.

Will it be my “first look” tool. Not likely. Like you, I am comfortable with 
Intellij and VsCode. I have my build environments already set up.

I do not see this as a high-risk endeavor. It is “off to the side” of the main 
Postgres code. It takes an existing tool, doxygen, which is mostly useless, and 
starts to make it usable.

As an aside, I have contacted Intellij about allowing custom Doxygen files for 
hover information. No reply yet, but it would allow us to tailor our IDEs to 
display the information we need most.




Re: Where can I find the doxyfile?

2024-02-07 Thread John Morris
>> I think all the explanatory messages in doc/doxygen/meson.build
>> are a bit much.  I think it's enough to just not define the target
>> when the required conditions (dependencies, options) are not there.
>> Maybe something like docs_pdf can serve as an example.
Makes sense, and it is simpler.
Let’s continue after I take a look at docs_pdf.

Thanks,
John



Re: Where can I find the doxyfile?

2024-02-07 Thread John Morris
>> Maybe this is something that can be tweaked on the doxygen side?

I’ll look into that idea. Doxygen is a separate project with its own developers 
to persuade. I could imagine a special “C” or “C++” mode. I wanted to keep 
things simple, and the filter mechanism is how they extend doxygen to support 
other languages.

One alternative is to add the filter to doxygen’s webpage. They currently list 
a number of filters, none of which address this particular issue. The filter 
would become a build dependency like doxygen itself.

In the meantime, I’d like to address Alvaro’s concern about generating 
artifacts. It’s a bit tedious, but a visual survey of the output may add 
confidence or show areas needing improvement.


Re: Where can I find the doxyfile?

2024-02-08 Thread John Morris
>> I think all the explanatory messages in doc/doxygen/meson.build
>> are a bit much.  I think it's enough to just not define the target

Here is a patch with an updated meson.build as you suggested. I agree the 
messages were a bit much.
On the other hand, I would like to see clear error messages when dot or doxygen 
are not installed,
but I’ll leave that for a future discussion.


  *   John
-


doxygen-v4.patch
Description: doxygen-v4.patch


meson.build
Description: meson.build


Re: Where can I find the doxyfile?

2024-02-08 Thread John Morris
>> We also have a handful of .cpp files.
Fortunately, our .cpp files are simple and do not include C++ specific features.
We shouldn’t have any problem adding .cpp to the filter list.

At some point it will be necessary to support general C++, but I wasn’t 
planning to do it yet.

>> This file isn't really following postgres coding style - any reason not to 
>> do so?
I’ll update it to Postgres commenting. The filter started as a standalone 
project.

>> What do you mean with "build time error messages" specifically? Why would we
>> want to output anything at build time (rather than configure time)?
“build time error messages” occur after typing “ninja”.

Documentation is usually produced long after meson setup messages have been 
forgotten.
It is much more useful to see “Please install the doxygen program” after typing 
“ninja doxygen”
than to see “target not supported”, or to have to look through the setup log.

It’s personal preference, but I think the sgml docs should display a similar 
message.

>> I'd add "native: true", it'd not work to find the program in the target
>> environment of a cross build.
Got it. Thanks.


Re: Where can I find the doxyfile?

2024-02-12 Thread John Morris
Update:  another patch with 1) suggested changes, 2) delete old html before 
generating new, and 3) added flex names for the more complex regular 
expressions.

Exercising the “ninja doxygen” command, brought up some issues.

  1.  The generated html directory contained old as well as new pages.
The build script now deletes old pages before building new ones.
  2.  Using a line of dashes to suppress text formatting is not implemented.
Some dashed comments are harder to read, especially around code samples.
  3.  Comments at the end of “#define” are considered part of the define
and not annotated as doxygen comments.

The first issue was very confusing, so it has been fixed.
My preference is to make the other two issues part of a future enhancement. 
Thoughts?

I’ve also been scanning the generated pages looking for anomalies.

  1.  About 1/3 of pages examined.
  2.  In most cases, the filter works as expected.
  3.  Many places are missing information, so the output has blank fields (as 
expected).
  4.  A few places have obvious nonsense. (eg. a group comment applied to one 
element)
  5.  No situations where the output would be misleading.
  6.  In all cases, the source code is “a click away”.

While I had planned to look at *every* page, I’ll probably stop at the 1/3 
sample – unless someone wants to help scan through the pages with me.

I also heard back from Jetbrains about incorporating custom Doxyfiles. They do 
their own rendering and do not invoke the doxygen command. Custom Doxyfiles are 
not going to happen. (It’s probably the same for VS.)

On my Mac M1, generating doxygen takes about 20 seconds. When graphs are added, 
it takes 5 minutes.




doxygen_v5.patch
Description: doxygen_v5.patch


Re: Where can I find the doxyfile?

2024-02-16 Thread John Morris
>> I have found it very strange that a tool like doxygen which can create
>> all sorts of call graphs, just ignores some comments.  The comments
>> above function are very important.

I agree with you . I hated doxygen for decades because of the irritating 
annotations it required.  When I discovered IDEs were creating doxygen-like 
popups from conventional comments, I tried to figure out what they were doing. 
In the process, I discovered filters, and I created a filter to match what I 
thought the IDEs were doing.

(As it turns out, IDEs implement their own rendering independent of doxygen.)

I find it ironic I’ve gone from being a long-time hater of doxygen to being its 
advocate.


  *   John Morris


Tiny tidbit of history. Back in the 70’s, I created a comment extractor for the 
language Ratfor. We used it to maintain an alphabetical index of functions and 
to display pseudo-code. We drew our call graphs by hand and saved them in 
manila folders. Hard to imagine now.



Re: Atomic ops for unlogged LSN

2024-02-29 Thread John Morris
Looks good to me.

  *   John

From: Nathan Bossart 
Date: Thursday, February 29, 2024 at 8:34 AM
To: Andres Freund 
Cc: Stephen Frost , John Morris 
, Bharath Rupireddy 
, Michael Paquier 
, Robert Haas , 
pgsql-hack...@postgresql.org 
Subject: Re: Atomic ops for unlogged LSN
Here is a new version of the patch that uses the new atomic read/write
functions with full barriers that were added in commit bd5132d.  Thoughts?

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com


FW: Add the ability to limit the amount of memory that can be allocated to backends.

2023-03-30 Thread John Morris
 Hi Reid!
Some thoughts
I was looking at lmgr/proc.c, and I see a potential integer overflow - both 
max_total_bkend_mem and result are declared as “int”, so the expression 
“max_total_bkend_mem * 1024 * 1024 - result * 1024 * 1024” could have a problem 
when max_total_bkend_mem is set to 2G or more.
/*
* Account for shared memory 
size and initialize
* max_total_bkend_mem_bytes.
*/

pg_atomic_init_u64(&ProcGlobal->max_total_bkend_mem_bytes,

   max_total_bkend_mem * 1024 * 1024 - result * 
1024 * 1024);


As more of a style thing (and definitely not an error), the calling code might 
look smoother if the memory check and error handling were moved into a helper 
function, say “backend_malloc”.  For example, the following calling code

/* Do not exceed maximum allowed memory allocation */
if 
(exceeds_max_total_bkend_mem(Slab_CONTEXT_HDRSZ(chunksPerBlock)))
{
MemoryContextStats(TopMemoryContext);
ereport(ERROR,

(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of 
memory - exceeds max_total_backend_memory"),

errdetail("Failed while creating memory context \"%s\".",

   name)));
}

slab = (SlabContext *) 
malloc(Slab_CONTEXT_HDRSZ(chunksPerBlock));
  if (slab == NULL)
  ….
Could become a single line:
Slab = (SlabContext *) backend_malloc(Slab_CONTEXT_HDRSZ(chunksPerBlock);

Note this is a change in error handling as backend_malloc() throws memory 
errors. I think this change is already happening, as the error throws you’ve 
already added are potentially visible to callers (to be verified). It also 
could result in less informative error messages without the specifics of “while 
creating memory context”.  Still, it pulls repeated code into a single wrapper 
and might be worth considering.

I do appreciate the changes in updating the global memory counter. My 
recollection is the original version updated stats with every allocation, and 
there was something that looked like a spinlock around the update.  (My memory 
may be wrong …). The new approach eliminates contention, both by having fewer 
updates and by using atomic ops.  Excellent.

 I also have some thoughts about simplifying the atomic update logic you are 
currently using. I need to think about it a bit more and will get back to you 
later on that.


  *   John Morris






Re: Where can I find the doxyfile?

2023-10-14 Thread John Morris
Thank you for trying the patch out and commenting on it.

I'm starting to think of it as a project. Here's a quick project statement.

The purpose is to generate improved  Doxygen output while making maximal use of 
how Postgres currently does program comments.

Thinking in terms of specific steps, and adding to what you have mentioned:

  *   Configure Doxyfile so it produces output similar to previous output.
  *   Only build Doxygen output if requested
  *   Don't compile the filter or configure the Doxyfile  if they aren't needed
  *   Include contrib in the sources to document
  *   Provide options for other (non-html) output. (Which ones?)
  *   Update Postgres documentation to include instructions for creating 
Doxygen output.
  *   Mention it in developer guidelines and provide sample code showing a 
"desired" commenting style.

Does that list seem complete? I don't want people to think we're imposing new 
standards or legislating new commenting styles.  It's more a matter of 
describing what we currently do, maybe with some minor suggestions for 
improving.

  *   John Morris


Re: Atomic ops for unlogged LSN

2023-10-26 Thread John Morris
Keeping things up to date.  Here is a rebased patch with no changes from 
previous one.


  *   John Morris


atomic-lsn.patch
Description: atomic-lsn.patch


Re: Atomic ops for unlogged LSN

2023-10-31 Thread John Morris
  *   This patch looks a little different than the last version I see posted 
[0].
That last version of the patch (which appears to be just about committable)

My oops – I was looking at the wrong commit. The newer patch has already been 
committed,  so pretend that last message didn’t happen. Thanks,
   John


Re: Atomic ops for unlogged LSN

2023-11-01 Thread John Morris
Here is what I meant to do earlier. As it turns out, this patch has not been 
merged yet.

This is a rebased version . Even though I labelled it “v3”, there should be no 
changes from “v2”.


atomicLSN-v3-Use-atomic-ops-for-unloggedLSNs.patch
Description: atomicLSN-v3-Use-atomic-ops-for-unloggedLSNs.patch


Re: Including Doxyfile and Meson script for docs into main source tree

2023-11-02 Thread John Morris
>On the patch itself I'm somewhat unconvinced that it is a good idea or
>long-term maintainable to actually have a kinda random copy of the
>configuration file(!) of an external software (doxygen in this)

Rather than copying the Doxyfile, we could just publish the non-default values. 
As a mostly non-Doxygen user, I found it useful to see the config file in its 
entirety, but I agree it is a bit much.

Would only showing modified values make more sense?



From: Stefan Kaltenbrunner 
Date: Thursday, November 2, 2023 at 2:51 AM
To: Bohdan Mart 
Cc: pgsql-hack...@postgresql.org , 'Bruce 
Momjian' , postg...@coyotebush.net , 
John Morris 
Subject: Re: Including Doxyfile and Meson script for docs into main source tree
Hi Bohdan!

I will see about making th ecurrent doxygen configuration available as a
download again in the next few days. Not sure whether there is an
additional actionable item yet as far as the project from John Morris
goes wrt. the postgresql.org sysadmin team.

The patch proposed is an enhancement of the build process in core
postgresql and I think we would look into utilizing that also for the
"official" build once applied.

I dont think we would want to locally maintain a C-based filter and a
custom build procedure otherwise.

On the patch itself I'm somewhat unconvinced that it is a good idea or
long-term maintainable to actually have a kinda random copy of the
configuration file(!) of an external software (doxygen in this) case in
our code that might need continous updating and maintainance to keep up
with new upstream releases.
As it stands the patch right now is against 1.9.4, with 1.9.1 running on
our installation and 1.9.8 being the latest release...


Stefan


On 30.10.23 14:57, Bohdan Mart wrote:
> Hello, Stefan!
>
> What do you think about this? See quoted message from John Morris.
>
> P.S. Also we would like you to provide currently used Doxyfile, as
> Postgres doxigen's home page no longer have link to download said file.
>
> Regards,
>  Bohdan.
>
> On 14.10.23 21:54, John Morris wrote:
>> Thank you for trying the patch out and commenting on it.
>>
>> I'm starting to think of it as a project. Here's a quick project
>> statement.
>>
>> The purpose is to generate improved  Doxygen output while making
>> maximal use of how Postgres currently does program comments.
>>
>> Thinking in terms of specific steps, and adding to what you have
>> mentioned:
>>
>>   * Configure Doxyfile so it produces output similar to previous output.
>>   * Only build Doxygen output if requested
>>   * Don't compile the filter or configure the Doxyfile  if they aren't
>> needed
>>   * Include contrib in the sources to document
>>   * Provide options for other (non-html) output. (Which ones?)
>>   * Update Postgres documentation to include instructions for creating
>> Doxygen output.
>>   * Mention it in developer guidelines and provide sample code showing a
>> "desired" commenting style.
>>
>> Does that list seem complete? I don't want people to think we're
>> imposing new standards or legislating new commenting styles.  It's
>> more a matter of describing what we currently do, maybe with some
>> minor suggestions for improving.
>>
>>   * John Morris
>>


Re: Atomic ops for unlogged LSN

2023-11-06 Thread John Morris
I incorporated your suggestions and added a few more. The changes are mainly 
related to catching potential errors if some basic assumptions aren’t met.

There are basically 3 assumptions. Stating them as conditions we want to avoid.

  *   We should not get an unlogged LSN before reading the control file.
  *   We should not get an unlogged LSN when shutting down.
  *   The unlogged LSN written out during a checkpoint shouldn’t be used.

Your suggestion addressed the first problem, and it took only minor changes to 
address the other two.

The essential idea is, we set a value of zero in each of the 3 situations. Then 
we throw an Assert() If somebody tries to allocate an unlogged LSN with the 
value zero.

I found the comment about cache coherency a bit confusing. We are dealing with 
a single address, so there should be no memory ordering or coherency issues. 
(Did I misunderstand?) I see it more as a race condition. Rather than merely 
explaining why it shouldn’t happen, the new version verifies the assumptions 
and throws an Assert() if something goes wrong.



atomic_lsn_v5.patch
Description: atomic_lsn_v5.patch


Re: Where can I find the doxyfile?

2023-11-06 Thread John Morris
Another update, this time with an abbreviated Doxyfile. Rather than including 
the full Doxyfile, this updated version only includes modified settings. It is 
more compact and more likely to survive across future doxygen versions.


  *   John Morris


doxygen_v2.patch
Description: doxygen_v2.patch


Re: locked reads for atomics

2023-11-10 Thread John Morris
>I wonder if it's worth providing a set of "locked read" functions.

Most out-of-order machines include “read acquire” and “write release” which are 
pretty close to what you’re suggesting. With the current routines, we only have 
“read relaxed” and  “write relaxed”. I think implementing acquire/release 
semantics is a very good idea,

I would also like to clarify the properties of atomics. One very important 
question: Are atomics also volatile?  If so, the compiler has very limited 
ability to move them around. If not, it is difficult to tell when or where they 
will take place unless the surrounding code is peppered with barriers.


Re: Temporary file access API

2022-09-27 Thread John Morris
  *   So I think that code simplification and easy adoption of in-memory data
changes (such as encryption or compression) are two rather distinct goals.
admit that I'm running out of ideas how to develop a framework that'd be
useful for both.

I’m also wondering about code simplification vs a more general 
encryption/compression framework. I started with the current proposal, and I’m 
also looking at David Steele’s approach to encryption/compression in 
pgbackrest. I’m beginning to think we should do a high-level design which 
includes encryption/compression, and then implement it as a “simplification” 
without actually doing the transformations.
















Re: Atomic ops for unlogged LSN

2023-07-19 Thread John Morris
>> Why don't we just use a barrier when around reading the value? It's not like
>> CreateCheckPoint() is frequent?

One reason is that a barrier isn’t needed, and adding unnecessary barriers can 
also be confusing.

With respect to the “debug only” comment in the original code, whichever value 
is written to the structure during a checkpoint will be reset when restarting. 
Nobody will ever see that value. We could just as easily write a zero.

Shutting down is a different story. It isn’t stated, but we require the 
unlogged LSN be quiescent before shutting down. This patch doesn’t change that 
requirement.

We could also argue that memory ordering doesn’t matter when filling in a 
conventional, unlocked structure.  But the fact we have only two cases 1) 
checkpoint where the value is ignored, and 2) shutdown where it is quiescent, 
makes the additional argument unnecessary.

Would you be more comfortable if I added comments describing the two cases? My 
intent was to be minimalistic, but the original code could use better 
explanation.


Re: Atomic ops for unlogged LSN

2023-07-20 Thread John Morris

> what happens if …  reader here stores the old unloggedLSN value
> to control file and the server restarts (clean exit). So, the old
>value is loaded back to unloggedLSN upon restart and the callers of
> GetFakeLSNForUnloggedRel() will see an old/repeated value too. Is it a
> problem?

First, a clarification. The value being saved is the “next” unlogged LSN,
not one which has already been used.
(we are doing “fetch and add”,  not “add and fetch”)

You have a good point about shutdown and startup.  It is vital we
don’t repeat an unlogged LSN. This situation could easily happen
If other readers were active while we were shutting down.

>With an atomic variable, it is guaranteed that the readers
>don't see a torn-value, but no synchronization is provided.

The atomic increment also ensures the sequence
of values is correct, specifically we don’t see
repeated values like we might with a conventional increment.
As a side effect, the instruction enforces a memory barrier, but we are not
relying on a barrier in this case.



Re: Atomic ops for unlogged LSN

2023-07-20 Thread John Morris
Based on your feedback, I’ve updated the patch with additional comments.

  1.  Explain the two cases when writing to the control file, and
  2.  a bit more emphasis on unloggedLSNs not being valid after a crash.

Thanks to y’all.

  *   John


v2-0001-Use-atomic-ops-for-unloggedLSNs.patch
Description: v2-0001-Use-atomic-ops-for-unloggedLSNs.patch


Re: Temporary file access API

2022-09-22 Thread John Morris
I’m a latecomer to the discussion, but as a word of introduction, I’m working 
with Stephen, and I have started looking over the temp file proposal with the 
idea of helping it move along.

I’ll start by summarizing the temp file proposal and its goals.

>From a high level, the proposed code:

  *   Creates an fread/fwrite replacement (BufFileStream) for buffering data to 
a single file.
  *   Updates BufFile, which reads/writes a set of files, to use BufFileStream 
internally.
  *   Does not impact the normal (page cached) database I/O.
  *   Updates all the other places where fread/fwrite and read/write are used.
  *   Creates and removes transient files.
I see the following goals:

  *   Unify all the “other” file accesses into a single, consistent API.
  *   Integrate with VFDs.
  *   Integrate transient files with transactions and tablespaces.
  *   Create a consolidated framework where features like encryption and 
compression can be more easily added.
  *   Maintain good streaming performance.
Is this a fair description of the proposal?

For myself, I’d like to map out how features like compression and encryption 
would fit into the framework, more as a sanity check than anything else, and 
I’d like to look closer at some of the implementation details. But at the 
moment, I want to make sure I have the proper high-level view of the temp file 
proposal.


From: Robert Haas 
Date: Wednesday, September 21, 2022 at 11:54 AM
To: Antonin Houska 
Cc: PostgreSQL Hackers , Peter Eisentraut 
, Stephen Frost 
Subject: Re: Temporary file access API
On Mon, Aug 8, 2022 at 2:26 PM Antonin Houska  wrote:
> > I don't think that (3) is a separate advantage from (1) and (2), so I
> > don't have anything separate to say about it.
>
> I thought that the uncontrollable buffer size is one of the things you
> complaint about in
>
> https://www.postgresql.org/message-id/CA+TgmoYGjN_f=fcerx49bzjhng+gocty+a+xhnrwcvvdy8u...@mail.gmail.com

Well, I think that email is mostly complaining about there being no
buffering at all in a situation where it would be advantageous to do
some buffering. But that particular code I believe is gone now because
of the shared-memory stats collector, and when looking through the
patch set, I didn't see that any of the cases that it touched were
similar to that one.

--
Robert Haas
EDB: http://www.enterprisedb.com





Atomic ops for unlogged LSN

2023-05-23 Thread John Morris
This is a short patch which cleans up code for unlogged LSNs. It replaces the 
existing spinlock with
atomic ops.  It could provide a performance benefit for future uses of
unlogged LSNS,  but for now it is simply a cleaner implementation.


0001-Use-atomic-ops-for-unloggedLSNs.patch
Description: 0001-Use-atomic-ops-for-unloggedLSNs.patch