Re: Where can I find the doxyfile?
Here is the updated patch. It should fix the meson issue when no doxygen is present. doxygen-v3.patch Description: doxygen-v3.patch
Re: Where can I find the doxyfile?
>> I wish you'd stop proposing the lex filter so that we can discuss the >> Doxyfile.in file and the accompanying Meson rule. I see no obstacle to discussing Doxyfile.in and the meson.build file. Your suggestions are welcome. If you can find the original Doxyfile, it would make sense to incorporate it. >> but having something that purports to display our source code and >> then show something that is*not* our source tree sounds insane. The filter enables doxygen to interpret Postgres comments. Any code displayed is the original code, not the filtered code. >> Speaking from my personal point of view, I make very little use >> of our Doxygen pages, Same for me. The pages are missing critical information, such as function descriptions and descriptions of structure fields. This filter is an attempt at fixing those problems. Of course, nothing will fix doxygen when there are no comments to begin with. If we want comprehensive doxygen output, we need to add comments. However, we don’t need to add those irritating extra symbols.
Re: Where can I find the doxyfile?
>> It seems something you want to keep for your personal use. >> I think this filter is the kind of genious idea that it's OK if you >> can do it at home I don’t agree with your characterization. The purpose of the filter is to bring existing Postgres comments into the doxygen output. While I haven’t done a full survey, the majority of Postgres code has comments describing functions, globals, macros and structure fields. Currently, those comments are thrown away. They do not appear in the existing Doxygen output. The filter itself is not rocket science. It is a straightforward scanner built using the flex tool. I understand many people are more comfortable with perl or python, but flex is uniquely appropriate for creating small, efficient scanners. The resulting output is not perfect. However, it is a major step forward from what we are currently producing. Over time, we can gradually update Postgres comments to fill in missing information. Eventually, doxygen output could become a “first look” tool for looking at code. Will it be my “first look” tool. Not likely. Like you, I am comfortable with Intellij and VsCode. I have my build environments already set up. I do not see this as a high-risk endeavor. It is “off to the side” of the main Postgres code. It takes an existing tool, doxygen, which is mostly useless, and starts to make it usable. As an aside, I have contacted Intellij about allowing custom Doxygen files for hover information. No reply yet, but it would allow us to tailor our IDEs to display the information we need most.
Re: Where can I find the doxyfile?
>> I think all the explanatory messages in doc/doxygen/meson.build >> are a bit much. I think it's enough to just not define the target >> when the required conditions (dependencies, options) are not there. >> Maybe something like docs_pdf can serve as an example. Makes sense, and it is simpler. Let’s continue after I take a look at docs_pdf. Thanks, John
Re: Where can I find the doxyfile?
>> Maybe this is something that can be tweaked on the doxygen side? I’ll look into that idea. Doxygen is a separate project with its own developers to persuade. I could imagine a special “C” or “C++” mode. I wanted to keep things simple, and the filter mechanism is how they extend doxygen to support other languages. One alternative is to add the filter to doxygen’s webpage. They currently list a number of filters, none of which address this particular issue. The filter would become a build dependency like doxygen itself. In the meantime, I’d like to address Alvaro’s concern about generating artifacts. It’s a bit tedious, but a visual survey of the output may add confidence or show areas needing improvement.
Re: Where can I find the doxyfile?
>> I think all the explanatory messages in doc/doxygen/meson.build >> are a bit much. I think it's enough to just not define the target Here is a patch with an updated meson.build as you suggested. I agree the messages were a bit much. On the other hand, I would like to see clear error messages when dot or doxygen are not installed, but I’ll leave that for a future discussion. * John - doxygen-v4.patch Description: doxygen-v4.patch meson.build Description: meson.build
Re: Where can I find the doxyfile?
>> We also have a handful of .cpp files. Fortunately, our .cpp files are simple and do not include C++ specific features. We shouldn’t have any problem adding .cpp to the filter list. At some point it will be necessary to support general C++, but I wasn’t planning to do it yet. >> This file isn't really following postgres coding style - any reason not to >> do so? I’ll update it to Postgres commenting. The filter started as a standalone project. >> What do you mean with "build time error messages" specifically? Why would we >> want to output anything at build time (rather than configure time)? “build time error messages” occur after typing “ninja”. Documentation is usually produced long after meson setup messages have been forgotten. It is much more useful to see “Please install the doxygen program” after typing “ninja doxygen” than to see “target not supported”, or to have to look through the setup log. It’s personal preference, but I think the sgml docs should display a similar message. >> I'd add "native: true", it'd not work to find the program in the target >> environment of a cross build. Got it. Thanks.
Re: Where can I find the doxyfile?
Update: another patch with 1) suggested changes, 2) delete old html before generating new, and 3) added flex names for the more complex regular expressions. Exercising the “ninja doxygen” command, brought up some issues. 1. The generated html directory contained old as well as new pages. The build script now deletes old pages before building new ones. 2. Using a line of dashes to suppress text formatting is not implemented. Some dashed comments are harder to read, especially around code samples. 3. Comments at the end of “#define” are considered part of the define and not annotated as doxygen comments. The first issue was very confusing, so it has been fixed. My preference is to make the other two issues part of a future enhancement. Thoughts? I’ve also been scanning the generated pages looking for anomalies. 1. About 1/3 of pages examined. 2. In most cases, the filter works as expected. 3. Many places are missing information, so the output has blank fields (as expected). 4. A few places have obvious nonsense. (eg. a group comment applied to one element) 5. No situations where the output would be misleading. 6. In all cases, the source code is “a click away”. While I had planned to look at *every* page, I’ll probably stop at the 1/3 sample – unless someone wants to help scan through the pages with me. I also heard back from Jetbrains about incorporating custom Doxyfiles. They do their own rendering and do not invoke the doxygen command. Custom Doxyfiles are not going to happen. (It’s probably the same for VS.) On my Mac M1, generating doxygen takes about 20 seconds. When graphs are added, it takes 5 minutes. doxygen_v5.patch Description: doxygen_v5.patch
Re: Where can I find the doxyfile?
>> I have found it very strange that a tool like doxygen which can create >> all sorts of call graphs, just ignores some comments. The comments >> above function are very important. I agree with you . I hated doxygen for decades because of the irritating annotations it required. When I discovered IDEs were creating doxygen-like popups from conventional comments, I tried to figure out what they were doing. In the process, I discovered filters, and I created a filter to match what I thought the IDEs were doing. (As it turns out, IDEs implement their own rendering independent of doxygen.) I find it ironic I’ve gone from being a long-time hater of doxygen to being its advocate. * John Morris Tiny tidbit of history. Back in the 70’s, I created a comment extractor for the language Ratfor. We used it to maintain an alphabetical index of functions and to display pseudo-code. We drew our call graphs by hand and saved them in manila folders. Hard to imagine now.
Re: Atomic ops for unlogged LSN
Looks good to me. * John From: Nathan Bossart Date: Thursday, February 29, 2024 at 8:34 AM To: Andres Freund Cc: Stephen Frost , John Morris , Bharath Rupireddy , Michael Paquier , Robert Haas , pgsql-hack...@postgresql.org Subject: Re: Atomic ops for unlogged LSN Here is a new version of the patch that uses the new atomic read/write functions with full barriers that were added in commit bd5132d. Thoughts? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
FW: Add the ability to limit the amount of memory that can be allocated to backends.
Hi Reid! Some thoughts I was looking at lmgr/proc.c, and I see a potential integer overflow - both max_total_bkend_mem and result are declared as “int”, so the expression “max_total_bkend_mem * 1024 * 1024 - result * 1024 * 1024” could have a problem when max_total_bkend_mem is set to 2G or more. /* * Account for shared memory size and initialize * max_total_bkend_mem_bytes. */ pg_atomic_init_u64(&ProcGlobal->max_total_bkend_mem_bytes, max_total_bkend_mem * 1024 * 1024 - result * 1024 * 1024); As more of a style thing (and definitely not an error), the calling code might look smoother if the memory check and error handling were moved into a helper function, say “backend_malloc”. For example, the following calling code /* Do not exceed maximum allowed memory allocation */ if (exceeds_max_total_bkend_mem(Slab_CONTEXT_HDRSZ(chunksPerBlock))) { MemoryContextStats(TopMemoryContext); ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), errmsg("out of memory - exceeds max_total_backend_memory"), errdetail("Failed while creating memory context \"%s\".", name))); } slab = (SlabContext *) malloc(Slab_CONTEXT_HDRSZ(chunksPerBlock)); if (slab == NULL) …. Could become a single line: Slab = (SlabContext *) backend_malloc(Slab_CONTEXT_HDRSZ(chunksPerBlock); Note this is a change in error handling as backend_malloc() throws memory errors. I think this change is already happening, as the error throws you’ve already added are potentially visible to callers (to be verified). It also could result in less informative error messages without the specifics of “while creating memory context”. Still, it pulls repeated code into a single wrapper and might be worth considering. I do appreciate the changes in updating the global memory counter. My recollection is the original version updated stats with every allocation, and there was something that looked like a spinlock around the update. (My memory may be wrong …). The new approach eliminates contention, both by having fewer updates and by using atomic ops. Excellent. I also have some thoughts about simplifying the atomic update logic you are currently using. I need to think about it a bit more and will get back to you later on that. * John Morris
Re: Where can I find the doxyfile?
Thank you for trying the patch out and commenting on it. I'm starting to think of it as a project. Here's a quick project statement. The purpose is to generate improved Doxygen output while making maximal use of how Postgres currently does program comments. Thinking in terms of specific steps, and adding to what you have mentioned: * Configure Doxyfile so it produces output similar to previous output. * Only build Doxygen output if requested * Don't compile the filter or configure the Doxyfile if they aren't needed * Include contrib in the sources to document * Provide options for other (non-html) output. (Which ones?) * Update Postgres documentation to include instructions for creating Doxygen output. * Mention it in developer guidelines and provide sample code showing a "desired" commenting style. Does that list seem complete? I don't want people to think we're imposing new standards or legislating new commenting styles. It's more a matter of describing what we currently do, maybe with some minor suggestions for improving. * John Morris
Re: Atomic ops for unlogged LSN
Keeping things up to date. Here is a rebased patch with no changes from previous one. * John Morris atomic-lsn.patch Description: atomic-lsn.patch
Re: Atomic ops for unlogged LSN
* This patch looks a little different than the last version I see posted [0]. That last version of the patch (which appears to be just about committable) My oops – I was looking at the wrong commit. The newer patch has already been committed, so pretend that last message didn’t happen. Thanks, John
Re: Atomic ops for unlogged LSN
Here is what I meant to do earlier. As it turns out, this patch has not been merged yet. This is a rebased version . Even though I labelled it “v3”, there should be no changes from “v2”. atomicLSN-v3-Use-atomic-ops-for-unloggedLSNs.patch Description: atomicLSN-v3-Use-atomic-ops-for-unloggedLSNs.patch
Re: Including Doxyfile and Meson script for docs into main source tree
>On the patch itself I'm somewhat unconvinced that it is a good idea or >long-term maintainable to actually have a kinda random copy of the >configuration file(!) of an external software (doxygen in this) Rather than copying the Doxyfile, we could just publish the non-default values. As a mostly non-Doxygen user, I found it useful to see the config file in its entirety, but I agree it is a bit much. Would only showing modified values make more sense? From: Stefan Kaltenbrunner Date: Thursday, November 2, 2023 at 2:51 AM To: Bohdan Mart Cc: pgsql-hack...@postgresql.org , 'Bruce Momjian' , postg...@coyotebush.net , John Morris Subject: Re: Including Doxyfile and Meson script for docs into main source tree Hi Bohdan! I will see about making th ecurrent doxygen configuration available as a download again in the next few days. Not sure whether there is an additional actionable item yet as far as the project from John Morris goes wrt. the postgresql.org sysadmin team. The patch proposed is an enhancement of the build process in core postgresql and I think we would look into utilizing that also for the "official" build once applied. I dont think we would want to locally maintain a C-based filter and a custom build procedure otherwise. On the patch itself I'm somewhat unconvinced that it is a good idea or long-term maintainable to actually have a kinda random copy of the configuration file(!) of an external software (doxygen in this) case in our code that might need continous updating and maintainance to keep up with new upstream releases. As it stands the patch right now is against 1.9.4, with 1.9.1 running on our installation and 1.9.8 being the latest release... Stefan On 30.10.23 14:57, Bohdan Mart wrote: > Hello, Stefan! > > What do you think about this? See quoted message from John Morris. > > P.S. Also we would like you to provide currently used Doxyfile, as > Postgres doxigen's home page no longer have link to download said file. > > Regards, > Bohdan. > > On 14.10.23 21:54, John Morris wrote: >> Thank you for trying the patch out and commenting on it. >> >> I'm starting to think of it as a project. Here's a quick project >> statement. >> >> The purpose is to generate improved Doxygen output while making >> maximal use of how Postgres currently does program comments. >> >> Thinking in terms of specific steps, and adding to what you have >> mentioned: >> >> * Configure Doxyfile so it produces output similar to previous output. >> * Only build Doxygen output if requested >> * Don't compile the filter or configure the Doxyfile if they aren't >> needed >> * Include contrib in the sources to document >> * Provide options for other (non-html) output. (Which ones?) >> * Update Postgres documentation to include instructions for creating >> Doxygen output. >> * Mention it in developer guidelines and provide sample code showing a >> "desired" commenting style. >> >> Does that list seem complete? I don't want people to think we're >> imposing new standards or legislating new commenting styles. It's >> more a matter of describing what we currently do, maybe with some >> minor suggestions for improving. >> >> * John Morris >>
Re: Atomic ops for unlogged LSN
I incorporated your suggestions and added a few more. The changes are mainly related to catching potential errors if some basic assumptions aren’t met. There are basically 3 assumptions. Stating them as conditions we want to avoid. * We should not get an unlogged LSN before reading the control file. * We should not get an unlogged LSN when shutting down. * The unlogged LSN written out during a checkpoint shouldn’t be used. Your suggestion addressed the first problem, and it took only minor changes to address the other two. The essential idea is, we set a value of zero in each of the 3 situations. Then we throw an Assert() If somebody tries to allocate an unlogged LSN with the value zero. I found the comment about cache coherency a bit confusing. We are dealing with a single address, so there should be no memory ordering or coherency issues. (Did I misunderstand?) I see it more as a race condition. Rather than merely explaining why it shouldn’t happen, the new version verifies the assumptions and throws an Assert() if something goes wrong. atomic_lsn_v5.patch Description: atomic_lsn_v5.patch
Re: Where can I find the doxyfile?
Another update, this time with an abbreviated Doxyfile. Rather than including the full Doxyfile, this updated version only includes modified settings. It is more compact and more likely to survive across future doxygen versions. * John Morris doxygen_v2.patch Description: doxygen_v2.patch
Re: locked reads for atomics
>I wonder if it's worth providing a set of "locked read" functions. Most out-of-order machines include “read acquire” and “write release” which are pretty close to what you’re suggesting. With the current routines, we only have “read relaxed” and “write relaxed”. I think implementing acquire/release semantics is a very good idea, I would also like to clarify the properties of atomics. One very important question: Are atomics also volatile? If so, the compiler has very limited ability to move them around. If not, it is difficult to tell when or where they will take place unless the surrounding code is peppered with barriers.
Re: Temporary file access API
* So I think that code simplification and easy adoption of in-memory data changes (such as encryption or compression) are two rather distinct goals. admit that I'm running out of ideas how to develop a framework that'd be useful for both. I’m also wondering about code simplification vs a more general encryption/compression framework. I started with the current proposal, and I’m also looking at David Steele’s approach to encryption/compression in pgbackrest. I’m beginning to think we should do a high-level design which includes encryption/compression, and then implement it as a “simplification” without actually doing the transformations.
Re: Atomic ops for unlogged LSN
>> Why don't we just use a barrier when around reading the value? It's not like >> CreateCheckPoint() is frequent? One reason is that a barrier isn’t needed, and adding unnecessary barriers can also be confusing. With respect to the “debug only” comment in the original code, whichever value is written to the structure during a checkpoint will be reset when restarting. Nobody will ever see that value. We could just as easily write a zero. Shutting down is a different story. It isn’t stated, but we require the unlogged LSN be quiescent before shutting down. This patch doesn’t change that requirement. We could also argue that memory ordering doesn’t matter when filling in a conventional, unlocked structure. But the fact we have only two cases 1) checkpoint where the value is ignored, and 2) shutdown where it is quiescent, makes the additional argument unnecessary. Would you be more comfortable if I added comments describing the two cases? My intent was to be minimalistic, but the original code could use better explanation.
Re: Atomic ops for unlogged LSN
> what happens if … reader here stores the old unloggedLSN value > to control file and the server restarts (clean exit). So, the old >value is loaded back to unloggedLSN upon restart and the callers of > GetFakeLSNForUnloggedRel() will see an old/repeated value too. Is it a > problem? First, a clarification. The value being saved is the “next” unlogged LSN, not one which has already been used. (we are doing “fetch and add”, not “add and fetch”) You have a good point about shutdown and startup. It is vital we don’t repeat an unlogged LSN. This situation could easily happen If other readers were active while we were shutting down. >With an atomic variable, it is guaranteed that the readers >don't see a torn-value, but no synchronization is provided. The atomic increment also ensures the sequence of values is correct, specifically we don’t see repeated values like we might with a conventional increment. As a side effect, the instruction enforces a memory barrier, but we are not relying on a barrier in this case.
Re: Atomic ops for unlogged LSN
Based on your feedback, I’ve updated the patch with additional comments. 1. Explain the two cases when writing to the control file, and 2. a bit more emphasis on unloggedLSNs not being valid after a crash. Thanks to y’all. * John v2-0001-Use-atomic-ops-for-unloggedLSNs.patch Description: v2-0001-Use-atomic-ops-for-unloggedLSNs.patch
Re: Temporary file access API
I’m a latecomer to the discussion, but as a word of introduction, I’m working with Stephen, and I have started looking over the temp file proposal with the idea of helping it move along. I’ll start by summarizing the temp file proposal and its goals. >From a high level, the proposed code: * Creates an fread/fwrite replacement (BufFileStream) for buffering data to a single file. * Updates BufFile, which reads/writes a set of files, to use BufFileStream internally. * Does not impact the normal (page cached) database I/O. * Updates all the other places where fread/fwrite and read/write are used. * Creates and removes transient files. I see the following goals: * Unify all the “other” file accesses into a single, consistent API. * Integrate with VFDs. * Integrate transient files with transactions and tablespaces. * Create a consolidated framework where features like encryption and compression can be more easily added. * Maintain good streaming performance. Is this a fair description of the proposal? For myself, I’d like to map out how features like compression and encryption would fit into the framework, more as a sanity check than anything else, and I’d like to look closer at some of the implementation details. But at the moment, I want to make sure I have the proper high-level view of the temp file proposal. From: Robert Haas Date: Wednesday, September 21, 2022 at 11:54 AM To: Antonin Houska Cc: PostgreSQL Hackers , Peter Eisentraut , Stephen Frost Subject: Re: Temporary file access API On Mon, Aug 8, 2022 at 2:26 PM Antonin Houska wrote: > > I don't think that (3) is a separate advantage from (1) and (2), so I > > don't have anything separate to say about it. > > I thought that the uncontrollable buffer size is one of the things you > complaint about in > > https://www.postgresql.org/message-id/CA+TgmoYGjN_f=fcerx49bzjhng+gocty+a+xhnrwcvvdy8u...@mail.gmail.com Well, I think that email is mostly complaining about there being no buffering at all in a situation where it would be advantageous to do some buffering. But that particular code I believe is gone now because of the shared-memory stats collector, and when looking through the patch set, I didn't see that any of the cases that it touched were similar to that one. -- Robert Haas EDB: http://www.enterprisedb.com
Atomic ops for unlogged LSN
This is a short patch which cleans up code for unlogged LSNs. It replaces the existing spinlock with atomic ops. It could provide a performance benefit for future uses of unlogged LSNS, but for now it is simply a cleaner implementation. 0001-Use-atomic-ops-for-unloggedLSNs.patch Description: 0001-Use-atomic-ops-for-unloggedLSNs.patch