On Wed, Jul 2, 2008 at 8:43 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Ken Camann" <[EMAIL PROTECTED]> writes: >> Oh I see. Between this and looking again at the warning list, I see >> that it will probably take a lot more work than I thought. There are >> about 450 occurrences of the assumption that sizeof(size_t) == >> sizeof(int). > > [ blink... ] There are *zero* occurrences of the assumption that > sizeof(size_t) == sizeof(int), unless maybe in some of that grotty > #ifdef WIN32 code. Postgres has run on 64-bit platforms for many > years now.
Hi Tom. I knew about the previous 64 bit platform support, which is why I was so surprised to see the problem. Unless I am missing an important #define that somehow makes this stuff go away (but I don't think so, given how much of it there is) it does happen to be in there. If I haven't done anything wrong, I would assume no one noticed because those architectures define sizeof(long) to be >= sizeof(size_t). Well actually, let me be as strict as possible because I don't know the latest C standards very well (I am a C++ programmer). Am I correct that the standard says that sizeof(size_t) must be sizeof(void*), and that no compiler has ever said otherwise? I think so, given what size_t is supposed to mean. So I tend use sizeof(void*) and sizeof(size_t) interchangeably. Sorry for the confusion if that is less clear. According to postgres.h (not conditionally defined by anything) states that all the code assumes: sizeof(Datum) == sizeof(long) >= sizeof(void *) >= 4 where the first equation is reflexively true because Datum is a long typedef. EMT64/AMD64 is new compared to the older architectures, I would guess the older ones predate the time when it became a somewhat de facto standard to leave "long int" at 4 bytes, and make "long long" the new 64-bit type. In fact this definition is so common that it will soon be the de jour C++ standard definition. I assume ISO C still will not fix byte lengths to the declarators since they've fought it for so long. In any case, if sizeof(long) = 4 this fails to be true. This is more interesting still (in c.h) /* * Size * Size of any memory resident object, as returned by sizeof. */ typedef size_t Size; /* * Index * Index into any memory resident array. * * Note: * Indices are non negative. */ typedef unsigned int Index; /* * Offset * Offset into any memory resident array. * * Note: * This differs from an Index in that an Index is always * non negative, whereas Offset may be negative. */ typedef signed int Offset; There seems to be an interesting mix of size_t, long, and int in use for memory. No one has noticed possibly because the shared buffers per single user have never been bigger than 2GB for anyone. Postgres documentation recommends "big" numbers like 20 or 30 MB, and the default is much smaller. In order to have had problems with this, you'd probably need all the following to happen at once: 1.) a huge enterprise (with lots of money to buy memory but using postgres and not Oracle) doing data warehousing on enormous tables 2.) on a platform where sizeof(int) = sizeof(long) = 4 but sizeof(void*) = 8 3.) a DBA who wanted the shared buffers > 2 GB 4.) An operating system supporting > 2GB of memory 5.) An operating system willing to allocate continuous blocks > 2 GB 6.) An cstdlib implementation of malloc willing to allocate continuous blocks > 2 GB 7.) Exactly the right query to make it explode. I think it happens to work out that not all of those have happened simultaneously yet. Anyway, there are a lot of other sizeof(int) == sizeof(size_t) assumptions in totally unimportant places, here's one in bootstrap.c "int len; len = strlen(str); //possible loss of data" That kind is very common. -Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers