Apologies for the naff double post, but I meant to add that obviously the figures for the solaris box are bogus after the first run...imagine a file system cache of an entire 2gb file. I tried creating a file of 4gb on this box, but it bombed with a "file too large error". Unfortunately, I can't rip memory out of this box as I don't have exclusive access.
On Thu, 2002-04-18 at 11:49, Mark Pritchard wrote: > I threw together the attached program (compiles fine with gcc 2.95.2 on > Solaris 2.6 and egcs-2.91.66 on RedHat Linux 6.2) and ran it a few > times. Data is below. Usual disclaimers about hastily written code etc > :) > > Machine = ghoul (generic intel, 384mb ram, dual p3-800, ide disk running > dma) > > Sequential > Bytes Read Time Bytes / Sec > 536870912 27.14 19783933.74 > 536870912 27.14 19783990.60 > 536870912 27.11 19801872.14 > 536870912 26.92 19942928.41 > 536870912 27.31 19657408.43 > 19794026.66 (avg) > > Random > Bytes Read Time Bytes / Sec > 1073741824 519.57 2066589.21 > 1073741824 517.78 2073751.44 > 1073741824 516.92 2077193.23 > 1073741824 513.18 2092333.29 > 1073741824 510.68 2102579.88 > 2082489.41 (avg) > > Machine = jedi (Sun E420, 3gb ram, dual 400s, test on single scsi disk) > > Sequential > Bytes Read Time Bytes / Sec > 2097152000 65.19 32167675.28 > 2097152000 65.22 32154114.65 > 2097152000 65.16 32182561.99 > 2097152000 65.12 32206105.12 > 2097152000 64.67 32429463.26 > 32227984.06 (avg) > > Random > Bytes Read Time Bytes / Sec > 4194304000 1522.22 2755394.79 > 4194304000 278.18 15077622.05 > 4194304000 91.43 45874730.07 > 4194304000 61.43 68273795.19 > 4194304000 54.55 76890231.51 > 41774354.72 > > If I interpret Tom's "divide" instruction correctly, is that a factor of > 10 on the linux box? > > On Thu, 2002-04-18 at 01:16, Tom Lane wrote: > > "Luis Alberto Amigo Navarro" <[EMAIL PROTECTED]> writes: > > > On my own few experience I think this could be solved decreasing > > > random_page_cost, if you would prefer to use indexes than seq scans, then > > > you can lower random_page_cost to a point in which postgres works as you > > > want. So the planner would prefer indexes when in standard conditions it > > > would prefer seq scans. > > > > It's entirely possible that the default value of random_page_cost is too > > high, at least for many modern machines. The experiments I did to get > > the 4.0 figure were done a couple years ago, on hardware that wasn't > > exactly new at the time. I have not heard of anyone else trying to > > measure it though. > > > > I don't think I have the source code I used anymore, but the principle > > is simple enough: > > > > 1. Make a large file (several times the size of your machine's RAM, to > > ensure you swamp out kernel disk buffering effects). Fill with random > > data. (NB: do not fill with zeroes, some filesystems optimize this away.) > > > > 2. Time reading the file sequentially, 8K per read request. > > Repeat enough to get a statistically trustworthy number. > > > > 3. Time reading randomly-chosen 8K pages from the file. Repeat > > enough to get a trustworthy number (the total volume of pages read > > should be several times the size of your RAM). > > > > 4. Divide. > > > > The only tricky thing about this is making sure you are measuring disk > > access times and not being fooled by re-accessing pages the kernel still > > has cached from a previous access. (The PG planner does try to account > > for caching effects, but that's a separate estimate; the value of > > random_page_cost isn't supposed to include caching effects.) AFAIK the > > only good way to do that is to use a large test, which means it takes > > awhile to run; and you need enough spare disk space for a big test file. > > > > It'd be interesting to get some numbers for this across a range of > > hardware, filesystems, etc ... > > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] > > > > > ---- > > #include <errno.h> > #include <stdio.h> > #include <stdlib.h> > #include <time.h> > #include <sys/stat.h> > #include <sys/time.h> > > /** > * Constants > */ > > #define BLOCK_SIZE (8192) > > /** > * Prototypes > */ > > // Creates the test file filled with random data > void createTestFile(char *testFileName, long long fileSize); > > // Handles runtime errors by displaying the function, activity and error number > void handleError(char *functionName, char *activity); > > // Standard entry point > int main(int argc, char *args[]); > > // Prints correct usage and quits > void printUsageAndQuit(); > > // Tests performance of random reads of the given file > void testRandom(char *testFileName, long long amountToRead); > > // Tests performance of sequential reads of the given file > void testSeq(char *testFileName); > > /** > * Definitions > */ > > /** > * createTestFile() > */ > void createTestFile(char *testFileName, long long fileSize) > { > FILE *testFile; > long long reps, i, j, bufferReps; > time_t timetmp; > long long *buffer; > size_t written; > > // Indicate op > printf("Creating test file %s of %lld mb\n",testFileName,fileSize); > > // Adjust file size to bytes > fileSize *= (1024*1024); > > // Allocate a buffer for writing out random long longs > if (!(buffer = malloc(BLOCK_SIZE))) > handleError("createTestFile()","malloc"); > > // Open the file for writing > if (!(testFile = fopen(testFileName, "wb"))) > handleError("createTestFile()","fopen"); > > // Initialise the random number generator > srandom(time(NULL)); > > // Write data > reps = fileSize / BLOCK_SIZE; > bufferReps = BLOCK_SIZE / sizeof(long long); > for (i = 0; i < reps; i++) > { > // Fill buffer with random data > for (j = 0; j < bufferReps; j++) > buffer[j] = random(); > > // Write > written = fwrite(buffer, sizeof(long long), bufferReps, testFile); > if (written != bufferReps) > handleError("createTestFile()","fwrite"); > } > > // Flush and close > if (fflush(testFile)) > handleError("createTestFile()","fflush"); > if (fclose(testFile)) > handleError("createTestFile()","fclose"); > > // Free buffer > free(buffer); > } > > /** > * handleError() > */ > void handleError(char *functionName, char *activity) > { > fprintf(stderr, "Error in %s while attempting %s. Error %d (%s)\n", >functionName, activity, errno, strerror(errno)); > exit(1); > } > > /** > * main() > */ > int main(int argc, char *argv[]) > { > // Print usage and quit if argument count is definitely incorrect > if (argc < 3) > { > // Definitely wrong > printUsageAndQuit(); > } > else > { > // Dispatch > if (!strcmp(argv[1], "create")) > { > if (argc != 4) > printUsageAndQuit(); > > // Create the test file of the specified size > createTestFile(argv[2], atol(argv[3])); > } > else if (!strcmp(argv[1], "seqtest")) > { > if (argc != 3) > printUsageAndQuit(); > > // Test performance of sequential reads > testSeq(argv[2]); > } > else if (!strcmp(argv[1], "rndtest")) > { > if (argc != 4) > printUsageAndQuit(); > > // Test performance of random reads > testRandom(argv[2], atol(argv[3])); > } > else > { > // Unknown command > printUsageAndQuit(); > } > } > > return 0; > } > > /** > * printUsageAndQuit() > */ > void printUsageAndQuit() > { > puts("USAGE: rndpgcst [create <file> <size_in_mb>] | [seqtest <file>] | >[rndtest <file> <read_in_mb>]"); > > exit(1); > } > > /** > * testSeq() > */ > void testSeq(char *testFileName) > { > FILE *testFile; > char *buffer; > long long reps, totalRead, thisRead, timeTaken; > struct timeval startTime, endTime; > struct timezone timezoneDiscard; > > // Indicate op > printf("Sequential read test of %s\n",testFileName); > > // Grab a buffer > buffer = malloc(BLOCK_SIZE); > > // Open the file for reading > if (!(testFile = fopen(testFileName, "rb"))) > handleError("testSeq()","fopen"); > > // Start timer > if (gettimeofday(&startTime, &timezoneDiscard) == -1) > handleError("testSeq()", "gettimeofday start"); > > // Read all data from file > totalRead = 0; > while ((thisRead = fread(buffer, 1, BLOCK_SIZE, testFile)) != 0) > totalRead += thisRead; > > // End timer > if (gettimeofday(&endTime, &timezoneDiscard) == -1) > handleError("testSeq()", "gettimeofday start"); > > // Close > if (fclose(testFile)) > handleError("testSeq()","fclose"); > > // Free the buffer > free(buffer); > > // Display time taken > timeTaken = (endTime.tv_sec - startTime.tv_sec) * 1000000; > timeTaken += (endTime.tv_usec - startTime.tv_usec); > printf("%lld bytes read in %f seconds\n", totalRead, (double) timeTaken / >(double) 1000000); > } > > /** > * testRandom() > */ > void testRandom(char *testFileName, long long amountToRead) > { > FILE *testFile; > long long reps, i, fileSize, timeTaken, totalRead, readPos, thisRead, >offsetMax; > struct stat fileStats; > char *buffer; > struct timeval startTime, endTime; > struct timezone timezoneDiscard; > > // Indicate op > printf("Random read test of %s for %lld mb\n", testFileName, amountToRead); > > // Initialise the random number generator > srandom(time(NULL)); > > // Adjust amount to read > amountToRead *= (1024*1024); > > // Determine file size > if (stat(testFileName, &fileStats) == -1) > handleError("testRandom()", "stat"); > fileSize = fileStats.st_size; > > // Grab a buffer > buffer = malloc(BLOCK_SIZE); > > // Open the file for reading > if (!(testFile = fopen(testFileName, "rb"))) > handleError("testRandom()","fopen"); > > // Start timer > if (gettimeofday(&startTime, &timezoneDiscard) == -1) > handleError("testRandom()", "gettimeofday start"); > > // Read data from file > reps = amountToRead / BLOCK_SIZE; > offsetMax = fileSize / BLOCK_SIZE; > for (i = 0; i < reps; i++) > { > // Determine read position > readPos = (random() % offsetMax) * BLOCK_SIZE; > > // Seek and read > if (fseek(testFile, readPos, SEEK_SET) == -1) > handleError("testRandom()","fseek"); > if ((thisRead = fread(buffer, 1, BLOCK_SIZE, testFile)) != BLOCK_SIZE) > handleError("testRandom()","fread"); > } > > // End timer > if (gettimeofday(&endTime, &timezoneDiscard) == -1) > handleError("testRandom()", "gettimeofday start"); > > // Close > if (fclose(testFile)) > handleError("testRandom()","fclose"); > > // Free the buffer > free(buffer); > > // Display time taken > timeTaken = (endTime.tv_sec - startTime.tv_sec) * 1000000; > timeTaken += (endTime.tv_usec - startTime.tv_usec); > printf("%lld bytes read in %f seconds\n", amountToRead, (double) timeTaken / >(double) 1000000); > } ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly