Simon Riggs <[EMAIL PROTECTED]> writes: > - yes, I would expect the results you get. If you sample 5% of rows and > each block has on average at least 20 rows, then we should expect the > majority of blocks to be hit.
These results are from my test program. 5% means 5% of 8k blocks from the test file. In other words, reading a random 5% of the blocks from the test file in sequential order but seeking over the skipped blocks is just as slow as reading the entire file. I feel like that can't be right but I can't find anything wrong with the methodology. An updated program is attached with which I got these results: bash-3.00# for i in `seq 1 100` ; do umount /u6; mount /dev/sda1 /u6; ~stark/src/pg/a.out /u6/temp/small $i ; done Reading 1% (1280/128000 blocks 1048576000 bytes) total time 7662706us MB/s 1.37 effective MB/s 136.84u Reading 2% (2560/128000 blocks 1048576000 bytes) total time 12495106us MB/s 1.68 effective MB/s 83.92 Reading 3% (3840/128000 blocks 1048576000 bytes) total time 15847342us MB/s 1.99 effective MB/s 66.17 Reading 4% (5120/128000 blocks 1048576000 bytes) total time 18281244us MB/s 2.29 effective MB/s 57.36 Reading 5% (6400/128000 blocks 1048576000 bytes) total time 18988843us MB/s 2.76 effective MB/s 55.22 Reading 6% (7680/128000 blocks 1048576000 bytes) total time 19225394us MB/s 3.27 effective MB/s 54.54 Reading 7% (8960/128000 blocks 1048576000 bytes) total time 19462241us MB/s 3.77 effective MB/s 53.88 Reading 8% (10240/128000 blocks 1048576000 bytes) total time 19747881us MB/s 4.25 effective MB/s 53.10 Reading 9% (11520/128000 blocks 1048576000 bytes) total time 19451411us MB/s 4.85 effective MB/s 53.91 Reading 10% (12800/128000 blocks 1048576000 bytes) total time 19546511us MB/s 5.36 effective MB/s 53.65 Reading 11% (14080/128000 blocks 1048576000 bytes) total time 18989375us MB/s 6.07 effective MB/s 55.22 Reading 12% (15360/128000 blocks 1048576000 bytes) total time 18722848us MB/s 6.72 effective MB/s 56.01 Reading 13% (16640/128000 blocks 1048576000 bytes) total time 18621588us MB/s 7.32 effective MB/s 56.31 Reading 14% (17920/128000 blocks 1048576000 bytes) total time 18581751us MB/s 7.90 effective MB/s 56.43 Reading 15% (19200/128000 blocks 1048576000 bytes) total time 18422160us MB/s 8.54 effective MB/s 56.92 Reading 16% (20480/128000 blocks 1048576000 bytes) total time 18148012us MB/s 9.24 effective MB/s 57.78 Reading 17% (21760/128000 blocks 1048576000 bytes) total time 18147779us MB/s 9.82 effective MB/s 57.78 Reading 18% (23040/128000 blocks 1048576000 bytes) total time 18023256us MB/s 10.47 effective MB/s 58.18 Reading 19% (24320/128000 blocks 1048576000 bytes) total time 18039846us MB/s 11.04 effective MB/s 58.13 Reading 20% (25600/128000 blocks 1048576000 bytes) total time 18081214us MB/s 11.60 effective MB/s 57.99 ...
#include <sys/types.h> #include <sys/stat.h> #include <sys/time.h> #include <time.h> #include <fcntl.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> #define BLOCKSIZE 8192 int main(int argc, char *argv[], char *arge[]) { char *fn; int fd; int perc; struct stat statbuf; struct timeval tv1,tv2; off_t size, offset; char *buf[BLOCKSIZE]; int b_toread, b_toskip, b_read=0, b_skipped=0; long us; fn = argv[1]; perc = atoi(argv[2]); fd = open(fn, O_RDONLY); fstat(fd, &statbuf); size = statbuf.st_size; size = size/BLOCKSIZE*BLOCKSIZE; gettimeofday(&tv1, NULL); srandom(getpid()^tv1.tv_sec^tv1.tv_usec); b_toread = size/BLOCKSIZE*perc/100; b_toskip = size/BLOCKSIZE-b_toread; for(offset=0;offset<size;offset+=BLOCKSIZE) { if (random()%(b_toread+b_toskip) < b_toread) { lseek(fd, offset, SEEK_SET); read(fd, buf, BLOCKSIZE); b_toread--; b_read++; } else { b_toskip--; b_skipped++; } } gettimeofday(&tv2, NULL); us = (tv2.tv_sec-tv1.tv_sec)*1000000 + (tv2.tv_usec-tv1.tv_usec); fprintf(stderr, "Reading %d%% (%d/%d blocks %ld bytes) total time %ldus MB/s %.2f effective MB/s %.2f\n", perc, b_read, b_read+b_skipped, size, us, (double)b_read*BLOCKSIZE/us, (double)size/us ); exit(0); }
-- greg
---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org