Re: F77 indexed file support

Roland Hughes via Fortran Wed, 08 Mar 2023 05:30:51 -0800

Hello Arjen,

Thanks for your reply.


You are confusing RMS Files-11 file versioning with Indexing.

Sorry, this got away from me. Once I started I couldn't stop.

Real computers, didn't matter who made them or their OS, all provided atleast one type of indexed file. These were business class platforms.Even scientific users needed a business class platform.


Organization types:

*DIRECT *

This was also called a "Hash File" on many platforms. It was a glorifiedsequential file you accessed via RECORD NUMBER. Sometimes data wasstored and accessed as an on-disk array. Most had some kind of hashalgorithm.

On the PC platform a somewhat crippled version of this was the fileformat used by DBase and the other Xbase tools of the day used. Each"index" was stored in its own file. The platform could only have one"index" open at a time so inserts/additions to the data were onlyreflected in that file. You had to PACK the data file and REINDEX eachtime you opened a different "index" because that was the only way to becertain they were in synch.


https://www.theminimumyouneedtoknow.com/xbase_book.html

If you really want to know more, you can download that book from thexBaseJ project on SourceForge. I wrote it and donated it to the project.

No matter what platform DIRECT access had the problem of deletions. Thewere only "marked" as deleted within the data. You had to PACK a file toremove them. Lazy developers used bad hash algos so you also had to dealwith collisions and missing data.

Today we have amazing hash algos. Even in commercial relational databasesystems certain indexes/keys are implemented via hash because it is thefastest for look-up only data.

DIRECT access files are still popular on the lesser platforms because,if you have a logically contiguous sequential file and mandate fixedsize records, the C/C++ fseek() function lets you basically implement one.


*ISAM*

I had to spend two weeks in COBOL I drawing on paper and chalk boarduntil I understood how this worked. While the VAX claimed support for it(to make it easier to port IBM software to VAX) I never encounteredanyone using it. Even this small description will be more than anyonewishes to know about it.

Indexed Sequential Access Method was really only used on platforms thatallocated disk storage in Volumes, Cylinders, and Tracks. A certainnumber of tracks (or cylinders) were allocated to the "index area." Thiswas assumed to be at the top of the file. There key values would begrouped into partially filled "buckets" along with their data track number.

The Primary Data Area would be allocated a certain number of tracks,cylinders, or even volumes (a volume is an entire disk spindle).

The Overflow Data Area got allocated tracks, cylinders, and potentiallyvolumes as well.

We will skip the conversation of "bucket splits." Buckets were fixedsize, you tried to keep them 50% or less full. Finding a record was alinear search in the index area reading the first entry of each bucketuntil you found one greater than what you were looking for which meantyour value existed in the previous bucket if at all.

Assuming your key was in said bucket, you then read the track in thedata area and chewed through it record by record (they could be packed)looking for the record with your key or one greater. (lesser if you wereusing a descending index)

After going through one or more tracks and coming up empty handed yoursearch was not over, it had to perform a sequential search on the entireoverflow area.

All of this pain I just shared with you was handled by the OS. YourCOBOL, FORTRAN, BASIC, DIBOL, etc. program simply did READ RECORD blahKEY EQ more-blah.

If you were coding in Assembly, this pain you had to do yourself,especially if your OS didn't provide a system services library to do itmostly for you.

Deletions are simply flagged deleted. You have to REORG these files toempty out the overflow area, reclaim deleted data space, and clean upthe buckets.


*VSAM*

Virtual Sequential Access Method.

Has only Index Areas (plural) and Data Areas. Each "bucket" is at leastone disk block in size. At its end it contains a link to the next areabucket. These buckets can be scattered anywhere on disk. If you havebound volumes or any other OS ability to group multiple disks into onelogical disk, they can be on any of those spindles. Data areas are the same.

You have the option of processing this file sequentially by doing akeyed hit to the first record of some index and sequentially reading. Itwill read in key sequence until end.

Index buckets are required to have the actual data bucket of the record.Records could span blocks/buckets and they could be packed.

Deleted records were actually deleted. If block/bucket spanning wasenabled you paid a bit of overhead price while things shuffled around.

The file system keeps track of the lowest and highest key value for eachindex as well as the bucket count for the index. A binary search is doneto find the correct index bucket.

Again, COBOL, FORTRAN, BASIC, DIBOL, etc programmers just did theirlanguage's version of READ RECORD blah KEY EQ more-blah. The OS dideverything else.

Assembler programmers on platforms that didn't have system services tocall had to do all of this on their own.


We all had to take Assembler so we all had to learn this stuff.

Sorry, once that started spilling out I couldn't stop it.

Roland

On 3/7/2023 11:32 PM, Arjen Markus wrote:

I have never worked much with VAXes, but I do remember that VAX used afile system where you made a new version of a file and the olderversions were automatically kept. I guess that is the purpose of theINDEXED organisation. It is not so much a limitation of gfortran thatit does not support this, but a consequence of the operating system'scompletely different view on files and file management.


Regards,

Arjen

Op di 7 mrt 2023 om 23:58 schreef Thomas Koenig via Fortran<fortran@gcc.gnu.org>:


    Hi Roland,

    >   210  OPEN (UNIT=K_DRAW_CHAN,
    >       1        FILE=DRAWING_DATA,
    >       2        STATUS='OLD',
    >       3        ORGANIZATION='INDEXED',

    I'd never heard of that one up to now.

    >       4        ACCESS='KEYED',
    >       5        RECORDTYPE='FIXED',
    >       6        FORM='UNFORMATTED',
    >       7        RECL=K_DRAWING_RECORD_SIZE/4,
    >       8        CARRIAGECONTROL='FORTRAN',
    >       9        KEY=(1:8:CHARACTER),
    >       1        DISP='KEEP',
    >       2        IOSTAT=L_DRAW_ERR,
    >       3        ERR=999)
    >
    > The ORGANIZATION='INDEXED' is key.
    >
    > GnuCOBOL
    >
    > https://gnucobol.sourceforge.io/
    >
    > uses the BerkleyDB (sp?) library so the standard COBOL indexed file
    > support from the big computers can at least be mimicked.
    >
    > I'm searching everywhere and I cannot find Gnu Fortran (any flavor)
    > having an ORGANIZATION clause in the OPEN().

    ORGANIZATION is not an extension that gfortran supports.
    ifort, which traces its lineage back to VMS Fortran, supports
    ORGANIZATION, but not 'INDEXED', according to

    
https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/file-operation-i-o-statements/open-statement-specifiers/open-organization-specifier.html

    This is likely a Fortran interface to a VMS speciality; the older
    operating systems had stuff like that.  UNIX did away with all
    the record-orientation (I also remember VSAM and ISAM data sets
    on old IBM mainframes) and UNIX and derivatives, and Windows, now
    just offers the "stream of bytes" model.

    So, if you need the functionality, you will have to implement it
    yourself, possibly via a database.

    Best regards

            Thomas

--
Roland Hughes, President
Logikal Solutions
(630)-205-1593  (cell)
http://www.theminimumyouneedtoknow.com
http://www.infiniteexposure.net
http://www.johnsmith-book.com

Re: F77 indexed file support

Reply via email to