Hello Arjen,
Thanks for your reply.
You are confusing RMS Files-11 file versioning with Indexing.
Sorry, this got away from me. Once I started I couldn't stop.
Real computers, didn't matter who made them or their OS, all provided at
least one type of indexed file. These were business class platforms.
Even scientific users needed a business class platform.
Organization types:
*DIRECT *
This was also called a "Hash File" on many platforms. It was a glorified
sequential file you accessed via RECORD NUMBER. Sometimes data was
stored and accessed as an on-disk array. Most had some kind of hash
algorithm.
On the PC platform a somewhat crippled version of this was the file
format used by DBase and the other Xbase tools of the day used. Each
"index" was stored in its own file. The platform could only have one
"index" open at a time so inserts/additions to the data were only
reflected in that file. You had to PACK the data file and REINDEX each
time you opened a different "index" because that was the only way to be
certain they were in synch.
https://www.theminimumyouneedtoknow.com/xbase_book.html
If you really want to know more, you can download that book from the
xBaseJ project on SourceForge. I wrote it and donated it to the project.
No matter what platform DIRECT access had the problem of deletions. The
were only "marked" as deleted within the data. You had to PACK a file to
remove them. Lazy developers used bad hash algos so you also had to deal
with collisions and missing data.
Today we have amazing hash algos. Even in commercial relational database
systems certain indexes/keys are implemented via hash because it is the
fastest for look-up only data.
DIRECT access files are still popular on the lesser platforms because,
if you have a logically contiguous sequential file and mandate fixed
size records, the C/C++ fseek() function lets you basically implement one.
*ISAM*
I had to spend two weeks in COBOL I drawing on paper and chalk board
until I understood how this worked. While the VAX claimed support for it
(to make it easier to port IBM software to VAX) I never encountered
anyone using it. Even this small description will be more than anyone
wishes to know about it.
Indexed Sequential Access Method was really only used on platforms that
allocated disk storage in Volumes, Cylinders, and Tracks. A certain
number of tracks (or cylinders) were allocated to the "index area." This
was assumed to be at the top of the file. There key values would be
grouped into partially filled "buckets" along with their data track number.
The Primary Data Area would be allocated a certain number of tracks,
cylinders, or even volumes (a volume is an entire disk spindle).
The Overflow Data Area got allocated tracks, cylinders, and potentially
volumes as well.
We will skip the conversation of "bucket splits." Buckets were fixed
size, you tried to keep them 50% or less full. Finding a record was a
linear search in the index area reading the first entry of each bucket
until you found one greater than what you were looking for which meant
your value existed in the previous bucket if at all.
Assuming your key was in said bucket, you then read the track in the
data area and chewed through it record by record (they could be packed)
looking for the record with your key or one greater. (lesser if you were
using a descending index)
After going through one or more tracks and coming up empty handed your
search was not over, it had to perform a sequential search on the entire
overflow area.
All of this pain I just shared with you was handled by the OS. Your
COBOL, FORTRAN, BASIC, DIBOL, etc. program simply did READ RECORD blah
KEY EQ more-blah.
If you were coding in Assembly, this pain you had to do yourself,
especially if your OS didn't provide a system services library to do it
mostly for you.
Deletions are simply flagged deleted. You have to REORG these files to
empty out the overflow area, reclaim deleted data space, and clean up
the buckets.
*VSAM*
Virtual Sequential Access Method.
Has only Index Areas (plural) and Data Areas. Each "bucket" is at least
one disk block in size. At its end it contains a link to the next area
bucket. These buckets can be scattered anywhere on disk. If you have
bound volumes or any other OS ability to group multiple disks into one
logical disk, they can be on any of those spindles. Data areas are the same.
You have the option of processing this file sequentially by doing a
keyed hit to the first record of some index and sequentially reading. It
will read in key sequence until end.
Index buckets are required to have the actual data bucket of the record.
Records could span blocks/buckets and they could be packed.
Deleted records were actually deleted. If block/bucket spanning was
enabled you paid a bit of overhead price while things shuffled around.
The file system keeps track of the lowest and highest key value for each
index as well as the bucket count for the index. A binary search is done
to find the correct index bucket.
Again, COBOL, FORTRAN, BASIC, DIBOL, etc programmers just did their
language's version of READ RECORD blah KEY EQ more-blah. The OS did
everything else.
Assembler programmers on platforms that didn't have system services to
call had to do all of this on their own.
We all had to take Assembler so we all had to learn this stuff.
Sorry, once that started spilling out I couldn't stop it.
Roland
On 3/7/2023 11:32 PM, Arjen Markus wrote:
I have never worked much with VAXes, but I do remember that VAX used a
file system where you made a new version of a file and the older
versions were automatically kept. I guess that is the purpose of the
INDEXED organisation. It is not so much a limitation of gfortran that
it does not support this, but a consequence of the operating system's
completely different view on files and file management.
Regards,
Arjen
Op di 7 mrt 2023 om 23:58 schreef Thomas Koenig via Fortran
<fortran@gcc.gnu.org>:
Hi Roland,
> 210 OPEN (UNIT=K_DRAW_CHAN,
> 1 FILE=DRAWING_DATA,
> 2 STATUS='OLD',
> 3 ORGANIZATION='INDEXED',
I'd never heard of that one up to now.
> 4 ACCESS='KEYED',
> 5 RECORDTYPE='FIXED',
> 6 FORM='UNFORMATTED',
> 7 RECL=K_DRAWING_RECORD_SIZE/4,
> 8 CARRIAGECONTROL='FORTRAN',
> 9 KEY=(1:8:CHARACTER),
> 1 DISP='KEEP',
> 2 IOSTAT=L_DRAW_ERR,
> 3 ERR=999)
>
> The ORGANIZATION='INDEXED' is key.
>
> GnuCOBOL
>
> https://gnucobol.sourceforge.io/
>
> uses the BerkleyDB (sp?) library so the standard COBOL indexed file
> support from the big computers can at least be mimicked.
>
> I'm searching everywhere and I cannot find Gnu Fortran (any flavor)
> having an ORGANIZATION clause in the OPEN().
ORGANIZATION is not an extension that gfortran supports.
ifort, which traces its lineage back to VMS Fortran, supports
ORGANIZATION, but not 'INDEXED', according to
https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/file-operation-i-o-statements/open-statement-specifiers/open-organization-specifier.html
This is likely a Fortran interface to a VMS speciality; the older
operating systems had stuff like that. UNIX did away with all
the record-orientation (I also remember VSAM and ISAM data sets
on old IBM mainframes) and UNIX and derivatives, and Windows, now
just offers the "stream of bytes" model.
So, if you need the functionality, you will have to implement it
yourself, possibly via a database.
Best regards
Thomas
--
Roland Hughes, President
Logikal Solutions
(630)-205-1593 (cell)
http://www.theminimumyouneedtoknow.com
http://www.infiniteexposure.net
http://www.johnsmith-book.com