Re: [Bug-apl] ISO Component File API preview

Blake McBride Thu, 10 Jul 2014 11:51:07 -0700

Additionally, the advantage of using an SQL database over writing your own
native component file system is that the SQL database handles all the
underlying structure (pointers and layout) as well as ordering and access
for you.  A component file system on top of SQL is utterly trivial compared
to the work it would take to write a native implementation.



On Thu, Jul 10, 2014 at 1:45 PM, Blake McBride <blake1...@gmail.com> wrote:

> Dear David,
>
> I haven't actually had time to look at your work yet.  I am going off of
> what Elias said, my fair amount of experience with SQL databases, my
> experience writing the keyed files system, and the comments you have made.
>  Forgive me if I am wrong, but based on your comments, it seems like you
> don't have a lot of experience with SQL databases.
>
> In SQL terminology, there are no 'files'.  A 'database' is analogous to a
> D: drive in Windows, or a partition, file system, or mount point on Unix.
>  What you would thing of as a 'file' is called a 'table'.  A database can
> have any number of tables.  You would never think of putting one table per
> database.  That would be like creating a new drive for each file.
>
> Each table can have any number of rows (or records).  Each record in a
> single table will have the same data elements as the other records in that
> same table - sort of like a spreadsheet.  You can order or select by any
> row or any combinations of rows.  In practice, however, it is best to have
> an index setup for the more popular types of queries you will be doing.
>
> SQLite keeps each database in a single Windows or Unix file.  PostgreSQL
> has a complex internal data representation made up of many Windows/Unix
> files.  Both of these  facts are totally abstracted away from you, and in
> fact shouldn't matter to you at all.
>
> So, in implementing a component file system, it makes most sense to equate
> a single APL component file to a single SQL table.  Each component would be
> a record in an SQL table.  Ordering and access are handled automatically by
> the database.  Your implementation should not care much at all whether it
> is SQLite or PostgreSQL.
>
> I hope this is helpful and not disrespectful.
>
> Sincerely,
>
> Blake
>
>
>
>
> On Thu, Jul 10, 2014 at 11:45 AM, David Lamkins <da...@lamkins.net> wrote:
>
>> Please keep in mind that one of the key design centers of a component
>> file system is that your APL data is stored on a file. SQLite seems a
>> perfect match for this use case. Using SQLite freed me from having to worry
>> about all the details of writing an indexed file and ensuring that updates
>> are done consistently even in the case of software or hardware failures.
>> It's not that I couldn't implement all of those details in APL and C(++);
>> it'd be a *lot* more work and subject to a lot more bugs in the near term.
>> For this reason, SQLite is the ideal layer upon which to build a
>> single-user component file system.
>>
>> Regarding files: Take a look at component file systems on other
>> platforms. They are, to the best of my knowledge, all backed by named
>> files. Again, SQLite makes this possible because one database is one file.
>>
>> Regarding resources: SQLite is a library that maps SQL interactions onto
>> its database file. There is no server. There is no additional overhead in
>> opening multiple SQLite databases.
>>
>> Leaving the SQLite clarifications behind...
>>
>> I've never seen anyone else layer a component file abstraction on top of
>> a database server. To me, the main benefit of this would be the (potential)
>> ability to share a component file from multiple APL instances in multiple
>> locations. That's something that you can't do with SQLite unless you build
>> a server to mediate connections, access rights and concurrency.
>>
>> That said: What do you (Elias and Blake) see as the benefit of using a
>> database server like PostgreSQL to host a component file abstraction,
>> keeping in mind that you already have access to the underlying database? Or
>> to phrase that differently: why use a component-file API when you really
>> need the features of a database server?
>>
>>
>>
>> On Thu, Jul 10, 2014 at 8:09 AM, Blake McBride <blake1...@gmail.com>
>> wrote:
>>
>>> David, I feel very strongly for many reasons that creating a new
>>> database for each component file is a big, big mistake.  The overhead on
>>> this has got to be inordinately expensive.  Did you see what I did with the
>>> keyed file system?  It wouldn't be hard to convert that to a component file
>>> system at all.  I think it is important that all component files be stored
>>> in a single database.
>>>
>>> If you need any help with PostgreSQL, Elias and I have a lot of
>>> experience with it.  Please feel free with any questions.  I am very happy
>>> to help.
>>>
>>> --blake
>>>
>>>
>>>
>>> On Thu, Jul 10, 2014 at 10:02 AM, Elias Mårtenson <loke...@gmail.com>
>>> wrote:
>>>
>>>> David's implementation opens a new connection (in SQLite, creates a new
>>>> database file) for each CF "file".
>>>>
>>>> This is not a particularly good idea, and even using SQLite is
>>>> extremely wasteful in terms of resources. You really only want a single
>>>> database connection.
>>>>
>>>> Even if your'e using a single database, creating a separate database
>>>> table for each "file" is also not good. Databases are not designed to have
>>>> tables created and removed all the time. What you should do is to create
>>>> the necessary tables once, and then use those to store all of the CF
>>>> "files".
>>>>
>>>> Regards,
>>>> Elias
>>>>
>>>>
>>>> On 10 July 2014 22:53, Blake McBride <blake1...@gmail.com> wrote:
>>>>
>>>>> I am not sure where you are going with this, but I thought it worked
>>>>> like Elias said - multiple files per database.  This means that once you
>>>>> form the connection with the database - a one time event - no component
>>>>> file operation has to worry about any database or SQL related stuff.  It 
>>>>> is
>>>>> all automatically associated to the database you opened in the beginning.
>>>>>  Your functions should all be totally compliant.
>>>>>
>>>>> In the rare case where you wanted to use multiple databases, you can
>>>>> have a function that switches relative databases, and then all subsequent
>>>>> conforming calls will be relative to the newly selected (underlying)
>>>>> database.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Blake
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jul 10, 2014 at 1:25 AM, David Lamkins <da...@lamkins.net>
>>>>> wrote:
>>>>>
>>>>>> The short answer is that I'd like to retain compliance with the ISO
>>>>>> spec. That means either coming up with conforming extensions (e.g. 
>>>>>> parsing
>>>>>> the "filename" to infer database type) or adding new functions (as has
>>>>>> already been done for transaction support).
>>>>>>
>>>>>> Mixing API's to explicitly accept a database handle seems like a
>>>>>> reasonable extension.
>>>>>>
>>>>>> I don't think it'd be a problem to require an update of the component
>>>>>> file implementation each time lib_sql adds support for a new database. In
>>>>>> fact, given the vagaries of SQL implementations, I can imagine the forced
>>>>>> update of the component file code being a near-certainty.
>>>>>>
>>>>>> I do like the notion of allowing multiple component files in a single
>>>>>> database.
>>>>>> On Jul 9, 2014 10:38 PM, "Elias Mårtenson" <loke...@gmail.com> wrote:
>>>>>>
>>>>>>> Thank you for looking into this.
>>>>>>>
>>>>>>> Since the SQL API is database-agnostic, it would make sense to make
>>>>>>> your library the same. Instead of trying to make sense of the file
>>>>>>> specification, why don't you just pass it along to the SQL API? In other
>>>>>>> words, just pass the database type (currently only "sqlite" or
>>>>>>> "postgresql", but who knows what other databases we might want to 
>>>>>>> support
>>>>>>> in the future?) and the connect parameters to the SQL library without
>>>>>>> looking at them?
>>>>>>>
>>>>>>> Or, even easier, you might not want to call SQL∆Connect at all, and
>>>>>>> simply accept an already-opened database handle?
>>>>>>>
>>>>>>> As for the "files" concept, wouldn't it make more sense to allow
>>>>>>> multiple files in a single database. The files index would then be 
>>>>>>> stored
>>>>>>> in a separate table, which would be linked to a column in the datastore.
>>>>>>> Listing the files would then be a simple SELECT from the 
>>>>>>> file-list-table.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Elias
>>>>>>>
>>>>>>>
>>>>>>> On 10 July 2014 13:30, David B. Lamkins <dlamk...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Well, it seems simple enough, but I have zero experience with
>>>>>>>> PostgreSQL
>>>>>>>> and it's going to take me a while to ramp up...
>>>>>>>>
>>>>>>>> Therefore: Please apply the attached experimental patch and see
>>>>>>>> whether
>>>>>>>> you can get something working.
>>>>>>>>
>>>>>>>> Here's the gist of the change as implemented in this patch:
>>>>>>>>
>>>>>>>> CF_OPEN takes either a file name (no path separator; .cf extension)
>>>>>>>> or a
>>>>>>>> PostgreSQL URI. In the former case, you get a SQLite database file
>>>>>>>> with
>>>>>>>> the given name. In the latter case, lib_sql passes the URI through
>>>>>>>> to
>>>>>>>> the PostgreSQL server.
>>>>>>>>
>>>>>>>> Once you have a database handle (and - in the case of the PostgreSQL
>>>>>>>> connection, I'd guess - a properly configured database) then
>>>>>>>> everything
>>>>>>>> else should work exactly the same. (Or maybe not... While
>>>>>>>> proofreading
>>>>>>>> this message, I realized that I depend upon SQLite's implict oid
>>>>>>>> support; this may not be the same on PostgreSQL. But that's a small
>>>>>>>> matter of either making a table definition that'll work in both
>>>>>>>> cases,
>>>>>>>> or creating a separate case for each SQL variant.)
>>>>>>>>
>>>>>>>> Elias, INSERT INTO should be valid in both SQLite and PostgreSQL. At
>>>>>>>> least, that's what my quick search of the documentation suggests.
>>>>>>>>
>>>>>>>> Of course, the ISO component file APIs that are tied to *files*
>>>>>>>> won't do
>>>>>>>> anything useful for a PostgreSQL connection. This includes CF_ERASE,
>>>>>>>> CF_RENAME and CF_CREATE; the first two for the obvious reasons and
>>>>>>>> the
>>>>>>>> last because (for now, until I figure out or someone tells me
>>>>>>>> otherwise)
>>>>>>>> I'm assuming that the PostgreSQL database passed to CF_OPEN will
>>>>>>>> have
>>>>>>>> been created by a database administrator. If we can prove basic
>>>>>>>> operation, I'll take care of making the the file APIs do something
>>>>>>>> reasonable (probably just signal a domain error) when passed a
>>>>>>>> PostgreSQL URI.
>>>>>>>>
>>>>>>>> The alternate (non-URI) form of PostgreSQL connection string is not
>>>>>>>> presently recognized. It may be tricky to distinguish between a
>>>>>>>> poorly-formed file name and a properly-formed non-URI connection
>>>>>>>> string.
>>>>>>>>
>>>>>>>> Guys: I'm willing to pursue this, but don't yet have the PostgreSQL
>>>>>>>> knowledge to enable me to proceed. If you want to provide patches,
>>>>>>>> that'd be great. If you're willing to talk me through setting up
>>>>>>>> PostgreSQL (It's installed; I just haven't figured out how to create
>>>>>>>> roles and databases yet...) that'd be even better.
>>>>>>>>
>>>>>>>> I'll mention in passing that this seems like a bit of an abuse of
>>>>>>>> the
>>>>>>>> component *file* concept since there's no readily-accessible (to the
>>>>>>>> application programmer, anyhow) file in the case of the PostgreSQL
>>>>>>>> database.
>>>>>>>>
>>>>>>>> On the other hand, the ability to host a component file abstraction
>>>>>>>> on a
>>>>>>>> database server seems like an interesting and possibly useful
>>>>>>>> "conforming extension". I imagine that this could be a relatively
>>>>>>>> easy
>>>>>>>> way to implement multiuser access, as well. (I'm assuming that
>>>>>>>> PostgreSQL handles multiple clients and can do The Right Thing
>>>>>>>> w.r.t.
>>>>>>>> locking.)
>>>>>>>>
>>>>>>>> Let's see where this goes. I'm looking forward to your feedback and
>>>>>>>> help...
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, 2014-07-10 at 12:56 +0800, Elias Mårtenson wrote:
>>>>>>>> > Yes, that's how I work too. My home server contains a Postgres
>>>>>>>> > instance that I use for pretty much everything. It's quite
>>>>>>>> convenient.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Regards,
>>>>>>>> > Elias
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On 10 July 2014 12:53, Blake McBride <blake1...@gmail.com> wrote:
>>>>>>>> >         Greetings,
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >         PostgreSQL is very important to, at least, me.  I do a
>>>>>>>> lot of
>>>>>>>> >         production work in PostgreSQL.  I like SQLite too, but I
>>>>>>>> would
>>>>>>>> >         only use it when the data didn't relate to anything but
>>>>>>>> APL.
>>>>>>>> >          Here is what I propose.  Since your component file system
>>>>>>>> >         rides on top of SQL, and the standard doesn't know or care
>>>>>>>> >         about anything below the APL level, we should add a
>>>>>>>> function
>>>>>>>> >         that allows the user to specify the database information
>>>>>>>> >         (dbname, user, password, etc.).  That call would be made
>>>>>>>> as
>>>>>>>> >         sort of a setup step.  Once that setup step is specified,
>>>>>>>> all
>>>>>>>> >         of the standard API should work as described.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >         This will give us a totally standard API.  If someone
>>>>>>>> wants to
>>>>>>>> >         switch to GNU APL, all they have to do is add one
>>>>>>>> function to
>>>>>>>> >         specify the database.  Not too much to ask.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >         We kind of have to do this.  Even with SQLite, you still
>>>>>>>> have
>>>>>>>> >         to specify the database name (I presume one database
>>>>>>>> contains
>>>>>>>> >         many component files).
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >         Thanks.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >         Blake
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >         On Wed, Jul 9, 2014 at 9:44 PM, David B. Lamkins
>>>>>>>> >         <dlamk...@gmail.com> wrote:
>>>>>>>> >                 I'm certainly willing to consider alternatives.
>>>>>>>> IIUC,
>>>>>>>> >                 lib_sql also
>>>>>>>> >                 supports PostgreSQL. Anything else?
>>>>>>>> >
>>>>>>>> >                 How do I tell lib_sql to use a PostgreSQL server?
>>>>>>>> >
>>>>>>>> >                 The argument in favor of SQLite, of course, is
>>>>>>>> that
>>>>>>>> >                 it's serverless. No
>>>>>>>> >                 additional setup (beyond the installation of the
>>>>>>>> >                 library) required.
>>>>>>>> >
>>>>>>>> >                 Would there be any additional benefits or concerns
>>>>>>>> >                 when connecting to a
>>>>>>>> >                 PostgreSQL server?
>>>>>>>> >
>>>>>>>> >                 As you've no doubt noticed, there's nothing in the
>>>>>>>> >                 code (or in the
>>>>>>>> >                 standard API) to acknowledge or support the
>>>>>>>> notion of
>>>>>>>> >                 multiple users.
>>>>>>>> >                 Again: point in favor of SQLite...
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >                 On Thu, 2014-07-10 at 10:25 +0800, Elias Mårtenson
>>>>>>>> >                 wrote:
>>>>>>>> >                 > I was looking at your code, and I noticed that
>>>>>>>> it's
>>>>>>>> >                 SQLite-specific.
>>>>>>>> >                 > WOuldn't it make sense to make it
>>>>>>>> >                 SQL-implementation-agnostic?
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 > Based on what I can see, the only
>>>>>>>> SQLite-specific
>>>>>>>> >                 SQL you have in
>>>>>>>> >                 > there is "replace into" which I had never heard
>>>>>>>> >                 about before.
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 > Regards,
>>>>>>>> >                 > Elias
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 > On 9 July 2014 01:22, David Lamkins
>>>>>>>> >                 <da...@lamkins.net> wrote:
>>>>>>>> >                 >         I haven't yet written test scripts, but
>>>>>>>> I've
>>>>>>>> >                 informally tested
>>>>>>>> >                 >         all of the functions and am reasonably
>>>>>>>> >                 confident that the
>>>>>>>> >                 >         component file API is complete and
>>>>>>>> correct.
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 >         If you'd like to try out the API while
>>>>>>>> I'm
>>>>>>>> >                 working on scripted
>>>>>>>> >                 >         test cases, the repo is:
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> https://github.com/TieDyedDevil/iso-apl-cf
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 >         You'll find documentation is in the
>>>>>>>> comments
>>>>>>>> >                 and in Annex A of
>>>>>>>> >                 >         the ISO 13751 standard.
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 >         The standard "specifies a minimal set of
>>>>>>>> >                 functions which a
>>>>>>>> >                 >         conforming implementation must provide";
>>>>>>>> >                 I've implemented all
>>>>>>>> >                 >         of these. I've also added several useful
>>>>>>>> >                 functions not
>>>>>>>> >                 >         mentioned in the standard, including
>>>>>>>> >                 component inquiry,
>>>>>>>> >                 >         component drop, and transaction support.
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 >         Note that the code is not packaged for
>>>>>>>> my
>>>>>>>> >                 package manager; I
>>>>>>>> >                 >         assume that the component file
>>>>>>>> >                 implementation would become an
>>>>>>>> >                 >         L3 library in the event it's adopted for
>>>>>>>> >                 inclusion in GNU APL.
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 >         Júergen, I've specified the GPLv3
>>>>>>>> license
>>>>>>>> >                 since that's what
>>>>>>>> >                 >         GNU APL uses. If there's a more
>>>>>>>> appropriate
>>>>>>>> >                 choice of license
>>>>>>>> >                 >         for this library, please let me know.
>>>>>>>> >                 >
>>>>>>>> >                 >         --
>>>>>>>> >                 >         "The secret to creativity is knowing
>>>>>>>> how to
>>>>>>>> >                 hide your
>>>>>>>> >                 >         sources."
>>>>>>>> >                 >            Albert Einstein
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >                 >         http://soundcloud.com/davidlamkins
>>>>>>>> >                 >         http://reverbnation.com/lamkins
>>>>>>>> >                 >         http://reverbnation.com/lcw
>>>>>>>> >                 >         http://lamkins-guitar.com/
>>>>>>>> >                 >         http://lamkins.net/
>>>>>>>> >                 >         http://successful-lisp.com/
>>>>>>>> >                 >
>>>>>>>> >                 >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> "The secret to creativity is knowing how to hide your sources."
>>    Albert Einstein
>>
>>
>> http://soundcloud.com/davidlamkins
>> http://reverbnation.com/lamkins
>> http://reverbnation.com/lcw
>> http://lamkins-guitar.com/
>> http://lamkins.net/
>> http://successful-lisp.com/
>>
>
>

Re: [Bug-apl] ISO Component File API preview

Reply via email to