Troy A. Griffitts wrote: > There is a basic and practical difference between a local and a remote > installation, however abstract you want to get. > > Remote repositories have concepts like 'refresh from remote source' > (apt-get update)
Short term for SWORD, that is likely to remain an important difference. Long term, either a local repo just returns "OK, done, nothing changed" when asked to refresh, or you can rethink whether this is really needed... In the apt-get example you mentioned, I think the whole idea of needing apt-get update for a local metadata database arises only when the repository needs significant metadata to be usable, or when they get so large (like tens of thousands of items in the repo, as with apt) that it is currently impractical for performance reasons to obtain the index info / metadata dynamically upon opening the repo. If the largest Debian package repository on the planet held 200 packages, would we really all be using the "apt-get update" type of approach, and keeping a local metadata database? Thinking of other counter-examples: We do not "web update" before we can browse to a new web page. Nor do we "pdf update" before we can browse to a new PDF file, or even "video update" for a video file. If YouTube is considered a large repository of video files, one does not "youtube update" before one can watch a new video :) Some (probably too idealistic and blue sky) ideas and thoughts for the distant future of SWORD that arise when I think about this: (1) If Peter von Kaehne's idea that "a SWORD module is like a PDF" is accurate and appropriate, then the whole idea of "installing" a SWORD module is an unhelpful anachronism that can go away at some point in future development. The end user does not really want to "install" a SWORD module, they want to use (read/search/annotate/etc.) it! (2) Or, if modules are always going to be only "installable" entities, for whatever reason, then it seems to me to make little sense to provide them online as a tree of files per module. It is surely simpler, more efficient, and maybe more logical(?) to provide them as a single compressed archive file per module. Then you let the "install" process also decompress them (either after transport to the machine running the application, or decompress the byte stream as it arrives, if that is better for overall performance). Remote network transport time and disk write time is likely to dwarf any decompression time, even on embedded low power CPUs. Given this "SWORD modules are installable entities, not documents like PDFs" vision of the future, apt-get is a very workable analogy. Debian package repositories do not require that the repository unpacks every .deb file they offer, so that repository users can access and download the files inside one at a time (those files are not generally useful individually anyway!). Instead, it stores the .deb files, which are compressed archives, along with meta info about them to aid in searching. The client does the decompression and unpacking of the archives. I can imagine a SWORD repo operating this way, too. Unless you allow direct remote access (RPC-like, or maybe even NFS-like?) to the items in the remote SWORD repo (potentially a nice blue sky idea, but not currently implemented!), what is the benefit of the "unpacked tree of files" format for repo owners, for front end developers, or for end users? Right now, without knowing all the history, my understanding is that SWORD sort of does both, and so (to me) is confusing... online repositories are unpacked, but there is also a "raw zip" standardized way to store (and so transport) SWORD modules. When does the user pick one rather than the other? Why is the user being asked to make that choice? Is there really enough added value in having both to justify the additional system complexity that ensues from this "do both" approach to SWORD module storage in repositories? (3) Ignoring backward compatibility (!), one could in future make SWORD modules available as .zip files (or some other defined compressed archive file format), *only*. An installer would then use URLs to find collections of these archive files (and the related repo metadata if such is needed/useful), and more specific URLs to download the individual archive files, and then install them locally. This (as Greg pointed out) would allow for a very nicely abstracted set of methods that could expand to encompass any desired number of different URI schemes, from http: to ftp: to file: to sshfs: to something not yet invented. [Aside: If this *is* done, it can often make a lot of sense to use an "embedded magic number" approach to being able to identify the files as being SWORD modules rather than a generic zip/gzip/bzip2/etc compressed file, as this permits special treatment of them without having to use ugly workarounds like renaming them to end in something hopefully unique. Debian/Ubuntu .deb packages have such numbers, for example -- you can rename a .deb to a .foobar if you really want, and the file * command will still identify it as being a Debian binary package, and so you can still set you your file manager to do the right thing when you double click it! ] It seems to me that movement in that general direction (SWORD modules are available as a single file in some defined format, decompressed and installed locally) might be better (long term) than adding features to the current (SWORD modules online are a hierarchy of many files) approach. This is potentially also a useful first step on the long road to "SWORD modules are a single data file which the application opens, just like a PDF, no installation of them is needed" -- first make modules be single files, then make doing the equivalent of whatever an installer does fast enough than you can do it at file (module) open time :) > Local repositories usually aren't 'entered' in a list by a user, though > I suppose they could be if it was useful. Practically there is usually > 1 local source (a CD or USB drive) and the user can Browse... to the > location. This is not easily replicated for remote sources. They are > typically 'configured' and their configuration stored for future reference. Consider bookmarks in a web browser -- one can bookmark both local files and remote ones in there... there is no difference to the user interface at all in that case. Why do SWORD modules require such a distinction? How does it help the end user to have that distinction be visible to them? How does it help the front end developer to have that distinction be visible to them in the API? if the distinction is unhelpful to the users, then can it be abstracted away? > Some I can think of: We now have just added support in 1.6.0 for > non-anonymous FTP, so the user can input username/password if > necessary-- useful for access to a private beta repository. We have > supported Passive FTP as an option. With HTTP access, we might also add > HTTP proxy features. These all require frontend user preferences. I'd think that all of this can either be in the URL (username and password) or else a systemwide config option (proxies, passive vs active FTP -- though the "good default" these days for FTP seems to be try passive, and if it fails in a certain way, fall back to active). This probably needs a way for the "open a URL" method to prompt the user for authentication information (username and pw, usually), but that's all. The underlying subsystems (and control panels for users to set proxies, etc.) for doing it that way already exist, on most and perhaps all platforms, as far as I know, so making them preferences that SWORD front ends need to handle specially seems like extra work for both SWORD developers and SWORD end users, for no real benefit? > There will almost certainly be additional work for the frontends when we > add HTTP support, though not as different as you fear might be. I think that a careful design might be able to avoid that. Even if that is impractical for this first implementation, I think a design that moves the library towards a more unified approach to acquiring and opening/using modules would be good. For instance, longer term still, given a fast enough network pipe, why download and install any modules at all -- one should conceivably be able to have more of an RPC style approach to accessing a remote module... a little like accessing a remote SQL database today... or even just a file on a network share... you don't have to copy the entire database (or file) to your PC first, before you can use it :) > Libraries of modules are exposed as SWMgr objects. > An SWMgr object can be easily created from a local path: > > SWMgr localLibrary("/path"); > > So for local sources, you don't need InstallMgr to obtain an SWMgr object. Why not see that parameter as potentially being a URL, in a later version of the library, so unifying remote and local access? If a URL parsing library says it isn't a URL, treat it as file:// as a fallback... SWMgr library("http://www.example.com/sword/"); or similar? Other than performance over a slow network connection, is there any technical requirement for this to be restricted to "local"? Since "local" can in fact be very remote if one mounts a network filesystem, I'm not sure the distinction is all that useful to the end user anyway, is it? Going even further, is it necessary or helpful for the API to have the concept of "libraries" at all, other than as bookmarks to open modules to install? We don't normally expect PDF files to exist grouped into "libraries"; why would we expact SWORD modules to be so grouped, if they are in effect just like PDFs? (Even if they are in some ways perhaps more like databases, with all the searching and indexing stuff that they need... we don't generally group databases into sets of databases based on their physical or network location, either). > int InstallMgr::installModule(SWMgr *destMgr, const char *fromLocation, > const char *modName); > // which I don't hate Can SWORD not go back to this, and allow *fromLocation to be a URL? I think this is more or less what Greg is suggesting :) Adding capability without changing the API is really nice if you can do it, and if the resulting API is actually *simpler* than before (plus as a bonus you don't hate it)... that sounds good all around. Jonathan _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page