Wow. I'm interested to see if you can really get this to work. Just make sure that after you get the existing unit tests to pass that the functional tests that require external servers also pass - if you do a mvn site:stage you will see the documentation on how to do that.

VFS-245 is opened against AbstractFileName and probably needs to be dealt with in the context of what you are doing.

On Jun 4, 2009, at 2:22 PM, Vince Bonfanti wrote:

I'm investigating creating a memcached-based FilesCache implementation for use with Google App Engine. The basic obstacle is that this requires that all objects that are used as keys or values must be serializable. Before I go too far down this path, I'd like to know if this is a reasonable thing to
do (or consider doing). Any thoughts or guidance would be greatly
appreciated.

Background info: the Google App Engine (GAE) environment is inherently
distributed, where an unknown number of multiple instances of your
servlet-based web application will be running at the same time. The GaeVFS project (http://gaevfs.appspot.com) implements a VFS plugin on top of the GAE datastore; this is needed because the "real" filesystem within GAE is
read-only, so GaeVFS provides a writeable filesystem for use by
applications. It seems that for proper operation of VFS, the FilesCache implementation used by GaeVFS should (must?) also be inherently distributed. Fortunately, GAE supplies a memcached API that will be perfect for this use
(if I can solve the serialization problem).

Here's my first cut at what classes I'd need to make serializable and some
notes on which fields might be marked transient:

*AbstractFileName*
serialized fields: scheme (String), absPath (String), type (FileType)
transient fields: uri, baseName, rootUri, extension, decodedPath

*FileType*
serialized fields: name (String), hasChildren (boolean), hasContent
(boolean), hasAttrs (boolean)
transient fields: none

*AbstractFileObject*
serialized fields: name (AbstractFileName), content (DefaultFileContent) transient fields: fs (AbstractFileSystem), operation (FileOperations),
attached (boolean), type (FileType), parent (FileObject), children
(FileName[]), objects (List)

It's very important the the fs (AbstractFileSystem) field be transient to limit the number of other classes that need to be made serializable. It
looks like files are always retrieved from the files cache via the
AbstractFileSystem.getFileFromCache() method; if so then the "fs" field of AbstractFileObject can be restored within this method and doesn't need to be
serialized. We'll need to define a package-scope
AbstractFileObject.setFileSystem() method to support this. Or, this could be done within the MemcachedFilesCache.getFile() method before returning the FileObject (but then we'd need a FileObject.setFileSystem method, or do some
not-very-nice type casting).

*DefaultFileContent
*serialized fields: file (AbstractFileObject), attrs (Map), roAttrs (Map),
fileContentInfo (FileContentInfo), fileContentInfoFactory
(FileContentInfoFactory), openStreams (int)
transient fields: threadData (ThreadLocal)
question: what types do the "attrs" and "roAttrs" maps contain? are these
serializable?
attrs and roAttrs contain attributes specific to the file system. For example, Webdav contains what you would get back from a PROPFIND method. The Jar file system seems to return values found in the jar manifest., etc. The values I found would all be Strings but I can't guarantee it.

question: FileContentInfo and FileContentInforFactory are interfaces, what
are the implementations of these?
DefaultFileContentInfo, FileContentInfoFileNameFactory, HttpFileContentInfoFactory, MimeFileContentInfoFactory and WebdavFileContentInfoFactory.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to