I'm investigating creating a memcached-based FilesCache implementation for
use with Google App Engine. The basic obstacle is that this requires that
all objects that are used as keys or values must be serializable. Before I
go too far down this path, I'd like to know if this is a reasonable thing to
do (or consider doing). Any thoughts or guidance would be greatly
appreciated.

 Background info: the Google App Engine (GAE) environment is inherently
distributed, where an unknown number of multiple instances of your
servlet-based web application will be running at the same time. The GaeVFS
project (http://gaevfs.appspot.com) implements a VFS plugin on top of the
GAE datastore; this is needed because the "real" filesystem within GAE is
read-only, so GaeVFS provides a writeable filesystem for use by
applications. It seems that for proper operation of VFS, the FilesCache
implementation used by GaeVFS should (must?) also be inherently distributed.
Fortunately, GAE supplies a memcached API that will be perfect for this use
(if I can solve the serialization problem).

Here's my first cut at what classes I'd need to make serializable and some
notes on which fields might be marked transient:

*AbstractFileName*
serialized fields: scheme (String), absPath (String), type (FileType)
transient fields: uri, baseName, rootUri, extension, decodedPath

*FileType*
serialized fields: name (String), hasChildren (boolean), hasContent
(boolean), hasAttrs (boolean)
transient fields: none

*AbstractFileObject*
serialized fields: name (AbstractFileName), content (DefaultFileContent)
transient fields: fs (AbstractFileSystem),  operation (FileOperations),
attached (boolean), type (FileType), parent (FileObject), children
(FileName[]), objects (List)

It's very important the the fs (AbstractFileSystem) field be transient to
limit the number of other classes that need to be made serializable. It
looks like files are always retrieved from the files cache via the
AbstractFileSystem.getFileFromCache() method; if so then the "fs" field of
AbstractFileObject can be restored within this method and doesn't need to be
serialized. We'll need to define a package-scope
AbstractFileObject.setFileSystem() method to support this. Or, this could be
done within the MemcachedFilesCache.getFile() method before returning the
FileObject (but then we'd need a FileObject.setFileSystem method, or do some
not-very-nice type casting).

 *DefaultFileContent
*serialized fields: file (AbstractFileObject), attrs (Map), roAttrs (Map),
fileContentInfo (FileContentInfo), fileContentInfoFactory
(FileContentInfoFactory), openStreams (int)
transient fields: threadData (ThreadLocal)
question: what types do the "attrs" and "roAttrs" maps contain? are these
serializable?
question: FileContentInfo and FileContentInforFactory are interfaces, what
are the implementations of these?

Reply via email to