Hi Sunburned,

Currently, in my laboratory we are working on this problem. We are studying
two solutions :

- use GML such as a native format (eg TAB in MapInfo),
- create a specific binary format for OJ.

When we have some results, I send a mail to the OJ team.

Cheers.

R1.


On 3/29/07, Sunburned Surveyor <[EMAIL PROTECTED]> wrote:

I've been working on a solution to the problem of working with very large
datasets in OpenJUMP at home the past couple of weeks. (For those of you
that don't know, OpenJUMP reads all features in from a data source into
memory. This isn't a problem until you start working with some very large
datasets. For example, OpenJUMP runs out of memory before it can open the
shapefile with all of the parcels in my county. The size limit of the data
source OpenJUMP can work with is limited by the RAM of the computer OpenJUMP
is running on.) I'd like to give a brief explanation of how this system will
work, and then ask for some suggestions on an aspect of the design.



This system uses a very light-weight in-memory representation of the
Feature class. (This is required because portions of OpenJUMP's code
requires the ability to manipulate individual features or all the features
in a feature collection "in-memeory".) Object's of this light-weight Feature
Class are really a façade and forward all method calls to a FeatureCache
object. A FeatureCache is an implementation of the FeatureCollection
interface that actually manages data behind the light-weight Feature
objects.



The FeatureCache maintains a "buffer". In this buffer it stores in-memory
representations of regular OpenJUMP Feature objects. This buffer will only
grow to a maximum size that can be set by the user and based on the balance
between speed/performance and memory usage. When a method call is made to
the light-weight Feature object it is forwarded to the FeatureCache. The
FeatureCache passes this call to the regular Feature object if it is in the
buffer. If it is not in the buffer the Feature object is created in memory
from information in permanent storage or "on-disk". The method call is then
processed and the newly created Feature is placed in the buffer. If the
buffer is already at its limit the oldest Feature in the Buffer is stored
back in permanent memory and removed from the buffer.



There should be no major distinction between Features and a
FeatureCollection implemented by a FeatureCache and normal Features and
FeatureCollections that are stored entirely in memory. The only significant
difference will be the speed of operations and rendering. This will be
slower with this system than it is with Features and FeatureCollections
stored entirely in memory. However, it will make it possible to work with
very large datasets.



Here is the part of the system that I would like to get some suggestions
on. I need to decide on a storage format for the features placed in
permanent memory, or on disk. I think I have 3 choices.



[1] Java's Standard Object Serialization Format

[2] A custom binary storage format.

[3] A text based format.



I believe the first two formats will be much quicker than the third. I
don't really think the second format is something I want to do, because I
think cooking up a custom binary format will be a real pain in the neck. So
I need to decide between the first format listed and the third format
listed.



If I use a text-based format external tools will be able to easily work
with the FeatureCache, and I won't have to worry about versioning issues. It
will also be slower. If I use Java's standard object serialization format
I'll have better performance, but I'll have to worry about versioning issues
that might come up if we change the interface definition for the Feature
interface. It will also make it difficult for external tools, especially
those that aren't written in Java, to work with the data in the
FeatureCache.


I'd like to know what storage format the other developers would recommend
and why.

Thanks,

The Sunburned Surveyor

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share
your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

Reply via email to