Re: [JPP-Devel] Upper Limit On Data Size For OpenJUMP

Michaël Michaud Tue, 09 Oct 2007 23:50:48 -0700

Martin Davis a écrit :

>I wouldn't try and set quantitative upper limits, just let them be 
>dictated by available memory and processor size.
>
>But it seems reasonable to state that the architecure is primarily 
>designed for in-memory datasets (although that isn't strictly true, 
>since the DataStore framework shows how the same architecture can be 
>used to deal with datasets in external memory accessed on-the-fly and 
>via caching).  Perhaps a better description is that in JUMP features 
>which are visible are expected to be in-memory.
>  
>
One limitation of DataStore framework, as far as I experienced it, is 
that each time you zoom
or pan, needed features are loaded into memory as if they were new 
features (new fid), and you
loose other features (features located out of the window) as if they had 
never existed. This way,
it is not possible to process a whole featurecollection.
My vision woud be something between, with only references loaded into 
memory (id and
envelope) but references for every features of the featurecollection. 
With a good cache system,
it should make it possible to load much bigger datasets than with 
in-memory loaders and
caching should make it possible to keep reasonable performance.
Even if it seems reasonable and feasable to me, until now, I could not 
realize it (or only
versions with terrible performance drawbacks :-( )


>Personally I wouldn't try and go further than the current 
>in-memory/caching feature stream architecture.  Moving to a completely 
>external-memory based architecture (such as uDig) brings a whole passel 
>of new problems you have to work around (mostly to do with not having a 
>feature available when you want to work on it).
>  
>
I agree that managing large dataset in a read/write mode is quite 
difficult and needs a good
framework to manage transactions (I think kosmo team did a good work 
this way).
But I think read-only datasets is also an interesting usecase, and is 
not as difficult to manage
as a read/write driver.

>The caching Layer paradigm is pretty powerful - it could easily be 
>extended to handle huge shapefiles, for instance, by providing a 
>suitable driver. 
>  
>
Agile (alvaro zabala) wrote a good scalable shapefile driver in the 
past. I could make it work
with OJ after a minor modification, but I remember there were many 
dependencies.

>If the big-point-dataset thing is really such a big deal, I think it 
>could be  handled by loading the points into a very compact internal 
>representation (say an array of doubles), fronted by a FeatureCollection 
>implementation which simply creates Geometries and Features on-demand.  
>I suspect that this might scale very well for viewing.  Obviously a 
>custom data loader would be required, but there's no free lunch.
>  
>
Nice to benefit ideas from the jump architect :-)

Thanks

>Sunburned Surveyor wrote:
>  
>
>>I wonder if it is worth discussing whether or not there will be a
>>reasonable and practical limit to the size of data that we will try to
>>support in OpenJUMP. (For example, a maximum of 1,000 layers and
>>2,000,000 features.)
>>
>>I realize this is a little tricky, because each computer will be able
>>to handle different loads on memory and processor, but I think the
>>basic concept might have merit. I question the wisdom of modifications
>>to the core made in support of these super huge datasets when this
>>probably isn't our typical use case.
>>
>>The reality is that most of our users probably aren't working with
>>huge datasets. I think we have to remember that there will always be a
>>practical limit to the amount of data a computer program can work
>>with.
>>
>>As an example, at my day job we don't expect AutoCAD to handle
>>millions of points that come from our Laser Scanner. We have very
>>specialized software built specifically to handle millions of points.
>>This software filters, screens and processes these millions of points
>>to produce data that normal AutoCAD can handle. (For example, it
>>produces a "plane" from a set of points collected on the surface of a
>>wall, or a cylinder from a set of points collected on the surface of a
>>pipe.)
>>
>>Perhaps a better approach to huge GIS datasets is special tools that
>>can modify the data to produce meaningful results that can be used in
>>OpenJUMP on most computers.
>>
>>Or maybe we have a "specialized" version of OpenJUMP  built to work
>>with super huge data sets that eliminates a lot of the bells and
>>whistles that users of smaller data sets enjoy. Or maybe we have a
>>plug-in that reads a millions of points and creates a surface from the
>>results. Or maybe we have a plug-in that reads in giant shapefiles,
>>such as a shapefile of all the roads in Europe, and then "tiles" this
>>into smaller, more manageable shapefiles.
>>
>>At any rate, I don't think OpenJUMP can be all things to all people,
>>and it might be worth considering when we cross the line that requires
>>use of some programs designed especially for the use of huge datasets.
>>
>>Any comments?
>>
>>The Suburned Surveyor
>>
>>-------------------------------------------------------------------------
>>This SF.net email is sponsored by: Splunk Inc.
>>Still grepping through log files to find problems?  Stop.
>>Now Search log events and configuration files using AJAX and a browser.
>>Download your FREE copy of Splunk now >> http://get.splunk.com/
>>_______________________________________________
>>Jump-pilot-devel mailing list
>>Jump-pilot-devel@lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>
>>  
>>    
>>
>
>  
>


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

Re: [JPP-Devel] Upper Limit On Data Size For OpenJUMP

Reply via email to