>
>
>@Paul: IIRC Larry and Michaël intern()ed the Strings
>       from DBF files to save a lot of memory.
>  
>

@Paul: I must admit I recently observed a side effect of String interning
Loading a large shapefile (several hundred thousands points) with 
toponyms (every object with a different string) ended with an error 
about PermGen memory size..
Interned strings are located in a special memory space called PermGen, 
and in this case, this space was not enough (it can be changed with 
-XX:MaxPermSize)
May be we'll have to change the code to be able to switch from string to 
string.intern depending on how many similar string there are in an attribute
Another way is to manage a pool manually, but I have no precise idea how 
to implement it.

@Sascha : Seems like a good idea to avoid coordinates duplication in 
large Polygon datasets. I'm curious to see the effects on editing 
geometries (note that in some cases, the desired behaviour while editing 
adjacent polygons may be to move both vertices at the same time and in 
other cases only one).

@Martin : Please, can you explain what immutability means for 
coordinates. I see that x,y,z are public fields (and I remember I often 
changed them via small scripts, especially the z value). But may be I 
have no clear idea about immutability and its advantages.

Michaël
-

>Regards,
>   Sascha
>
>Paul Austin schrieb:
>  
>
>>Another huge memory saving can be done by using String.intern() on 
>>string objects as they are immutable anyway. I think the latest VM's do 
>>some garbage collection on the intern cache so it's not a bad thing to do.
>>
>>Paul
>>
>>Martin Davis wrote:
>>    
>>
>>>I'm almost 100% sure that JUMP treats Coordinate objects as immutable 
>>>(at least in the core code.  I do know that at least one plugin I wrote 
>>>changes the Coordinates in Geometries - my bad!).  I think this should 
>>>be a firm design principle of JUMP - it's simply not worth the risk to 
>>>mutate Coordinates in-place.  The same goes for Geometrys,  I think.  
>>>There's lots of benefits to having immutability, and lots of risks to 
>>>not having it.
>>>
>>>So your Coordinate-sharing idea should work.  Whether this really makes 
>>>much of an impact in the general use case I can't say - it's very 
>>>dependent on the nature of the data being loaded.  50% savings doesn't 
>>>seem like that much to me - but I guess that depends on whether you are 
>>>trying to load a 2 GB shapefile!
>>>
>>>Perhaps this should be called Coordinate-externing, referring to the 
>>>similar strategy that Java uses for String constants.
>>>
>>>Another possible option for providing memory savings is to take 
>>>advantage of the JTS CoordinateSequence facility, and use 
>>>PackedCoordinateSequences for raw Geometry storage.  This might give an 
>>>even bigger memory savings. But it would *definitely* require changes to 
>>>the core, since JUMP was mostly written before the JTS CS was 
>>>introduced, so the code assumes it can get down-and-dirty with the 
>>>Coordinate arrays in JTS. 
>>>
>>>Sascha L. Teichmann wrote:
>>>  
>>>      
>>>
>>>>Just for curiosity:
>>>>
>>>>When I load a larger polygon shapefile (burlulc)
>>>>I recognized that the geometries share a lot of
>>>>common vertices. In case of the burlulc layer
>>>>over 1,500,000.
>>>>They are represented by com.vividsolutions.jts.geom.Coordinate
>>>>objects. If a shapefile gets loaded a new Coordinate object
>>>>for each vertex is created.
>>>>
>>>>Now I added a simple TreeMap to the PolygonHandler of
>>>>OpenJUMP's shapefile reader to reuse already created
>>>>Coordinate objects and share them with other geometries.
>>>>
>>>>After loading the data (+ triggering GC) the normal OJ
>>>>uses approx. 124MB memory. After the the shared vertices
>>>>modification OJ uses only approx. 89MB.
>>>>
>>>>My question: May this mod lead to any side effects?
>>>>With JTS? With the CursorTools?
>>>>
>>>>Coordinate objects are not immutable, so I expect
>>>>side effects with e.g. neighboring polygons when
>>>>I edit one of them.
>>>>
>>>>I had a brief look at the code and played with
>>>>the CursorTools but I haven't found any side effects
>>>>yet.
>>>>
>>>>This idea comes from playing with OJ on a boring
>>>>friday evening. It only costs me a few seconds to
>>>>implement and if you say "This idea is plain stupid!"
>>>>I'll drop it immediately .. ;-)
>>>>
>>>>Kind regards,
>>>>Sascha
>>>>
>>>>-------------------------------------------------------------------------
>>>>This SF.net email is sponsored by DB2 Express
>>>>Download DB2 Express C - the FREE version of DB2 express and take
>>>>control of your XML. No limits. Just data. Click to get it now.
>>>>http://sourceforge.net/powerbar/db2/
>>>>_______________________________________________
>>>>Jump-pilot-devel mailing list
>>>>Jump-pilot-devel@lists.sourceforge.net
>>>>https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>>>
>>>>  
>>>>    
>>>>        
>>>>
>>>  
>>>      
>>>
>>-------------------------------------------------------------------------
>>This SF.net email is sponsored by DB2 Express
>>Download DB2 Express C - the FREE version of DB2 express and take
>>control of your XML. No limits. Just data. Click to get it now.
>>http://sourceforge.net/powerbar/db2/
>>_______________________________________________
>>Jump-pilot-devel mailing list
>>Jump-pilot-devel@lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>    
>>
>>------------------------------------------------------------------------
>>
>>Index: src/org/geotools/shapefile/PolygonHandler.java
>>===================================================================
>>--- src/org/geotools/shapefile/PolygonHandler.java    (Revision 863)
>>+++ src/org/geotools/shapefile/PolygonHandler.java    (Arbeitskopie)
>>@@ -3,6 +3,8 @@
>> import java.io.IOException;
>> import java.lang.reflect.Array;
>> import java.util.ArrayList;
>>+import java.util.Map;
>>+import java.util.LinkedHashMap;
>> 
>> import com.vividsolutions.jts.algorithm.CGAlgorithms;
>> import com.vividsolutions.jts.algorithm.RobustCGAlgorithms;
>>@@ -16,7 +18,47 @@
>> public class PolygonHandler implements ShapeHandler{
>>     protected static CGAlgorithms cga = new RobustCGAlgorithms();
>>     int myShapeType;
>>-    
>>+
>>+             /**
>>+              * Coordinate only calcs hash over x and y.
>>+              * Extending it to hash z too.
>>+              */
>>+             private static final class Coord extends Coordinate {
>>+
>>+                     public Coord(double x, double y) {
>>+                             super(x, y);
>>+                     }
>>+
>>+                     public boolean equals(Object o) { // equals3D()
>>+                             Coord c = (Coord)o;
>>+                             return x == c.x 
>>+                                     &&   y == c.y
>>+                                     &&  (z == c.z || (Double.isNaN(z) && 
>>Double.isNaN(c.z)));
>>+                     }
>>+
>>+                     public int hashCode() {
>>+                             //Algorithm from Effective Java by Joshua Bloch 
>>[Jon Aquino]
>>+                             int result = 17;
>>+                             result = 37 * result + hashCode(x);
>>+                             result = 37 * result + hashCode(y);
>>+                             if (!Double.isNaN(z))
>>+                                     result = 37 * result + hashCode(z);
>>+                             return result;                          
>>+                     }
>>+             } // class Coord
>>+
>>+             /** This is the number of coordinates to store for comparison.
>>+              *  If the number of vertices is very large it would be
>>+              *  inefficent to store them all in a HashMap.
>>+              *  Limiting does not provide the the optimal solution
>>+              *  but if some spatial coherence is given it does 
>>+              *  a good job.
>>+              */
>>+             public static final int MAX_COORDINATE_CACHE = 35000;
>>+
>>+             /** the coordinate cache */
>>+             protected LinkedHashMap coordinateCache;
>>+
>>     public PolygonHandler()
>>     {
>>         myShapeType = 5;
>>@@ -53,7 +95,7 @@
>>     public Geometry read( EndianDataInputStream file , GeometryFactory 
>> geometryFactory, int contentLength)
>>     throws IOException, InvalidShapefileException
>>     {
>>-    
>>+
>>      int actualReadWords = 0; //actual number of words read (word = 16bits)
>>         
>>        // file.setLittleEndianMode(true);
>>@@ -87,34 +129,73 @@
>>         
>>         partOffsets = new int[numParts];
>>         
>>-        for(int i = 0;i<numParts;i++){
>>-            partOffsets[i]=file.readIntLE();
>>-                     actualReadWords += 2;
>>+        for (int i = 0; i < numParts; i++) {
>>+                                     partOffsets[i]=file.readIntLE();
>>         }
>>+                             actualReadWords += (numParts << 1); // numParts 
>>* 2
>>         
>>         //LinearRing[] rings = new LinearRing[numParts];
>>         ArrayList shells = new ArrayList();
>>         ArrayList holes = new ArrayList();
>>-        Coordinate[] coords = new Coordinate[numPoints];
>>-        
>>-        for(int t=0;t<numPoints;t++)
>>-        {
>>-            coords[t]= new 
>>Coordinate(file.readDoubleLE(),file.readDoubleLE());
>>-                     actualReadWords += 8;
>>-        }
>>-        
>>-        if (myShapeType == 15)
>>-        {
>>-                //z
>>-            file.readDoubleLE();  //zmin
>>-            file.readDoubleLE();  //zmax
>>-                     actualReadWords += 8;
>>-             for(int t=0;t<numPoints;t++)
>>-            {
>>-                coords[t].z = file.readDoubleLE();
>>-                             actualReadWords += 4;
>>-            }
>>-        }
>>+
>>+                             if (coordinateCache == null) {
>>+                                     coordinateCache  = new 
>>LinkedHashMap(MAX_COORDINATE_CACHE-1) {
>>+                                             protected boolean 
>>removeEldestEntry(Map.Entry entry) {
>>+                                                     return size() > 
>>MAX_COORDINATE_CACHE;
>>+                                             }
>>+                                     };
>>+                             }
>>+
>>+        Coordinate [] coords = new Coordinate[numPoints];
>>+
>>+                             // Coordinate is not able to hash 3D so wrap
>>+                             // the coords in subclass Coord.
>>+                             // This produces a lot of temporary objects so 
>>+                             // this path is separated from the simple x,y 
>>case. :-/
>>+
>>+                             if (myShapeType == 15) { // with z
>>+
>>+                                     for (int t = 0; t < numPoints; ++t)
>>+                                             coords[t] = new Coord(
>>+                                                     file.readDoubleLE(),
>>+                                                     file.readDoubleLE());
>>+
>>+                                     actualReadWords += (numPoints << 3); // 
>>numPoints * 8
>>+                                     
>>+                                     file.readDoubleLE();  //zmin
>>+                                     file.readDoubleLE();  //zmax
>>+                                     actualReadWords += 8;
>>+
>>+                                     for (int t = 0; t < numPoints; ++t)
>>+                                             coords[t].z = 
>>file.readDoubleLE();
>>+
>>+                                     actualReadWords += (numPoints << 2); // 
>>numPoints * 4
>>+
>>+                                     for (int t = 0; t < numPoints; ++t) {
>>+                                             Coord c = (Coord)coords[t];
>>+                                             Coordinate shared = 
>>(Coordinate)coordinateCache.get(c);
>>+
>>+                                             if (shared == null)
>>+                                                     coordinateCache.put(c, 
>>shared = new Coordinate(c));
>>+
>>+                                             coords[t] = shared;
>>+                                     }
>>+                             }
>>+                             else { // without z -- directly use Coordinate
>>+                                     Coordinate coord = new Coordinate();
>>+                                     for (int t = 0; t < numPoints; ++t) {
>>+                                             coord.x = file.readDoubleLE();
>>+                                             coord.y = file.readDoubleLE();
>>+                                             Coordinate shared = 
>>(Coordinate)coordinateCache.get(coord);
>>+                                             if (shared == null) {
>>+                                                     
>>coordinateCache.put(coord, coord);
>>+                                                     shared = coord;
>>+                                                     coord = new 
>>Coordinate();
>>+                                             }
>>+                                             coords[t] = shared;
>>+                                     }
>>+                                     actualReadWords += (numPoints << 3); // 
>>numPoints * 8
>>+                             }
>>       
>>         if (myShapeType >= 15)
>>         {
>>    
>>
>>------------------------------------------------------------------------
>>
>>-------------------------------------------------------------------------
>>This SF.net email is sponsored by DB2 Express
>>Download DB2 Express C - the FREE version of DB2 express and take
>>control of your XML. No limits. Just data. Click to get it now.
>>http://sourceforge.net/powerbar/db2/
>>
>>------------------------------------------------------------------------
>>
>>_______________________________________________
>>Jump-pilot-devel mailing list
>>Jump-pilot-devel@lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>>    
>>


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

Reply via email to