Hi,

I just replaced the String.intern() mechanism used in DbfFile to save 
memory heap space by the use of a temporary HashMap.

Please, tell me if you notice any strange behaviour in the shapefile reader.

For those of you interested by performance tuning, a short abstract of 
what we did to try to optimize DbfFile reader :

Exemple : loading a 500000 features with attribute A (500000 different 
values) and attribute B (500000 identical values)

version 1 (original jump) : loads 500000 strings for A + 500000 strings 
for B in the jvm heap space
==> big waste of memory for case B

version 2 (optimized by MM and Larry) : loads 500000 strings in permgen 
memory for attribute A and 1 string in permgen memory for attribute B
==> need to tune max permgen memory when loading big shapefiles with 
attributes like A
==> permgen memory cannot be cleaned (no way to free the permgen space 
if the layer is deleted)
==> from my experience, the try/catch block added by Larry to switch 
from permgen space to heapspace in case or error did not prevent a 
permgen exception to rise and to stop the process

version 3 (today) : loads 500000 strings in the heapspace for A and 1 
string in the heapspace for B
==> the loading guaranty unicity of field value for one layer only 
(string will be duplicated for each loaded layer)
==> all the strings are in the heapspace (and can be free if the layer 
is deleted)

Michaƫl

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

Reply via email to