Hi Even,
I think that we can discuss the starting point for 2.0.
You start with idea "...to "refactor" a code base of 1.2 million lines.."
Maybe we have to start from white list - create ideal (brilliant) set of
classes and create a well-designed foundation for GDAL. Also rewrite
several wide used drivers (shp, tab, PostGIS, etc.). Than this will
works very good, begin some kind of hackathon for rewriting overs.
Also we can add some testing framework inside GDAL (e.g. googletest) and
start empty repo on github for GDAL 2.0.
The GDAL can be enhanced, we can move common things that use several
drivers to core or replace some core thing with outer libraries. CMake
maybe used as build system.
Yes this is deep refactoring, but maybe this is need for GDAL 2.0?
Best regards,
Dmitry
27.03.2014 14:24, Even Rouault пишет:
Hi,
Hi all,
MY opinion is that's not a really good way to count how many classes you
will have and trying to have minimal classes.
Drawing a model with UML is not easy : it's not easy to have a good model
from the beginning, it's not easy to name all members and methods
correctly, and the worst is inheritance.
You can think you spend too much time to make the model. But you can't say
if you will spend less or much time to change code because your model was
too simple.
The issue is that I'm trying to "refactor" a code base of 1.2 million lines of
C/C++ code (not mentionning the 140 000 lines of Python tests), and with 211
existing drivers. So, given that nobody will probably ever want to fund the
effort, it is completely excluded to have to edit each of those drivers in a non
automated way...
The end solution will necessary be a compromise. I would also like the existing
C API to continue working as much as possible.
Because I don't know all the model of Gdal and the model can be "reset" I
have some questions :
1) Why GDALMajorObject not inherit of GDALIMajorObject ?
It was just an omission in the UML diagram. My C++ sketch had this inheritance.
2) Why all classes inherit (directly or not) of GDALMajorObject ?
That's how it is done currently.
What do
you store in it ?
It contains a list of list of strings, or in more concrete terms, metadata
domains that contain metadata items.
Is it possible to put it as a member in classes ?
That could have been a way of doing it, yes. Always the debate composition vs
inheritance.
Have
you consider templates ?
Actually, no. I tend to avoid them unless they are an obvious solution.
3) An interface for Drivers ? geName(), getLongName(), getDescription() ?
Or maybe abstract ...
4) What is GDALDataset in your diagram ?
It is the base class that a driver that supports both raster and vector would
have to subclass.
5) Where is the link between Driver (Gdal/Ogr) and Dataset (Raster/Vector)
? Member, templates ?
A Driver instanciates a Dataset by opening an existing one, or by creating a new
one. But the Dataset object itself is a free-standing objects that is only owned
by user code, and not by the Driver.
For now, I've dropped my crazy UML diagram that nobody would be able to
understand, even if I managed to make it work, and I'm pursuing my attempt
just merging OGRDataSource inside GDALDataset. And make OGRDriver and
OGRSFDriverRegistrar mostly disappear, or just create a minimized version of
them, as compatibility layer for existing drivers, that forward the real work to
GDALDriver and GDALDriverManager.
Even
Florent
2014-03-25 21:14 GMT+01:00 Even Rouault <[email protected]>:
Selon Etienne Tourigny <[email protected]>:
I also (respectfully) think that there are too many classes (mostly the
abstract classes and interfaces) which make it a bit hard to understand.
I like Vincent's second suggestion to have 3 base Dataset classes
(GDALRasterDataset, GDALVectorDataset, GDALHybridDataset) which inherit
from GDALDataset, and an enum (vector,raster,hybrid) to easily query
which
type the Dataset is (or which type its driver supports). You then
subclass
the relevant Dataset class and it *should* work auto-magically for
rasters.
For example, a given Dataset which subclasses GDALDataset would be
migrated as a subclass of GDALRasterDataset, which would imply that it
does not have a vector dataset. If the implementation changes to allow
vector, then you would change its parent and implement relevant vector
methods.
My view on this is KISS with a minimum number of classes...
Yes, the practice shows that the UML brainstorming leads to nowhere (at
least
with my current limited knowledge of C++ multi-inheritance)... So actually
instead of creating new classes, I think that eventually there will be less
classes... I'm not even sure about the interest of having subclasses of
GDALDataset to distinguish between raster only, vector only or
raster+vector. As
I wrote previously that could be done as a driver metadata instread.
So, for now, I've just "imported" the usefull methods of OGRDataSource into
GDALDataset, and made a few changes here or there so that it works. Well
except
ogr_refcount.py, because of a funny discovery was that the ref counting of
GDALDataset and OGRDataSource is (was) not the same. Starts at 1 for
GDALDataset
and 0 for OGRDataSource...
The temporary result is in the following branch:
https://github.com/rouault/gdal-1/commits/unification
OGRDataSource is kept with just that :
class CPL_DLL OGRDataSource : public GDALDataset
{
public:
OGRDataSource();
virtual const char *GetName() = 0;
static void DestroyDataSource( OGRDataSource * );
OGRSFDriver *GetOGRDriver() const;
void SetOGRDriver( OGRSFDriver *poDriver );
virtual const char* GetDriverName();
protected:
friend class OGRSFDriverRegistrar;
OGRSFDriver *m_poOGRDriver;
};
And it could eventually completely disappear (or maybe just remain as a
convenience class if we don't want to touch existing drivers) if I manage
to
merge OGRSFDriver into GDALDriver, and get rid of OGRSFDriverRegistrar.
Even
cheers
Etienne
On Tue, Mar 25, 2014 at 9:13 AM, Vincent Mora
<[email protected]>wrote:
On 25/03/2014 11:44, Even Rouault wrote:
Selon Vincent Mora <[email protected]>:
On 24/03/2014 21:46, Even Rouault wrote:
Hi,
"Release soon, release early", so for people who like UML diagrams
(there
is
also a prototype of C++ classes for those who don't like UML very
much),
here's
a blog entry with the outcome of my thoughts for a possible
re-organisation
of
the GDAL/OGR class hierarchy
http://erouault.blogspot.ca/2014/03/draft-gdalogr-class-
hierarchy-for-gdal.html
I don't understand the need for HandleRasterData() and
HandleVectorData().
Isn't inheriting from GDALEmptyRasterDataset or
GDALEmptyVectorDataset
sufficient to convey the information that the class doesn't handle
those
kind of datasets.
A call to GetLayerCount()/GetRasterCount() from the class user is
sufficient to say "no vector/raster data here", and a dynamic _cast
to
GDALEmptyVectorDataset/GDALEmptyRasterDataset is even clearer to me,
so
HandleRasterData()/HandleVectorData() would be a third way to tell
the
same thing.
Actually, when thinking more about this, this kind of information
should
be at
the driver level, and not at the dataset level. The idea is that there
are use
cases where you want to know if a driver can handle only vector, only
raster, or
both. For example if you still want to have separate Open dialog
boxes for
raster or vector.
Also I'm not sure about the names GDALAbstract* since those are
partial
implementations.
I know I'm not very good at naming. Do you have an alternative
proposal ?
The
issue is that with my draft we end up with a lot of classes and
interfaces, so
it is not obvious to find a good name to reflect their content.
Partial sound ok to me, but I not so good at naming ?
Considering the number of classes, I gave it some thoughts this morning
and, like Dmitry, thought of merging the Abtract (Partial) into the
interface before realizing that, even if it works (you can overload de
default function in Empty) it's a bit ugly and I ended up preferring
your
solution.
An altenative would be to have a Dataset with both aspects and provide
three partial specialisations: one for vector (it will behave like
empty
raster) one for raster, and one for both. Code duplications could be
avoided by implementing protected member functions in the Dataset
class and
simply calling them in the implementations of partial specialization.
This
solution avoid the diamond shaped inheritance diagram of hell :)
Even
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev