On Tue, Mar 30, 2010 at 9:36 AM, Dimitri Fontaine <dfonta...@hi-media.com> wrote: > Robert Haas <robertmh...@gmail.com> writes: > >> On Tue, Mar 30, 2010 at 12:56 AM, Anindya Jyoti Roy <anind...@iitk.ac.in> >> wrote: >>> As Jeff Davis pointed out, I followed the modification he suggested and now >>> I want to have a basic matching only. I think atleast the fingerprint >>> processing can be done in summer (if not the image processing). Is it a good >>> GSoC project now? >> >> I'm not sure. Can you provide a more detailed design? > > Apply the following to fingerprint searches ? > > http://www.postgresql.org/docs/current/static/gist-implementation.html > http://wiki.postgresql.org/wiki/Image:Prato_2008_prefix.pdf > > I guess that what remains to be defined is how you get those > fingerprint, what the datatype is named, is it fixed size or varlena, > what operators you want to make available, and which will have index > support. That's GiST + GIN, right ?
Well, yeah. I think the fingerprinting and operator support are the real questions. My fear is that the student who is asking this question does not really have a good handle on that aspect of the project. Maybe I'm wrong. However the description that was given was: 2> the database search engine will be able to search for image also 3> it will list the matching images in the order of degree of match. 4> in this matching system I will likely use the system of dividing the image into important parts and match them. That's pretty vague. If someone came and said, I'm going to use XYZ system from the following academic papers, that would inspire a lot more confidence, at least for me. Also I think this item from the original email reflects a fundamental misunderstanding of how this would integrate into PostgreSQL: 5> The database will also contain fingerprints, that may be the primary key. Again, if the student had said, the XYZ system above will work well with GIN indexing because we can construct the posting lists like thus-and-so, or if they had said, it will work well with GIST because there is a similarity metric we can use to construct the penalty and picksplit functions, I would feel a lot better. But the description given is so general in terms of both what is to be done on the image processing side and what is to be done on the PostgreSQL side that I am afraid that the student is going to be in far too deep. Compare this description with the one from the student who wants to implement JSON support - that sounds a whole lot closer to something that someone (perhaps him) could sit down and code. My point here is not to discourage anyone or turn them off on trying to submit a GSoC project related to PostgreSQL. Indeed, I really hope they do. But it will benefit the project much more if the projects are small and successful than it will if they are large and not successful, or successful according to some metric but not actually producing code that will be widely used or merged into core. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers