Short: I'm looking for a front-end tool to help me process (qualify and classify / catalog) about 10,000 scanned images.
Long: Hi, I've promissed to someone to process a bunch (10,000) of images, and am realizing it might not be as simple as I hoped. I'm soliciting, in this forum of image processing experts, experiences and suggestions for possible solutions. I am a complete newbie into image processing, with zero experience (I know Gimp exists), so please don't think that I know what I want. I only have an idea of what should the ideal target roughly look like. Before I start, first is a meta-question - is this a good forum to ask this at ? In which forum(s) would it be good to ask these types of questions ? (I've noticed that gimp-perl has on average only 1.5 posts/mth) I'm looking for some tool(s) that would help me qualify and classify / catalog a bunch of images. I can easily build myself the database structures I need in MySQL or PostgreSQL. I'm having trouble finding the frontend tool which would allow me to view (and manipulate a little) the image and would be an effective data entry tool for these qualifications / classifications. I'm not so concerned at this point about the viewing of the images once they are all processed, althouh I imagine that the same tool might be also used for viewing these images at the end. I would prefer to run it all on linux, but would settle for Windows front-end, if necessary. I know Javascript, PHP & Perl if some integrating is needed. (I also know C++ and Java, but would prefer not to use them for this, if possible.) I have about 4,000 photos w/EXIF info, but I'll write about those at the bottom of this post. More importantly, I have about 7,000 B&W (dithered) scans of docs of various contrasts (sometimes light gray), all of them are text (no photos). I currently have them all (99%) in pixel format (png and pbm), about 1% are jpegs. None are multipage scans, all are single-image scans. I need to classify them in several "dimensions", but elements / attributes of those dimensions may vary based on the type of content the document carries; I need to build a searchable database, so I can find them by specifying a criteria in one or more dimensions. E.g. "all expense docs from `Botanical Gardens' involving period June 23, 2003 to July 23, 2003", and a set of 140 image files would fall out for display / browsing. I would really hope to have a frontend which would be fully controllable via kbd, just because kbd is so much faster to use than mouse (for most things (*1)) Key Meaning a "This is another page from the same doc. Write it into the DB and b "This page is blank - doesn't contain any information" 6.1 display next scan". n "This is scan is a page of another doc. Close the previous logical doc." d "Add this page to a doc that has been created before" s "Start a new doc." f "This page pertains to finances." 6.2 c "This page pertains to finances / income." 6.2.1 e "This page pertains to finances / expense." 6.2.2 l "This page pertains to legal." 6.3 i "This page pertains to info." 6.4 ... (*1) - mouse comes in handy for only two actions: see G4 and G7 below So I guess I would be looking for a "graphical engine", or "display engine" capable of (hopefully fast) display and manipulation of images. Separate zoomed window for fine navigation would be a nice extra. It would be nice if it would have combo boxes for choosing / adding items (see dimension 4 below), where the selection of items narrows down as you type lookup codes / starting letters of the entities. ( see point 4 below ) If worse comes to worst, I would settle for this whole thing being done in javascript ( I found that it is possible to draw lines / rectangles in javascript - see maptuit.com : http://tremblant.www.maptuit.com/corporate/testdrive/getamap.html) But using browser and javascript for this image manipulation would be terribly slow, probably ugly, and I would hate if I had to use MS's Explorer's exentions :-( Not to mention that I have no idea how could I do 8x-zoom popup with mouse-fine-control in Javascript. Plus browsers don't really allow for easy image panning. Below is what I think my wishlist should be. But then again, I'm new to image processing ... Thanks, John This is what I imagine the graphical engine should be able to do: G1 fit-to-widow G2 fit-width-to-window G3 1-to-1 pixel zoom G4 8-to-1 pixel zoom (in a smaller window - see G7) G5 mouse movement in the above three items moves the image, so whole page could be quickly visually scanned for defects G6 ability to specify areas (mostly rectangular, possibly occasionally rotated) of an image [ this would tango with the system feature 2.5 below - ability to treat these areas as separate scans (as pieces of different documents) ] G7 fine-navigation: nice extra: when Conrol key or something is pressed, a fine-navigation (8x zoomed) window pops up on the side, and mouse movement is 8x finer - allows for spefifying fine rotation angle (1.7.2) by means of clicking on two points which *should* be in a straight horizontal or vertical line on the original G8 another nice extra: "increase contrast" algorithm - in a B/W or dithered picture: draw a 2 or 3-pixel wide line between pixels that are less then distance X apart (this will enhance). This is just my formulation of what a "contrast enhancing" algorithm should do. Or another algorithm with similar effect: if a pixel has another pixel less than distance X away, turn other pixels black in its 2 or 3-pixel diameter. The dimensions would be: 1 picture quality dimension: 1.2 resolution : 300 ? 600 ? other ? 1.3 lineart or dithered ? 1.4 legible scan ? 1.5 the whole page is scanned ? or are parts / edges missing? 1.6 needs re-scan ? 1.7 needs post-processing ? 1.7.1 rotation by X*90 degrees 1.7.2 rotation by Y*0.1 degrees 1.7.3 increasing "contrast" (difficult with B&W/dithered pics) 2 document structure dimension: (2.1 to 2.3 erased) 2.4 which scan is the chapter title page, if any ? 2.5 if one scan contains more than one logical document, how does the scan divide into areas containing them ? 2.6 which library does it belong to ? 2.7 which shelf within library does it belong to ? 2.8 which volume of books on that shelf does it belong to ? 2.9 which book in that volume does it belong to ? 2.10 which chapter in that book does it belong to ? 2.11 which page of the chapter is it ? 2.12 which side of that page is it ? 3 time dimension: 3.1 date & time 3.2 period (from date to date) 3.3 expiry date 3.4 other date 4 entities dimension: 4.1 from which entity ? [ choose from / add to list of entities ] 4.2 to which entity ? [ choose from / add to list of entities ] 4.3 publishing entity ? [ choose from / add to list of entities ] 4.4 from which address ? [ choose from / add to list of addresses ] 4.5 to which address ? [ choose from / add to list of addresses ] 5 values: 5.1 ID1 5.2 ID2 5.3 title 5.4 subject 5.5 value1 5.6 value2 5.7 value3 6 flag: 6.1 blank page ? 6.2 financial ? 6.2.1 expense ? 6.2.2 income ? 6.3 legal ? 6.4 infomational ? 6.5 expired ? 7 ownership / responsibility for this doc: 7.1 Jack's group 7.1.1 Jack 7.1.2 Peter 7.2 Mary's group 7.2.1 Mary 7.2.1 Dennis Then I have about 4,000 JPEG color pics, most of them w/EXIF data. With these, there may be additional qualification, plus some from above may not qualify 1.8 rating of quality of composition (capturing the intended subject) 1.9 rating of technical quality 1.8.1 focused 1.8.2 not shaken (when tripod not used) 1.8.3 proper lighting / timing / contrast and then sorting them into categories : 8. category 8.1 trees 8.1.1 indoor 8.1.2 outdoor 8.2 bushes 8.3 tools _______________________________________________ Gimp-user mailing list [EMAIL PROTECTED] http://lists.xcf.berkeley.edu/mailman/listinfo/gimp-user