There is a perl program called exiftool that can load and set exif tool without loading the image data (or at least it doesn’t decode the image data). I don’t know whether it would be faster than loading image data/properties with ImageIO. You could write a perl script that used your bundled exiftool to load the exif data and output the results for many files in a format your program could handle, because instantiating perl/exiftool repeatedly for each image in a separate NSTask would probably be pretty slow.
Jim Crate > On Jan 7, 2023, at 2:07 PM, Alex Zavatone via Cocoa-dev > <cocoa-dev@lists.apple.com> wrote: > > Hi Gabe. I’d add basic logging before you start each image and after you > complete each image to see how much each is taking on each of problem tests > so you can see the extent of how slow it is on your problem platforms. > > Then you can add more logging to expose the problems and start to address > them once you see where the bottlenecks are. > > I wonder if there is a method to load the EXIF data out of the files without > opening them completely. That would seem like the ideal approach. > > Cheers, > Alex Zavatone > >> On Jan 7, 2023, at 12:36 PM, Gabriel Zachmann <z...@cs.uni-bremen.de> wrote: >> >> Hi Alex, hi everyone, >> >> thanks a lot for the many suggestions! >> And sorry for following up on this so late! >> I hope you are still willing to engage in this discussion. >> >> Yes, Alex, I agree in that the main question is: >> how can I get the metadata of a large amount of images (say, 100k-300k) >> *without* actually loading the whole image files. >> (For your reference: I am interested in the date tags embedded in the EXIF >> dictionary, and those dates will be read just once per image, then cached in >> a dictionary containing filename & dates, and that dictionary will get >> stored on disk for future use by the app.) >> >>> CGImageSourceRef imageSourceRef = >>> CGImageSourceCreateWithURL((CFURLRef)imageUrl, NULL); >> >> I have tried this: >> >> for ( NSString* filename in imagefiles ) >> { >> NSURL * imgurl = [NSURL fileURLWithPath: filename isDirectory: NO]; >> CGImageSourceRef sourceref = CGImageSourceCreateWithURL( (__bridge >> CFURLRef) imgurl, NULL ); >> } >> >> This takes 1 minute for around 300k images stored on my internal SSD. >> That would be OK. >> >> However! .. if performed on a folder stored on an external hard disk, I get >> the following timings: >> >> - 20 min for 150k images (45 GB) >> - 12 min for 150k images (45 GB), second time >> - 150 sec for 25k images (18 GB) >> - 170 sec for 25k images (18 GB), with the lines below (*) >> - 80 sec for 22k (3 GB) images >> - 80 sec for 22k (3 GB) images, with the lines below (*) >> >> All experiments were done on different folders on the same hard disk, WD >> MyPassport Ultra, 1 TB, USB-A connector to Macbook Air M2. >> Timings with the same number of files/GB were the same folders, resp. >> >> (*): these were timings where I added the following lines to the loop: >> >> CFDictionaryRef fileProps = CGImageSourceCopyPropertiesAtIndex( image, >> 0, NULL ); >> bool success = CFDictionaryGetValueIfPresent( fileProps, >> kCGImagePropertyExifDictionary, (const void **) & exif_dict ); >> CFDictionaryGetValueIfPresent( exif_dict, >> kCGImagePropertyExifDateTimeDigitized, (const void **) & dateref ); >> iso_date = [isoDateFormatter_ dateFromString: (__bridge NSString * >> _Nonnull)(dateref) ]; >> [datesAndTimes_ addObject: iso_date ]; >> >> (Plus some error checking, which I omit here.) >> >> First of all, we can see that the vast majority of time is spent on >> CGImageSourceCreateWithURL(). >> Second, there seem to be some caching effects, although I have a hard time >> understanding that, but that is not the point. >> Third, the durations are not linear; I guess it might have something to do >> with the sizes of the files, too, but again, didn't investigate further. >> >> So, it looks to me like CGImageSourceCreateWithURL() really loads the >> complete image file. >> >> I don't see why Ole Begemann (ref'ed in Alex' post) can claim his approach >> does not load the whole image. >> >> >> Some people suggested parallelizing the whole task, using >> dispatch_queue_create or NSOperationQueue. >> (Thanks Steve, Gary, Jack!) >> Before restructuring my code for that, I would like to better understand why >> you think that will speed up things. >> The code above pretty much does no computations, so most of the time is, I >> guess, spent on waiting for the data to arrive from hard disk. >> So, why would would several threads loading those images in parallel help >> here? In my thinking, they will just compete for the same resource, i.e., >> hard disk. >> >> >> I also googled quite a bit, to no avail. >> >> Any and all hints, suggestions, and insights will be highly appreciated! >> Best, Gab >> >> >>> >> >> >>> if (!imageSourceRef) >>> return; >>> >>> CFDictionaryRef props = CGImageSourceCopyPropertiesAtIndex(imageSourceRef, >>> 0, NULL); >>> >>> NSDictionary *properties = (NSDictionary*)CFBridgingRelease(props); >>> >>> if (!properties) { >>> return; >>> } >>> >>> NSNumber *height = [properties objectForKey:@"PixelHeight"]; >>> NSNumber *width = [properties objectForKey:@"PixelWidth"]; >>> int height = 0; >>> int width = 0; >>> >>> if (height) { >>> height = [height intValue]; >>> } >>> if (width) { >>> width = [width intValue]; >>> } >>> >>> >>> Or this link by Ole Bergmann? >>> >>> https://oleb.net/blog/2011/09/accessing-image-properties-without-loading-the-image-into-memory/ >>> >>> I love these questions. I find out more about iOS programming by >>> researching other people’s problems than the ones that I’m currently faced >>> with. >>> >>> Hopefully some of these will help. >>> >>> Cheers, >>> Alex Zavatone >> > > _______________________________________________ > > Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) > > Please do not post admin requests or moderator comments to the list. > Contact the moderators at cocoa-dev-admins(at)lists.apple.com > > Help/Unsubscribe/Update your Subscription: > https://lists.apple.com/mailman/options/cocoa-dev/jim%40quevivadev.com > > This email sent to j...@quevivadev.com _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com