There is a perl program called exiftool that can load and set exif tool without 
loading the image data (or at least it doesn’t decode the image data). I don’t 
know whether it would be faster than loading image data/properties with 
ImageIO. You could write a perl script that used your bundled exiftool to load 
the exif data and output the results for many files in a format your program 
could handle, because instantiating perl/exiftool repeatedly for each image in 
a separate NSTask would probably be pretty slow. 

Jim Crate


> On Jan 7, 2023, at 2:07 PM, Alex Zavatone via Cocoa-dev 
> <cocoa-dev@lists.apple.com> wrote:
> 
> Hi Gabe.  I’d add basic logging  before you start each image and after you 
> complete each image to see how much each is taking on each of problem tests 
> so you can see the extent of how slow it is on your problem platforms.
> 
> Then you can add more logging to expose the problems and start to address 
> them once you see where the bottlenecks are.
> 
> I wonder if there is a method to load the EXIF data out of the files without 
> opening them completely.  That would seem like the ideal approach.
> 
> Cheers,
> Alex Zavatone
> 
>> On Jan 7, 2023, at 12:36 PM, Gabriel Zachmann <z...@cs.uni-bremen.de> wrote:
>> 
>> Hi Alex, hi everyone,
>> 
>> thanks a lot for the many suggestions!
>> And sorry for following up on this so late!
>> I hope you are still willing to engage in this discussion.
>> 
>> Yes, Alex, I agree in that the main question is:
>> how can I get the metadata of a large amount of images (say, 100k-300k) 
>> *without* actually loading the whole image files.
>> (For your reference: I am interested in the date tags embedded in the EXIF 
>> dictionary, and those dates will be read just once per image, then cached in 
>> a dictionary containing filename & dates, and that dictionary will get 
>> stored on disk for future use by the app.)
>> 
>>> CGImageSourceRef imageSourceRef = 
>>> CGImageSourceCreateWithURL((CFURLRef)imageUrl, NULL);
>> 
>> I have tried this:
>> 
>>  for ( NSString* filename in imagefiles ) 
>>  {
>>     NSURL * imgurl = [NSURL fileURLWithPath: filename isDirectory: NO];
>>    CGImageSourceRef sourceref = CGImageSourceCreateWithURL( (__bridge 
>> CFURLRef) imgurl, NULL );
>>  }
>> 
>> This takes 1 minute for around 300k images stored on my internal SSD.
>> That would be OK.
>> 
>> However! .. if performed on a folder stored on an external hard disk, I get 
>> the following timings:
>> 
>>         - 20 min for 150k images (45 GB) 
>>         - 12 min for 150k images (45 GB), second time
>>         - 150 sec for 25k images (18 GB)
>>         - 170 sec for 25k images (18 GB), with the lines below (*)
>>         - 80 sec for 22k (3 GB) images
>>         - 80 sec for 22k (3 GB) images, with the lines below (*)
>> 
>> All experiments were done on different folders on the same hard disk, WD 
>> MyPassport Ultra, 1 TB, USB-A connector to Macbook Air M2.
>> Timings with the same number of files/GB were the same folders, resp.
>> 
>> (*): these were timings where I added the following lines to the loop:
>> 
>>       CFDictionaryRef fileProps = CGImageSourceCopyPropertiesAtIndex( image, 
>> 0, NULL );
>>       bool success = CFDictionaryGetValueIfPresent( fileProps, 
>> kCGImagePropertyExifDictionary, (const void **) & exif_dict );
>>       CFDictionaryGetValueIfPresent( exif_dict, 
>> kCGImagePropertyExifDateTimeDigitized, (const void **) & dateref );
>>       iso_date = [isoDateFormatter_ dateFromString: (__bridge NSString * 
>> _Nonnull)(dateref) ];
>>       [datesAndTimes_ addObject: iso_date ];
>> 
>> (Plus some error checking, which I omit here.)
>> 
>> First of all, we can see that the vast majority of time is spent on 
>> CGImageSourceCreateWithURL().
>> Second, there seem to be some caching effects, although I have a hard time 
>> understanding that, but that is not the point.
>> Third, the durations are not linear; I guess it might have something to do 
>> with the sizes of the files, too, but again, didn't investigate further.
>> 
>> So, it looks to me like CGImageSourceCreateWithURL() really loads the 
>> complete image file.
>> 
>> I don't see why Ole Begemann (ref'ed in Alex' post) can claim his approach 
>> does not load the whole image.
>> 
>> 
>> Some people suggested parallelizing the whole task, using 
>> dispatch_queue_create or NSOperationQueue.
>> (Thanks Steve, Gary, Jack!)
>> Before restructuring my code for that, I would like to better understand why 
>> you think that will speed up things.
>> The code above pretty much does no computations, so most of the time is, I 
>> guess, spent on waiting for the data to arrive from hard disk.
>> So, why would would several threads loading those images in parallel help 
>> here? In my thinking, they will just compete for the same resource, i.e., 
>> hard disk.
>> 
>> 
>> I also googled quite a bit, to no avail.
>> 
>> Any and all hints, suggestions, and insights will be highly appreciated!
>> Best, Gab
>> 
>> 
>>> 
>> 
>> 
>>> if (!imageSourceRef)
>>> return;
>>> 
>>> CFDictionaryRef props = CGImageSourceCopyPropertiesAtIndex(imageSourceRef, 
>>> 0, NULL);
>>> 
>>> NSDictionary *properties = (NSDictionary*)CFBridgingRelease(props);
>>> 
>>> if (!properties) {
>>> return;
>>> }
>>> 
>>> NSNumber *height = [properties objectForKey:@"PixelHeight"];
>>> NSNumber *width = [properties objectForKey:@"PixelWidth"];
>>> int height = 0;
>>> int width = 0;
>>> 
>>> if (height) {
>>> height = [height intValue];
>>> }
>>> if (width) {
>>> width = [width intValue];
>>> }
>>> 
>>> 
>>> Or this link by Ole Bergmann?
>>> 
>>> https://oleb.net/blog/2011/09/accessing-image-properties-without-loading-the-image-into-memory/
>>> 
>>> I love these questions.  I find out more about iOS programming by 
>>> researching other people’s problems than the ones that I’m currently faced 
>>> with.
>>> 
>>> Hopefully some of these will help.
>>> 
>>> Cheers,
>>> Alex Zavatone
>> 
> 
> _______________________________________________
> 
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
> 
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
> 
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/cocoa-dev/jim%40quevivadev.com
> 
> This email sent to j...@quevivadev.com

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to