https://bugs.kde.org/show_bug.cgi?id=497938

            Bug ID: 497938
           Summary: Text-based Image Search
    Classification: Applications
           Product: digikam
           Version: unspecified
          Platform: Other
                OS: Other
            Status: REPORTED
          Severity: wishlist
          Priority: NOR
         Component: Searches-Advanced
          Assignee: digikam-bugs-n...@kde.org
          Reporter: chair-tweet-de...@duck.com
  Target Milestone: ---

SUMMARY

The proposal is to add a text-based image search feature in digiKam. The idea
is to allow users to input a text query (e.g., "cat on a couch") and retrieve
images that match based on their visual content, rather than relying on tags or
metadata. This would enable more flexible and intuitive searches, particularly
in large image collections where textual information may be limited or missing.

ADDITIONAL INFORMATION

To implement this feature, each image in the library would be associated with
an embedding calculated when it is added to the database (or in batch). This
calculation could be done using an AI model like CLIP (or similar models),
which generates a vector representation of the image based on its visual
content. These embeddings could be stored in the image files themselves (e.g.,
in EXIF metadata or an associated database), ensuring that this information is
preserved over time.

For the search, the idea would be to leverage a vector search engine that
compares the stored image embeddings with those generated from the text
queries. This would require the integration of a vector database like FAISS or
similar, enabling fast and scalable search within large image collections. When
a user submits a text query, an embedding is generated for the description and
compared to the pre-calculated image embeddings, with the most relevant images
returned based on vector similarity.

The interface could include a new option in the search section of digiKam,
allowing users to enter a textual description and see the corresponding images
based on their content. This approach would enable a more powerful search
system, going beyond simple keywords or tags associated with the images.


The technical details provided here are intended to illustrate what the feature
might look like. However, as I am not familiar with the internal structure of
digiKam, this is purely a conceptual framework, not a detailed action plan.
Further discussion and adjustments will be required to fit the specific
architecture of the software.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to