Building Blocks for Content-Based Image Retrieval and Similarity Search Systems in Digital Libraries (PhD Thesis, finished)
A continuously increasing amount of information is available in digital formats. This information is not limited to textual documents, but can also be represented as images, audio, movies, 3D models or compound documents consisting of several parts. The collection of these documents form large-scale digital libraries which can be publicly available online (e.g. the ACM Digital Library or the free Wikipedia) or private (e.g. a personal collection of documents).
Due to the number of artifacts held in such collections, users must be provided with a variety of tools and interfaces to retrieve the information they are looking for. In particular, keyword-based search might not be ideal or sufficient, e.g. to describe an image such that another person may find it - in particular if such person may use a have a different native language as it is common in todays world where the internet connects all continents. In that case, still a picture may say more than a thousand words.
Content-based image retrieval, as complementary approach to keyword-based queries, ranks results based on similarity to query examples. This is particularly promising for image retrieval, where visually similar objects can represent good query results. However, this poses the problem of how to find a sufficiently good starting image. To solve this problem, components to sketch an image on paper and/or keywords may help.
The purpose of this thesis is to identify important components providing functionality to not only support various search paradigms, but also combine them in order to achieve good user experience. This requires that systems do not force users to follow only one approach and / or in a particular order. For instance, a user should not be forced to start either with a query image, sketch or keyword-based query, but can start with whatever serves best the information need. During search, it must be able to add or remove information in order to refine the search. It is also essential, to capture as much knowledge from the user as possible. E.g., images in queries should not be considered "as-is", but seen in the context of the query. Therefore the user must be enabled to highlight areas of the image making it relevant for the query. The system must be extensible to include new knowledge resources easily, such that it can be adjusted to the information need of the user, e.g. by term-based query expansion in the domain of the current query. Sketch and Image of Van Gogh self-portrait painting and process to compare them.
As an example application, a novel system to query by sketch is implemented. This uses interactive paper to allow the user to draw sketches on paper, which are used by the system as a starting point. It also allows to define regions of interests in paper to give detailed feedback, which areas of an image make it a good result to a query. Such information is used to improve effectiveness and efficiency of the system compared to state-of-the-art image retrieval system. Keyword-based queries are supported and enriched by annotations added by users. Workflows provide extensibility to enrich the system with additional information, e.g. domain-specific thesaurus.
The basic buidlings blocks of such a system are a number of components to
- provide efficient index structures to increase speed of query execution
- extract features to make documents comparable
- manage collections of documents
- compute similarity with an appropriate distance measure
- enable the user to issue a query, i.e. by drawing a sketch or select a region of interest.
This thesis project is the continuation of the work at UNIBAS that started at UMIT in Fall 2003.