Probabilistic Object-oriented Logics for Annotation-based Retrieval

Allgemeine Informationen



POLAR is a framework for annotation-based document and annotation retrieval, and discussion search.

Annotations gain a growing importance in today's information systems to establish communicative and collaborative functions. Annotation-based discussion can be part of a larger work process in digital libraries as well as in Webbased systems like Wikipedia and newswire sites like ZDNet News, where people discuss the published articles and their content. Furthermore, semantic annotation and tagging (as we can find it, e.g. in gains rising popularity.

Annotations can basically be categorised into meta-level annotations, containing assertions about the annotated document, and content-level annotations, where the content of a document is extended by the annotation. It is a straightforward step to use annotations and annotation-based discussions as a valuable information source for document, annotation and discussion search. While classical retrieval tools enable us to search for documents as an atomic unit without any context, frameworks like POOL are able to model and exploit the document structure and nested documents (similar to current XML IR methods). But since annotation hypertexts are not necessarily trees (as structured documents in POOL), the POLAR framework is able to consider the special nature of annotations for different retrieval strategies. POLAR thus cannot only cope with structured documents like POOL, but also with several kinds of annotations.

POLAR thus offers the following features for annotation-based retrieval:

  • Indexing and modeling of annotation hypertexts, comprising
    • Object-modeling based on probabilistic propositions (e.g. index terms, attributes and categorisations)
    • Structured documents and annotations
    • Content and meta annotations
    • References
    • Merged annotation targets
    • Annotated passages (fragments)
    • Annotation types
    • Polarity of annotations
  • Annotation-based document and discussion search
    • Database queries
    • Content-oriented queries using knowledge and relevance augmentation and probabilistic retrieval
  • In later versions, POLAR is supposed to support semantic annotation and retrieval

Some experiments on annotation-based document search have been performed recently. The basic collection is a snapshot of ZDNet News containing roughly 4700 articles and >91000 user comments. We created a test set containing 20 topics and relevance judgements for 17 of them.



Verwandte Projekte


  • A first POLAR prototype, supporting a large part of the above functionality, has been developed and used for experiments on annotations-based document retrieval and discussion search. The POLAR implementation is based on the JaySpirit Java API for HySpirit. Details and software packages follow.