Semantic Cluster Analysis in Information Retrieval

General information

From 01. 07. 2009 until 30. 09. 2014
Contact Persons:
Involved Persons:
Sponsored by:
  • DFG
Reference number:
  • DFG: FU 205/22-1
  • UDE: ka00043i
Participating Institutions:


Clustering methods combine an object model, a similarity metrics and a fusion principle, where the latter is the focus of current research.

For more advanced problems, clustering can only be successful when the three elements are combined in a meaningful way and knowledge about both the analysis task and the user is considered. This principle of 'semantic clustering' will allow for solving clustering problems in IR in a more efficient and effective way than current methods.

This project aims at investigating the theoretical, methodological and experimental aspect of this problem. Hereby 'semantics' will play multiple roles:

  1. in the form of specialized retrieval models which consider knowledge
  2. about the IR task at hand,
  3. by integrating domain knowledge
  4. as ensemble clustering, i.e. combining fusion methods,
  5. from the user when performing interactive or multi-clustering.

Finally, semantics will form the basis for cluster labeling - which currently forms the biggest challenge in document clustering.

Additional information:

  • Project website of the working group MediaSystems of the Bauhaus-Universität Weimar




Diploma, Master and Bachelor theses Only in german!

Related projects

  • ezDL
    ezDL is framework for interactive search systems


  • In order to perform and evaluate user studies, an environment for experiments has been developed, which is based on the framework for interactive search systems ezDL and uses AItools to cluster the documents of a retrieval result and resent them to the user.