Classification and Intelligent Search on Information in XML


XML can be used for representing all kinds of documents in product cataloges, digital libraries and scientific data repositories, and across the Web. However, merely casting the documents into XML does not necessarily make their semantics explicit and more amenable for effective information searching. To fully leverage XML on a global scale, CLASSIX addresses the following issues:

  • Providing an easy-to-use yet powerful and efficient search language that combines concepts from the current XML pattern-matching languages, such as XPath and XQuery, with ontology-backed information-retrieval style search result ranking.
  • Extracting more semantics from existing document collections by constructing structural and ontological skeletons, e.g., in the form of DTDs or XML schemas that describe the data at a higher semantic level and can also facilitate new forms of indexing for efficiency.
  • Classifying existing documents according to a given thematic or personalized, hierarchical ontology to make searching more effective, e.g., exploit relevance feedback, and efficient, e.g., limit the search focus.



Diplom-, Master- und Bachelorarbeiten

Verwandte Projekte

    Focussed retrieval of structured documents
  • HyREX
    Hyper-media Retrieval Engine for XML
  • INEX
    Initiative for the Evaluation of XML retrieval