Categorizing Web Documents in Hierarchical Catalogues

  • Zitationsschlüssel:
    Frommholz:01a
  • Titel:
    Categorizing Web Documents in Hierarchical Catalogues
  • Autor(en):
    Ingo Frommholz
  • In:
    23th European Conference on Information Retrieval Research (ECIR 2001)
  • Jahr:
    2001

Zusammenfassung:


Automatic categorization of web documents (e.g. HTML documents) denotes the task of automatically finding relevant categories for a (new) document which is to be inserted into a web catalogue like Yahoo!.Thereexist many approaches for performing this difficult task. Here, special kinds of web catalogues, those whose category scheme is hierarchically ordered, are regarded. A method for using the knowledge about the hierarchy to gain better categorization results is discussed. This method can be applied in a post-processing step and therefore be combined with other known (non-hierarchical) categorization approaches.

Volltext als PDF