Generating Search Term Variants for Text Collections with Historic Spellings

  • Zitationsschlüssel:
    Ernst/Fuhr:06
  • Titel:
    Generating Search Term Variants for Text Collections with Historic Spellings
  • Autor(en):
    Andrea Ernst-Gerlach
    Norbert Fuhr
  • In:
    • Zitationsschlüssel:
      ECIR:06
    • Titel:
      28th European Conference on Information Retrieval Research (ECIR 2006)
    • Herausgeber:
      Mounia Lalmas
      Andy MacFarlane
      Stefan M. Rüger
      Anastasios Tombros
      Theodora Tsikrika
      Alexei Yavlinsky
    • Verlag:
      Springer
    • In:
      ECIR
    • Jahr:
      2006
  • Jahr:
    2006

Zusammenfassung:


In this paper, we describe a new approach for retrieval in texts with non-standard spelling, which is important for historic texts in English or German. For this purpose, we present a new algorithm for generating search term variants in ancient orthography. By applying a spell checker on a corpus of historic texts, we generate a list of candidate terms for which the contemporary spellings have to be assigned manually. Then our algorithm produces a set of probabilistic rules. These probabilities can be considered for ranking in the retrieval stage. An experimental comparison shows that our approach outperforms competing methods.

Volltext als PDF