A Probabilistic NF2 Relational Algebra for Integrated Information Retrieval and Database Systems
- A Probabilistic NF2 Relational Algebra for Integrated Information Retrieval and Database Systems
- N. Fuhr
- T. Rölleke
- Society for Design and Process Science (SDPS)
- Proceedings of the 2nd World Conference on Integrated Design and Process Technology (IDPT)
The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-first-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redefine the relational operators for this type of relations such that the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This effect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.
Fulltext as PS