Peer-to-Peer Architectures for Federated Search of Complex Digital Libraries


Duration:
From 01. 11. 2003 until 31. 12. 2006
Contact Persons:
Involved Persons:
Sponsored by:
  • DFG
  • NSF
Reference number:
  • DFG: BIB47 DOuv 02-01
  • UDE: 15311523 (ka00043c)
Participating Institutions:

The set of providers of Digital Libraries and services on the Web is growing both in absolute numbers and in terms of diversity. From a user point of view, there should be a single virtual library (``one stop shop'') comprising all relevant sources for their information needs. Peer-to-peer architectures have been effective at integrating large numbers of very simple DLs, for example, for file sharing. This project research will demonstrate the use of peer-to-peer architectures for federated search across large numbers of complex digital libraries that are integrated only very loosely.

The project is based on the assumption that it is neither possible nor desirable to enforce homogeneity in a large-scale federation of complex digital libraries. DL providers will differ in terms of their schema used, the quality of the data and their degree of cooperation. We will develop transformation methods that take into account the intrinsic imprecision and vagueness of mappings between different schemas. For this purpose, appropriate methods for describing DL schemas and the (uncertain) mappings between them must be developed.

There is a growing number of Web services that can be used for improving retrieval results from DLs; mapping services help in bridging heterogeneity, and enhancing services provide functions for retrieving additional, relevant documents. We will develop methods for dynamic incorporation of these services into the P2P retrieval system, by developing appropriate methods for both service description and service selection.

Large-scale peer-to-peer networks require routing services so that messages are routed to desired destinations efficiently. We will develop content-based routing services (resource description, resource selection, and data fusion) for peer-to-peer networks. Content-based routing services raise a variety of new issues in the peer-to-peer environment, for example partial representations of DL contents, and a more complex process for deciding whether to satisfy messages locally or route them to another node.

In order to make our implementations of these methods available for other researchers and developers, we will implement all methods by using the JXTA framework, which currently is used by a number of other projects in the DL and peer-to-peer areas.


Publications


Jie Lu; Jamie Callan (2004)
Merging retrieval results in hierarchical peer-to-peer networks. (poster description) Proceedings of the Twenty-Seventh International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Sheffield, UK, ACM.

Jie Lu; Jamie Callan (2004)
Federated search of text-based digital libraries in hierarchical peer-to-peer networks. Peer-to-Peer IR Workshop of the Twenty-Seventh International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK, ACM.

Jie Lu; Jamie Callan (2003)
Content-based information retrieval in peer-to-peer networks. Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM), New Orleans, ACM.


Talks


Diploma, Master and Bachelor theses

Only in german!



Related projects


DAFFODIL
Distributed Agents for User-Friendly Access of Digital Libraries
MIND
Resource Selection and Data Fusion for Multimedia International Digital Libraries

Project meetings


November 21/22, 2004, Pittsburgh:
Technical meeting
July 25, 2004, Sheffield:
Technical meeting
March 8/9, 2004, Duisburg:
Technical meeting
November 10/11, 2003, Pittsburgh:
Kick-off meeting

Testbeds


DTF in P2P networks (used in ECIR 2006 paper):
Used in ECIR 2006 paper (300 KB)
Schema mapping:
BIBDB, OAI (3 MB) (down)