Pepper

Peer-to-Peer Architectures for Federated Search of Complex Digital Libraries
General information
- Guy Bertrand Noutsa Tsemo
- Andre Lingemann
- Ghita Mezzour (CMU)
- DFG
- NSF
- DFG: BIB47 DOuv 02-01
- UDE: 15311523 (ka00043c)
Description
The set of providers of Digital Libraries and services on the Web is growing both in absolute numbers and in terms of diversity. From a user point of view, there should be a single virtual library (``one stop shop'') comprising all relevant sources for their information needs. Peer-to-peer architectures have been effective at integrating large numbers of very simple DLs, for example, for file sharing. This project research will demonstrate the use of peer-to-peer architectures for federated search across large numbers of complex digital libraries that are integrated only very loosely.
The project is based on the assumption that it is neither possible nor desirable to enforce homogeneity in a large-scale federation of complex digital libraries. DL providers will differ in terms of their schema used, the quality of the data and their degree of cooperation. We will develop transformation methods that take into account the intrinsic imprecision and vagueness of mappings between different schemas. For this purpose, appropriate methods for describing DL schemas and the (uncertain) mappings between them must be developed.
There is a growing number of Web services that can be used for improving retrieval results from DLs; mapping services help in bridging heterogeneity, and enhancing services provide functions for retrieving additional, relevant documents. We will develop methods for dynamic incorporation of these services into the P2P retrieval system, by developing appropriate methods for both service description and service selection.
Large-scale peer-to-peer networks require routing services so that messages are routed to desired destinations efficiently. We will develop content-based routing services (resource description, resource selection, and data fusion) for peer-to-peer networks. Content-based routing services raise a variety of new issues in the peer-to-peer environment, for example partial representations of DL contents, and a more complex process for deciding whether to satisfy messages locally or route them to another node.
In order to make our implementations of these methods available for other researchers and developers, we will implement all methods by using the JXTA framework, which currently is used by a number of other projects in the DL and peer-to-peer areas.
Publications
- Henrik Nottelmann; Gudrun Fischer (2007).
- Search and browse services for heterogeneous collections with the peer-to-peer network Pepper. Information Processing & Managementt 43
- Nottelmann, Henrik; Fuhr, Norbert (2007).
- A Decision-Theoretic Model for Decentralised Query Routing in Hierarchical Peer-To-Peer Networks. In 29th European Conference on Information Retrieval Research (ECIR 2007)
- Nottelmann, Henrik; Aberer, Karl; Callan, Jamie; Nejdl, Wolfgang (2006).
- The CIKM 2005 Workshop on Information Retrieval in Peer-to-Peer Networks. SIGIR Forum 40(1)
- H. Nottelmann; N. Fuhr (2006).
- Adding Probabilities and Rules to OWL Lite Subsets based on Probabilistic Datalog. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 14(1)
- Nottelmann, Henrik; Fuhr, Norbert (2006).
- Comparing different architectures for query routing in peer-to-peer networks. In 28th European Conference on Information Retrieval Research (ECIR 2006)
- Nottelmann, Henrik; Straccia, Umberto (2006).
- A Probabilistic, Logic-based Framework for Automated Web Directory Alignment. In: Zongmin Ma (ed.):
- Henrik Nottelmann; Umberto Straccia (2006).
- Information retrieval and machine learning for probabilistic schema matching. Information Processing and Management 43
- Gudrun Fischer; André Nurzenski (2005).
- Towards Scatter/Gather Browsing in a Hierarchical Peer-to-Peer Network. In Proceedings of the 2005 ACM Workshop on Information Retrieval in Peer-to-Peer Networks (P2PIR 2005), Bremen, Germany, November 4, 2005
- H. Nottelmann (2005).
- PIRE: An extensible IR engine based on probabilistic Datalog. In 27th European Conference on Information Retrieval Research (ECIR 2005)
- Henrik Nottelmann (2005).
- Inside PIRE: An extensible, open-source IR engine based on probabilistic logics. Technical Report, University of Duisburg-Essen
- Henrik Nottelmann; Gudrun Fischer; Alexej Titarenko; André Nurzenski (2005).
- An integrated approach for searching and browsing in heterogeneous peer-to-peer networks. In Proc. Heterogeneous and Distributed Information Retrieval
- H. Nottelmann; U. Straccia (2005).
- sPLMap: A probabilistic approach to schema matching. In 27th European Conference on Information Retrieval Research (ECIR 2005)
- Henrik Nottelmann; Umberto Straccia (2005).
- Information retrieval and machine learning for probabilistic schema matching (poster). In Proceedings of the 14th International Conference on Information and Knowledge Management
- Henrik Nottelmann; Karl Aberer; Jamie Callan; Wolfgang Nejdl (eds.) (2005).
- Proceedings of the 2005 ACM Workshop on Information Retrieval in Peer-to-Peer Networks (P2PIR 2005), Bremen, Germany, November 4, 2005.
- H. Nottelmann; N. Fuhr (2004).
- Combining CORI and the decision-theoretic approach for advanced resource selection. In 26th European Conference on Information Retrieval Research (ECIR 2004)
- Henrik Nottelmann; Norbert Fuhr (2004).
- pDAML+OIL: A probabilistic extension to DAML+OIL based on probabilistic Datalog. In Proceedings Information Processing and Management of Uncertainty in Knowledge-Based Systems
- H. Nottelmann; N. Fuhr (2004).
- A logic-based approach for computing service executions plans in peer-to-peer networks. In SIGIR Workshop on Peer-to-Peer Information Retrieval
- N. Fuhr; C.-P. Klas (2001).
- Combining RDF and Agent-Based Architectures for Semantic Interoperability in Digital Libraries. In Proceedings of the DELOS-Workshop on Interoperability in Digital Libraries
-
- Jie Lu; Jamie Callan (2004)
- Merging retrieval results in hierarchical peer-to-peer networks. (poster description) Proceedings of the Twenty-Seventh International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Sheffield, UK, ACM.
-
- Jie Lu; Jamie Callan (2004)
- Federated search of text-based digital libraries in hierarchical peer-to-peer networks. Peer-to-Peer IR Workshop of the Twenty-Seventh International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK, ACM.
-
- Jie Lu; Jamie Callan (2003)
- Content-based information retrieval in peer-to-peer networks. Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM), New Orleans, ACM.
Talks
- Norbert Fuhr (2007).
- A Decision-Theoretic Model for Decentralised Query Routing in Hierarchical Peer-To-Peer Networks. Talk at the European Conference on Information Retrieval Research, Rome, Italy
- Norbert Fuhr (2006).
- Comparing different architectures for query routing in peer-to-peer networks. Talk at the Max-Planck-Institute of Informatics (Saarbrücken, Germany)
- Henrik Nottelmann (2005).
- Pepper - Information Retrieval in hierarchical Peer-to-Peer networks with heterogeneous services. Talk at the 'P2PIR in Germany' workshop (Leipzig)
- Henrik Nottelmann (2005).
- Decision-theoretic resource selection in hierarchical peer-to-peer networks. Talk at the CMU LTI group meeting
- Henrik Nottelmann; Gudrun Fischer; Alexej Titarenko; André Nurzenski (2005).
- An integrated approach for searching and browsing in heterogeneous peer-to-peer. Talk at the HDIR 2005 workshop (co-located with SIGIR)
- Henrik Nottelmann (2003).
- Probabilistic logics for defining and using P2P service descriptions. Workshop on Metadata Management in Grid and Peer-to-Peer Systems (MMGPS), London
- Henrik Nottelmann (2003).
- Probabilistic logics for defining and using P2P service descriptions. QMIR Seminar, London
Diploma, Master and Bachelor theses Only in german!
- Information Retrieval im Semantic Web
- Finished diploma thesis
- Service-Beschreibungen in Peer-to-Peer-Netzen
- Finished master thesis
- Cluster-basiertes Browsing in Peer-to-Peer-Netzen
- Finished diploma thesis
- IR im P2P-Netz JXTA
- Finished diploma thesis
Project meetings
-
- November 21/22, 2004, Pittsburgh:
- Technical meeting
-
- July 25, 2004, Sheffield:
- Technical meeting
-
- March 8/9, 2004, Duisburg:
- Technical meeting
-
- November 10/11, 2003, Pittsburgh:
- Kick-off meeting
Testbeds
-
- DTF in P2P networks (used in ECIR 2006 paper):
- Used in ECIR 2006 paper (300 KB)
-
- Schema mapping:
- BIBDB, OAI (3 MB) (down)