DFG project SINIR – with simulated usage data for the library of the future
In the DFG project SINIR, a research team from the University of Passau is working on concepts for the digital library of the future in conjunction with the world’s largest specialist library for economics literature – and that with the help of simulated use data.
The DFG project SINIR (Simulating Interactive Information Retrieval) is special for two reasons: Firstly, the developed solution fits with the search profile of the user. Secondly, the journey there is in itself extraordinary. Because, for the development, the researchers are working completely without real usage data and without laborious tests.
„The project is about optimizing access to digital libraries“, explains Prof. Dr. Michael Granitzer, Chair of Data Science at the University of Passau, who is leading the project together with Matthias Hagen, Professor of Big Data Analytics at the Martin-Luther-University in Halle-Wittenberg and Prof. Dr. Klaus Tochtermann, Director of the Leibniz Information Centre for Economics (ZBW), Kiel. „With the help of our method, we can significantly reduce the process of development“, says Prof. Dr. Granitzer.
The research team simulates a library system and injects a large amount of artificially generated usage data into this system. This data forms as many usage preferences as possible. „A wide range of search profiles exists. Some users search every day, others only come once and then look very purposely for a specific book“, explains Prof. Dr. Granitzer.
Real usage data – enhanced with artificial data
The usage models therefore simulate, for example, queries, clicks and results interactions. But it doesn’t work completely without the use of existing data. The researchers are training some of the models on the basis of real log data – that is, those files which automatically record the search behaviour of real users. The researchers in turn enhance these models with artificially generated data.
These usage models help the researchers to improve the digital search. They allow inferences as to how library data is indexed by keyword so that they are easier to find. They also give tips on the best order for search results.
Participants and funding
The research team evaluates the developed process in the context of an actually existing digital library: the economics search engine EconBiz. After all, the project partner is the ZBW – the Leibniz Information Centre for Economics, the world’s largest specialist library for economics literature. The project is also supported by the University of Twente as an associated partner. The person responsible for this project is Christin Seifert, who until 2018 was acting professor for the Chair of Complex Systems Engineering at the University of Passau Faculty of Informatics. The result also provides the research team with other digital libraries in the form of an open source framework.
The Deutsche Forschungsgemeinschaft (DFG – German Research Foundation) is supporting the scheme with project number GR 4277/2-1, AOBJ: 651840 over a period of three years.