Since 2016, CrossAsia has been setting up an Integrated Text Repository (CrossAsia ITR). This technical infrastructure is designed to ensure both secure and sustainable operation and access to Asia-related resources independent of particular provider systems. This we guarantee by using standard technologies for metadata, APIs (e.g., OAI) and repository frameworks (e.g., Fedora). We plan to assign a standard number (e.g., DOI) to each digital object in the ITR, which will make addressing and versioning of the objects possible.
In order to standardise and prepare the data for ingest in the ITR according to defined routines, the full-text and image-text data are extracted from their original database environments. The data is stored on a very small-scale level (if possible) in the ITR structure so that control and persistent reference of the data can be ensured. In addition, information about usage rights is added to the metadata and object data in order to ensure existing rights and to prevent improper use. This further makes it possible to integrate content into current and future CrossAsia services and to provide data for analysis, exploration, enrichment and visualisation in the field of digital science.
Currently, CrossAsia stores and manages more than 120,000 titles with 13 million pages in the ITR. The data is also available via a full-text search.