Logo BMBF
 
Logo NGFN
Home
_


1.3. Data repository and showcase for NGFN-2.

Responsible: Prof. Dr. Walter Mewes, GSF, Munich.

Background:

The aim of this subproject is to establish a repository for diverse data types such as expression arrays and protein/protein interactions that should serve as an information resource for the NGFN as well as an open Web portal to represent the NGFN2 for the scientific community. SMP bioinformatics plans to implement a double-layered data repository design, with a "local instance" hosting data only of interest for local researches and a "central repository" for data of broader interest. This subproject deals with the implementation of the latter repository and will thus provide access to NGFN data for the scientific community, while also providing a comprehensive view for comparative data analysis.

Planned work:

This subproject is primarily comprised of the following five goals:

1) Array data repository (MIPSexpress technology): Here we plan to implement and maintain a central repository compliant with the current MIAME standards (MGED). In addition bidirectional data exchange via various file formats including the recent OMG-standard MAGE-ML will be provided. A special focus of this repository will be an in-depth integration of SOP management structures, lab-based procedure information, and technical parameter sets. Data provided to the central repository will allow for the comparative analysis with datasets of independent or related groups with respect to bio-experimental annotation, their quality, and the comparative statistics of the behaviour of individual genes or gene sets. Tools for the statistical analysis of data with respect to their functional annotation will be integrated as applications within the portal of the repository. Finally, we also aim to provide a standardized and user transparent data transport interface to both local centers and central repositories outside Germany.

2) Data resource for NGFN data and results (NGFN Matrix): Here we will implement a data retrieval layer that allows users to integrate array-technology data with various kinds of gene-oriented annotation. In particular, information on the functional links between genes as obtained from function ontologies like the "MIPS Functional Catalogue" or GO will dramatically enhance the significance of individual signals and will be of great value for the interpretation of array-technology data.

3) Information system: Here we will place major emphasis on relating the primary, experimental data to the functional information available from the functional annotation of mammalian genomes such as protein/protein interactions, regulatory and metabolic networks and other information on functional. Sets of interacting genes will be generated by appropriate analytical tools from the annotation and accessible for user-defined applications through standardized transport layers such as BioMoby, HOBIT or BioBus.

4) Standardization: The members of the work package are going to participate proactively in standardization initiatives and will facilitate their implementation in the experimental groups of NGFN as well as in software products.

5) Local database support: Here we plan to support the integration of local implementation information management systems with standardized interfaces (e.g. iCHIP) and provide their expertise with respect to technical integration and application interfacing.