Logo BMBF
 
Logo NGFN
Home
_
German Mouse Clinic – Data Management

Introduction
The German Mouse Clinic (GMC) is a platform for the systematic, standardized and comprehensive phenotyping of mutant mice and their litter mate controls. The GMC offers a primary screening procedure which includes examinations of more than 240 parameters. Furthermore, more in-depth secondary and tertiary screens for the examination of additional parameters are offered.
Compared to similar facilities one of the major advantages of the GMC concept is the aggregation of 14 screens (modular design, labs with adjacent mouse room in each screen) next to each other in one building (1). To achieve a maximum gain of phenotypic result data with a minimum number of animals the chronology of the examinations performed sequentially on the same animals is arranged in a standardized workflow.
In general between 4,000 and 5,000 mice live in the GMC simultaneously. From the beginning around 35,000 mice have been housed inside the GMC. The phenotyping capacity of the GMC per year is dimensioned for 1,500 mice in the primary screen and additional 1,500 mice in the secondary and tertiary screens. Around 50 scientists, technical assistants and animal caretakers work inside the GMC.
According to our long lasting experiences with the Munich ENU mutagenesis project (2) the implementation of a central IT system can be regarded as a vital prerequisite for the realisation of a project of this dimension. When the GMC project started working in the year 2001, we searched for a commercially available software to build the central module of the new GMC IT-system. This kind of software should include an advanced management of all kind of animal and facility data and it should support breeding. The users should be enabled to plan their experiments in advance and to monitor the status of each experiment. Work lists and cage cards should be printable. The up- and download of phenotype data should be available. But no such software was available at that time. Besides we did not have the man-power and the time to build up a new software from scratch.
Generally speaking the basic concepts of the German Mouse Clinic and a diagnostic clinic for man do not differ that much. Therefore we decided to license a clinical information system from a local software development company and to enhance and adapt its functionalities according to the specific needs of a phenotyping unit for rodents. A graphical user interface should enable the users to interactively enter and retrieve all kind of data generated inside the GMC.

Project Status
The GMC IT system now consists of several modules. The software has been adapted to the GMC`s needs within the first 2.5 years since January, 2002. It allows for the interactive input of animal and facility based data: Matings, litters and weanings can be entered and processed. Full genealogic trees including mouse core data and phenotyping results can be shown in a genealogy browser and can be exported. When a batch of new mice is imported into the GMC from other mouse facilities the basic data of these mice (like birth date, gender, genetic background, ear tag, coat colour, etc.) can be easily imported into the GMC database from a standardized Excel interface. Mice can be assorted to groups for any purpose to facilitate e.g. the workflow management of these mice or the export of result data to Excel files.

In the primary screen mice move through the GMC and are examined sequentially in up to 12 of 14 screens (as some experiments are dead end experiments). A comprehensive workflow manage¬ment has been established for a coordinated characterization of batches of mutant and control mice. The experiments can be planned weeks in advance to support the users in arranging their phenotyping slots.  Work lists can be printed out for the daily work in the lab. Color-coded cage cards are used to describe the cage mates.
Various analysers in the GMC record raw data and export them to databases or flat files, while others do not have an interface for the export of data to files. Results stored in flat files or transferred via interface devices are parsed and checked for input errors by a variety of scripts. Raw data are not stored in the database. The final validation of the data is performed by the scientists on the level of Excel files. Examination parameters are mostly grouped to parameter sets consisting of up to 50 parameters. The results are extracted from individually formatted files and passed over to central data inter¬faces (e*Gate Integration Suite) by VBA scripts. Before inserting the results into the GMC database checks for plausibility and referential integrity are automatically performed. Currently approx. 425,000 result datasets from more than 4,000 mice are stored in the database. This is on average 104 (minimum 1, maximum 351) parameters per mouse. Forms for an assisted manual input and display of results of the phenotype analysis for a certain mouse are established. Interfaces for the data export to Excel files (e. g. for a detailed statistical analysis of phenotype data or work lists for the input and upload of result data) are available.
The results of the examinations in the primary screens are accessible to all GMC screeners. Therefore the results can be viewed from different scientific perspectives which allows a more comprehensive assessment of the phenotypes and their possible causations and relations to each other.
Screeners can easily create individual ad-hoc queries to the database using the tool “Querybuilder” to retrieve information which are not directly available from the graphical user interface. For this neither inside knowledge of the database schema (including primary key / foreign key relations) nor the query language SQL is needed. Almost all relevant information can be selected or used as restrictive conditions with a few mouse clicks. So the users are not limited to pre-defined queries. Even aggregate functions, sorting and complex conditions (logical AND and OR in one query) are available. The corresponding SQL command is generated automatically and executed directly or after a manual modification. The result set can be exported to an Excel file or it can be further processed within the software.

Some experiments require an official approval and are limited to a certain number of animals. A tool for the automatic calculation of the number of animals housed in the GMC in a given period of time grouped by particular criteria (e. g. gene¬tically modified animals or animals in an experimental state) is available. This will facilitate the calculation and increase the accuracy of animal numbers reported to the regulatory authorities.
Movements of mice between cages inside the facility are stored in the database. The animals are located in individually ventilated cages. The hygienic status of the IVC racks is monitored by soiled air/bedding-sentinel mice. Using these information it is possible to detect animals potentially  affected from infectious diseases or seized with parasites. In the past in spite of the mandatory health monitoring of animals according to the FELASA recommendations prior to the import a batch of mice was affected by helminths. Using the movement and sentinel data we could quickly identify the affected mice and prevent a propagation of the helminths to other animals (3).

The application is running on a scalable Windows 2000 terminal server farm. It is available from any operating system on the client computers via Citrix MetaFrame XPa technology. This technology also provides a management console which enables administrators to shadow client sessions running on the server and even taking over control (if the user agrees). Thus an administrator can offer quick remote assistance to the users without physically entering the hygienically sensitive mouse facility.
A relational database linked to the application is used as a central data store.

To prevent data loss from local hard disk drives we store the files on a central file server. Recently the amount of hard disk storage capacity available to the members of the institute and the GMC on a central “Network Attached Storage” (NAS) RAID system has been increased to > 1 TByte. On the NAS, differential snapshot backups and full backups to tape are performed 3 times a day or once a week, respectively. Former versions of up to one month old files can be restored easily. User- and / or  group-specific access authorisation privileges can be assigned to the directories. The NAS system itself is administered by the GSF computer center. The GMC IT group sets up the directories, organisational groups and access privileges for the projects of the institute within the GSF Active Directory Domain. The NAS will be used as a high-availability system for the storage of any kind of files, even large files like videos or images from X-Ray scanners or our Micro-Computer Tomography machine.

Management software for the GMC coordination team
The GMC coordination team, consisting of five members, has to manage a multitude of tasks associated with phenotyping requests from potential new collaboration partners. Among others they have to allocate phenotyping slots of the primary screen, coordinate the signing of the collaboration agreements, ask for health certificates, provide mating assistance, manage transfer and import of the animals, organize weekly GMC meetings and result presentations, edit examination reports etc. To centralize the information on current requests and to avoid exceeding time-limits for critical tasks a new web-based management software has been established in-house. The mouse providers have to fill in a detailed request form on the GMC homepage (www.mouseclinic.de) to give the information needed to decide on the request. The data is stored in a relational Sybase ASE 12.5 database and can be retrieved and  edited by the GMC managers via password-protected web-based forms. After accepting a request several tasks are automatically assigned to certain members of the GMC management. A few days before the deadline of a task is reached an automatic reminder system sends an e-mail to the manager responsible for this task or finally to the whole management team. A first version of the management software will replace our current request form in August 2005.

Outlook
Outsourcing the development of an IT system for the management of all experiments and data in the GMC was definitely the fastest way to get an urgently needed system started as soon as possible. However, in the last years we realized the problems of running such a commercial software which cannot be modified by ourselves according to changing user needs as we only have full access to the database, but not to the source code of the application. Meanwhile we know the exact demands of the GMC users for the features of an animal and result documentation software much better than we did at the beginning of the project which makes it easier to develop a tailor-made software.
To be more flexible and independent we decided to use our knowledge from previous projects to establish a new software system in-house with a range of functionalities similar to those in our existing system. An additional focus shall be set on an improved integration of phenotype and genotype data. Links to public literature and phenome databases shall be implemented. From this we expect a more user-friendly and focussed search for known aspects of genes and phenotypes in the literature. This might also be helpful to accelerate finding possible candidate genes for mutant mouse lines which have been phenotyped, but not yet mapped.
The new information system will follow the idea of a so-called LAMP system (Linux server, Apache webserver, MySQL database management system, Perl or PHP used for the web application), although it will be running on an HP-UX server cluster with a Sybase ASE 12.5 database management system. The web-based application itself will most probably be written in Perl and PHP. The new software shall go operative at the end of Q1/2006. To migrate the database content to the new database we will have to write a script which unloads the data, re-organizes it to fit it to the different database schema and then loads it into the new database.
The GMC IT group is in intensive regular discussions with representatives from the user side in order to integrate the users in the developmental process and to keep the complexity of the business logic as low as possible. The more intuitively and user-friendly the application can be used the less amount of time has to be spent on user training and support. We do attach great importance to the compliance of the users to add their results to the database as a basis for further data evaluation. Interfaces for the transfer of result data to statistical analysis applications will be provided.
A new tool for the visualisation of parametric phenotype results shall be established to facilitate the identification of previously unknown correlations between parameters.
On the long term we plan to provide the results of  the phenotype examinations to the scientific community on the web after publishing them. Details will be discussed with other members of European consortia.

Lit.: 1. Gailus V et al. Introducing the German Mouse Clinic: open access platform for standardized phenotyping. Nature Methods 2005; 2(6):403-04. 2. Pargent et al. MouseNet database: digital management of a large-scale mutagenesis project. Mamm Genome. 2000 Jul;11(7):590-3. 3. Brielmeier M et al. Microbiological monitoring of laboratory mice and biocontainment in individually ventilated cages (IVCs): a field study. Lab Animals (submitted).