Maize Database Implementation--
Unveiling at Annual Meeting


Mary Berlyn, Research Scientist
Dept. of Biology and School of Forestry and Environmental Studies
Yale University
Stan Letovsky, President
Letovsky Associates, New Haven, CT

Maize geneticists had an opportunity to use the Sun SparcStation Sybase implementation of the Maize DB (Database) at their annual maize genetics meeting in Asilomar, CA, last March.

The scientists queried for stocks with specific genotypic combinations, map diagrams of gene and RFLP marker positions or cytogenetic markers, phenotypic traits associated with specific mutations, genes producing specific gene products or product classes, and the converse of such queries.

Interest in the database was high. Scientists from both academia and industry expressed satisfaction and support for the project, which is headed by USDA Agricultural Research Service Geneticist Ed Coe at the University of Missouri.

The demonstration allowed participants to query with the entire suite of first-phase, fully functioning software, with comprehensive data entry in many areas--including a wide range of Maize Genetics Cooperation Stock Center and other stocks, genes,gene products, and mutations--and with more limited entries of mapping data.

Database Development

The demonstration culminated a year of work that began, symmetrically enough, with the first meeting of the Maize Database Advisory Group recruited and convened by Ed Coe at the 1991 Annual MAIZE GENETICS Meeting in Delavan, WI. This group subsequently met to recommend priorities and requirements, and provide general advice for the development of the database.

A University of Missouri working group--Ed, Denis Hancock, Mary Polacco, and Marty Sachs--periodically met with Mary Berlyn, Yale University, and Stan Letovsky, Letovsky Associates, to further develop design specifications. After further work on the design, completed by Stan and Mary in September, the specifications were converted into database software by Stan. The Maize DB implementation in Sybase was delivered to the Columbia, MO, group in November 1991. Heavy-duty data entry work in Columbia and New Haven, and small modifications in the software followed.

At the March maize genetics meeting, Denis and Stan installed the database on a workstation in the poster hall (and an on-line vt100 version ran on a portable computer connected to the server in Columbia). The database developers then all vied to show their favorite features of Maize DB to interested participants.

Contents

Some of the contents and relationships within the database are indicated in the following statements. (Capitalized nouns indicate major objects in the database.)

Query Mode

The interface to the Maize DB is form-based. A query is formulated by placing the desired characteristic (e.g., phenotypic trait, mutation type, combination of specific mutants, or endpoint coordinates) into the appropriate field on the form for the object that the user wishes to retrieve. For example, to retrieve a strain with a mutation in each of the two orange pericarp genes orp1 and orp2, the user enters query mode on the Strain Form and places orp1 and orp2 in the Mutation list field.

To obtain a list of all genes between genetic coordinates 0 and 40 on Chromosome 1, the user enters query mode on Sites and enters 20 +/- 20 in the Coordinate fields and Gene in the Type field. A Menu button converts the user's specification into the corresponding Sybase query and returns a list of all objects (stocks in the first example, genes in the second) that satisfy those constraints. Menu options then allow the user to examine selected strains or genes in detail and to draw a map of the genes.

Rapid Travel

The interface also provides for rapid travel between forms to get detailed information about components of the description. For example, in examining a strain, the user may wish to see all information and references relating to one of the mutations. Pointing to that field and pushing a menu button presents the Mutation form for that mutation entry. The user can then either return to the original form or further expand the query by pointing to the "Gene field" to find its location on the chromosome and perhaps be surprised to learn that the gene with this well-defined morphological trait, in fact, codes for a subunit of the amino acid biosynthetic gene tryptophan synthetase.

Another user may have started a query by asking for a strain with a mutation affecting tryptophan biosynthesis, and a third by specifying and selecting from all mutations affecting pericarp color. They may end up with the same set of information, and may even ultimately choose the same Stock, but approach it from different perspectives and travel different routes through the database. In this manner, a wide variety of information at many levels of genetic analysis is available through paths determined by the user.

Comprehensive data entry continues; modifications and extension of the software are underway. This phase of development will emphasize extensions in the analysis and storage of both classical and RFLP mapping data.