|
With the genomes of several organisms sequenced, one goal is
to use high throughput approaches to elucidate the function of
genes in a global and systematic way. A typical procedure for
inferring gene function is to identify sets of genes whose expression
profiles are highly similar and then predict the function of uncharacterized
genes based on the functions of the other genes in the set. I
will discuss two different approaches for extracting reliable
functional information about genes from diverse collections of
microarray data. Both use a different form of prior knowledge
to leverage the analysis.
The first approach, called the gene recommender, identifies
new candidate genes related to a set of genes of interest. It
was inspired by collaborative filtering methods that recommend
books and movies. The method takes advantage of the idea that
most genes of related function tend to be co-regulated in only
a subset of the experiments represented in the database. By implementing
a simple feature-selection method and then building a cluster
around the input set using only the selected experiments, the
method successfully identified new genes likely to be involved
in a tumorogenesis pathway.
The second approach uses microarray data from multiple organisms
to infer gene function. The most reliable gene-expression relationships
may be those detectable in multiple organisms. Not only have such
relationships been reproduced in independent experiments, they
have also survived evolutionary selective pressures. I will present
a method that identifies these relationships from DNA microarray
data and then uses them to predict gene function. The results
suggest combining data across multiple species can improve the
functional annotation of conserved genes over using data from
any single species alone.
Tuesday, February 18, 2003
Moore School Bldg. - Room #216
3:00 - 4:30 p.m.
|