THE
PROTEIN CHEMISTRY GROUP
Extended Phylogenetic Patterns Search (EPPS)
|
Comparative genomics led to the definition of
clusters of orthologous groups of proteins (COGs) by comparing protein sequences
encoded in (currently 66) complete genomes
Tatusov et al., 1997;
Tatusov et al., 2001.
Each COG consists of individual proteins or groups of orthologs from at
least 3 lineages and thus corresponds to a conserved domain. The COG database
can be analyzed by extended phylogenetic patterns search (EPPS),
which is an extended version of phylogenetic patterns search located at
http://www.ncbi.nlm.nih.gov/COG/old/phylox.html.
EPPS provides a means for finding COGs that contain or exclude selected
organisms. For that purpose, a pattern of the (currently 66) organisms can be defined by assigning
each organism one of the following three parameters
by clicking to the corresponding green, red or grey box on the left hand side of
each individual organism:
Green box = Yes: the COG must contain this organism
Red box = No: the COG must not contain this organism
Grey box = Ignored this organism is totally ignored by EPPS
Clicking Y, N or or I
at the top of each column activates the
respective selection for all organisms in that column. The total number of each,
selected Yes and No, are automatically shown under
Selected. The accuracy of the analysis with respect to (i) the
organisms that must and (ii) the organisms that must not be included in the
output COGs may then be specified i. e. the maximum number of exceptions
of an exact match for each case can be defined under
Exceptions for ´Yes´and
Exceptions for ´No´. After clicking Start Search,
the subset of COGs that fits the predefined pattern with the specified accuracy is output at the bottom
of the search.
|