Format of Input and Output Sequence Files
1. List of
orthologous genes and their locations in 26 conserved operons
This table
lists predicted operons in Geobacter sulfurreducens with links to operon
sequence data. The names and locations of open reading frames predicted to be
within the operon are provided. Also listed are GenBank GI identification
numbers for G. sulfurreducens. For each open reading frame in G.sulfurreducens
links are also provided are links to the sequences of their putative orthologs
in Geobacter metallireducens and Desulfovibrio vulgaris, along
with the E-values for their Blastp similarity to the ORF in G.
sulfurreducens.
2. List of
files with sequence data for 26 operons
This table
provides links to sequence files for each operon, AlignACE input files, and
AlignACE output files.
The name of
each file with sequence data for individual operons is OperonNumber_SpeciesName,
where
OperonNumber is the operon name (see List
of orthologous genes and their locations in 26 conserved operons) assigned
by FGENESB (Solovyev and Salamov, unpublished).
SpeciesName is gsul for Geobacter
sulfurreducens, gmet for Geobacter metallireducens and dvul
for Desulfovibrio vulgaris.
Examples:
1256_gsul contains sequence data for operon 1256 in
G. sulfurreducens;
1256_gmet contains sequence data for operon 1256 in
G. metallireducens;
1256_dvul contains sequence data for operon 1256 in
D. vulgaris.
The sequence
files for individual operons are in FASTA format.
![]()
3. Input
sequence files for AlignACE
The files
are of the type *.in, where * is
the operon name.
Examples:
2.in is the AlignACE input file for operon 2;
17.in is the AlignACE input file for operon 17.
All
sequences in AlignACE input files are in FASTA format.
AlignACE was
developed by Roth et al. (Roth, F. P., Hughes, J. D., Estep, P. W., Church, G.
M. 1998. Finding DNA regulatory motifs within unaligned noncoding sequences
clusted by whole-genome mRNa quantitation. Nature Biotechnol. 16, 939-945).
![]()
4. Motifs predicted by comparative genomics
analysis (AlignACE output files)
The files
are of the type *.out, where * is the operon name.
Listed are
motifs predicted for the noncoding regions of the 26 conserved operons.
Examples of
files:
2.out is the file with predicted motif
sequences for operon 2;
17.out is the file with predicted motif
sequences for operon 17.
Output files provided here are in the AlignACE output format (Roth et al. 1998). These files were used in further analyses to identify likely biologically significant motifs. The AlignACE output format is described in detail in the help file of the George M. Church Laboratory Analysis Software for mRNA Abundance Data Web site .