GFF (General Feature Format) lines are based on the Sanger
GFF2 specification. GFF lines have nine required fields that must be
tab-separated. If the fields are separated by spaces instead of tabs, the track will not
display correctly. For more information on GFF format, refer to Sanger's
GFF page.
GFF (General Feature Format) lines are based on the Sanger
GFF2 specification. GFF lines have nine required fields that must be
tab-separated. If the fields are separated by spaces instead of tabs, the track will not
display correctly. For more information on GFF format, refer to Sanger's
GFF page.
Note that there is also a GFF3 specification that is not currently supported by the Browser.
All GFF tracks must be formatted according to Sanger's GFF2 specification.
If you would like to obtain browser data in GFF (GTF) format, please refer to
Genes in gtf or gff format on the Wiki.
Here is a brief description of the GFF fields:
- seqname - The name of the sequence. Must be a chromosome or scaffold.
- source - The program that generated this feature.
- feature - The name of this type of feature. Some examples of
standard feature types are "CDS", "start_codon", "stop_codon", and
"exon".
- start - The starting position of the feature in the sequence. The first base is numbered 1.
- end - The ending position of the feature (inclusive).
- score - A score between 0 and 1000. If the track line
useScore attribute is set to 1 for this annotation data set, the
score value will determine the level of gray in which
this feature is displayed (higher numbers = darker gray). If there is no
score value, enter ".".
- strand - Valid entries include '+', '-', or '.' (for don't know/don't care).
- frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the
first base. If the feature is not a coding exon, the value should be '.'.
- group - All lines with the same group are linked together into a single item
GFF (General Feature Format) lines are based on the Sanger
GFF2 specification. GFF lines have nine required fields that must be
tab-separated. If the fields are separated by spaces instead of tabs, the track will not
display correctly. For more information on GFF format, refer to Sanger's
GFF page.
Note that there is also a GFF3 specification that is not currently supported by the Browser.
All GFF tracks must be formatted according to Sanger's GFF2 specification.
If you would like to obtain browser data in GFF (GTF) format, please refer to
Genes in gtf or gff format on the Wiki.
Here is a brief description of the GFF fields:
- seqname - The name of the sequence. Must be a chromosome or scaffold.
- source - The program that generated this feature.
- feature - The name of this type of feature. Some examples of
standard feature types are "CDS", "start_codon", "stop_codon", and
"exon".
- start - The starting position of the feature in the sequence. The first base is numbered 1.
- end - The ending position of the feature (inclusive).
- score - A score between 0 and 1000. If the track line
useScore attribute is set to 1 for this annotation data set, the
score value will determine the level of gray in which
this feature is displayed (higher numbers = darker gray). If there is no
score value, enter ".".
- strand - Valid entries include '+', '-', or '.' (for don't know/don't care).
- frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the
first base. If the feature is not a coding exon, the value should be '.'.
- group - All lines with the same group are linked together into a single item.
Example:
browser position chr22:10000000-10025000
browser hide all
track name=regulatory description="TeleGene(tm) Regulatory Regions"
visibility=2
chr22 TeleGene enhancer 10000000 10001000 500 + . touch1
chr22 TeleGene promoter 10010000 1001010
GTF format
GTF (Gene Transfer Format) is a refinement to GFF that tightens the specification.
The first eight GTF fields are the same as GFF. The group field has been
expanded into a list of attributes. Each attribute consists of a type/value pair. Attributes
must end in a semi-colon, and be separated from any following attribute by exactly one space.
The attribute list must begin with the two mandatory attributes:
- gene_id value - A globally unique identifier for the genomic
source of the sequence.
- transcript_id value - A globally unique identifier for the
predicted transcript.
Example: Here is an example of the ninth field in a GTF data line:
gene_id "Em:U62317.C22.6.mRNA"; transcript_id "Em:U62317.C22.6.mRNA";
exon_number 1
The Genome Browser groups together GTF lines that have the same
transcript_id value. It only looks at features of type exon and
CDS.
For more information on this format, see
http://mblab./GTF2.html.
If you would like to obtain browser data in GTF format, please refer to
Genes in gtf or gff format on the Wiki.
|