Sequence Analysis in a Nutshell: A Guide to Common Tools and Databases

A feature is a single word or abbreviation indicating a functional role or region associated with a sequence. A list of SWISS-PROT features (organized by feature type) is presented below. An example for each feature is also included to illustrate its use for describing a sequence location or region.

3.3.1 Change Indicators

CONFLICT

Different papers report differing sequences:

FT CONFLICT 304 304 MISSING (IN REF. 3).

MUTAGEN

Indicates an experimentally altered site:

FT MUTAGEN 65 65 H->F: 100% ACTIVITY LOSS.

VARIANT

Authors report that sequence variants exist:

FT VARIANT 136 136 M -> I.

VARSPLIC

Describes sequence variants produced by alternative splicing:

FT VARSPLIC 33 49 MISSING (IN SHORT ISOFORM).

3.3.2 Amino Acid Modifications

BINDING

Binding site for chemical group (co-enzyme, prosthetic group, etc.):

FT BINDING 14 14 HEME (COVALENT).

CARBOHYD

Glycosylation site:

FT CARBOHYD 53 53 N-LINKED (GLCNAC...) (POTENTIAL).

DISULFID

Disulfide bond:

FT DISULFID 23 84 PROBABLE.

LIPID

Covalent binding of a lipid moiety:

FT LIPID 2 2 MYRISTATE.

Table 3-2 lists the attached groups that are currently defined.

Table 3-2. SWISS-PROT lipid moiety attached groups

Attached group

Description

MYRISTATE

Myristate group attached through an amide bond to the N-terminal glycine residue of the mature form of a protein or to an internal lysine residue.

PALMITATE

Palmitate group attached through a thioether bond to a cysteine residue or through an ester bond to a serine or threonine residue.

FARNESYL

Farnesyl group attached through a thioether bond to a cysteine residue.

GERANYL-GERANYL

Geranyl-geranyl group attached through a thioether bond to a cysteine residue.

GPI-ANCHOR

Glycosyl-phosphatidylinositol (GPI) group linked to the alpha-carboxyl group of the C-terminal residue of the mature form of a protein.

N-ACYL DIGLYCERIDE

N-terminal cysteine of the mature form of a prokaryotic lipoprotein with an amide-linked fatty acid and a glyceryl group to which two fatty acids are linked by ester linkages.

METAL

Binding site for a metal ion:

FT METAL 28 28 COPPER (POTENTIAL).

MOD_RES

Posttranslational modification of a residue:

FT MOD_RES 686 686 PHOSPHORYLATION (BY PKC).

Table 3-3 lists the most frequent modifications.

Table 3-3. Frequently used SWISS-PROT amino acid modifications

Modification

Description

ACETYLATION

N-terminal or other.

AMIDATION

Generally at the C-terminal of a mature active peptide.

BLOCKED

Undetermined N- or C-terminal blocking group.

FORMYLATION

Of the N-terminal methionine.

GAMMA-CARBOXYGLUTAMIC ACID

Of glutamate.

HYDROXYLATION

Of asparagine, aspartic acid, proline or lysine.

METHYLATION

Generally of lysine or arginine.

PHOSPHORYLATION

Of serine, threonine, tyrosine, aspartic acid or histidine.

PYRROLIDONE CARBOXYLIC ACID

N-terminal glutamate which has formed an internal cyclic lactam. This is also called "pyro-Glu".

SULFATION

Generally of tyrosine.

SE_CYS

Selenocysteine:

FT SE_CYS 52 52

THIOETH

Thioether bond.

THIOLEST

Thiolester bond.

3.3.3 Regions

CA_BIND

Extent of a calcium-binding region:

FT CA_BIND 759 770 EF-HAND 1 (POTENTIAL).

CHAIN

Extent of a polypeptide chain in the mature protein:

FT CHAIN 21 119 BETA-2 MICROGLOBULIN.

DNA_BIND

Extent of a DNA-binding region:

FT DNA_BIND 69 128 HOMEOBOX.

DOMAIN

Extent of a domain of interest on the sequence:

FT DOMAIN 22 788 EXTRACELLULAR (POTENTIAL).

NP_BIND

Extent of a nucleotide phosphate-binding region:

FT NP_BIND 13 25 ATP.

PEPTIDE

Extent of a released active peptide:

FT PEPTIDE 13 107 NEUROPHYSIN 2.

PROPEP

Extent of a propeptide:

FT PROPEP 550 574 REMOVED IN MATURE FORM.

REPEAT

Extent of an internal sequence repetition:

FT REPEAT 225 307 1.

SIGNAL

Extent of a signal sequence (prepeptide).

SIMILAR

Extent of a similarity with another protein sequence:

FT SIMILAR 139 153 STRONG WITH CA-BINDING EF-HAND SEQUENCE.

TRANSIT

Extent of a transit peptide (mitochondrial, chloroplastic, thylakoid, cyanelle or for a microbody):

FT TRANSIT 1 25 MITOCHONDRION.

TRANSMEM

Extent of a transmembrane region.

ZN_FING

Extent of a zinc finger region:

FT ZN_FING 319 343 GATA-TYPE.

3.3.4 Secondary Structure

Secondary structures are formed as a result of the physical characteristics of the amino acid sidechains of a protein (see Table 3-4).

Table 3-4. SWISS-PROT secondary structure codes

Abbreviation

Description

Type

B

Residue in an isolated beta-bridge

STRAND

E

Hydrogen-bonded beta-strand (extended strand)

STRAND

G

3(10) helix

HELIX

H

Alpha-helix

HELIX

I

Pi-helix

HELIX

S

Bend (five-residue bend centered at residue i)

Not specified

T

H-bonded turn (3-turn, 4-turn or 5-turn)

TURN

For example:

FT HELIX 4 14

3.3.5 Others

ACT_SITE

Amino acid(s) involved in the activity of an enzyme:

FT ACT_SITE 193 193 ACCEPTS A PROTON DURING CATALYSIS.

INIT_MET

Initiator methionine:

FT INIT_MET 0 0

NON_CONS

Non-consecutive residues:

FT NON_CONS 1683 1684

NON_TER

The residue at an extremity of the sequence is not the terminal residue:

FT NON_TER 129 129

SITE

Any other interesting site on the sequence:

FT SITE 759 760 CLEAVAGE (BY THROMBIN).

UNSURE

Uncertainties in the sequence.

Категории