|
Repetitive sequences in P. falciparum proteins |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The repeat consensus for each protein was determined from P. falciparum 3D7 strain sequences using the programme XSTREAM, with gaps removed for clarity. The phases of some repeats were adjusted to those described in the literature. Where the repeating motif differs between isolates the alternative repeating motifs are shown in parentheses after the 3D7 consensus sequence. Variation in repeat numbers seen in 16 different P. falciparum strains (3D7, IT, HB3, DD2, 7G8, CD1, GA1, GB4, GN1, KE1, KH1, KH2, ML1, SD1, SN1, TG1, where two-letter codes represent the country of origin for the 11 field isolates) were determined manually using Jalview with TCOFFEE used to align the sequences. Several proteins, such as AP2-G, also contain homopolymeric amino acid repeats; these are not indicated in the table. Long-read PACBIO genome sequencing data was obtained from the Pf3K consortium. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
PfID |
|
|
Annotation |
Consensus motif in 3D7 |
Number of repeats |
Localisation |
EC |
Transcript |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41688 |
PF3D7_0930300 |
 |
 |
merozoite surface protein 1 |
GASAQS (GSGGSVASG/SGGSVT) |
2–6 (3–5/4) |
Merozoite surface |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41678 |
PF3D7_0206800 |
 |
 |
merozoite surface protein 2 |
AGGS (TTTESNSPSPPI/PAGAGASGNP/AGAGASGS/GASGSG/SGSAGG/GAGAS/SGSAG) |
7–11 (4/6/8/13/6/9/3) |
Merozoite surface |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41679 |
PF3D7_0207600 |
 |
 |
serine repeat antigen 5 |
QGSTGASP |
3–12 |
PV + Merozoite surface |
3.4.22.- |
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41690 |
PF3D7_1035300 |
 |
 |
glutamate-rich protein GLURP |
EILPEDKNEKVQHEIVEVE |
4–14 |
Merozoite surface |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SEKSVSEPAEHVEIV |
3 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41691 |
PF3D7_1036400 |
|
|
liver stage antigen 1 |
DLEQERLAKEKLQEQQS |
32–93 |
Liver Stage |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41681 |
PF3D7_0318200 |
|
|
DNA-directed RNA polymerase II subunit RPB1 |
YSPTSPK |
6–11 |
Nucleus |
2.7.7.6 |
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41687 |
PF3D7_0831800 |
 |
 |
histidine-rich protein II |
AHHAAD |
38–46 |
RBC cytoplasm |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41696 |
PF3D7_1222600 |
 |
|
AP2 domain transcription factor AP-G |
KNN |
7–9 |
Nucleus |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
DTYN |
16–18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41698 |
PF3D7_1370300 |
 |
|
membrane associated histidine-rich protein |
HDHD |
3–5 |
Maurer's clefts |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
DHG |
6–12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41680 |
PF3D7_0304600 |
|
|
circumsporozoite (CS) protein |
NANP |
40–46 |
Sporozoite Surface |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41689 |
PF3D7_0935900 |
 |
|
ring-exported protein 1 |
KPQAEKDASKLTTTYDQTKEV |
3–6 |
Maurer's clefts |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NKETKPQNDKYTL |
2 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41685 |
PF3D7_0501300 |
 |
|
skeleton-binding protein 1 |
ASGIGNLVGDA |
5–6 |
Maurer's clefts |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
QNAQ |
7–14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41694 |
PF3D7_1149000 |
 |
|
antigen 332, DBL-like protein |
PVEEKNVSEEI |
6–11 |
Maurer's clefts |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
FVTGELPEEDIINEKVQEEEE |
3–5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EESASEEIVEDEGSV |
5–7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ENVEEKKTMDEEIVDQGSVV |
3–4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TEEVVEEEGSV |
5–10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EEVVEEGSAT |
2–11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EEIVEEEESSS |
11–12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EEIVEEEGSVV |
10–24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SVTEELVDEG |
2–3 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TEEIVED-EGSF |
6–7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NEEILEEEGSY |
6–11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
GSATDYFVGQGSDNEEIIEE |
2–4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
IKEEQLDSEE |
6–20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EVEEVSVDD |
2–5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EEIEEIESVT |
2–4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
TEDVEEVSS |
4–8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41682 |
PF3D7_0401800 |
 |
|
Plasmodium exported protein (PHISTb), unknown function |
STA |
9–18 |
Maurer's clefts |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
RSASAASTT |
8–12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
STSTTQSPST |
3–6 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41674 |
PF3D7_0102200 |
 |
|
ring-infected erythrocyte surface antigen |
EEPTVADEHV |
6–8 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EENV |
41–47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41675 |
PF3D7_0113000 |
 |
|
glutamic acid-rich protein |
EKK |
12–17 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EKEKKKQ |
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EEHKE |
8–9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KGKKD |
5 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EEDEDDA |
8–10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41676 |
PF3D7_0201900 |
 |
|
erythrocyte membrane protein 3 |
LEEYNETDLAKGKEVTNKAHEN |
17–19 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KNKELQNKGSEGLKENAEL |
9–12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NKDISNKDMKNKELL |
2–3 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
QQNTGLKNTPSEG |
54–87 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41677 |
PF3D7_0202000 |
 |
 |
knob-associated histidine-rich protein |
SKKHKDHDGEKKK |
4 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ATKEASTSKE |
4–7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41683 |
PF3D7_0402000 |
 |
|
Plasmodium exported protein (PHISTa), unknown function |
KQGGKKEEV |
9–14 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41684 |
PF3D7_0500800 |
 |
|
erythrocyte membrane protein 2 mature parasite-infected erythrocyte surface antigen |
GESKET |
11–22 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EKNDEKKDKVLGEGDKEDVK |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KEKEEV |
7–11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
KEKEEV |
3–9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ESEE |
17–26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41686 |
PF3D7_0532400 |
 |
|
lysine-rich membrane-associated PHISTb protein |
NKKVRGA |
5 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ENKKAGT |
5–7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41692 |
PF3D7_1102300 |
|
|
Plasmodium exported protein, unknown function |
ERKEREEREKK |
9 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
EREKREKKEKE |
13–14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41693 |
PF3D7_1148700 |
 |
|
Plasmodium exported protein (PHISTc), unknown function gametocyte exported protein 12 |
KECVPNECMK |
8 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41695 |
PF3D7_1201000 |
 |
|
Plasmodium exported protein (PHISTb), unknown function |
EKDEK |
18–36 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
DDDDEDDED |
7–8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41699 |
PF3D7_1476200 |
 |
|
Plasmodium exported protein (PHISTb), unknown function |
KEQEKEKERKRKE |
4 |
RBC periphery |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41697 |
PF3D7_1301400 |
 |
|
Plasmodium exported protein (hyp12), unknown function |
KKKEKQE |
8 |
RBC cytoplasm |
|
 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NEDE |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|