Repetitive sequences in P. falciparum proteins
The repeat consensus for each protein was determined from P. falciparum 3D7 strain sequences using the programme XSTREAM, with gaps removed for clarity. The phases of some repeats were adjusted to those described in the literature. Where the repeating motif differs between isolates the alternative repeating motifs are shown in parentheses after the 3D7 consensus sequence. Variation in repeat numbers seen in 16 different P. falciparum strains (3D7, IT, HB3, DD2, 7G8, CD1, GA1, GB4, GN1, KE1, KH1, KH2, ML1, SD1, SN1, TG1, where two-letter codes represent the country of origin for the 11 field isolates) were determined manually using Jalview with TCOFFEE used to align the sequences. Several proteins, such as AP2-G, also contain homopolymeric amino acid repeats; these are not indicated in the table. Long-read PACBIO genome sequencing data was obtained from the Pf3K consortium.
PfID Annotation Consensus motif in 3D7 Number of repeats  Localisation EC Transcript
41688 PF3D7_0930300
merozoite surface protein 1 GASAQS (GSGGSVASG/SGGSVT)  2–6 (3–5/4)  Merozoite surface 
41678 PF3D7_0206800
merozoite surface protein 2 AGGS (TTTESNSPSPPI/PAGAGASGNP/AGAGASGS/GASGSG/SGSAGG/GAGAS/SGSAG)  7–11 (4/6/8/13/6/9/3)  Merozoite surface 
41679 PF3D7_0207600
serine repeat antigen 5 QGSTGASP  3–12  PV + Merozoite surface  3.4.22.-
41690 PF3D7_1035300
glutamate-rich protein GLURP EILPEDKNEKVQHEIVEVE  4–14  Merozoite surface 
SEKSVSEPAEHVEIV 
41691 PF3D7_1036400 liver stage antigen 1 DLEQERLAKEKLQEQQS  32–93  Liver Stage 
41681 PF3D7_0318200 DNA-directed RNA polymerase II subunit RPB1 YSPTSPK  6–11  Nucleus  2.7.7.6
41687 PF3D7_0831800
histidine-rich protein II AHHAAD  38–46  RBC cytoplasm 
41696 PF3D7_1222600
AP2 domain transcription factor AP-G KNN  7–9  Nucleus 
DTYN  16–18 
41698 PF3D7_1370300
membrane associated histidine-rich protein HDHD  3–5  Maurer's clefts 
DHG  6–12 
41680 PF3D7_0304600 circumsporozoite (CS) protein NANP  40–46  Sporozoite Surface 
41689 PF3D7_0935900
ring-exported protein 1 KPQAEKDASKLTTTYDQTKEV  3–6  Maurer's clefts 
NKETKPQNDKYTL 
41685 PF3D7_0501300
skeleton-binding protein 1 ASGIGNLVGDA  5–6  Maurer's clefts 
QNAQ  7–14 
41694 PF3D7_1149000
antigen 332, DBL-like protein PVEEKNVSEEI  6–11  Maurer's clefts 
FVTGELPEEDIINEKVQEEEE  3–5 
EESASEEIVEDEGSV  5–7 
ENVEEKKTMDEEIVDQGSVV  3–4 
TEEVVEEEGSV  5–10 
EEVVEEGSAT  2–11 
EEIVEEEESSS  11–12 
EEIVEEEGSVV  10–24 
SVTEELVDEG  2–3 
TEEIVED-EGSF  6–7 
NEEILEEEGSY  6–11 
GSATDYFVGQGSDNEEIIEE  2–4 
IKEEQLDSEE  6–20 
EVEEVSVDD  2–5 
EEIEEIESVT  2–4 
TEDVEEVSS  4–8 
41682 PF3D7_0401800
Plasmodium exported protein (PHISTb), unknown function STA  9–18  Maurer's clefts 
RSASAASTT  8–12 
STSTTQSPST  3–6 
41674 PF3D7_0102200
ring-infected erythrocyte surface antigen EEPTVADEHV  6–8  RBC periphery 
EENV  41–47 
41675 PF3D7_0113000
glutamic acid-rich protein EKK  12–17  RBC periphery 
EKEKKKQ 
EEHKE  8–9 
KGKKD 
EEDEDDA  8–10 
41676 PF3D7_0201900
erythrocyte membrane protein 3 LEEYNETDLAKGKEVTNKAHEN  17–19  RBC periphery 
KNKELQNKGSEGLKENAEL  9–12 
NKDISNKDMKNKELL  2–3 
QQNTGLKNTPSEG  54–87 
41677 PF3D7_0202000
knob-associated histidine-rich protein SKKHKDHDGEKKK  RBC periphery 
ATKEASTSKE  4–7 
41683 PF3D7_0402000
Plasmodium exported protein (PHISTa), unknown function KQGGKKEEV  9–14  RBC periphery 
41684 PF3D7_0500800
erythrocyte membrane protein 2 mature parasite-infected erythrocyte surface antigen GESKET  11–22  RBC periphery 
EKNDEKKDKVLGEGDKEDVK 
KEKEEV  7–11 
KEKEEV  3–9 
ESEE  17–26 
41686 PF3D7_0532400
lysine-rich membrane-associated PHISTb protein NKKVRGA  RBC periphery 
ENKKAGT  5–7 
41692 PF3D7_1102300 Plasmodium exported protein, unknown function ERKEREEREKK  RBC periphery 
EREKREKKEKE  13–14 
41693 PF3D7_1148700
Plasmodium exported protein (PHISTc), unknown function gametocyte exported protein 12 KECVPNECMK  RBC periphery 
41695 PF3D7_1201000
Plasmodium exported protein (PHISTb), unknown function EKDEK  18–36  RBC periphery 
DDDDEDDED  7–8 
41699 PF3D7_1476200
Plasmodium exported protein (PHISTb), unknown function KEQEKEKERKRKE  RBC periphery 
41697 PF3D7_1301400
Plasmodium exported protein (hyp12), unknown function KKKEKQE  RBC cytoplasm 
NEDE 
Online articles related to Repetitive sequences in P. falciparum proteins retrieved from PubMed