Protist parasites cause important human and animal diseases, and because of their early divergence from other eukaryotes they possess structural and biochemical characteristics not found in other cells. The completion of the genome projects of most human protist parasites and the development of novel molecular tools for their study guarantee a rapid progress in understanding how they invade, modify and survive within their hosts. The ultimate goal of these studies will be the identification of targets for the design of drugs, diagnostics and vaccines. In addition, the accessibility of some of these parasites to multiple genetic manipulations has converted them into model systems in cell and molecular biology studies that could lead to the understanding of basic biological processes, as well as their evolution and pathogenesis. In the present chapter we discuss the biochemical and molecular peculiarities of these parasites and the molecular tools available for their study.
It is estimated that more than half of the human population plus a much greater number of domestic and wild animals suffer from parasitic infections. The magnitude of the problem can be illustrated by estimates of more than 250 million human cases of, and approximately 1 million deaths each year from, malaria alone .
Information about the protist (or more commonly known as protozoan) parasites of medical importance is summarized in Table 1. In addition to the parasites indicated in this Table, several species of parasites are important to humans because they cause disease in domestic animals or because they are used as model organisms: for example, Trypanosoma brucei, Trypanosoma congolense and Trypanosoma vivax, which cause nagana in cattle; Eimeria spp., which cause coccidiosis in chicken; Babesia spp., which produces babesiosis in domestic animals and very occasionally in humans; Theileria spp, which causes theileriosis in cattle; and Tritrichomonas foetus, which causes trichomoniasis in cattle. Other species non-pathogenic to human or domestic animals are used as models for the pathogenic species, such as Crithidia fasciculata and Leishmania tarentolae, used as models for human pathogenic trypanosomatids, and Plasmodium berghei and Plasmodium yoelii yoelii, used as models for human malaria parasites.
In addition to their relevance in human and animal health, protist parasites have become the object of extensive studies by cellular, molecular and evolutionary biologists. Because of their early divergence from other eukaryotes they exhibit unusual biological characteristics. Their study has resulted not only in the discovery of unique peculiarities, such as the presence of organelles such as the hydrogenosome, mitosome, glycosome, apicoplast and acidocalcisome, but also in the discovery of structures and biological processes that were later found in more developed organisms, such as extranuclear (mitochondrial) DNA, RNA editing, GPI (glycosylphosphatidylinositol) membrane anchor synthesis and trans-splicing.
Research in molecular parasitology has entered the ‘post-genomic' era after the completion of several genome sequences for parasitic protists, including those of the three main pathogenic species of the family Trypanosomatidae, T. brucei , Trypanosoma cruzi  and Leishmania major , the apicomplexan parasites Plasmodium falciparum , P. yoelii yoelii  and other Plasmodium species, Cryptosporidium parvum , Cryptosporidium hominis , Theileria parva , Theileria annulata  and Toxoplasma gondii (http://www.toxodb.org), and the amitochondriate protists Entamoeba histolytica , Trichomonas vaginalis  and Giardia lamblia . In addition, transcriptome and proteome analyses of a number of these parasites have generated massive amounts of information. The ultimate goal of these studies is the identification of candidate drug, diagnostic or vaccine targets to combat the diseases caused by these pathogens. In addition, these studies will provide key molecular information about basic biological processes, evolution and pathogenesis.
In the present chapter I will review the biochemical and molecular peculiarities of the most important protist parasites relevant to human health and the molecular tools available for their study.
Trypanosomatids belong to the family Trypanosomatidae and order Kinetoplastida and include numerous genera, all of which are parasitic. Some are said to be monogenetic because they have only one host, in general an insect, and some are digenetic because they have a second host, which could be a plant (Phytomonas and others) or an animal (Trypanosoma and Leishmania). Each of the species responsible for human disease (the T. cruzi and T. brucei group, and Leishmania spp.) has a different insect vector, very different life cycle and causes radically different diseases (Table 1).
The study of the biology of trypanosomatids has resulted in the discovery of several biochemical and molecular peculiarities, some of them unique to these organisms, and others that were later found in other eukaryotes. The name Kinetoplastida is given by the presence of the kinetoplast, a structure that contains 5–20% of the total cellular DNA and is a large network of several thousand similar copies of minicircles and a few dozen maxicircles encoding a few mitochondrial proteins and ribosomal RNAs . This was the first extranuclear DNA ever discovered, long before mammalian mitochondria were shown to contain DNA. All trypanosomatids contain unique specialized peroxisomes where most glycolytic enzymes are found, and are termed the glycosomes [15,16]. Another structure first described in trypanosomatids is the acidocalcisome, an acidic compartment rich in calcium and polyphosphate that has been found in a wide range of organisms from bacteria to humans and is involved in phosphorus and cation storage, pH homoeostasis and osmoregulation [17,18]. Unique metabolic features of trypanosomatids include their substitution of trypanothione (a glutathione–spermidine conjugate) for glutathione in many reactions involved in protection against oxidative stress [19,20], and the heavy use of GPI anchors to attach proteins and oligosaccharide components to the outer surface of the plasma membrane, the study of which was pioneered in these parasites [21,22].
Trypanosomatid nucleic acids have also peculiar characteristics. Antigenic variation in African trypanosomes, the mechanism by which the parasites change the antigenic character of their glycoprotein surface coat to evade the host's immune system, occurs through unique programmed DNA rearrangements [23,24]. Bent DNA was first described in kinetoplast DNA minicircles , and RNA editing, a process by which messenger RNAs are modified, was first discovered in trypanosomatids and later found in several other eukaryotes [26,27]. Transcription is polycistronic, and trans-splicing plus polyadenylation are required to produce mature mRNA. In this process, which is coupled with polyadenylation, a spliced leader sequence is transferred to the polycistronic mRNAs. This process has also been found in nematodes, euglenoids, trematodes and chordates .
Genetic work in human trypanosomatids has been limited by their diploidy and infrequent genetic or sexual exchange that has only been described in the insect host phase of African trypanosomes  and Leishmania . Genomic data indicate that hybrids of various lineages of T. cruzi exist, showing that genetic exchange in this parasite can occur in Nature, although infrequently. However, both forward and reverse genetic approaches can be used with trypanosomatids. Using classical or forward genetic approaches, cells are mutagenized (e.g. by chemical mutagens or by insertional mutagenesis, which allows incorporation of DNA at random into the genome) to induce DNA lesions, and mutants with a phenotype of interest are sought. In contrast, using reverse genetic approaches, the study of a gene starts with the gene sequence rather than a mutant phenotype. The function of the gene is altered using various techniques, and the effect on the organism is analysed. This last approach has greatly benefited from the genome projects.
Forward genetic approaches have been used mainly in Leishmania spp. and T. brucei (Table 2) . In Leishmania, mutagenesis with chemical agents such as nitrosoguanidine or insertional mutagenesis with transposable elements (sequences of DNA that can move around to different positions within the genome) such as mariner have been successful. The transposable element mariner has also been successfully used in T. brucei. Chemical mutagenesis in Leishmania has led to the identification of important genes involved in the synthesis of surface LPG (lipophosphoglycan). In T. brucei, RNAi (RNA interference) has been adapted to knock down genes involved in surface-protein expression. An RNAi library containing random segments of genomic DNA was transfected into trypanosomes previously engineered so that after induction of the expression of this DNA, the synthesis of dsRNA (double-stranded RNA) that corresponded to the inserted DNA in each trypanosoma transfectant led to the loss of mRNA expression of the corresponding gene. Selection of parasites according to their phenotype allowed the identification of the genes responsible for the changes .
A number of tools have been developed for reverse genetics in several trypanosomatids (Table 2) [31,33]. One advantage is that most genes in trypanosomatids lack introns. Vectors for transient or stable transfection of DNA have allowed studies on the localization of proteins when the genes were fused to small peptides and detected by antibodies, or to fluorescent proteins such as GFP (green fluorescent protein) and detected by direct fluorescence. The knockout of genes can be performed by homologous recombination. A selectable marker (usually a drug-resistant gene) with untranslated 5´- and 3´-segments from the gene of interest is used for gene replacement. Because trypanosomatids are diploid, a complete knockout requires two rounds of transfection and selection using different selectable markers. In Leishmania and T. cruzi, knockout trials of essential genes could lead to the emergence of polyploidy (the presence of more than two chromosomes), which has hampered the use of this strategy. Complementation of the knockouts with an extra copy of the depleted gene is essential to demonstrate the specificity of the defect and is usually performed in T. brucei and Leishmania. Conditional knockouts of essential genes have been obtained after the introduction of a gene copy under the control of a tetracycline-inducible system in T. brucei, and tetracycline-inducible systems have been developed for T. cruzi and Leishmania [31,33].
Gene knockdowns using RNAi technology have been well developed in T. brucei, but cannot be used in T. cruzi and most Leishmania parasites, whose genomes lack the enzymes involved in this pathway .
The phylum Apicomplexa is defined by the presence of the apical complex, which includes a microtubule anchoring ring through which secretory organelles (rhoptries, micronemes and dense granules) release their content for cell invasion . The phylum includes a large number of organisms among which are several human and animal pathogens such as Plasmodium spp., Cryptosporidium spp. and T. gondii (Table 1). In addition to their specialized secretory system, some of these parasites (i.e. Plasmodium spp. and T. gondii) but not others (i.e. Cryptosporidium spp.) possess a relict DNA-containing chloroplast known as the apicoplast . The apicoplast harbours several metabolic pathways, such as those involved in the biosynthesis of fatty acids, isoprenoids, iron–sulphur clusters and haem. Another specialized structure of Apicomplexa is the digestive vacuole of the erythrocytic stages of Plasmodium and Babesia spp., which contains the hydrolytic enzymes necessary for haemoglobin digestion . A vacuole identified as the PLV (plant-like vacuole; also called VAC) was recently identified in T. gondii and has similarities to the yeast and plant vacuole in that it contains proteolytic enzymes and has roles in water and ion balance, and in the secretory pathway [37,38]. These organelles (digestive vacuole and PLV) probably replace the function of lysosomes, which have not been described in this group of parasites. Apicomplexa also appear to lack peroxisomes, and, as do the trypanosomatids, they possess only one mitochondrion per cell and also acidocalcisomes .
In contrast with trypanosomatids, Plasmodium spp. and T. gondii have a haploid genome, and forward genetic approaches have been possible [40,41] (Table 2). Genetic crosses have been important for the study of genes involved in chloroquine resistance in malaria parasites  and genetic mapping in T. gondii . Early work using chemical mutagenesis was able to generate temperature-sensitive mutants in T. gondi , and more recent work (using N-ethyl-N-nitrosourea) has produced mutants with defects in stage differentiation, invasion and egress, and cell division and cell-cycle progression. Identification of the mutated gene is performed usually by complementation (reintroduction of the gene) with a wild-type cDNA library or a cosmid library. Insertional mutagenesis has been used for the identification of promoters and genes involved in T. gondii differentiation and survival in activated macrophages. High frequency or non-homologous recombination in T. gondii has been an advantage for the use of this technique. Another approach has been to combine insertional mutagenesis with GFP-tagging to identify the subcellular destination of different proteins of unknown function [40,41,44]. The generation of mutants using a different DNA transposon system has also been tried in malaria parasites .
Reverse genetic approaches have been very successful in T. gondii, which has become a model organism for the Apicomplexa [40,41]. Transient and stable transfection protocols, numerous vectors using different promoters with various strengths and stage specificities (for tachyzoites or bradyzoites), and positive and negative selectable markers are all well established. Many of these techniques have been exploited to investigate the subcellular localization of proteins by fusing genes to GFP or its derivatives or to smaller epitope tags and detecting their products using direct fluorescence microscopy or specific antibodies and immunofluorescence microscopy respectively.
Several strategies for gene knockout have been developed. The high frequency of non-homologous recombination, which is beneficial for insertional mutagenesis, is a problem for integration of the genes into their corresponding locus for performing gene knockouts. Using different strategies has circumvented this problem. One of these is the use of long flanking regions to facilitate integration by homologous recombination. Another is the use of a second selectable marker (for example encoding a fluorescent protein) outside the homologous flanking sequence. Only the cells that are not fluorescent are those that have undertaken homologous recombination. Those that become fluorescent are eliminated because the vectors have integrated at random [40,41]. Another strategy to eliminate the expression of an essential gene has been the modulation of the protein stability using a small molecule . The target protein is fused to a mutated version of the wild-type rapamycin-binding protein FKB12, which is only correctly folded in the presence of the small ligand Shield-1. In the absence of ligand, the stability of the protein is compromised, resulting in its degradation along with that of its fusion partner. This approach, which is especially useful for cytosolic proteins, has been used in T. gondii and P. falciparum  and more recently in the trypanosomatid L. major . A more recent strategy has been the use of tachyzoites in which the non-homologous end-joining DNA repair pathway was disrupted by deletion of the KU80 gene, which resulted in a much greater efficiency of gene targeting that could be useful for localization studies  or to obtain gene knockouts .
For genetic analysis of essential genes, a system similar to that developed for T. brucei (the tet-repressor system) was also developed in T. gondii that was suitable for the expression of toxic genes and dominant-negative mutants, although not adequate for conditional knockouts. For this purpose, a tetracycline transactivator-based inducible system was developed in T. gondii that was also useful in P. falciparum. First, a cell line expressing an inducible copy of the gene is constructed and in a second step the knockout of the target gene is performed [40,41].
RNAi is apparently not as efficient in Apicomplexa as in T. brucei, and its use has been more limited in T. gondii. In addition, there is no evidence for the presence of RNAi machinery in the genomes of Plasmodium spp. [40,41].
The common characteristics of this group are the absence of typical mitochondria, an anaerobic carbohydrate metabolism, micro-aerophilic properties, and their placement on deep-branching lineages in eukaryotic phylogenetic trees . The mitochondria in these organisms are replaced by homologues that were called hydrogenosomes in Trichomonas  and mitosomes in Giardia and Entamoeba . Both types of organelles are double-membrane-bound and do not contain DNA. Hydrogenosomes produce hydrogen and have also been described in diverse anaerobic ciliates and in anaerobic chytid fungi. In the anaerobic ciliate Nycotherus ovalis, hydrogenosomes retain a mitochondrial genome providing a link between them and mitochondria. Mitosomes are remnant mitochondria that lack ATP-generating pathways and have been found in parasitic members of Amoebozoa (e.g. Entamoeba), Microsporidia, Diplomonads (e.g. Giardia) and Apicomplexa (e.g. Cryptosporidium). Another homologue to the mitochondrion is the MLO (mitochondrion-like organelle) present in Blastocystis, which could also be a human pathogen (Table 1). The MLO has similarities to the hydrogenosomes of N. ovalis, including the retention of an organelle genome .
All of these amitochondriate parasites have an anaerobic carbohydrate metabolism characterized by the presence of the enzyme pyruvate:ferredoxin oxidoreductase, which replaces the pyruvate dehydrogenase of aerobic eukaryotes in the generation of acetyl-CoA. This is a very peculiar enzyme, the only example known to combine two substrate free radicals (CoA thiyl radical and a pyruvate-derived carbon-centred radical) to form a high-energy compound (acetyl-CoA) . The enzyme is also the main route of metronidazole reduction, which is necessary for the mode of action of this drug against these parasites .
The molecular tools available for work with amitochondriate protists are much less developed  (Table 2). Forward genetic approaches are limited since these parasites do not appear to have sexual exchanges. Chemical mutagenesis and lectin selection have been used with Trichomonas vaginalis to identify lipophosphoglycan mutants. Reverse genetic approaches have started to give interesting results. Transient and stable transfections are possible in Giardia, Entamoeba and Trichomonas. Giardia possesses a double-stranded virus and it has been possible to engineer it to introduce and express exogenous and endogenous genes . Tetracycline-inducible transfection systems have been developed in Giardia and Entamoeba. However, disruption of genes by homologous recombination is very difficult in Giardia, which has two diploid nuclei, or in Entamoeba, which is also polyploid. Some gene knockouts have been reported after introducing selectable markers flanked by long regions of parasite DNA on linearized plasmids in Giardia and Trichomonas, but not in Entamoeba. Antisense RNA-based systems were developed in Giardia using its dsRNA virus and in Entamoeba using target genes in the antisense direction downstream of the 5´-untranslated regions of ribosomal protein L21 .
The completion of the genome projects and the availability of molecular tools to work with most human protist parasites almost guarantee that a rapid progress in understanding the biology of these parasites will be achieved during this new century and that novel targets for drugs, diagnostics and vaccines will be identified. A challenge common to the study of other cells is to define the function for the many genes identified as being hypothetically present in the genome of these parasites. The study of individual genes will be a tedious and time-consuming process and methods will need to be developed for high-throughput identification of their function and their potential use as drug, diagnostic or vaccine targets.
• The genomes of the most important human protist parasites have been completed, and transcriptome and proteome analyses of many of them are being completed.
• Protist parasites have unique structures and metabolic pathways that could be exploited for drug or vaccine design, and their study has also led to the discovery of structures and pathways later found in other organisms.
• Forward and especially reverse genetic approaches are feasible with most of these protists.
• Trypanosoma brucei and Toxoplasma gondii have become model organisms for cellular and molecular studies.
I apologize to those whose work could not be cited due to space limitations. R.D. is supported by grants from the U.S. National Institutes of Health (AI-068647 and AI-077583).
- © The Authors Journal compilation © 2011 Biochemical Society