In recent years, our understanding of the functioning of ABC (ATP-binding cassette) systems has been boosted by the combination of biochemical and structural approaches. However, the origin and the distribution of ABC proteins among living organisms are difficult to understand in a phylogenetic perspective, because it is hard to discriminate orthology and paralogy, due to the existence of horizontal gene transfer. In this chapter, I present an update of the classification of ABC systems and discuss a hypothetical scenario of their evolution. The hypothetical presence of ABC ATPases in the last common ancestor of modern organisms is discussed, as well as the additional possibility that ABC systems might have been transmitted to eukaryotes, after the two endosymbiosis events that led to the constitution of eukaryotic organelles. I update the functional information of selected ABC systems and introduce new families of ABC proteins that have been included recently into this vast superfamily, thanks to the availability of high-resolution three-dimensional structures.
ABC (ATP-binding cassette) systems form one of the largest families of proteins. They share a highly conserved module, the ABC, that binds and hydrolyses ATP, whose energies are coupled to a wide variety of cellular processes, including not only transport of various molecules, but also many housekeeping functions such as translation of mRNA, and DNA replication and repair. The ABC module ranks first of the top 20 families in the Pfam database (release 24.0), that accumulates approximately 12 000 different protein families . It is characterized unequivocally by five short sequence motifs, which should be present in this order to qualify as an ABC ATPase (Figure 1A): the Walker A motif, a highly conserved glutamine residue in the Q-loop, the signature motif which is distinctive of ABC ATPases, the Walker B motif and a downstream highly conserved histidine residue within the H-loop or Switch region. The crystal structure of several ABC proteins or modules has been determined (reviewed in [2,3]). An ABC module comprises two structural domains: a RecA-like domain containing the Walker A and B motifs  and a helical domain that is unique to ABC ATPases and that contains the signature motif (Figure 1B). The two domains are joined by two flexible loops, one of which contains a highly conserved glutamine residue and is known as the Q-loop. The Q-loop mediates interactions between the ABC subunits and the transmembrane subunits of ABC transporters (reviewed in ). ABC systems are widespread among living organisms, from some large viruses to mammals. In sharp contrast with the broad functional diversity of these systems, they display a limited assortment of structural organizations, which strongly correlates to their ultimate function. The archetype of an ABC transporter is constituted by two IM (integral membrane) modules, each comprising four to eight TMHs (transmembrane helices), and two ABC modules, which can be connected in many different ways. ABC importers also comprise an extracytoplasmic SBP (solute-binding protein) involved in the capture of substrates and their presentation to the inner membrane complex. Soluble ABC proteins involved in housekeeping processes are generally composed of two ABC modules fused together.
Classes, families and evolution of ABC proteins
Classification of ABC proteins has been the object of many studies, with a focus on eukaryote or vertebrate ABC transporters, owing to the medical importance of these systems [6–9]. The classification of these systems into seven families (A–G), although satisfactory for most eukaryotes, is a kind of ‘tree that hides the forest’. Extending the analysis to plants and lower eukaryotes revealed the occurrence of systems that were found mainly in prokaryotes, such as the newer families H and I [10–12]. To understand the diversity of ABC systems, phylogenetic analyses should be performed on the members of the whole superfamily, including those of the most humble micro-organism. It appears that all eukaryote families are represented in prokaryotes, and they rather constitute subfamilies, which are included in larger families of ABC systems [13–16].
A representative phylogenetic tree, based on the sequences of approximately 800 ABC modules and updating the data of previously published analyses [14,15], is shown in Figure 2. The sequences segregate into 34 clusters or families. Some clusters, comprising obviously highly related proteins known to function together, such as the two ATPases of hydrophobic amino acid importers (HAA family), were merged into a single family. The vast majority of the families are monophyletic, i.e. all their members have a common ancestor. Families could be divided into subfamilies, on the basis of similarities between IM modules, to better distinguish the biological roles of ABC systems. The final 29 families with their subfamilies are listed in Table 1.
ABC modules segregate into three main classes that match fairly well with the three functional divisions of ABC systems, importers, exporters and soluble ATPases. Class 1 comprises systems with fused ABC and IM modules and it contains the vast majority of export systems. Class 2 comprises soluble proteins with two tandemly repeated ABC (ABC2) modules and no IM modules, which are probably not transporters. Class 3 contains systems with IM and ABC modules carried by independent polypeptide chains, which are mostly importers. The majority of ABC importers include an additional extracytoplasmic component, the SBP. Clustering methods performed on IMs and SBPs of ABC importers showed a good agreement with the classifications of ABCs, suggesting that components of ABC transporters co-evolve with minimal shuffling of their components (reviewed in ). A good correlation exists between the sequences of the ABCs and the global substrate specificity of ABC systems. This apparent relationship between sequence and function could reflect constraints imposed by the interaction of ABC proteins with their IM partners that carry substrate recognition sites [17,18].
Proteins from all three domains of life (bacteria, archaea and eukaryota) are found in each class of ABC systems. This observation suggests that ABC systems began to specialize very early, probably before the separation of the three domains . Although the ABC module probably descends from a unique ancestor, there is now evidence that IMs of exporters (unfortunately, importers have not been included in this study) have at least three different ancestors . They have been named ABC1 with six TMHs, ABC2 with four TMHs and ABC3 with eight TMHs, and they correspond to transporters of class 1, to the cluster of drug and polysaccharide exporters of class 3, and to the o228 family of class 3 systems respectively (see Figure 1 and Table 1).
From these analyses, we propose two hypothetical scenarios for the origin and evolution of ABC systems. The ancestral ABC module was probably present in the LUCA (last universal common ancestor). The ABC module might have been associated with IM proteins of different origins to constitute transporters, and a duplication–fusion event led to class 2 proteins. In the first hypothesis, LUCA had probably all classes of ABC systems, which have been inherited by organisms of the three domains of life. Prokaryotes inherited all ABC classes. Class 3 systems, and in particular binding-protein-dependent importers, are virtually absent from eukaryotic genomes, although there is evidence that some were acquired first and lost subsequently, since remnants of such transporters have been found in the chloroplast genome of plants and algae (reviewed in [5,14,15]). According to an alternative hypothesis, LUCA had only the ABC module. All classes of ABC systems emerged in bacteria or archaea, and were extensively exchanged thanks to horizontal gene transfer. Eukaryotes probably acquired ABC systems from the symbiotic bacteria that are the putative ancestors of organelles. From genes encoding the so-called half transporters, eukaryotes developed DPL (ABCB), OAD (ABCC) and EPD (ABCG) full-length transporters by several independent duplication–fusion events.
It is not the purpose of this chapter to present an exhaustive analysis of these families, since other reviews deal with this topic [5,15]. Instead I update the information on recently characterized ABC transporters, and I review in more detail the functions of soluble ABC ATPases, which are involved in important housekeeping functions, with a focus on SMC (structural maintenance of chromosomes)-like ABC ATPases that have been recently included in the superfamily.
Class 1 transporters, all exporters?
Class 1 comprises proteins carrying IM and ABC modules in the same polypeptide chain that segregates into several families and subfamilies that are described in Table 1. The vast majority of these proteins are involved in export of metabolites, cell-surface components and noxious substances (reviewed in ). However, the transport polarity of a few class 1 transporters is still debated, as shown in the following selected examples.
In Arabidopsis, the AtABCB14 transporter, a member of the Pgp (P-glycoprotein) subfamily, modulates stomatal closure on transition to elevated CO2. In plants lacking AtABCB14, stomatal closure induced by high CO2 levels was accelerated. Malate has been suggested to be one of the factors mediating the stomatal response to CO2. Indeed, exogenously applied malate to plant epidermal strips induced a similar AtABCB14-dependent response to that of high CO2 levels. The gene encoding AtABCB14 was able to complement the growth defect of an Escherichia coli mutant affected in transport of C4-dicarboxylic acids, and to promote uptake of malate in this strain .
In bacteria, some members of the SID subfamily have been proposed to participate in the uptake of the siderophore yersiniabactin, such as the YbtPQ system in Yersinia [21,22]. Similar conclusions were drawn from studies on homologous transporters in Mycobacteria . However, it was suggested that the streptococcal homologous transporter EqbKL is probably involved in the export of the siderophore equibactin .
Crystal structures help our understanding of the functional mechanism of class 2 soluble ABC ATPases
The UVR family involved in DNA repair and drug resistance
The UVR family comprises bacterial UvrA proteins involved in the NER (nucleotide excision repair) pathway of DNA, which involves the recognition and removal of damaged DNA and its repair. Briefly, a complex consisting of a dimer of UvrA and a monomer of UvrB recognizes the lesion. UvrA dissociates from the complex and UvrB recruits UvrC, which performs a dual incision at both sides of the lesion. Then, UvrD dislodges the damaged oligonucleotide, and the gap is repaired by the action of DNA polymerase I and DNA ligase (reviewed in ).
The three-dimensional structure of BstUvrA from Geobacillus stearothermophilus (formerly Bacillus stearothermophilus) has helped us to understand the basis of the interactions of UvrA with UvrB and DNA. The structure consists of a head-to-head dimer with four nucleotide-binding sites, each monomer being made of two tandemly repeated ABC modules folded in a head-to-tail conformation . It was proposed that ATP binding and hydrolysis regulate dimerization indirectly via rearrangement of the ATPase modules. The structure of the monomer revealed the presence of three zinc-binding motifs and two large IDs (inserted domains) within the helical domains of the N-terminal and the C-terminal ABC modules. The proximal IDs are important for the interaction between UvrA and UvrB . The ventral surface of BstUvrA was involved in DNA binding, on the basis of biochemical studies of mutant proteins in combination with analysis of the surface properties of the UvrA structure [26,27].
Some bacterial genomes encode a paralogue of UvrA that differ from canonical UvrA by the deletion of the UvrB recognition/interaction domain. In Streptomycetes, these proteins are involved in resistance to DNA-intercalating drugs, such as daunorubicin or nogalamycin [28,29], a role that appears to be independent of the UvrB and UvrC proteins. In addition to the canonical DrUvrA1 protein, the Deinococcus radiodurans genome encodes DrUvrA2, which lacks the UvrB-interaction domain and whose exact role is unclear. Recently, a detailed biochemical and structural analysis of DrUvrA2 helped in our understanding of the dynamics of DNA binding by UvrA proteins . Although the structures of the core ABC modules of BstUvrA and DrUvrA2 are very similar, they differ by the absence of the UvrB-binding domain in DrUvrA2 and in the disposition of the IDs. The IDs in DrUvrA2 adopt a much closer conformation with respect to BstUvrA. Moreover, three distinct crystal structures of DrUvrA2 show different orientations of the IDs with respect to the dimer core. The variety of ID conformations suggests a mechanism whereby the IDs move apart, embrace the DNA, and deliver it to the ventral surface of UvrA dimers, which contain residues that have been shown to be critical for DNA binding .
The ART family involved in gene regulation and in macrolide antibiotic resistance
Phylogenetic analyses distinguish a large class 2 family of proteins named ART, which can be subdivided into three subfamilies. The EF3 subfamily comprises proteins homologous to the yeast translation elongation factor eEF3 (eukaryotic elongation factor 3). Members of this family are found mainly in fungi, but also in some green algae, diatoms and large viruses infecting unicellular organisms (E. Dassa, unpublished work). The yeast protein is required for in vitro translation and for in vivo growth. It interacts with EF1α to stimulate binding of a cognate aminoacyl-tRNA to the ribosomal A (aminoacyl-tRNA)-site and to participate in the release of deacylated tRNA from the ribosomal E (exit)-site . Recent studies suggested that the post-termination complex consisting of a ribosome, mRNA and tRNA is completely disassembled into free subunits by eEF3 and ATP, thereby allowing a new round of translation . Higher eukaryotes lack a homologue of eEF3; they should therefore have different ribosome-recycling pathways. The crystal structure of eEF3 has been solved , showing that it consists of an N-terminal HEAT repeat domain, followed by a four-helix bundle and two ABC ATPase modules, with a chromodomain inserted within the helical domain of the second one. The two ABC domains are somewhat distant from each other on the crystal structure, but cryoelectron microscopy of a complex consisting of AMP-PNP (adenosine 5′-[β,γ-imido]triphosphate)-bound eEF3 and functional yeast post-termination ribosomes shows that ABC modules adopt a canonical closed head-to-tail conformation. eEF3 uses an entirely new factor-binding site near the ribosomal E-site, with the chromodomain likely to stabilize the ribosomal L1 stalk in an open conformation, thus allowing tRNA release .
REG subfamily members participate in several non-transport processes. Although well conserved (see Figure 2), no clear picture is emerging from their described functions (reviewed in ). The two most investigated proteins are the yeast protein Gcn20p (Gcn is general control non-derepressible), which was shown to participate in the complex regulation of Gcn4p, a protein that stimulates the transcription of amino acid biosynthetic genes in response to amino acid starvation (reviewed in ); and the E. coli Uup protein, a generalist DNA-binding protein whose inactivation leads to an increase in precise excision of transposons and a decrease in growth fitness [36,37].
ARE subfamily proteins are involved in resistance towards MLS (macrolide/lincosamide/streptogramin) antibiotics, which target ribosomes, by an unknown mechanism. They have been proposed either to participate in the constitution of efflux systems, whose membrane partners await identification, or to act at the ribosome to impair access of antibiotics to their target on 23S rRNA (reviewed in ).
The RLI family involved in ribosome biogenesis and translation termination
The RLI family is conserved among eukaryotes and archaea, but no information is available on the archaeal proteins. After the initial observation that mammalian RLI or ABCE1 binds to RNAse L, modulating interferon production and the stability of mRNAs , several reports suggested that this protein is involved in ribosome assembly. Depletion of yeast RLI1 in vivo leads to a cessation of growth, a lower polysome content and a decrease in the average size of a polysome . RLI1 was also found associated with both pre-40S particles and mature 40S subunits, and with eIF (eukaryotic translation initiation factor) 3, eIF5 and eIF2 [41,42]. RLI1 is associated with ribosomes and with Hcr1p, a protein involved in rRNA processing and translation initiation. Depletion of RLI1 also causes a nuclear export defect of the small and large ribosomal subunits and subsequently a translational arrest . A new function for RLI1 in translation termination was recently identified [44,45], in addition to its roles in ribosomal subunit maturation, transport of ribosomal subunits to the cytoplasm and translation initiation. RLI1 interacts physically with eRF (eukaryotic translation termination release factor) 1/Sup45 and eRF3/Sup35 in Saccharomyces cerevisiae. Down-regulation of RLI1 expression leads to defects in the recognition of a stop codon, as seen in mutants of other termination factors. The [Fe–S] cluster of RLI1 is required for its activity in translation termination . Biochemical analyses of post-termination complexes showed that ABCE1 dissociates post-termination complexes into free 60S subunits and an mRNA–tRNA–40S complex . This differs from what was observed with eEF3 (see above) and suggests that the mRNA–tRNA–40S complex may reinitiate translation of a downstream ORF (open reading frame) located on the same mRNA .
The crystal structure of the Pyrococcus furiosus RLI at 1.9 Å (1 Å=0.1 nm) resolution, devoid of its [Fe–S]-binding sites, and that of the complete protein of Pyrococcus abyssi were determined in complexes with Mg2+ and ADP [46,47]. The ABC modules of RLI adopt a canonical ABC dimer arrangement, with two composite active sites. The linker between the two ABC modules and the C-terminus of the protein constitute a hinge at the interface of the ABCs opposite to the active-site cleft, and mutations in the linker eliminate function. The first ABC module contains a HLH (helix–loop–helix) insertion in its helical domain whose exact role is unknown. RLI1 harbours two essential [4Fe–4S] clusters, structurally highly similar to bacterial ferredoxins, in which seven of the eight conserved cysteine residues co-ordinating the [Fe–S] clusters are essential for cell viability . The close proximity of the [Fe–S] domain with the adenine and ribose-binding region on the N-terminal ABC module suggest a link between [Fe–S] domain function and ATP-induced conformational changes.
Class 3 systems, a common core organization and opposite polarities of transport
Class 3 systems carry their functional modules on separate polypeptide chains, encoded by different genes, which are usually organized in an operon. They are found in bacteria and archaea, with a notable exception for the ABCA subfamily, that clusters within the DRA family, and whose members are identified in eukaryotes exclusively (reviewed in ).
The vast majority of class 3 systems are composed of importers for various nutrients. Their function depends on the presence of an extracellular SBP that recognizes substrates with high affinity (reviewed in [5,50]). SBP-dependent ABC importers are probably the most characterized ABC systems at the structural and mechanistic levels . The functions of the MKL family putative importers have been characterized [51–57]. The system is composed of one ABC, one IM, one membrane-anchored extracytoplasmic protein that may act as substrate binding protein and another soluble extracytoplasmic protein. Originally involved in toluene tolerance in Pseudomonas putida  and in intracellular spreading in Shigella flexneri , they have been demonstrated to be involved in retrograde transport of phospholipids, from the outer to the inner membranes . In Actinobacteria, an MKL system called Mce4 was shown to be involved in cholesterol import, and was encoded by an operon included in a large cluster of genes whose products are cholesterol-regulated and implicated in the degradation of steroids [55,56]. A plastidic homologous system was shown to mediate the import of lipid precursors from the endoplasmic reticulum into the chloroplast membrane (reviewed in ). These data support the notion that MKL systems constitute a conserved pathway of lipid import.
Class 3 also contains a family of SBP-independent importers (CBNV family), also called ECF (energy-coupling factor) transporters, involved in the uptake of cobalt and nickel  and vitamins and cofactors [59–61] in bacteria and archaea. ECF transporters consist of an ABC protein (component A), a conserved transmembrane protein (component T) and a transmembrane substrate-capture protein (component S) with an unknown stoichiometry. Remarkably, components S are able to promote a basal transport of substrates, without the help of the energy-coupling components (reviewed in ).
Another subclass of class 3 systems also lacks SBPs, but their members are involved in drug resistance and in the biogenesis of extracellular complex polysaccharides (DRA, DRI and CLS families). These systems have been proposed to participate in the export of such molecules, but none was demonstrated to involve an efflux mechanism directly. These transporters cluster with importers, suggesting either that their transport polarity has changed during evolution or that they are not directly involved in the export of these substances (reviewed in [5,14]).
New members that join the team
Comparisons of three-dimensional structures have allowed the assignment to the ABC superfamily a significant number of protein families that were not retrieved by BLAST searches, either because the overall similarity was lower than the default cut-off or because conserved motifs were separated from each other by the insertion of large domains, such as coiled coils. However, all distinctive motifs of ABC ATPases could be found in their sequences, except in MutS (Figure 3).
SMC proteins involved in chromosome organization
SMC proteins perform key functions including cohesion of sister chromatids, condensation and segregation of chromosomes. Some SMC proteins also participate in DNA repair (see [63,64] for comprehensive reviews). These proteins, found in eukaryotes and in most prokaryotes, form a highly conserved family. The primary structure of SMC proteins consists of five distinct domains. The highly conserved N-terminal domains contain the Walker A motif and the Q-loop; the C-terminal domain contains the ABC signature and the Walker B motif. These domains are separated by a central domain that contains the ‘hinge’, which is flanked by two long helices. The latter form an intramolecular coiled coil that brings the N- and C-terminal domains into close contact, forming a functional ABC ATPase monomer called ‘head’. Eukaryotic genomes encode six paralogous SMC proteins that form three different heterodimers, thanks to interaction between hinge domains (see  for a review). SMC1, SMC3, Pds5, Scc3 and α-kleisin constitute the cohesin complex, which is essential for sister chromatid cohesion during mitosis. SMC2 and SMC4 interact with CAP (chromosome-associated protein)-D2, CAP-G and γ-kleisin to form the condensin complex that plays a central role in DNA condensation into chromosomes and in chromatid segregation during metaphase and anaphase steps of mitosis (reviewed in ). The SMC5–SMC6 complex is less well characterized, and appears to function in the response to DNA damage, in recombination at DNA DSBs (double-strand breaks), in the regulation of the stability of rDNA and much more (see  for a review). SMC heterodimers interacts via their hinges to form V-shaped assemblies, and also via hinges and ATPase domains to form ring-like structures. Such a conformation was proposed, from cellular biology and microscopy approaches, to embrace two chromatids in order to maintain them in close proximity .
Bacterial and archaeal genomes encode several different SMC proteins, which perform essentially the same functions as their eukaryote homologues. In contrast with their eukaryote counterparts, prokaryote SMCs form homodimers . In γ-proteobacteria, including E. coli, SMC is replaced with its distant relative called MukB . In bacterial cells, SMC/MukB may contribute to chromosome condensation and segregation. SMC/MukB interact with the kleisin subunits ScpA/MukF and with ScpB/MukE. Microscopic observations suggest that SMC may be loaded on to newly replicated DNA at the replication forks to condense newly synthesized regions; subsequently, these regions are moved away from the centrally located replication machinery towards opposite cell poles by an unknown mechanism. Previously, it was shown that MukBEF could bridge two DNA molecules in multiple steps . First, a MukB–DNA complex is formed, which subsequently captures another protein-free DNA fragment . The initial tether is quickly strengthened by recruitment of additional MukB proteins. DNA bridging is modulated by ATP and MukEF. These findings explain how SMC proteins catalyse both intra- and inter-chromosomal links inside the cell. Thus bacterial SMC proteins act as global organizers of the chromosome.
The crystal structure of the head domain of Thermotoga maritima SMC  in a monomeric form displays a typical ABC fold, which is almost identical with that of the monomer of the ATPase of Rad50 (see below).
ABC proteins involved in DNA DSB repair
DSBs can be produced by a variety of different mechanisms. In eukaryotes and archaea, the SMC-like protein Rad50, in complex with Mre11 and Nbs1, is an essential component involved in early steps of DNA repair at DSBs, including damage recognition, DNA end processing and cell-cycle arrest signalling (reviewed in ). Mutations in human genes encoding Nbs1 and Mre11 lead to diseases that predispose to cancer: Nijmegen breakage syndrome and ataxia telangiectasia-like disorder respectively (see  for a review). In a patient with a Nijmegen breakage syndrome-like disorder, mutations were identified in the RAD50 gene . Rad50 proteins display the same organization as SMC proteins, but a smaller CXXC domain, called the hook, replaces the hinge domain. Two CXXC hook domains can co-ordinate a zinc ion thereby allowing two Rad50 molecules to interact . In the course of DNA-damage recognition, multiple Rad50–Mre11–Nbs1 complexes bind to DNA. It is now widely accepted that the complex serves to prevent separation of chromosomes at a DSB. The DNA ends are modified by unwinding and nuclease digestion, a step that requires additional unidentified components. Interaction between hook domains serves to tether the two DNA ends. Repair of broken DNA ends is then achieved either by homologous recombination or by non-homologous end-joining. A large body of evidence from many organisms indicates that the Mre11 complex also has important functions during the process of DNA replication . The crystal structure of the ATPase module of P. furiosius Rad50 was instrumental in defining the biologically relevant dimeric state of ABC ATPases, and provided a framework for understanding how ATP controls conformational switching in the ABC ATPase superfamily .
In bacteria, a role in DNA repair was proposed for the SbcCD complex. SbcC shares with Rad50 proteins a central domain carrying a zinc bridge motif (reviewed in ). SbcC interacts with SbcD, an Mre11-like endonuclease. In E. coli, the SbcCD complex is able to cleave hairpin structures in DNA that can halt replication . SbcCD is required, along with RecBCD and RecFOR complexes (see the next paragraph), for repair of DSBs generated by a restriction endonuclease, through homologous recombination with an intact homologous chromosome . However, an alternative model for DSB repair at hairpins that does not require homologous recombination was proposed . A non-polar deletion of the sbcC gene rendered Bacillus subtilis cells sensitive to mitomycin C, which causes interstrand cross-links in DNA, and to γ-irradiation, which provokes DSBs in DNA. This sensitivity is increased in a recN-deletion background, indicating that SbcC and RecN (see below) determine independent DNA-repair pathways . In B. subtilis, SbcC accumulates at the replication centres upon induction of DNA cross-links. These findings suggest that the SbcCD complex is involved in the restart of stalled replication forks, possibly by promoting homologous recombination for replication restart .
RecN is a member of the SMC protein family and has been involved in DNA repair via homologous recombination in several bacterial species. In comparison with SbcC, the coiled-coil domain of RecN is shorter and its central domain is different from the hinge and zinc-binding domains of SMC- and Rad50-like proteins. In B. subtilis, RecN accumulates early at defined DSBs and later at sites that are away from the replication centres in nucleoids . This indicates that RecN and SbcCD collaborate in DSB repair. The subsequent recruitment of RecA and other proteins at DSBs would allow recombinational repair between the broken and intact chromatids .
Evolution of SMC-like proteins
Together with some proteins encoded in the genomes of certain phages, prokaryote and eukaryote SMC proteins constitute a vast conserved subclass of ABC proteins involved in DNA dynamics and repair [83,84]. Phylogenetic analyses suggest that SMC-like proteins evolved from a common ancestor. The tree displays two major branches. The first comprises SMC proteins stricto sensu (cohesins and condensins) with eukaryal, bacterial and archaeal sub-branches, and the second one contains three sub-branches with the Rad50 homologues (eukaryota, archaea and SbcC), the MukB homologues and the RecN homologues. Each of the six eukaryotic SMC subfamilies probably originated from an ancient common ancestor through a series of gene-duplication events .
An ABC protein involved in recombination-mediated DNA repair
In E. coli and in many other bacteria, there are two main pathways of homologous recombination, which both require the action of RecA to perform a strand invasion and exchange reaction. The primary function of these two pathways in bacteria is the recombination-mediated repair of stalled or collapsed DNA replication forks (see  for a review). The first is dependent on the RecBCD complex (AddAB in firmicutes), and acts on DNA ends. The second involves the RecFOR complex, and acts primarily on gaps within DNA, but also on DNA ends under certain conditions. The crystal structure of RecF from D. radiodurans has been determined . The monomer consists of a RecA like subdomain containing the Walker A and B motifs and a mostly helical domain that contains the signature. RecF is structurally similar to the head region of Rad50, but lacks its long coiled-coil region .
MutS, another potential ABC protein?
It has been proposed, by comparison of crystal structures, that the ATPase module of MutS, a protein involved in DNA mismatch repair, belongs to the ABC superfamily [88–90]. Although its ATPase modules interact together in a head-to-tail conformation, as do ABC proteins, MutS ATPase is apparently devoid of a large helical domain and there is no evidence for the conservation of the Q-loop and the ABC signature (Figure 3). These considerations suggest that MutS is distant from bona fide ABC ATPases.
ABC systems are involved in the transport of very diverse molecules, and class 2 and SMC ATPases perform important housekeeping functions. The homology of ABC modules of all classes and the similarity in their structures and conformational changes upon ATP binding suggest a common mechanism of energy coupling. We are far from understanding the diversity of ABC systems, since only a negligible minority among those detected in genomes has been functionally characterized. Investigation of the remaining ABC systems will illustrate their functional diversity and their versatility. The quite good correlation between the sequence of ABC ATPases and their overall function is helpful for the annotation of ABC systems in sequenced genomes. We are maintaining ABCISSE, a database which includes functional, sequence and structural information on ABC systems and which is available at http://www.pasteur.fr/abcisse.
• ABC systems form one of the largest families of proteins.
• ABC modules segregate into three classes (mainly import, export and soluble systems); each of these comprises proteins from the three domains of life. Their divergence probably occurred once in the history of ABC systems.
• Class 1 and 3 systems are involved in the transport of a wide variety of different molecules.
• Class 2 systems participate in mRNA translation, ribosome biogenesis and DNA repair by nucleotide excision.
• New soluble ATPases have been included recently within the ABC superfamily, which perform vital functions in living organisms DNA replication and repair by recombination.
- © The Authors Journal compilation © 2011 Biochemical Society