4.2: Genetic screening for mutations- Forward genetics - Biology

4.2: Genetic screening for mutations- Forward genetics - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

One way to identify genes that affect a particular biological process is to induce random mutations in a large population, and then look for mutants with phenotypes that might be caused by a disruption of a particular biochemical pathway. This strategy of mutant screening has been used very effectively to identify and understand the molecular components of hundreds of different biological processes. For example, to find the basic biological processes of memory and learning, researchers have screened mutagenized populations of Drosophila to recover flies (or larvae) that lack the normal ability to learn. Because of the similarity of biology among all organisms, some of the genes identified by this mutant screen of a model organism may be relevant to learning and memory in humans, including conditions such as Alzheimer’s disease.

– Genetic Screens

In a typical mutant screen, researchers treat a parental population with a mutagen. This may involve soaking seeds in EMS, or mixing a mutagen with the food fed to flies. Usually, no phenotypes are visible among the individuals that are directly exposed to the mutagen because in all the cells every strand of DNA will be affected independently. Thus the induced mutations will be heterozygous and limited to single cells. However, what is most important to geneticists are the mutations in the germline of the mutagenized individuals. The germline is defined as the gametes and any of their developmental precursors, and is therefore distinct from the somatic cells (i.e. non-reproductive cells) of the body. Because most induced mutations are recessive, the progeny of mutagenized individuals must be mated in a way that allows the new mutations to become homozygous (or hemizygous). Strategies for doing this vary between organisms. In any case, the generation in which induced mutations are expected to occur can be examined for the presence of novel phenotypes. Once a relevant mutant has been identified, geneticists can begin to make inferences about what the normal function of the mutated gene is, based on its mutant phenotype. This can then be investigated further with molecular genetic techniques.

Exposure of an organism to a mutagen causes mutations in essentially random positions along the chromosomes. Most of the mutant phenotypes recovered from a genetic screen are caused by loss-of-function mutations. These alleles are due to changes in the DNA sequence that cause it to no longer produce the same level of active protein as the wild-type allele. Loss-of-function alleles tend to be recessive because the wild-type allele is haplosufficient (see Chapter 3). A loss-of-function allele that produces no active protein is called an amorph, or null. On the other hand, alleles with only a partial loss-of-function are called hypomorphic. More rarely, a mutant allele may have a gain-of-function, producing either more of the active protein (hypermorph) or producing an active protein with a new function (neomorph). Finally, antimorph alleles have an activity that is dominant and opposite to the wild-type function; antimorphs are also known as dominant negative mutations.

Forward genetics

Forward genetics is a molecular genetics approach of determining the genetic basis responsible for a phenotype. Forward genetics methods begin with the identification of a phenotype, and finds or creates model organisms that display the characteristic being studied.

This was initially done by using naturally occurring mutations or inducing mutants with radiation, chemicals, or insertional mutagenesis (e.g. transposable elements). Subsequent breeding takes place, mutant individuals are isolated, and then the gene is mapped. Forward genetics can be thought of as a counter to reverse genetics, which determines the function of a gene by analyzing the phenotypic effects of altered DNA sequences. [1] Mutant phenotypes are often observed long before having any idea which gene is responsible, which can lead to genes being named after their mutant phenotype (e.g. Drosophila rosy gene which is named after the eye colour in mutants). [2]

Genetic screening of non-classic CAH females with hyperandrogenemia identifies a novel CYP11B1 gene mutation

Objective: Congenital adrenal hyperplasia (CAH) is an endocrine autosomal recessive disorder with various symptoms of diverse severity. Mild hyperandrogenemia is the most commonclinical feature in non-classic CAH patients and 95% of the cases are identified by mutations in the CYP21A2 gene. In the present study, the second most common cause for non-classic CAH (NC-CAH), 11β-hydroxylase deficiency due to mutations in the CYP11B1 gene, is investigated.

Design: Screening of the CYP21A2 and CYP11B1 genes by direct sequencing was carried out for the detection of possible genetic defects in patients with suspected CAH.

Res ults: It wasobserved that CYP11B1 variants co-exist only in rare cases along with mutations in CYP21A2 in patients clinically diagnosed with CAH. A total of 23 NC-CAH female patients out of 75 were identified with only one mutation in the CYP21A2 gene. The novel CYP11B1 gene mutation, p.Val484Asp, was identified in a patient with CAH in the heterozygous state. The structural characterization of the novel p.Val484Asp was found to likely cause distortion of the surrounding beta sheet and indirect destabilization of the cavity that occurs on the opposite face of the structural elements, leading to partial impairment of the enzymatic activity.

Conclusions: CYP21A2 gene mutations are the most frequent genetic defects in cases of NC-CAH even when these patients are in the heterozygous state. These mutations have a diverse phenotype giving rise to a variable extent of cortisol synthesis impairment it is also clear that CYP11B1 mutants are a rare type of defects causing CAH.


Production of Caméor mutant population

Caméor is an early-flowering garden pea cultivar that completes its reproductive cycle within four months, permitting three successive generations a year under greenhouse conditions. Although pea is predominantly self-fertilizing, some residual cross-pollination can occur. In order to avoid contamination, 100 Caméor plants, derived from single seeds, were analyzed for genetic uniformity using a set of 16 short sequence repeat markers distributed over every arm of the seven predicted pea chromosomes [23] and left to set seeds in insect-proof greenhouses. In total, 10,000 Caméor seeds were produced and used to create the mutant population.

In order to balance maximum mutation density with acceptable plant survival rate, we first conducted a 'kill-curve' analysis on batches of 100 seeds, using a range of doses from 8 to 57 mM EMS. Most treated first generation mutant (M1) plants exhibited retarded growth at an early seedling stage, but all of them recovered. Thirty plants from each treatment were then grown until maturity and assessed for fertility and seed production. A high loss of fertility was observed at the highest doses, with less than 30% of plants fertile at doses higher than 32 mM EMS. The highest EMS doses allowing 50% of plants to set seeds, 16 mM and 24 mM, were retained and tested on large batches of seeds (Table 1). Little difference was observed between these two doses with a tendency toward higher seed production with 16 mM EMS, so a final dose of 20 mM EMS was used for population production. The mean number of seeds per pod was also slightly higher for the plants treated with 16 mM than for those treated with 24 mM EMS. The high rate of arrested embryos in pods of M1 plants treated with EMS doses of 16-24 mM attested to its good mutagenesis efficacy. Out of 8,600 M1 plants, more than 4,817 lines that had produced more than 5 M2 seeds each were individually harvested. To produce M3 seeds, four M2 seeds per M1 plant were sown in two-liter pots and M3 seeds were harvested from two sister plants, referred to as A and B. Leaf material was harvested from the healthiest looking plant, referred to as A (Figure 1). Seed stocks were sent to the Grain Legumes stock center in Dijon for multiplication, distribution and long-term storage of the lines.

Establishment of pea EMS mutant library. Caméor seeds were EMS mutagenized. Out of 8,600 M1 plants self-fertilized in an insect-proof glasshouse, 4,817 produced more than 5 M2 seeds each. Four M2 seeds, referred to as A-D, per M1 parent were grown to maturity and scored for phenotypes. DNA was extracted from the plants referred to as A, which were left to set M3 seeds. As a backup, M3 seeds were harvested from the sister B plants. The collected M3 seeds were sent to the Grain Legumes Biological Resource Center for distribution, maintenance of the lines and long-term storage of the mutant library.

Phenotyping of the Caméor mutant population

As we intended to create a reference mutant collection that could be used for forward and reverse genetics, we carried out a systematic phenotyping of the mutant population. Our phenotype scoring was based on visual characterization of four plants per M2 family at key developmental stages, from germination until fruit maturation. To facilitate the phenotype scoring we defined a phenotype ontology adapted to pea. This phenotyping tool does not cover all phenotypic alterations (for example, no root evaluation was carried out) and was constructed for high-throughput scoring of many mutant lines in a relatively short growing season. The vocabulary used to describe the mutant plants was organized in a hierarchical tree and is composed of 107 subcategories of phenotypes clustered at different levels. The complete list of the vocabulary used is shown in Additional data file 1 and the number of lines found in each major phenotype category is shown in Table 2.

Out of the 4,817 M2 families, 1,840 showed a visible phenotype, which represents 38% of the lines. Among the lines that showed a visible phenotype, 45% were scored for a single phenotype and 55% displayed multiple phenotypes, that is, they fall into more than one major phenotype category (Figure 2a). This rate of pleiotropy is an underestimation as the phenotypic characterization is based on high-throughput visual observation of only four mutant lines per M2 family. Detailed morphological and biochemical characterization of higher numbers of plants per M2 family would result in more phenotypic effects per mutant and, thus, a higher rate of pleiotropy. The most commonly observed phenotypes are related to stem size, leaf and plant architecture, followed by those related to cotyledons, stipules and seeds, with the least abundant phenotypes being related to flowers, plantlet architecture and petiole morphology (Figure 2b). Examples of phenotypes corresponding to the primary categories described are shown in Figure 3.

Distribution of phenotypic characteristics of the mutant population and rate of pleiotropy. (a) Number of M2 families in each phenotypic group. The x-axis indicates the nine major phenotypic categories, listed in Table 2, and the y-axis indicates the total number of M2 families. Each bar represents the number of mutants in the corresponding category. The blue bar represents the quantity of pleiotropic mutants (having more than one phenotype), given by the first number in the category label. The red bar represents the non-pleiotropic mutants and is given by the second number in the category label. (b) Total number of M2 families (y-axis) sharing 1-5 major phenotypic categories (x-axis). The bar for one phenotypic category indicates how many mutants are categorized in only one phenotypic group (non-pleiotropic mutants), and the bars for the 2-5 phenotypic categories represent the number of mutants that share two to five phenotypes, respectively. In each case, the total number of mutants is indicated on the top of the bar.

Examples of mutant phenotypes representing the nine major phenotypic groups. (a) Plant 566: cotyledon color, albino. (b) Plant 939: plantlet architecture, bushy plant architecture, hyper compact leaf color, pale green stem size, extreme dwarf. (c) Plant 54: plant architecture, determinate growth. (d) Plant 1,236: plant architecture, basal branching leaf color, pale green, yellow leaf size, medium stem size, dwarf. (e, f) Plant 903: leaf, cone shaped at leaf base flowers, sterile flowers. (g) Plant 1,567: leaf, distorted stipule, silver-argentous. (h) Plant 630: flowers, cauliflower type inflorescence flowers, abnormal all stem, dwarf leaf, upcurling.

Caméor TILLING platform

To set up the pea TILLING platform, DNA samples were prepared from 4,704 M2 plants, each representing an independent family and organized in pools of 8 M2 families. One key factor in TILLING is the availability of the annotated genomic sequence of the gene to be tilled. Even though the pea genome has not yet been sequenced, acquisition of the genomic sequences of target genes is facilitated by the high degree of synteny between pea and the model plant Medicago truncatula, which is being sequenced [24]. The CODDLE program (Codons Optimized to Discover Deleterious Lesions [25, 26]) combined with the PRIMER3 tool [27] are used to define the best amplicon for TILLING. PCR products used for TILLING have a maximum size of about 1,500 bp and, therefore, longer genes are divided into several amplicons. To reduce variation in the quality and the quantity of the PCR amplification product due to the pea genome complexity and low amount of genomic DNA used in PCR, nested PCR is performed. Mutations are detected in the amplified targets using the mismatch-specific endonuclease ENDO1, as described previously [28]. Individual mutant lines are identified following a pool deconvolution step, and then the mutated base is identified by sequencing.

A primary objective in a mutagenesis project is to generate a saturated resource where every locus is mutated and represented by multiple alleles. To evaluate the existence of multiple alleles per locus, we screened for mutations in the pea Methyl transferase 1 gene (PsMet1) [29]. Three amplicons of 1,383, 1,310 and 1,149 bp were tilled (Figure 4) and 96 mutants were identified (Figure 5). Sequence analysis of the mutations showed that 6 were intronic, 37 silent, 50 missense and 3 nonsense mutations (Figure 4b). Although characterization of PsMet1 mutants is beyond the scope of this article, we found that retrieval of the mutant alleles from the A plant M3 seed stocks was successful, without the need to use backup M3 seed stocks collected from the sister B plants (Figure 1). The exonic mutants were mostly present as heterozygotes (79 out of 90 mutations), but 11 lines were homozygous for the mutations. As expected with EMS mutagenesis, these mutations were distributed relatively evenly within the screened amplicons (Figure 4b).

Comparison between predicted and obtained mutations. (a) Output of the CODDLE program using as an example the PsMetI genomic sequence. Exons are represented by white boxes and introns by red lines. The CODDLE program was used to identify those regions of the gene in which G:C to A:T transitions are most likely to result in deleterious effects on the encoded protein (represented by the probability curve traced in tourquoise). The CODDLE algorithm is based on an evaluation of protein sequence conservation from comparison of database accessions of homologous proteins. For PsMetI, three fragments were chosen based on these CODDLE results (blue lines). External and internal primers were designed to amplify each region by nested PCR. (b) Graphic representation of mutations identified in the three regions of the gene PsMetI. This drawing was made using the PARSESNP program [43], which maps the mutation on a gene model to illustrate the distribution of mutations. Purple triangles represent silent mutations and black and red triangles represent missense and truncation mutations, respectively.

TILLING screen. Example of a PsMetI TILLING screen on eightfold pooled pea DNA. The image of the cleavage reaction is collected from both channels (dyes IRD700 and IRD800). The sizes of the cleavage products (circled) from the two dye-labeled DNA strands (red or green) add up to the size of the full-length PCR product (top of the gel). PCR artifacts are distinguishable from true mutants by yellow points (red and green added) as they appear at the same size in both channels. The size of the cleavage product (the sizing ladder can be seen at the left and middle of the image) indicates approximately where the single nucleotide polymorphism is located in the fragment.

To further evaluate the quality of the mutant population, we extended the TILLING screen to another 19 genes and identified 371 point mutations in those genes (Table 3). As expected for EMS, all the mutations were G:C to A:T transitions [6, 30]. Induced mutations discovered in exons consisted of 66.75% missense, 28.51% silent and 4.74% stop mutations (Table 4). Although the number of observed missense mutations was bigger than the amount predicted by CODDLE (63.80%), we recovered stop mutations in a slightly lower proportion than predicted (6.90%). As many tilled amplicons harbor intronic segments, some recovered mutations were intronic. Although some of these could potentially affect the efficiency of mRNA splicing, such an impact is unpredictable. Thus, intronic mutants were not characterized further. In contrast, the large number of non-synonymous mutations recovered is of interest as they may lead to gain- or loss-of-function phenotypes. Such mutations will also permit dissection of the function of the protein with respect to its sub-domain structure.

We calculated the mutation frequency in the 20 targeted genes (Table 3) according to Greene et al. [6]: mutation frequency equals the size of the amplicon multiplied by the total number of samples screened divided by the total number of identified mutants. We estimated the average mutation rate to be one mutation every 200 kb. This mutation density is 1.5 times higher than the rate of one mutation per 300 kb reported for Arabidopsis, the best characterized TILLING mutant population to date [6]. Therefore, the 16-24 mM dose of EMS used to create the pea mutant population appears to be an adequate dose for TILLING. On average, we identified 34 alleles per tilled gene (after normalization to TILLING of the entire population). Considering that about half of missense mutations should have a deleterious effect on a typical protein [31], 25 alleles per tilled kilobase would be sufficient for phenotypic analyses.

Setup of the UTILLdb database

We scored 4,817 lines in the mutant population for phenotypic alterations using 107 subcategories of phenotypes. In TILLING screens we searched for mutations in 20 genes and identified 467 alleles. In order to manage and integrate the expanding data from both the phenotype recordings and TILLING target genes, we implemented the database UTILLdb. UTILLdb was developed according to a relational database system, interconnecting four main modules: lines, phenotype categories, sequences and mutations. Two main types of data are accessible, the morphological phenotypes of mutants and the sequences of tilled genes and corresponding alleles, when available. UTILLdb may be searched using a sequence, through a BLAST tool [32] or for a phenotypic feature using a keyword search. The outcome of the search is shown as a table of results that displays the phenotype of each line, with associated pictures and mutated sequence if it exists. Thus, the user could ask whether lines that share mutations in a specific gene share the same phenotypes and vice versa. As we expect the phenotypic characterization of the TILLING mutants to become more detailed as they are analyzed by UTILLdb users, UTILLdb was designed so that the passport data of the mutant lines can be extended or modified as needed. UTILLdb is publicly accessible through a web interface [33]. A link is implemented to facilitate seed ordering. UTILLdb serves also as an entry point for users wishing to have their favorite gene tilled on the Caméor TILLING platform. Results from those screens as well as the phenotype of the mutants identified will be implemented in UTILLdb.

3. Target-selected mutagenesis

Target-selected mutagenesis is used to identify mutations in a specific gene from a randomly mutagenized genome. The approach involves standard mutagenesis using EMS or UV/TMP, for example, followed by screening using the polymerase chain reaction (PCR) or other method to identify animals carrying lesions in the gene of interest.

Two community consortia, the C. elegans Knockout Consortium (, and the C. elegans National Bioresource Project of Japan (NBRP, are currently using target-selected mutagenesis to identify mutant alleles for every gene in the C. elegans genome ( C. elegans Deletion Mutant Consortium, 2012). Together, over 6,700 deletion or null alleles have been isolated by these groups and are available to the scientific community ( C. elegans Deletion Mutant Consortium, 2012). Strains from the C. elegans Knockout Consortium are available from the Caenorhabditis Genetics Center (CGC, under the allele designation ok or gk. Strains generated by the NBRP, with allele designation tm , are available after submission of a Materials Transfer Agreement (Mitani, 2009). Deletion strains are listed on Wormbase, which should be consulted first to determine if an allele is already available for a gene of interest. If not, one may request a deletion strain from the Consortium. Upon receiving a strain, it should be outcrossed to remove irrelevant mutations. It is always important to verify the presence of the mutation in a strain. In addition, deletion mutations obtained by target-selection can be accompanied by proximal duplications, resulting in obligate heterozygosity for the gene under consideration. In these situations, phenotypic assessment of the mutant strain may not reveal the defects associated with complete gene loss. To confirm that a gene of interest is responsible for an observed defect, transgenic rescue experiments should be performed and/or multiple alleles examined ( C. elegans Deletion Mutant Consortium, 2012). In addition to the collection of deletions, the Million Mutations Project ( has identified over 800,000 single nucleotide polymorphisms (SNPs) and over 16,000 indels in more than 20,000 genes ( C. elegans Deletion Mutant Consortium, 2012 Thompson et al., 2013). While some of these SNPs do not affect gene function, many may provide useful structure-function data.

3.1. Target-selected mutagenesis using PCR

The goal of target-selected mutagenesis is to generate a library containing pools of mutagenized animals that can be screened for specific DNA lesions. Deletion alleles are easier to identify using PCR than point mutations, so UV/TMP and EMS are the preferred mutagens (Lesa, 2006). For library construction, 600,000 hermaphrodites are randomly mutagenized. Once the mutagenized animals become gravid, they are bleached and L1s are allowed to hatch in M9, overnight. The L1s are then subdivided into 1,152 subcultures (55 mm NGM plates) of 500 animals each. After two generations of self-fertilization, 20% of the animals per plate are rinsed off for genomic DNA isolation into twelve 96-well plates (Lesa, 2006). The remaining animals are stored at 15°C for up to six weeks. Although libraries can be frozen, live-animal libraries are less prone to failures in mutant recovery (Lesa, 2006).

Three approaches are commonly used to search for deletions in a gene of interest: restricted PCR conditions, poison primers, and thermostable restriction enzymes. Each approach employs nested PCR primers designed to flank the genomic sequence of interest to increase specificity, and each method favors amplification of deletion products over larger wild-type products (Edgley et al., 2002 Jansen et al., 1997 Wei et al., 2002 Zwaal et al., 1993). All of these methods are prone to high false positive rates (50%-90%) therefore, each pooled sample should be screened in duplicate (Jansen et al., 1997). A comprehensive protocol for library creation and the poison primer PCR technique is available in the Reverse genetics chapter in WormBook.

3.1.1. Restricted PCR extension time (Jansen et al., 1997)

Each library pool is incubated first with a primer pair (10 pmol per primer) designed 2.5-3.5 kb apart over a region of interest, followed by incubation of a 1:500 dilution of the first PCR with a nested primer pair (Liu et al., 1999). After addition of polymerase, nucleotides and buffer, samples are cycled with short extension times (45 s to 1 min) to favor accumulation of the deletion product (Jansen et al., 1997). DNA polymerases have specific turnover rates (maximum number of nucleotides polymerized in a given amount of time), which vary depending on the type of Taq used (Kornberg and Baker, 1992). For example, if the wild-type (WT) product is 2 kb, the deletion product is 500 bp, and the Taq polymerase has a turnover rate of 1 kb/min, approximately two min will be needed to complete the WT reaction and 30 sec for the deletion product. If the extension time is limited to 45 s, only the deletion product will be amplified. Over many rounds, some WT product is amplified, but at a much slower rate than the deletion. This greatly increases the odds of detecting the less abundant deletion allele. Detected deletions are variable in extent and average around 1.4 kb in size (Liu et al., 1999). Completed reactions are run on an agarose gel, and pools that have a product smaller than the wild-type product are identified. Library animals from this pool are plated either singly or in smaller pools and retested to identify single hermaphrodites carrying the relevant lesion (Jansen et al., 1997).

3.1.2. Poison primer method (Edgley et al., 2002)

This approach was developed to enrich for deletions that are less than 500 bp in size (Edgley et al., 2002). The protocol is similar to that for the restricted PCR extension time method, except that a third primer is added between the external primers of the first PCR (Figure 3). This poison primer is designed to amplify only off of wild-type genomic DNA, competing with the full-length wild-type product, thereby reducing the quantity of wild-type template available for the second PCR. The second set of nested primers is only able to bind the deleted product and the full-length wild-type product, whose total concentration is halved because of the poison primer product. This leads to a 500-fold increase in sensitivity by enhancing the ratio of deletion to wild-type products. Use of a short extension time (45 s) also facilitates synthesis of the deletion product. Because the poison primer determines the location of amplified deletion products, it is usually best to target the 5’ end of a gene to generate null alleles (Edgley et al., 2002).

Figure 3. Poison primer method for detecting deletions. This technique enriches rare, deleted DNA over the more abundant, full-length, wild-type product. Primers a, b, and p are used in the first PCR to generate a wild-type template, a mutant template, and a ‘poison’ template. In the second PCR, these products are used as template with primers a’ and b’ to increase the signal of the deleted product. Without the poison primer, full-length product dominates the PCR reaction, and the deletion product is below the detection limit. When the poison primer is included, total full-length amplification is decreased, and the deletion product is revealed.

3.1.3. Thermostable restriction enzymes (Huang et al., 2006 Wei et al., 2002)

In this less commonly used method, sequences in a target gene are identified that have a thermostable restriction enzyme recognition site in the interval spanned by the primer pairs (Wei et al., 2002). These enzymes are stable at high temperatures, making them ideal to use with PCR. Pooled genomic DNA is digested for 2 h before PCR to cleave genomic DNA and reduce the amount of wild-type template available. The enzymes are also included during the PCR amplifications to continuously cleave any wild-type product. Two sets of nested primers are used. Targeted deletion sequences lack the restriction enzyme site and are not cleaved, thereby increasing the sensitivity of the reaction (Wei et al., 2002). Deletions ranging from 330 bp to 1.7 kb have been found using this method (Huang et al., 2006 Wei et al., 2002). Validated thermostable restriction enzymes include Psp GI, Tli I, Bst UI, Ape KI, and Tsp 45I (Huang et al., 2006 Wei et al., 2002).


TILLING (Targeting Induced Local Lesions IN Genomes) allows the identification of point mutations and small indels in a gene of interest (McCallum et al., 2000). Originally developed in Arabidopsis thaliana , TILLING uses a single-strand DNA nuclease to determine the location of mismatches after hybridizing a mutated genome to a non-mutated genome (Colbert et al., 2001 Till et al., 2003). While this method has only been used once to isolate point mutations in a gene of interest in C. elegans , it has been used more in other organisms. Mutations in a population of C. elegans are generated using EMS or ENU (Gilchrist et al., 2006). PCR targeting a gene of interest is performed on genomic DNA derived from many pools of animals (Figure 4). Primers are tagged with different fluorophores, marking the 5’ and 3’ ends of the product with different colors. After PCR, the DNA in each reaction is denatured. Reannealing then allows heteroduplex formation between wild-type and mutant DNA. Reannealed DNA is incubated with a single-strand DNA nuclease, CEL1, derived from celery juice extract (CJE) (Till et al., 2004). CJE cuts heteroduplex DNA where a mismatch or indel creates a single-stranded bulge. Samples are run on denaturing LI-COR gels, and the gels are examined in both fluorescent channels to determine in which pool a mismatch occurs (Gilchrist et al., 2006). In a screen for ten genes (ranging from 788 bp to 9.1 kb in size), 71 mutations were identified. Of these, 59% constituted missense alleles, and 3% were nonsense alleles (Gilchrist et al., 2006).

Figure 4. TILLING allows identification of point mutations in pools of mutated DNA. Both mutated DNA and wild-type DNA are subjected to PCR, using differentially labeled 5’ and 3’ primers (here in green and red). The PCR products are denatured and reannealed slowly, so that heteroduplex DNA forms between a mutated DNA strand and a wild-type DNA strand. The reannealed PCR products are treated with Cel1, an enzyme that specifically cleaves heteroduplex DNA. The reaction is visualized on a LI-COR gel in two fluorescence channels. When heteroduplex DNA is cut, smaller DNA fragments are seen in both the green and red channels within the same lane. Full-length, uncut product will be prominent at the top of the gel.

Individual mutagenized F1 animals are plated singly onto 1500 seeded NGM plates. After the food supply is expended, half of the animals are frozen at 󔽘°C. The other half are used to generate genomic DNA individually into 96-well plates. To reduce the number of PCR reactions, eight F1 DNA samples are pooled together for analysis. Once a positive pool has been identified, DNA genomic samples are assayed individually to determine from which plate the sample originally arose. PCR primers are designed over an area of 1.5 kb, mainly encompassing exonic regions. Primers can be designed using the program CODDLE to select for regions high in G/C content, which are most prone to mutation by EMS (Gilchrist et al., 2006).

3.3. G4 DNA-induced deletion mutagenesis

The consensus sequence G3+N1-7G3+N1-7G3+ N1-7G3+ folds in vitro to form a guanine quadruplex (G4) structure that may block replication in vivo (Kruisselbrink et al., 2008). The FANCJ protein (DOG-1 in C. elegans ) is a DNA helicase required for proper replication of such guanine-rich tracts (Kruisselbrink et al., 2008). In dog-1 mutants, these tracts are prone to 5’ deletions. There are more than 2,900 guanine quadruplex sites in the C. elegans genome, annotated in Wormbase under “G4 Motif” in GBrowse, and over 1,600 genes are located 5’ of a G4 site (Pontier et al., 2009). Of ten genes assayed in one study, 11 deletion alleles were generated, most of which were up to several kilobases upstream of the G4 site targeted (Pontier et al., 2009). This method has been used once to generate deletion alleles (Pontier et al., 2009).

dog-1(pk2247) mutants are picked to individual NGM plates and allowed to grow until no OP50 remains on the plate (Pontier et al., 2009). Half of the animals on each plate are rinsed off with M9 for genomic DNA isolation. Prior to DNA isolation, animals are split into two groups. If a deletion is present in only one of the samples, it probably arose from a somatic event. However, if a deletion is detected in both samples, it is most likely a heritable mutation (Pontier et al., 2009). Deletions are screened using any of the PCR protocols described above. After isolation, the strain is outcrossed to wild-type animals to remove the dog-1 mutation and any off-target deletions.

What is Forward Genetics?

Forward genetics can be defined as the path of determining the basis of genetics that is responsible for a particular phenotype. Naturally occurring mutations and mutants that are induced by radiation, chemicals or transposable elements (insertional mutagenesis) were the initial approaches for forward genetics. It is then followed by breeding, isolation of the mutant individuals and lastly the mapping of the gene. Forward genetics is performed to determine the gene function through the analysis of the phenotypic effects of DNA sequences that are altered. Therefore, it is considered antagonistic to reverse genetics. Mutant phenotypes are usually examined beforehand to identify the particular gene responsible and can give rise to genes being named after the respective mutant phenotype. Drosophila rosy gene named after the mutant’s eye color is an example.

In the context of conventional genetical approach, a researcher handling the determination process of the genetic basis for phenotypes would directly map the gene on the particular chromosome where it is present. This is done through cross-breeding with different individuals where those individuals carry different other uncommon traits. Statistical analysis will be carried out to determine the frequency of occurrence where the two traits are inherited together. This convention methodology of mapping takes a considerable long period of time.

Drosophila Models of Huntington Disease

LESLIE M. THOMPSON , J. LAWRENCE MARSH , in Animal Models of Movement Disorders , 2005

1. Screens

Genetic screens can identify mutations that lead to more severe (enhancers) or less severe (suppressors) phenotypes. Two types of mutations can be screened: loss-of-function alleles in which a mutation typically causes the activity of one allele to be lost, or gain-of-function alleles in which the mutation is caused by an insertion of a transposable P-element containing a UAS based enhancer and promoter (EP), resulting in ectopic activation of the gene in the presence of GAL4 drivers. Eye degeneration of a polyQ alone expressing fly model (UAS-127Q) was used in a P-element screen for modifiers of the phenotype [ 28 ]. Several modifiers were identified, highlighting a role for protein folding pathways (dHDJ1, dTPR2) and suggesting a possible association with pathways involved in cell proliferation and DNA replication (Drosophila homolog of human myeloid leukemia factor 1, MLF1) [ 29 ]. Using the SCA1 model to screen for suppressors and enhancers of the phenotype, several classes of modifier genes were identified. These included transcriptional regulating genes, protein modification genes, and chaperones as modifiers of mutant ataxin 1 pathology [ 25 ].

Genomic screening: A pet subject

Many canine inherited diseases are late onset conditions that are often not apparent until maturity. Breeders and owners must therefore wait until a puppy grows before knowing if it is affected. This delays vital preventive veterinary treatment and can result in breeding from affected animals, thus perpetuating the disease. Replacing this traditional approach with genomic screening is a much faster way to manage breed-specific problems genomic carrier status is easy to screen in young puppies, so breeders and owners do not have to wait for clinical signs before taking action.

Moreover, with more research, scientists are also finding that different gene mutations cause the same or similar diseases. For example, various genes associated with retinal function cause progressive retinal atrophy (PRA) and blindness in many dog breeds. Inheritance is mostly as an autosomal recessive gene (PRCDin many breeds, PDE6Bin the Irish setter, RD3in collies, and PDE6Ain cardigan Welsh corgis all cause rod-cone dysplasia), but in the Siberian husky and Samoyed the condition is due to RGPR, an X-linked gene, and autosomal dominance is the route for bullmastiffs and English mastiffs (RHO). Golden retrievers are affected by two different gene mutations (SLC4A3and TTC8) causing PRA. There is still a lot to learn about heritability and disease expression in dogs.

Materials and Methods

Mutagenesis and breeding

ENU mutagenesis followed standard protocols (S tottmann and B eier 2010). In brief, 6- to 8-week-old A/J mice (Jackson Labs) were injected i.p. with a fractionated dose of ENU once weekly for 3 weeks with 90, 95, or 100 mg/kg ENU (Sigma). A/J mice were used due to their ability to tolerate larger doses of ENU with acceptable morbidity and mortality, thus increasing their mutation load and increasing productivity of the screen (W eber et al. 2000). ENU was prepared using standard methods and dissolved in ethanol. After allowing for a period of infertility following the mutagenesis, a standard three-generation breeding scheme was followed (B ode et al. 1988 H erron et al. 2002 also see Results and Figure 1 ). The A/J males were crossed with FVB/J females (Jackson Labs), retinoic acid response element (RARE)-lacZ–positive females, or Lis1 heterozygous females (largely from a 129 background). The resulting G2 females were backcrossed to generate G3 embryos or postnatal day (P) 14 pups. For timed pregnancies, successful matings were identified by the presence of a copulation plug, and noon of the day of detection was set as embryonic day (E) 0.5. All animals were housed in accordance with the Harvard Medical School Center for Animal Resources and Comparative Medicine.

ENU breeding scheme. (A) A traditional three-generation breeding scheme for recovering recessive traits in the third generation (G3). G0 males are mutagenized and outcrossed to generate the G1 population. These are again outcrossed to generate G2 females, which may carry any given ENU mutation. These are backcrossed to the G1 male to create the G3 generation, which is then screened for phenotype(s). Solid indicates homozygosity for an ENU mutation, open indicates no mutation, solid/open indicates heterozygosity. (B) Introduction of a reporter allele is accomplished by mating RARE-lacZ–positive animals to the G1 male and genotyping subsequent females for the transgene. (C) Similarly, a sensitizing locus is incorporated by mating Lis1 heterozygous females with G1 males.

Mouse genotyping

RARE-lacZ mice were genotyped with standard lacZ primers (F-TTTAACGCCGTGCGCTGTTCG, R-GATCCAGCGATACAGCGCGTC). Lis1 heterozygotes were identified using primers for the neomycin resistance cassette (F-TCCTGCCGAGAAAGTATCCATCAT, R-CCAGCCGGCCACAGTCGT) as previously described (H irotsune et al. 1998). Looptail mice carry a mutation in the vang-like 2 (Vangl2) gene, and heterozygous mice were identified by their abnormally looped tail. Direct sequencing of the mutation (in the eighth exon of vangl2) was done to genotype embryos (F-AGAGGATGAAGGGTGGGTG, R-GTGTCAGGGCCAGAGAACC).

Whole-genome single nucleotide polymorphism scanning

To map our mutations, we genotyped a custom whole-genome panel of 768 single nucleotide polymorphism (SNPs) with the Illumina GoldenGate technology at the Broad Institute Center for Genotyping and Analysis. A total of 256 of the 768 SNPs on this panel were polymorphic between A/J and FVB/J. The initial genomic interval carrying the mutation was defined by flanking SNPs, which identified a homozygous A/J haplotype shared by all affected animals. Fine mapping was done by analyzing further meioses with microsatellite markers, restriction fragment length polymorphisms [identified via the MGI database or the algorithm SNP2RFLP (B eckstead et al. 2008)], or direct sequencing of SNPs. Sequencing of candidate genes was done with standard methods, either exon-based sequencing or using random-primed cDNA from mutant RNA for transcript analysis. Primer design was either with Primer3 ( or via the Exon Primer function in the University of California at San Francisco (UCSC) Genome browser ( All DNA sequencing other than SNP mapping was done at the Dana-Farber/Harvard Cancer Center DNA Resource Core. In two of these mutant lines, crn2 and hith2, our initial mapping was with a cohort of affected animals with multiple phenotypes and resulted in two candidate regions. Analysis of more embryos and standard microsatellite mapping demonstrated that not all phenotypes were allelic and allowed us to focus on the chromosomal location reported in Table 2 for mapping crn2 and hith2.

Table 2

NamePhenotype descriptionChromosomeInterval (Mb)Locus
Brain dimple (brdp)Telencephalic dysmorphology mild hydrocephaly1315� (3)
Cleft lip and palate, exencephaly (clpex)Cleft lip and palate, exencephaly781� (4)
Cleft palate 1 (clft1)Cleft palate only1183� (10)Irf6
Rudolph (rud)Cortical dysmorphology, shortened long bones1160� (8)Hsd17b7
Craniorachischisis (crn2)Craniorachischisis1549� (4 a )Scrib
Progressive hydrocephalus (prh)Progressive hydrocephaly30� (4)
Cleft face (clft3)Midline fusion defect, small brain, encephalocele150� (4)
Hole in the head 2 (hith2)Encephalocele, hydrocephalus836� (4 a )Casp3

Eight monogenic mutations were recovered in this screen. The causal gene or initial mapping interval from the whole-genome SNP scan is reported with the number of mutants used in parentheses.

Histology and immunohistochemistry

Histological analysis used standard methods (N agy 2003). Embryos were fixed in either Bouin's fixative or 4% paraformaldehyde followed by paraffin embedding and sectioning at 14 μm. Sections were stained with hematoxylin and eosin or Cresyl violet. Caspase-3 immunohistochemistry was performed on paraffin sections following the manufacturer's instructions using reagents described below. After deparaffinization of the slides, antigen retrieval was performed with Citra Buffer (Biogenex). Blocking against background from avidin𠄻iotin immunohistochemistry was done with commercial reagents (DAKO, Vectorlab). Slides were incubated with a primary antibody against full-length caspase-3 (Cell Signaling) overnight at 4° at 1:200. Visualization was with a biotin-labeled secondary antibody (Vectastain Universal ABC kit) and counterstained with hematoxylin QS (Biogenex).

Analysis of crn2 cochlear phenotype

E18 whole-mount cochleae were fixed in 4% paraformaldehyde, dissected to expose the sensory epithelium, and stained with Alexa Fluor-conjugated phalloidin (1:200 Invitrogen). Images shown are confocal projections of the cochlear sensory epithelium. Absolute rotation of hair-cell stereocilia bundles was assessed at positions of 5, 25, and 50% of the total cochlear length from the basal end of the duct, as previously described (M ontcouquiol et al. 2003). Data are presented as the average deviation from 0° error bars represent SEM. Significant differences were determined by two-tailed t-test.

Supporting information

S1 Fig. Wolbachia recovers from severe titre reduction within four fly generations.

Relative Wolbachia titres of the progeny of tetracycline-treated flies. wMelCS_b-carrying females laid eggs in food containing varying doses of tetracycline. The progeny of three females were used to set up the experiment. At the first generation, four females were randomly selected for egg-laying in antibiotic-free fly food and Wolbachia titre was measured using qPCR. Titres of untreated females were used to normalize the qPCR results. The progeny of a female with the median titre was used to set up the next generation. Wolbachia titre in the F1 was significantly determined by the concentration of the antibiotic (p < 0.001 for all doses compared with control at generation 1), but recovered to normal within four fly generations (p > 0.05 for all doses compared with control at generation 4).

S2 Fig. EMS decreases female fecundity and Wolbachia titre in the next generation in a dose-dependent manner.

The total number of eggs (A) and adults (B) from females treated with varying doses of EMS. The reproductive output of 10 females was determined in the first ten days after EMS treatment by daily transferring females to new vials for egg laying. Females fed on a sucrose solution served as controls. Each dot represents the total number of eggs (A) or adults (B) laid by individual females during ten days. The effect of EMS on the reproductive output of females was estimated using a non-linear model and was highly significant (p < 0.001 for both numbers of eggs and adults per female). (C and D) Wolbachia titres in the F1 progeny of females treated with varying EMS doses. Wolbachia titre was quantified on individual females (n = 5–13 per dose), after laying eggs for three days. Wolbachia titres were normalized against the titres of untreated females. Dashed red lines represent the mean value predicted using non-linear models. The effect of EMS on Wolbachia titres in the next generation was highly significant (p < 0.001 for both panels).

S3 Fig. Isolation of over-proliferative Wolbachia variants.

(A-D) Relative Wolbachia titres in a control (wMelCS_b) and EMS-treated D. melanogaster lines. Flies to set up the next generation was selected as described for Fig 1. Line 2B was isolated in the same batch as Line 2A (wMelOctoless) and they may be not independent. Likewise, Lines 3A (wMelOctoless2), 3B, and 3C were also isolated in a same batch.

S4 Fig. Generation of isogenic D. melanogaster lines with wMelPop2 and wMelOctoless.

The first, second and third chromosomes of flies carrying wMelPop2, wMelOctoless, and wMelOctoless2 were replaced through the use of balancer chromosomes. Wolbachia infection (and also mitochondria) was kept in the stock by crossing females with Wolbachia with indicated males. The mitochondria are only shown in females because of its strictly maternal transmission. All males were free of Wolbachia infection. Dashed lines indicate the genotype selected from the previous cross. Virgin female in the first cross were considered mutant in all chromosomes (*), for illustrative purposes. Question marks (?) represent recombined chromosomes.

S5 Fig. Proliferation of wMelOctoless and wMelOctoless2 in a host isogenic genetic background.

Relative Wolbachia titres in D. malanogaster males carrying wMelOctoless and wMelOctoless2 at 0 and 7 days post adult eclosion, at 25°C. This experiment was set-up as described in Fig 1. Relative Wolbachia titre was determined using qPCR and normalized to that of 0–1 days-old wMelCS_b-infected males. Each dot represents the relative titre of a single male.

S6 Fig. Confirmation of amplification and deletion of Octomom genes by qPCR.

The amplification and deletion of individual Octomom genes (wMel loci WD0507WD0514) was confirmed using qPCR in wMelPop2 and wMelOctoless, respectively. The copy number of three genes outside the Octomom region (wMel loci WD0505, WD0519, and rpoD) were also determined. Five females carrying wMelCS_b, wMelPop2, and wMelOctoless were used in the analysis. The copy number of wMelPop2 and wMelOctoless genes is relative to that of wMelCS_b.

S7 Fig. Selection for lines carrying Wolbachia with a desired Octomom copy number.

The relative copy number of genomic WD0513 in Wolbachia-carrying stocks throughout 30 fly generations. Each generation, 5–20 females were randomly collected for egg-laying for 3–4 days and used to determine the relative copy number of WD0513, as a proxy for the Octomom copy number. The progeny of a single female was used to set up the next generation. qPCR results were normalized to that of wMelCS_b, which has a single copy of Octomom per genome.

S8 Fig. Octomom region is amplified in tandem in wMelPop2 and wMelPop.

Oxford Nanopore MinION reads supporting the amplification of the Octomom region in tandem in wMelPop2 and wMelPop Wolbachia variants. We mapped wMelPop2 and wMelPop long reads (BioProject: PRJNA587443) to the the Octomom region in their genomes (Accessions CP046922.1 and CP046921.1, respectively) using minimap2 v2.17-r941 [48] and plotted the alignment summary (S7 Table) for illustrative purposes.

S9 Fig. Identification of the genetic bases for over-proliferation of the Wolbachia in Line 2B and Line 3A (wMelOctoless2).

Relative coverage in the genomic region containing the Octomom region. As in Fig 2B, Illumina paired-end reads were mapped to wMelCS_b (GenBank: CP046924.1) genome, and the number of reads mapping to each position were normalized by dividing to the median coverage across the genome. Coverage information for wMelCS_b, wMelPop2 and wMelOctoless is also given in Fig 2B. We identified the deletion of Octomom as the cause of proliferation in lines 2B and line 3A (wMelOctoless2), as no other difference was found when compared to wMelCS_b.

S10 Fig. The amplification or deletion of Octomom increase Wolbachia proliferation rate in adults.

Time-course of relative Wolbachia titres in adults at 18°C (A), 25°C (B) and 29°C (C) with different Wolbachia variants. Replicate of experiment shown in Fig 3. Wolbachia titres were determined and analysed as described for Fig 3.

S11 Fig. Octomom copy number determines Wolbachia titres on the day of adult eclosion.

Relative Wolbachia titres on the day of adults eclosion. Males developed at 25°C were collected within 24 hours after eclosion for Wolbachia titre measurement using qPCR. Data used in this figure are also shown in Fig 3 and S10 Fig (time point 0). Letters represent significant groups after p-value correction.

S12 Fig. wMelPop2 and wMelPop are phenotypically indistinguishable.

(A) WD0513 copy number of wMelPop2 and wMelPop in two experimental replicates. Using WD0513 as a proxy, the Octomom copy number of wMelPop2 and wMelPop was tightly controlled prior to phenotypic comparison. (B) Wolbachia relative titres at 25°C. The progeny of wMelPop2- and wMelPop-infected females carrying three copies of Octomom was used to set up the experiments. Males that developed at 25°C were collected upon eclosion, aged to specific time-points and used to determine Wolbachia titres using qPCR. Wolbachia titres were normalized to that of wMelCS_b-carrying flies collected on the day of eclosion. Proliferation rates of wMelPop2 and wMelPop were not different (p = 0.32). (C) Lifespan of males (solid lines) and females (dashed lines) flies at 25°C. Males were transferred to new vials every five days, while females every four days. (D) Coefficients of a Cox mixed model, representing the effect of wMelPop2 and wMelPop on the lifespan relative to wMelCS_b-carrying flies. wMelPop2 and wMelPop was equally pathogenic (p = 0.29).

S13 Fig. Octomom copy number dynamics throughout fly development and during adult life.

Relative copies of WD0513 throughout D. melanogaster development (A) and during adult life (B). WD0513 relative copy numbers were determined in samples shown in Fig 4 (for panel A) and Fig 3 and S10 Fig (for panel B). WD0513 copies were normalized to that of 0–1 old wMelCS_b-infected males. (A) Vertical dashed lines separate developmental stages (i.e. eggs, larvae, pupae, and adults). The x-axis is not in a continuous scale. (B) The two replicates are represented by different symbols.

S14 Fig. wMelPop2 and wMelOctoless are pathogenic to both males and females.

Lifespan of D. melanogaster males at 18°C (A), 25°C (B), and 29°C (C). Survivorship was determined as in Fig 5. This is a replicate of Fig 5. (D) Survival of D. melanogaster females at 25°C. Survival was determined as in Fig 5, except that females were transferred to new vials every four days. The experiment was performed twice. (E) Coefficients of a Cox mixed model of the lifespan of females relative to Wolbachia-free control. Both replicate experiments were pooled for statistical comparisons. Bars represent the standard error of the coefficient, and letters the statistically significant groups.

S15 Fig. Wolbachia variants, not differences in the host genetic background, are pathogenic.

(A-B) Survival of D. melanogaster females at 29°C. Virgin wMelCS_b-carrying females were crossed with males carrying wMelOctoless or wMelPop2 (with 3 or 8–9 Octomom copies) and vice-versa. The resulting progeny developed at 25°C and was placed at 29°C after adult eclosion. The survival of 50 female progeny, which have the same genetic background but differ in Wolbachia infection, was determined per condition, per replicate. Females were maintained in groups of ten and transferred to new vial every four days. The experiment was performed twice. (C) Coefficients of a Cox mixed model representing the effect of the parental crosses on the survivorship of females. Significance was accessed after p-value correction for multiple comparisons, and significant groups are represented by letters.

S16 Fig. Correlation between Wolbachia-induced phenotypes and bacterial titres or doubling time.

(A-C) Correlation between Wolbachia titre at the day of eclosion and the strength of life-shortening phenotype determined at 18°C (A), 25°C (B), and 29°C (C). The y-axis represents the strength of Wolbachia life-shortening phenotype (estimated using Cox mixed model shown in Fig 5). The x-axis represents the natural log of the relative Wolbachia titre estimated using a linear mixed model. Bacterial titres were normalized to that of wMelCS_b-infected flies (shown in S11 Fig). (D and E) The correlation between the strength of anti-viral protection and Wolbachia doubling time. The y-axis represents the strength of anti-viral protection (estimated using Cox mixed model shown in Fig 6). The x-axis represents Wolbachia doubling time in adults at 18°C (D), or 25°C (E) (shown in Table 1). The Pearson correlation coefficient (r) and its significance (p) are given in each panel. A grey line represents the trend (fit of linear regression). Error bars represent the standard errors of the estimates. None of these correlations were statistically significant and they complement correlations shown in Fig 5 and Fig 6.

S17 Fig. Survival of flies with different Wolbachia variants after challenge with DCV or buffer solution.

(A) Survival of males carrying different Wolbachia variants after a challenge with DCV (A) or a buffer solution (B and C). Fifty 3–5 days-old Drosophila males, per line, were pricked with DCV (10 9 TCID50/ml) or buffer and survival curves were determined at 18°C for 40 days. A is a replicate of Fig 6A, 6B and 6C are controls for these experiments. (D) Coefficients of Cox mixed models of buffer-pricked flies. Both replicates were pooled for statistical analysis. Bars represent the standard error of the estimate, and the letters the statistically significant groups after p-value correction.

S1 Table. Number of F1 females screened for new over-proliferative Wolbachia variants per experimental condition.

wMelCS_b-infected G0 females, raised in control or antibiotic-treated food (12.5 μg/ml), were fed different doses of ethyl-methanesulfonate (EMS) and allowed to lay eggs in individual vials. F1 females were collected as virgins, mated to non-mutagenized males and also allowed to lay eggs individually. F1 females were used for Wolbachia titre measurement when were 10-days old. Number of F1 females tested per experimental condition is shown.

S2 Table. Coverage statistics of the sequencing project.

Coverage statistics (mean and range) of Illumina reads mapped to either Wolbachia or mitochondria of D. melanogaster Release 6 genome sequence (KJ947872.2:1–14,000). Sequencing data of each Wolbachia variants are mapped to own genome assembly (BioProject ID: PRJNA587443), except for Wolbachia in Line 2B and wMelOctoless2 which were mapped to wMelCS_b genome (Accession: CP046924.1). ND–not determined.

S3 Table. Flies infected with new over-proliferative Wolbachia variants did not inherit mutated mitochondria.

Illumina reads on flies infected with different Wolbachia variants were mapped to the mitochondria of D. melanogaster Release 6 genome sequence (KJ947872.2:1–14,000). A summary of the mapping is given in S2 Table. The mitogenome of flies infected with wMelCS_b, wMelOctoless and wMelPop2 was identical. We found an SNP unique to flies infected with wMelCS-like variants (G→A on position 10,793) but absent in flies infected with wMelPop. We confirmed this SNP using Sanger sequencing.

S4 Table. Assembly and annotation statistics.

Wolbachia genomes were assembled using the Unicycler v0.4.8-beta pipeline and annotated using NCBI Prokaryotic Genome Annotation Pipeline v4.10. wMel reference genome (Accession: AE017196.1) is included for comparison purposes.

S5 Table. SNPs and indels between newly assembled wMel and wMel reference genome.

The genome of a newly assembled Cluster III wMel Wolbachia variant (Accession: CP046925.1) was aligned to wMel reference genome (Accession: AE017196.1) using Mauve v2.4.0. All the differences were confirmed via Sanger sequencing.

S6 Table. SNPs and indels between wMelCS_b and and wMel reference genome.

The genome of wMelCS_b (Accession: CP046924.1) was aligned to wMel reference genome (Accession: AE017196.1) using Mauve v2.4.0.

S7 Table. Alignment summary of long reads supporting the amplification of the Octomom region in tandem.

Long reads (MinION, Oxford Nanopore) reads supporting the amplification of the Octomom region in tandem in wMelPop2 (Accession: CP046922.1) and wMelPop (Accession: CP046921.1) genomes. Long reads were mapped to Octomom region using minimap2 v2.17-r941 and the number of Octomom copies determined using blastn v2.8.1+.

S8 Table. Primers used for amplification and quantification of individual Wolbachia genes.

Primers used in this study have been previously described [16,17].

S9 Table. List of primers used to improve Wolbachia draft genomes.

Primers used to amplify and sequence, using Sanger technology, genomic regions containing predicted differences between Wolbachia draft genomes.

S1 Text. Confirmation of the amplification and deletion of the Octomom in wMelPop2 and wMelOctoless, respectively.

The genomes of wMelCS_b, wMelPop2 and wMelOctoless were aligned using Mauve v2.4.0. The three-fold amplification of Octomom in wMelPop2 and its deletion in wMelOctoless were the only difference identified when compared with wMelCS_b.

Watch the video: Genetik Dansk 04u (February 2023).