1. To prepare a bacterial artificial chromosome (BAC) library of Aspergillus fumigatus suitable for complete genome sequencing.
2. To fingerprint 3000 clones from the BAC library and generate a partial map of the genome.
3. To sequence completely 10 physically linked BAC clones, to submit the annotated data to the public databases and to set up a web interface for the display of the genomic information.
How much sequence and genetic organisation data are there for Aspergillus fumigatus?
Aspergillus fumigatus has an estimated genome size of 30-35 Mb. Only ~40 genes have been sequenced and placed on the sequence databases (39 complete genes in the 19/7/98 database updates). All of these genes except two have at least one intron. There are an estimated 10,000-12,000 open reading frames in A. fumigatus, based on the data generated by Incyte for Candida albicans (10,380 ORFs before completing shotgun sequencing). Gene density can be estimated in A. nidulans based on a single cosmid clone of 38,807 bp in which 11 putative ORFs were found (1 gene every 3.5 kb). Although there are 8 chromosomes in all the other Aspergilli studied (A. nidulans, A. niger, A. flavus, A. oryzae), it is not known how many there are in A. fumigatus. We identified 4 bands on CHEF gels (with variation between isolates [unpublished data]) and a recent paper (Yousef et al., 1998) has confirmed the resolution of 4 bands (but not provided a figure). Phylogeny of the Aspergilli based on the small subunit ribosomal DNA, has shown that although the species are very closely related (< 2.5 % distance), A. nidulans is the most evolutionary distant from all the other species (Verweij et al., 1995). Comparison of homologous genes (not including ribosomal RNA genes) that have been cloned from both organisms shows that there is typically about 60% identity (range 47 to 74%) between the 2 species. Other data which show that these two species are closely related include the fact that all of the introns in homologous genes are positionally conserved, the fact that at least one gene cluster is conserved and that there is still a high level of sequence identity between a homologous intergenic region (42%) which contains two promoters.
Why sequence Aspergillus fumigatus?
Aspergillus fumigatus is the most common mould causing infection worldwide. The first infection described in man, an aspergilloma, was in Edinburgh in 1842 (Bennett, 1842) and many cases of invasive disease in non-immunocompromised patients were reported from the UK between 1890 (Wheaton, 1890) and 1947 (Cawley, 1947). These cases and more recent series emphasise that A. fumigatus is a primary, albeit rare, pathogen of man. Allergic disease due to Aspergillus was first described in London in 1952 (Hinson et al., 1952) and the first invasive (and fatal) infection in an immunocompromised patient was described in 1953 in the British Medical Journal in a patient from Gloucester (Rankin, 1953).
The frequency of invasive disease has risen ~14-fold over the 12 years up to 1992, as judged after death in unselected autopsies, and invasive aspergillosis has overtaken candidiasis as the most frequent fungal pathogen found after death in tertiary care hospitals in Europe. Thus 4% of all patients dying had invasive aspergillosis, compared with about 2% with invasive candidiasis (Groll et al., 1996). Patients at risk (frequency of disease) include those with chronic granulomatous disease (25-40%), lung transplant recipients (17-26%), allogeneic bone marrow transplant patients (4-30%), neutropenic patients with leukaemia (5-25%), heart transplant recipients (2-13%), pancreas transplant recipients (1-4%), renal transplant patients in Europe and the USA (~1%) and in India (~10%), and patients with AIDS, multiple myeloma and severe combined immunodeficiency (~4%) (Denning, 1998). Over 500,000 transplants are done annually in the world. Acute leukaemia affects about 3/100,000 of the population and on average each patient receives 3 cycles of chemotherapy, with each cycle being a major risk period. Similar figures are found for high grade lymphoma patients who are also at high risk of invasive aspergillosis. In the industrialised nations alone these figures add up to about 250,000 periods of major risk per year. AIDS cases are predicted to exceed 40 million by the year 2000 which would generate about 1.4 million cases of invasive aspergillosis, although in developing countries most patients will not live long enough to get this disease.
The crude mortality from invasive aspergillosis is around 85% and falls to around 50% if treatment is given (Denning, 1996). The new drugs in trial (voriconazole, etc) may reduce the mortality slightly (~10%) (Denning et al., 1997a), but patients in trials tend to do better than those treated in clinical practice.
In addition to invasive disease, Aspergillus causes a number of other diseases in man. These include aspergilloma ("colonisation" of existing pulmonary cavities), sinusitis in normal people, allergic bronchopulmonary and sinus infections, keratitis (which usually leads to blindness in that eye and is common in the developing world) and postoperative infections in normal people. Aspergilloma numbers are set to rise dramatically because of the increasing problem with tuberculosis and aspergillomas are notoriously difficult to treat. Cavities of 2 cm or larger after tuberculosis subsequently develop aspergillomas in 15-20% of patients (in the UK). The 5 year survival of patients with aspergillomas is about 40%. Allergic bronchopulmonary aspergillosis occurs in patients with cystic fibrosis and asthmatics (an increasing number) causing pulmonary fibrosis and death usually within 10 years of diagnosis.
Aspergillus is also a pathogen of many vertebrates and some non-vertebrates (Smith, 1989). Examples of veterinary disease caused by Aspergillus include pulmonary and airsac infections in many birds, but particularly newly hatched chicks, common crows, mallard ducks, parrots and, in outbreaks, penguins in captivity. Both sinusitis and disseminated infections occur in long-nosed dogs. Bovine abortion, guttural pouch aspergillosis in horses (which causes catastrophic haemorrhage) and disseminated infections in most animals including whales and dolphins are occasional problems of undetermined frequency.
A. fumigatus is the main pathogen in all these infections with the exception of sinusitis which is often caused by A. flavus and otitis externa which is most often caused by A. niger. While A. nidulans is an attractive option for sequencing (genome sequencing is in progress), as an organism A. nidulans is a rare pathogen in man and animals and can therefore only indirectly improve our understanding of aspergillosis. Arguments can be made for sequencing A. niger and A. flavus instead of A. fumigatus. Aside from the disease related arguments which are compellingly in favour of A. fumigatus, there are 3 additional scientific reasons which help to counter the arguments in favour of other species. These are:
1) As A. fumigatus is highly thermotolerant, comfortably grows at 50oC and can resist remarkable extremes of environment (like rocket fuel for example), it is highly possible that useful enzymes can be isolated from it which may be of scientific and commercial interest. Such enzymes are likely to be involved in complex metabolic pathways. One example is a phytase capable of withstanding 100oC for 20 min (Pasamontes et al., 1997).
2) A. fumigatus will have metabolic pathways and genes not present in A. nidulans. One example is the 1,8-dihydroxynaphthalene-melanin anabolic pathway. The gene (arp1) for one enzyme involved in this pathway, scytalone dehydratase, has been cloned in A. fumigatus and no homologue has been detected in A. nidulans (Tsai et al., 1997). Another example is the metabolic product fumigillin which appears to inhibit endothelial cell proliferation and is being looked at as a novel chemotherapeutic agent (Ingber et al., 1990).
3) A. fumigatus is haploid and therefore 'easy' to genetically engineer. The single and multiple gene disruptions that have been done point to a way of assigning function to unknown genes in the future which is technically more straightforward than is the case, for example with Candida albicans.
We have recently described resistance to itraconazole in A. fumigatus (Denning et al., 1997b). Two mechanisms were identified - a change in the P450 target (we are currently cloning the corresponding gene to look for mutations) and probable drug efflux (we have cloned one ABC transporter gene from A. fumigatus and have 2 other clones to examine). Testing (using a validated method) shows that 6% of UK A. fumigatus isolates are resistant to itraconazole and we have over 15 resistant 'wild' or clinical isolates in our collection. Resistance has now been documented in the UK, Sweden, the Netherlands, France and the USA. We have also convincingly demonstrated resistance in vivo to amphotericin B in A. fumigatus (Verweij et al., 1998), but cannot match this with an in vitro test yet. For this reason we do not know how common this phenomenon is. As these are the only 2 licensed drugs active against Aspergillus, the development of resistance is of concern. In addition, A. fumigatus is a major economic problem in the timber industry where wood drying kilns are used. As it survives at high temperature, it tends to grow on the surface of timber and causes a dark green discolouration of the wood which reduces its value. This costs Sweden, for example, millions of krona each year (estimated at 275 million SEK in 1994) and is proving difficult to resolve. Exposure of the timber workers to this huge load of fungal spores leads to a toxic syndrome, akin to extrinsic allergic alveolitis (or farmer's lung) but more severe, which is incompletely understood. One possible explanation is the toxin fumitremorgen produced by the conidia of A. fumigatus (Land et al., 1987).
Other sequencing efforts
Shotgun sequence from the genome of A. fumigatus is currently being generated by Incyte and being sold as part of their Pathoseek database. Genome Therapeutics are sequencing some selected genes in A. fumigatus for Schering Plough and intend to place much of this on the Internet. Privately GlaxoWellcome have decided to generate sequence from the A. fumigatus genome in addition to making a modest financial contribution to the sequencing of chromosome IV (the smallest chromosome) of A. nidulans. We have heard, but this information is unsubstantiated, that Pfizer have already generated sequence from the genome of A. fumigatus and direct questions to the head of their antifungal discovery program have been unrewarding and answers guarded. Thus within 2 years a large amount of sequence will have been generated from A. fumigatus and this information will not be released for public access. We are assuming that these sequencing efforts involve a shotgun approach and will not include any attempt to complete the genomic sequence. Millennium have just announced that they have completed a 5-fold shotgun sequencing of A. nidulans and will make this data openly accessible from July 1999. So far only two cosmid clones from A. nidulans have been sequenced and submitted to the public databases. Both clones are from chromosome VIII and one includes the brlA gene. In addition, a physical map of the 31 Mb A. nidulans genome has been published using overlapping cosmid clones which have been assembled into 49 contigs (Prade et al., 1997).
The general strategy for sequencing A. fumigatus is as follows:
a. Preparing a BAC library.
b. Fingerprinting 3000 BAC clones and assembling into contigs.
c. Fine mapping 10 BAC clones (~1Mb of sequence) representing the sequence on either side of the nitrate assimilation gene cluster.
d. Sequencing the 10 BAC clones to evaluate the library and mapping strategy, and to gain an insight into the gene density and similarity to A. nidulans and S. cerevisiae.
e. Analysing and annotating the complete sequence data.
This pilot project will aim to achieve all of the above. In addition informal contacts will be made with sequencing centres and key scientists to put together an international consortium to facilitate sequencing the rest of the genome.
1. The isolate
As the primary purpose of the sequencing effort is to assist in the understanding of pathogenicity, it is essential to select an isolate of proven and known pathogenicity. Unfortunately the details about almost all isolates in culture collections are paltry and insufficient to be confident about the pathological features of a given infection. It is also essential to select an isolate grown from a sterile source rather than respiratory fluids to ensure that it was not a colonising isolate. Certain features of pathogenicity are important to 'capture', including vascular invasion and lung infarction as well as dissemination to distant organs. For these reasons we have selected AF293. This isolate was grown from the lung of a 57 year old woman from Shrewsbury with aplastic anaemia who died of undiagnosed invasive pulmonary aspergillosis. The aplastic anaemia may have been due to gold therapy for rheumatoid arthritis. Her lung revealed haemorrhagic nodules at autopsy which indicates that there was vascular invasion and infarction. The isolate is susceptible to itraconazole and amphotericin B. Full clinical and pathological details are available. The isolate will be deposited in the National Collection of Pathogenic Fungi, held in Bristol.
2. Preparation of a BAC library
Our consensus is that a BAC library would be the optimal library for sequencing in an organism with a genome size of about 30-35 Mb. The DNA will be prepared in agarose plugs, in the same manner as for CHEF gels, by encapsulating protoplasts in low gelling temperature agarose followed by SDS lysis and proteinase K digestion. The DNA will be incompletely digested for varying times initially with Sau3AI or alternatively with HindIII/EcoRI and then separated in a preparative agarose gel. The resulting large fragments of the appropriate size range (80 to 200 kb) will be cloned into the pBACe3.6 vector, packaged using a P1 extract and transfected into E. coli. This work will be carried out at the Sanger Centre by Michael Quail and Michael Anderson with input as required from Sharen Bowman, Rhian Gwilliam (Sanger Centre) and Philip Glaser (Paris).
The quality of the library will be assessed by checking for the presence and size of the inserts and verifying that 20 random clones contain differing regions of the genome. If there are major problems with construction of the BAC library, then we will use a cosmid library for the mapping and sequencing. This library has been constructed using DNA partially digested with Sau3AI and cloned into pCosHX. The DNA was isolated from strain B-5233, a clinical isolate from a leukaemia patient who died of invasive aspergillosis (Tsai et al., 1997).
3. Mapping of the clones
Once the library quality has been confirmed, 3000 clones will be plated out and grown up in triplicate microtitre plates. DNA will be extracted and subjected to fluorescent fingerprinting with Sau3AI and HindIII enzymes. At the Sanger Centre, all these procedures are automated, including the analysis of the restriction fragments by FPC to generate contigs. These contigs will form the basis for a later total genome mapping project. The clones will also be gridded out robotically onto membranes for hybridisation (see below).
4. Mapping and sequencing of 10 BAC clones
Ten BAC clones will be sequenced in order to verify the BAC library and mapping approach, and to accurately cost any future large scale sequencing project for this organism. We will also derive some data about gene density in A. fumigatus, the frequency of introns and repetitive elements and the level of conservation in overall genetic organisation with A. nidulans. The clones will be selected on the following basis using the published genetical and physical maps of A. nidulans (Prade et al., 1997; Clutterbuck, 1998). The nitrate assimilation gene cluster is located on chromosome VIII (5.0 Mb) in A. nidulans and maps well away from both the centromere and telomere (13 and 30 markers lie between it and the centromere and telomere, respectively). The three genes have been cloned from A. nidulans (crnA, niiA and niaD) (Johnstone et al., 1990; Unkles et al., 1991). These three genes have also been physically mapped to a single cluster in A. fumigatus which is structurally identical to A. nidulans. The sequence of niiA and of the intergenic region between niiA and niaD have been published (Yousef et al., 1998). The brlA genetic marker (which has been cloned) maps to between 4.9 and 9.7 centimorgans from the niaD marker in A. nidulans and lies 450 kb away on a 1000 kb contig. The brlA gene is involved in regulating conidiophore development (Adams et al., 1988) and therefore highly likely to have a homologue in A. fumigatus. The benA genetic marker lies 99 cM from niaD (495 to 2970 kb) and has been physically mapped within 50 kb of the centromere. The gene has been cloned in both A. nidulans and A. fumigatus. In addition, three other genetic loci which lie between benA and brlA, have been cloned in A. nidulans and these will be used as intermediate probes for the mapping. Probes will be prepared from all these genes and used to identify the BAC clones that carry these genes by hybridisation against the gridded 3000 clone BAC library. If the genetic organisation between A. fumigatus and A. nidulans has been maintained in this region, the BAC clones identified will form part of a contig, as determined from the previously generated physical restriction maps. If for whatever reason, no overlap between the benA and brlA markers can be established, then we will walk from the niaD containing contig using the end sequences of the clones as probes against the cosmid library. Once we have a suitable contig around 1 Mb in size, we will sequence the 10 clones which have minimal overlaps. It is likely that the sequence will contain 250 to 350 coding regions, almost all previously unknown in A. fumigatus. If necessary, other cloned A. fumigatus genes which have a genetically and physically mapped A. nidulans homologue will also be used as probes to pull out clones from the library to act as starting points for chromosome walking. Such genes include pyrG which maps to chromosome I and rodA which maps to chromosome III.
4. Sequence analysis
In line with Beowulf and Sanger Centre policy on data release, all the raw sequence information will be released directly onto the internet the evening the information is generated. Finished sequence for each BAC will be analysed and annotated at the Sanger Centre and placed in the public databases. Analysis at the Sanger Centre will be done with a comprehensive set of tools used for the analysis of the Plasmodium falciparum, S. cerevisiae, S. pombe and bacterial genome sequences. The sequences will be searched for coding regions, introns and other features such as repetitive elements, and comparisons will be made against DNA, protein and protein motif databases to assign the initial putative function for coding regions. These analyses will be brought together using DIANA and AceDB to generate an annotated completed sequence for each BAC clone which will be submitted to the EMBL database. All of our unfinished and finished sequence data and annotation will be accessible from our website, both for similarity (BLAST) searching and for download.
The sequences will be further analysed in Manchester by Michael Anderson in collaboration with Andy Brass who leads the Bio-informatics Group at the University of Manchester. These analyses will build on work being developed jointly with computer science on automatic cDNA analysis. The novel feature of this system is the use made of keywords extracted from all the relevant sequence matches to help inform the functional prediction. In addition, we will build on our work with Stephen Oliver on the yeast genome information management system (the GIMS project) and we will use the very rapid comparative genome analysis software developed within the lab (the RAPID system of Crispin Miller - submitted to the Bioinformatics journal). These tools will be used to perform comparative genome analysis between Aspergillus species and with S. cerevisiae as an additional means for decorating the A. fumigatus sequence with relevant functional information.
The results from the various stages will be collated and these collected data for each coding region and other features of interest, including relevant literature references, will be brought together in a relational database (Sybase) which will be accessible from the world-wide web. At this stage a new website will be set up with links to other sites such as the Aspergillus Website (www.aspergillus.ac.man.uk) and other fungal genome databases. This site will form the base for collating the data from the whole genome project should this go ahead in the future.
The Aspergillus scientific community.
Who will benefit from and be able to utilise this sequence data? There are, of course a large number of scientists working on A. nidulans all over the world and these groups would benefit from the information on A. fumigatus, not only for purposes of comparison, but also because genetic engineering may allow the transformation of an A. fumigatus gene into A. nidulans for industrial or experimental purposes. The same can be said for the smaller A. niger, A. oryzae and even smaller A. terreus communities.
There is a small but growing pool of scientists working on A. fumigatus itself. These include in the UK, D. Holden (London), N. Gow (Aberdeen), S. Kelly (Aberystwyth), D. Adams (Leeds), R. Hay and A. Hamilton (London), J. Burnie (Manchester), K. Reid (Oxford) and a number of others. In Europe there are a large number of active scientists, including M. Monod (Lausanne), J-P. Latgé (Paris), R. Ruchel (Goettingen), J. Martinez (Valencia), F. Leal (Salamanca), G. Tronchin (Angers), J. Bastide (Montpellier), R. Grillot (Grenoble), Z. Erjavec and H. Kaufmann (Groningen), P. Verweij and W. Melchers (Nijmegan), C. Cremeri and A. Schaffner (Zurich) as well as a number of clinicians with a keen interest in the disease. Relatively fewer scientists and clinicians are active in the field in the USA and because of this the NIH recently put out a special call for proposals in the areas of aspergillosis and drug resistance in fungi. There is greater expertise generally in the USA in the allergic manifestations of aspergillosis than in the invasive and immediately life-threatening forms.
In addition to the academic scientific community there will be real interest in the data from the pharmaceutical industry. Many firms are actively seeking new antifungal drugs, partly because of the emergence of resistance and partly because of the inadequacy of current therapy, particularly for invasive aspergillosis. Companies active in this area include GlaxoWellcome, Zeneca, Pfizer, Merck, Lilly, Johnson and Johnson (Janssen), Roche, Bristol Myers Squibb, Uriach, Aronex, Phytera, Xoma and a number of other smaller concerns. Many companies have an agrochemical research programme for which the A. fumigatus sequence will be directly relevant. The worldwide antifungal market is currently in excess of £2.6 billion and the agrochemical market approximately £2 billion.
References
Adams TH, Boylan MT, Timberlake WE. brlA is necessary and sufficient to direct conidiophore development in Aspergillus nidulans. Cell 1988;54:353-362
Bennett JH On the parasitic vegetable structures found growing in living animals. Trans Roy Soc Edinburgh 1842;15:277-279.
Cawley EP. Aspergillosis and the aspergilli. Arch Intern Med 1947;80:423-434.
Clutterbuck J. The Aspergillus nidulans linkage map. http://www.gla.ac.uk/Acad/IBLS/molgen/aspergillus/index.html
Denning DW. Therapeutic outcome of invasive aspergillosis. Clin Infect Dis 1996;23:608-615.
Denning DW. Invasive aspergillosis (State-of-the-art clinical article). Clin Infect Dis 1998;26:781-805.
Denning DW, del Favero A, Gluckman E, Rhunke M, Yonren S, Troke P, Sarantis N. The efficacy and tolerability of UK 109-196 (Voriconazole) in the treatment of invasive aspergillosis (IA). Abstr F552 ISHAM 1997a; Parma, Italy 8-13 June.
Denning DW, Venkateswarlu K, Oakley K, Anderson MJ, Manning NJ, Stevens DA, Warnock DW, Kelly SL. Itraconazole resistance in Aspergillus fumigatus. Antimicrob Ag Chemother 1997b;41:1364-1368.
Groll AH, Shah PM, Mentzel C, Schneider M, Just-Neubling G, Huebling G, Huebner K. Trends in the postmortem epidemiology of invasive fungal infections at a university hospital. J Infect 1996; 33:23-32
Hinson KFW, Moon AJ, Plummer NS. Broncho-pulmonary aspergillosis. A review and a report of eight new cases. Thorax 1952;7:317-333.
Johnstone IL, McCabe PC, Greaves P, Gurr SJ, Cole GE, Brow MAD, Unkles SE, Clutterbuck AJ, Kinghorn JR, Innis MA. Isolation and characterisation of the crnA-niiA-niaD gene cluster for nitrate assimilation in Aspergillus nidulans. Gene 1990;90:181-192.
Ingber D, Fujita T, Kishimoto S, Sudo K, Kanamaru T, Brem H, Folkman J. Synthetic analogues of fumigillin that inhibit angiogenesis and suppress tumour growth. Nature 1990;348:555-7.
Land CJ, Hult K, Fuchs R, Hagelberg S, Lundstrom H. Tremorgenic mycotoxins from Aspergillus fumigatus as a possible occupational health problem in sawmills. Appl Environ Microbiol 1987;53:787-90.
Pasamontes L, Haiker M, Wyss M, Tessier M, van Loon APGM. Gene cloning, purification, and characterization of a heat-stable phytase from the fungus Aspergillus fumigatus. Appl Environ Microbiol 1997;63:1696-1700.
Prade RA, Griffith J, Kochut K, Arnold J, Timberlake WE. In vitro reconstruction of the Aspergillus (=Emericella) nidulans genome. Proc Natl Acad Sci USA 1997;94:14564-14569
Rankin NE. Disseminated aspergillosis and moniliasis associated with agranulocytosis and antibiotic therapy. Br Med J 1953; April 25:918-919.
Smith JMB. Opportunistic Mycoses of Man and other Animals. CAB International, Wallingford, Oxon, UK 1989, pp81-114.
Tsai H-F, Washburn RG, Chang YC, Kwon-Chung KJ. Aspergillus fumigatus arp1 modulates conidial pigmentation and complement deposition. Mol Microbiol 1997;26:175-183
Unkles SE, Hawker KL, Grieve C, Campbell EI, Montague P, Kinghorn JR. crnA encodes a nitrate transporter in Aspergillus nidulans. Proc Natl Acad Sci USA 1991;88:204-208.
Verweij PE, Meis JFGM, Vandenhurk P, Zoll J, Samson RA, Melchers WJG. Phylogenetic relationships of five species of Aspergillus and related taxa as deduced by comparison of sequences of small subunit ribosomal RNA. J Med Vet Mycol 1995;33:185-190.
Verweij PE, Oakley KL, Morrissey J, Morrissey G, Denning DW. Efficacy of LY303366 against amphotericin B "susceptible" and "resistant" A. fumigatus infection in a murine model of invasive aspergillosis. Antimicrob Ag Chemother 1998;42:873-878.
Wheaton SW. Case primarily of tubercle, in which a fungus (aspergillus) grew in the bronchi and lung, stimulating actinomycosis. Path Trans 1890;41:34-37.
Yousef GA, Moore MM. Mapping of the nitrate-assimilation gene cluster (crnA-niiA-niaD) and characterization of the nitrite reductase gene (niiA) in the opportunistic fungal pathogen Aspergillus fumigatus. Curr Genet 1998;33:206-215