MOLECULAR EPIDEMIOLOGY AND THE NEW TUBERCULOSIS Peter M. Small(1) and Andrew Moss(2) >From 1-Department of Medicine, Division of Infectious Diseases and Geographic Medicine and Howard Hughes Medical Institute, Stanford University, and 2 Department of Epidemiology and Biostatistics, University of California, San Francisco. Address correspondence to: Peter M. Small, M.D. Room 251, Beckman Center Stanford University Stanford, CA 94305 Phone (415) 725-7908 Fax (415) 723-1399 Dr. Small is supported by the Howard Hughes Medical Institute and NIH Grant KO8 AI01137-01. Dr. Moss is supported by a grant from the Kaiser Family Foundation. Running Title: Molecular Epidemiology of Tuberculosis ASTRACT DNA fingerprinting techniques now exist which identify specific strains of M. tuberculosis. These techniques may be integrated with conventional epidemiolgic approaches to better understand tuberculosis in its modern form. This paper reviews the lessons learned from this approach about the pathogenesis and epidemiology of tuberculosis. In addition, it speculates about the potential future applications of molecular epidmeiology, including it use as an adjunct to conventional public health measures. KEY WORDS: Tuberculosis, epidemiology, RFLP, drug resistance, molecular epidemiology. 1.INTRODUCTION Molecular epidemiology is the integration of molecular-biologic techniques, which identify specific strains of organisms, with epidemiology, which seeks to understand the distribution of disease in populations. Here we discuss the application of molecular epidemiology to tuberculosis. Because of the impact on tuberculosis of HIV, it has been suggested (1,2) that this disease of antiquity has become a "new entity" with a new epidemiology, perhaps requiring a new approach to case-finding and control. We will examine what molecular epidemiology has suggested about this "new" tuberculosis and speculate on the potential of molecular epidemiology for confronting tuberculosis in its modern form. 2.THE NEW TUBERCULOSIS Between 1985 and 1991 the number of tuberculosis cases reported in the US rose from 22,201 to 26,283 cases, reversing a historic decline (3). In New York city the incidence of tuberculosis more than doubled over the last decade. In the two New York health districts with the highest tuberculosis incidence, the 1991 rates were 219 and 124 per 100,000 population, comparable to those in the pre-antibiotic era, or present-day Tanzania (4). New York City is the center of an even more disturbing phenomenon, the emergence of M. tuberculosis strains which are resistant to antimicrobial agents. These include multiple-drug resistant (MDR) strains which are resistant to several antibiotics. In NYC 19% of all prevalent strains in mid-1991 were resistant to both isoniazid and rifampin, as were 44% of all strains from previously treated patients (5). Inattention to infection control, delays in suspicion for and identification of drug resistance, and inefficient therapy have permitted the dissemination of these strains in institutional settings, resulting in over 250 active cases identified in outbreaks. The response to therapy in patients infected with MDR strains is often poor even with the best available therapy. In a review of 171 patients infected with strains resistant to at least isoniazid and rifampin, the overall response rate was only 56% despite a mean of 51 months of treatment (6). Although MDR-TB has only been seen sporadically outside NYC, the conditions which spawned it are present globally. HIV has a profound effect on the pathogenesis of tuberculosis. Infection in AIDS patients progresses very rapidly to active disease (7). In patients who are latently infected with M. tuberculosis, co-infection with HIV dramatically increases the risk of progression to active disease, so that the apparent risk of a co-infected person developing active disease over 2 years exceeds the lifetime risk in HIV seronegatives (8). It has been speculated that late-stage HIV patients may also be at additional risk of infection, as well as risk of rapid progression (7). Finally, AIDS patients do not maintain protective immunity following infection and thus remain susceptible to exogenous reinfection (9). As a result of these factors infected HIV seropositive individuals can rapidly become active cases and disseminators themselves, accelerating the epidemic process. The cumulative effect of HIV has been described by Phil Hopewell as "telescoping" the natural history of tuberculosis (1). The synergy between HIV infection and the social conditions which promote tuberculosis is probably the largest single factor fueling the development of a "new" tuberculosis. These social conditions include increased immigration, the rise of poverty and homelessness, intravenous drug use, and the breakdown of tuberculosis control systems (1,2,10). The crowding which results from poverty and homelessness has long been known to contribute to the spread of tuberculosis (11) and is common in the same geographical areas as HIV infection. In New York, the documented failures of the tuberculosis control system (12,13) clearly interact with both HIV infection and poverty to increase the spread of disease. Perhaps the main result of the interaction of HIV and M. tuberculosis has been to increase the proportion of tuberculosis attributable to new infection rather than reactivation of latent infection acquired early in life. It has been estimated that between 1985 and 1991, 13,700 cases of tuberculosis were the result of active transmission in hospitals, homeless shelters, prisons and other communal living situations (10). Thus, in the United States, tuberculosis prevention must look increasingly towards identifying the biologic, social and environmental factors which foster tuberculosis transmission. 3.MOLECULAR EPIDEMIOLOGY OF TUBERCULOSIS Tuberculosis epidemiology has been hindered by the lacked of a method for determining if bacterial isolates from different patients have a common origin, i.e. are the same strain or clone. Such a relationship could be inferred if the isolates share enough phenotypic and genotypic traits, and is most convincingly demonstrated when these traits are known to be stable. One phenotypic trait which shows great promise as an indicator of clonality in M. tuberculosis is diversity in the copy number and genomic location of a DNA sequence designated IS6110. Restriction fragment length polymorphism (RFLP) analysis is a "DNA fingerprinting" technique which exploits this diversity to generate strain specific patterns (14). The technique involves extracting the DNA from a M. tuberculosis culture, digesting this DNA with a restriction endonuclease, separating the resulting fragments according to size using agarose gel electrophoresis, "Southern" blotting these fragments to a membrane and probing this to identify which fragments contain the IS6110 element. For each strain analysed this process yields a unique "fingerprint" pattern with 2 to 20 bands. Laboratory procedures have been standardized among American and European laboratories using this technique (15) facilitating the comparison of results obtained in different laboratories. Software for pattern storage and comparison has been developed which permits the automated storage, analysis, and comparison of large numbers of isolates (Figure 1). The stability of the fingerprint has been demonstrated by examining M. tuberculosis isolated from chronic secreters and in outbreak settings. Repeat isolates from individuals who persistently excrete M. tuberculosis have been shown to have identical DNA fingerprints (16). Isolates from individuals who are epidemiologically linked in point source outbreaks also have the same fingerprint (7,14,17,18). Thus there is increasing consensus that RFLP patterns are a reliable indicator of strain identity. However, this may not be true with strains which have only a few copies of IS6110. Preliminary data from several laboratories (J. van Embden, D Alland personal communications; P.Small, unpublished data) suggest that patients who are epidemiologically unrelated can have isolates with the same single-band pattern. This is probably because in such strains, the single copy of IS6110 is located in one specific region of the genome (the "integration hotspot") (19). Although this has only been demonstrated for strains with single copies of IS6110, it is speculated that similar nonspecific results may be encountered with 2 or 3 copies. In fact, it may be that the reliability of RFLP to identify a strain may increase with the number of copies of IS6110. In addition, the DNA fingerprints generated by RFLP typing are not absolutely stable due to genetic rearrangements in the bacteria. The addition or loss of a single band has been demonstrated in chronic secretors (20,21) and in the outbreak setting (7). In practice however, related isolates which have patterns which differ by a single band can be identified. It is not surprising that there is instability in the RFLP pattern given that the repetitive element from which the patterns are generated has the characteristics of a mobile genetic element. The transpositional frequency of this particular element appears to be rare in comparison with the time-scale of most point-source outbreak investigations, so that RFLP patterns are unlikely to change significantly in such settings. However, this phenomenon may decrease the aplicability of RFLP typing in tracking strains over extended periods of time. Because the RFLP technique can only be performed on microgram quantities of DNA, its use is currently limited to culture-positive cases. Thus strains cannot be tracked in patients who have been infected but have not developed disease, i.e. PPD converters. In addition, because the technique requires extraction of DNA from viable cultures, it is time consuming and can only be performed in specialized bio-safety laboratories. Other IS6110 based techniques which utilize DNA amplification with the polymerase chain reaction are being developed which may circumvent some of these problems (22,23). 4.WHAT HAS MOLECULAR EPIDEMIOLOGY TAUGHT US? Outbreak investigation. RFLP typing has become a routine tool in investigating suspected institutional outbreaks. It has been used to identify clonal outbreaks in a congregate living facility (77, a shelter (24), prisons (25,26,27), and multiple hospitals (17,18,28,29). Because the epidemiological and molecular investigations have been consistent in these outbreaks, there is increasing consensus that (1) RFLP identity of strains seen in these outbreaks denotes infection with the same strain, (2) infection with the same strain usually results from infection from a common source, and (3) the common source infection is usually recent. It must be emphasized however that these remain unproven assumptions and that the imputation of a recent common source requires epidemiological confirmation. In addition to the individual outbreak investigations, RFLP typing has shown that one strain was involved in at least three different CDC-identified MDR outbreaks in New York, and demonstrated additional occurrences of the same strain in at least twelve other New York hospitals (17,30,31). Resistance patterns (typically resistant to all five first line drugs plus kanamycin and sometimes ethionamide) were very similar in at least 100 individuals infected with this strain in 1991-92 ( B. Kreiswirth, personal communication). This suggests the potential use of RFLP typing for community wide monitoring of particularly resistant strains. Several different approaches to the use of fingerprinting should be noted, depending on the kind and extent of epidemiologic or microbiologic information available. The New York and Florida MDR outbreaks reported by the Centers for Disease Control (28) were suspected because unusual drug resistance patterns had been noted; in these instances RFLP typing served to show a common strain which, in combination with epidemiological information, demonstrated hospital transmission. A similar conclusion might have been reached on the basis of sensitity typing alone. In the San Francisco congregate housing outbreak, RFLP typing confirmed transmission in settings where surveillance alone suggested transmission (7). In settings where population based surveillance is conducted, DNA fingerprinting may even detect epidemiologically unsuspected transmission without any prior suspicion of transmission. For example in San Francisco, the strain from a non-compliant alcoholic who was smear positive for 4 months has now been identified in 10 others. Though direct transmission has not been demonstrated epidemiologically, the demographics of these patients make transmission plausible (P. Small, unpublished data). The latter case raises the question of the relationship between RFLP typing and conventional TB epidemiology. Under what circumstances can an epidemiologic relationship be inferred from RFLP typing? It is possible that epidemiologically unrelated cases have been infected with different strains which coincidently have the same RFLP pattern. On the other hand, apparently unrelated cases may actually be related, e.g. by casual transmission. At this point in our understanding of these issues, RFLP typing should probably not be used alone except in a hypothesis-generating mode. In most circumstances there should be some prior epidemiological strategy before large numbers of samples are routinely RFLP tested. Biology of the new tuberculosis: rapid progression. The San Francisco residential facility outbreak provided the first conclusive demonstration of the accelerated progression of tuberculosis in HIV infected individuals (7). In this outbreak, 37% of AIDS patients who were close contacts of a smear-positive case developed active disease within four months of their exposure. Similar conclusions follow from the MDR outbreaks in New York and Florida. In fact the use of RFLP typing to identify outbreaks is highly dependent on the phenomenon of accelerated progression, and hence on HIV infection. The view through the RFLP window is primarily a view of those persons with advanced HIV disease who are co-infected with M. tuberculosis. This suggests that the outbreaks detected amongst late stage HIV-infected persons are likely to be followed increasingly by subsequent cases of the same strain in HIV-negative persons. Biology of the new tuberculosis: Exogenous reinfection. RFLP analysis has shown that AIDS patients do not develope sufficient immunity from tuberculosis to prevent re-infection from subsequent exposure. This was demonstrated in a group of New York AIDS patients whose apparent relapse following initial improvement during or after anti-tuberculosis treatment was actually due to exogenous re-infection with a second strain of M. tuberculosis(9). While there appears to be no doubt that highly immunocompromised patients can be exogenously reinfected with new strains of M. tuberculosis, the frequency with which this occurs remains undetermined. In addition, the possibility that, with sufficient exposure, exogenous reinfection can occur in immunocompetent persons remains unexplored. 5.WHAT CAN MOLECULAR EPIDEMIOLOGY TEACH US IN THE FUTURE? Extent of recent infection. RFLP typing may allow us to estimate the relative contribution of newly acquired versus reactivated latent infection in a population. Under the assumptions of section 4 above, each group of patients whose isolates have the same RFLP type (which we refer to as a "cluster") may include one reactivated index case, but the other cases represent recently acquired rather than reactivated disease. Thus the proportion of the total incident cases that are in clusters is an estimate of the proportion resulting from recent infection (an underestimate because late-developing cases are ignored). Preliminary data from population-based screening in San Francisco demonstrate clustering in 33% of incident isolates examined, suggesting that a relatively high proportion of tuberculosis in San Francisco was recently acquired. Clustering was more prevalent in cases diagnosed in high-incidence census tracts and was associated with risk factors for HIV (P. Small, unpublished data). Resistance. The relationship between clustering and drug susceptibility patterns may also be used to examine primary and acquired drug resistance. Primary resistance, caused by clonal dissemination of resistant strains, will give rise to limited RFLP diversity among drug resistant strains in a population. Acquired drug resistance, arising from inadequate treatment of individual patients, will give rise to a high proportion of unique RFLP patterns. Most drug-resistant strains in San Francisco, where resistance is relatively rare and occurs primarily in elderly immigrants, have unique RFLP patterns. In New York, where drug resistance is believed to be most common in relatively young, HIV-infected persons, there may be dominant resistant strains (P.Small, unpublished data, and B.Kreiswirth, personal communication). Risk factors for tuberculosis If "clustering" is demonstrated to be a valid proxy for recent infection, then a formal multivariate comparison of cluster and noncluster cases in terms of risk factors for tuberculosis such as drug use, homelessness, poor compliance and so on will allow us to estimate the importance of each of these factors on recent transmission of tuberculosis, and hence its importance for prevention of the new tuberculosis. However, it should be remembered that this approach is based on active disease and not infection, thus will overestimate the importance of rapid progressors. In particular, the telescoping of tuberculosis in HIV infection strongly biases this approach towards the importance of HIV-related risk factors. Phenotypic characteristics of strains. The most speculative, but perhaps most intriguing application of RFLP typing is to identify phenotypic vbariability between strains. Strain variability in drug resistance has profound implications for patient care and public health. Also, animal models have suggested that certain strains, such as those from Southern India, are less virulent. It is possible that some strains have specific tissue tropism, and are, for example, predisposed to cause scrofula or meningitis. The biologic and epidemiologic comparison of individual strains may make it possible to identify such phenotypes in human populations. This will assist in the identification of genetic mechanism such as genes encoding for adhesion or invasion of tissues which confer greater transmissibility. IMPLICATIONS FOR PUBLIC HEALTH RFLP typing is clearly a valuable technology for tuberculosis control. It is only one technique however, and it should not compete with standard control approaches for resources. Its major limitations are its dependence on culture, and thus its restriction to culture-positive disease and inability to address the epidemiology of asymptomatic infection. Ironically, this new molecular technique suggests that much current tuberculosis is due to active transmission, re-emphasizing the importance of traditional TB-control activities like aggressive casefinding and rapid treatment. However, molecular techniques should be integrated into these activities both to assist in detecting point source outbreaks and as a tool for examining tuberculosis control. Molecular techniques can be used to gain a more sophisticated understanding of some the traditional questions in tuberculosis epidemiology. Who is spreading tuberculosis and where is it being spread? What activities confer a risk of infection and how can risk be reduced? What are the risk factors for progression to active disease and do current chemoprophylaxis recommendations include all those at high risk? If molecular pathogenic determinants of M. tuberculosis are discovered, is it possible that these be factored into tuberculosis control? For example, should there be more contact investigation on strains which have an invasion gene which enhances transmissibility and less effort made to trace contacts of patients infected with strains which are known to cause only lymphadenopathy? Finally, why is MDR-TB emerging in certain communities, and what is the fate of individual MDR strains? Molecular epidemiology can contribute to answering these question, all of which have important implications for tuberculosis control. ACKNOWLEGEMENTS: We are indebted to the many thoughtful discussions with Gary Schoolnik, Phillip Hopewell and Tony Paz, and the San Francisco General Hospital Mycobacterial Disease Research Group. FIGURE LEGEND: Software has been developed for automated analysis and comparison of large numbers of RFLP results. This is a photograph of the computer screen showing the digitized image of the autoradiograph (left) indicating the band position (marked with fine dashes). The "window" insert represents a dendrographic representation of the degree of similarity between isolates from this and other gels. 1 Hopewell PC. Impact of human immunodeficiency virus infection on the epidemiology, clinical features, management and control of tuberculosis. CLin Infect Dis 1992:18;540-546 2 Snider DE, Roper WL The New Tuberculosis. N Engl J Med 326:1992: 703-5 3 CDC, Morbidity and Mortality Weekly 1992, 41; 14 p 240. 4 Styblo K. The Impacat of HIV infection on the global epidemiology of tuberculosis. Bull Int Union Tuberc Lung Dis 1991;61:27-32 5 Frieden TR, Sterling T, Pablos-Mendez A, Kilburn JO, Cauthen GM and Dooley SW. The Emergence of Drug-resistant tuberculosis in New York. NEJM 328;1993: 521-26 6 Goble M, Iseman MDR, Madsen LA et. al. Treatment of 171 patients with pulmonary tuberculosis resistant to isoniazid and rifampin. N Eng J Med 1993; 328:527-32. 7 Daley CD, Small PM, Schecter GS, Schoolnik GK, McAdam R, Jacobs R, Hopewell. An outbreak of tuberculosis with accelerated progression among persons infected with the human immunodeficiency virus: An analysis using restriction-fragment-length polymorphisms. N Engl J Med 1992; 326: 231-235. 8 Selwyn PA, Hartel D, Lewis VE et al. A prospective study of the risk of tuberculosis among intravenous drug users with human immunodeficiency virus infection.New Eng J Med 1989:320;545-50 9 Small PM, Shaefer RW, Hopewell PC, Singh SP, Murphy M, Desmond E, Sierra M, Schoolnik GK. Exogenous reinfection with Multidrug-resistant Mycobacterium tuberculosis in patients with advanced HIV infection. N Eng J Med. 1993; 328: 1137-44. 10 Bloom BR and Murray CJL. Tuberculosis, commentary on a reemergent killer. Science 1992;257:1055-63 11 Ransome A. Researches On Tuberculosis: The Webber-Parkes Prize Essay. London: Smith, Elder and Company, 1898. 12 Brudney K and Dobkin J. Resurgent tuberculosis in New York City. Human immunodeficiency virus, homelessness and the decline of tuberculosis control programs. Am Rev Respir Dis. 1991;144:745-49 13 Brdney K and Dobkin J. A tale of two cities: tuberculosis control in Nicaragua and New York City. Sem REsp In 1991;6:261-72 14 Hermans PW, van Scoolingen D, Dale JW, Schuitema ARJ, McAdam RA, Catty D, van Embden JDA. Insertion element IS986 from M. tuberculosis: A useful tool for diagnosis and epidemiology of tuberculosis. J Clin Micro; 1990: 28 (9): 2051-2058. 15 Van Embden JDA, Crawfaord JT, Dale JW et al. Strain Identification of Mycobacaterium Tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. 16 Otal E, Martin C, Vincent-levy-Frebault V, Thierry D, Gicquel B. Restriction fragment length polymorphism analysis using IS6110 as an epidemiologic marker in tuberculosis. J Clin Micro 1992;29:1252-3. 17 Edlin BR, Tokars JI, Grieco MH et al. An outbreak of multidrug resistant tuberculosis amonghospitalized patients with the acquired immunodeficiency syndrome. N. Eng J Med 1992;325:1514-21 18 Fischl M, Uttamchandani R, Daikos GL et al. An outbreak of tuberculosis caused by multiple-drug-resistanat tubercle bacilli among patients with HIV infection. Ann Intern M ed 1992;117:177-182 19 Hermans PW, van Soolingen D, Bik EM, De Haas PEW, Dale JW, van Embden JDA. Insertion element IS987 from Mycobacterium bovis BCG is located in a hot-spot integration region for insertion elements in Mycobacterium tuberculosis complex strains. Infection and Immunity 1991; 59: 2695-2705. 20 Small PM, Shaefer RW, Hopewell PC, Singh SP, Murphy M, Desmond E, Sierra M, Schoolnik GK. Exogenous reinfection with Multidrug-resistant Mycobacterium tuberculosis in patients with advanced HIV infection. N Eng J Med. 1993; 328:1137-44. 21 Das S, Chan SL, Allen BW, Mitchison DA, Lowrie DB. Application of DNA fingerprinting with IS986 to sequential mycobvacterial isolates obtainted from pulmonary tuberculosis patients in Hong Kong before, during and after short-course chemotherapy. Tubercl;e and Lung Disease 1993; 74: 47-51. 22 Palittapongarnpim P, Chomyc S, Fanning A, Kunimoto D. DNA fingerprinting of Mycobacterium tuberculosis isoaltes by ligation-mnediated polymerase chain reaction. Nuc Acids Res 1993;21: 761-762. 23 Haas WH, Butler WR, Woodley CL, Crawford JT. Mixed-linker polymerase chain reaction: A new method for rapid fingerprinting of isoaltes of the Mycobacterium tuberculosis complex. J Clin Micro 1993: 31; 1293-1298. 24 Alland D, Brudney K, McAdam R et al. An outbreak of tuberculosis in a New York City mens shelter. ABstract 47 D. WOrld Congress on Tuberculosis. Washington DC 1993 25CDC Transmission of Multidrug-resistant tuberculosis among immunocompromised persons in a correctional system. New York 1991. MMWR 1992;41:507-9 26 Valway S, Papania M, Richards S et al. Multi-drug resistant tuberculosis (MDR-TB) in New York State correctional facilities 1990-91. Abstract 54 b. World Congress on Tuberculosis. Washington DC 1992. 27 Dooley SW, Villarino ME, Lawrence M, et al. Nosocomial transmission of tuberculosis in a hospital unit for HIV-infected patients. JAMA 1992;267:2632-4. 28 CDC. Nosocomial transmission of multidrug resistant tuberculosis among HIV-infected persons, Florida and New York, 1988-1991 MMWR 1991;40:585-91 29 Alfalla C, Hewlett D, Horn D et al. An outbreak of multidrug resistant tuberculosis among 28 patients at a New York City hospital. Abstract No 240724. ICAAC. Anahem 1992 30 Kent ,J, Valway S, Onorato I. Epidemiologically linked outbreaks of multidrug-resistant tuberculosis (MDRTB) , New York State, 1990-92. Abstract 51B. World Congress on Tuberculosis. Washington DC, 1992 31 Kabus D, Kreiswirth B, Hanna B et al. DNA fingerprinting of tuberculosis in New York City. Abstract 26B. World Congress on Tuberculosis. Washington DC 1992.