The Results of an Independent Peer Review Evaluation Coordinated by the Interagency
Coordinating Committee on the Validation of Alternative Methods (ICCVAM)
and the National Toxicology Program Center for the Evaluation of Alternative Toxicological
Methods (NICEATM).
A Test Method for Assessing the Allergic Contact Dermatitis Potential of Chemicals/Compounds |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
LLNA Report Table of Contents
List of Tables
ACD Allergic Contact Dermatitis AOO Acetone-Olive Oil BA Beuhler Assay CAS Chemical Abstracts Service cRT-PCR Competitive Reverse Transcriptase-Polymerase Chain Reaction CV Coefficient of Variation DMF N, N-Dimethyl formamide DMSO Dimethyl sulfoxide DNCB 2, 4 Dinitrochlorobenzene DPM Disintegrations Per Minute DTH Delayed-Type Hypersensitivity ELISA Enzyme-Linked Immunosorbent Assay FCM Flow Cytometric (Flow Cytometry) FDA Food and Drug Administration GLP Good Laboratory Practice Regulations GPMT Guinea Pig Maximization Test GPT Guinea Pig Tests (Nonstandard) HCA Hexylcinnamic aldehyde HMT Human Maximization Test HPTA Human Patch Test Allergen ICCVAM Interagency Coordinating Committee on the Validation of Alternative Methods IgE Immunoglobin Class E IL-2 Interleukin Type 2 IL-6 Interleukin Type 6 i.v. Intravenous LLNA Murine Local Lymph Node Assay LNC Lymph Node Cells MEK Methyl ethyl ketone NICEATM NTP Interagency Center for the Evaluation of Alternative Toxicological Methods NTP National Toxicology Program PCNA Proliferating Cell Nuclear Antigen PG Propylene glycol PRP ICCVAM Peer Review Panel Evaluating the LLNA SD Standard Deviation SI Stimulation Index SLS Sodium lauryl sulfate SOP Standard Operating Procedures Th1 T-Helper Cell Type 1 Th2 T-Helper Cell Type 2
Peer Review Panel MembersThe following individuals served on the Peer Review Panel that evaluated the LLNA on September, 17, 1998.
AcknowledgementsThe following individuals are acknowledged for their contributions to the peer review process. ICCVAM Immunotoxicology Working Group (IWG)
Agency for Toxic Substances and Disease Registry William Cibulas, Ph.D. Consumer Product Safety Commission Marilyn Wind, Ph.D. Kailash Gupta, Ph.D. Susan Aitken, Ph.D. Department of Defense Harrold Salem, Ph.D. U.S. Army Edgewood Research, Development and Engineering Center Robert Finch, Ph.D. U.S. Army Center for Environmental Health Development Laboratory, Fort Detrick Army Base John M. Frazier, Ph.D. DOD Tri-Service Toxicology Laboratory, Wright-Patterson Air Force Base Department of Energy Marvin Frazier, Ph.D. Department of the Interior Barnett A. Rattner, Ph.D. Department of Transportation James K. OSteen George Cushmac, Ph.D. Environmental Protection Agency Richard Hill, M.D., Ph.D. (Co-Chairperson) Office of Prevention, Pesticides, and Toxic Substances (OPPTS) Angela Auletta, Ph.D. Office of Prevention, Pesticides, and Toxic Substances (OPPTS) Karen Hamernik, Ph.D. Office of Prevention, Pesticides, and Toxic Substances (OPPTS) Hugh Tilson, Ph.D. National Health and Environmental Effects Research/Office of Research and Development (NHEERL/ORD) Food and Drug Administration
David Longfellow, Ph.D. Victor A. Fung, Ph.D. National Institute of Environmental Health Sciences William S. Stokes, D.V.M. (Co-Chairperson) John Bucher, Ph.D. Errol Zeiger, Ph.D. Rajendra Chhabra, Ph.D. National Institutes of Health, Office of the Director Louis Sibal, Ph.D. Christina Blakeslee National Institute of Occupational Safety and Health Douglas Sharpnack, D.V.M. Kenneth Weber, Ph.D. National Library of Medicine Vera Hudson Occupational Safety and Health Administration Surender Ahir, Ph.D. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Return to the top of the page Preface The Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) with support from the National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM) recently sponsored the independent scientific peer review of the validation status of the Murine Local Lymph Node Assay (LLNA), a new test method proposed for assessing the allergic contact dermatitis potential of chemicals. The review was one of the critical components in the ICCVAM process that culminates in achieving regulatory acceptance and implementation of scientifically validated toxicological testing methods. These methods are generally more predictive of adverse human health effects than current methods, and they may be alternative methods that provide for improved animal well-being and that reduce or eliminate the need for animals. These activities were conducted in accordance with public health directives of Public Law 103-43, which directed the National Institute of Environmental Health Sciences to develop and validate improved alternative toxicological testing methods, and to develop criteria and processes for the validation and regulatory acceptance of such methods (NIEHS, 1997). ICCVAM was established as a collaborative effort by NIEHS and 13 other Federal regulatory and research agencies and programs. The purpose of ICCVAM is to coordinate issues within the Federal government that relate to the development, validation, acceptance, and national/international harmonization of toxicological test methods. The Committees functions include the coordination of interagency scientific reviews of toxicological test methods and communication with outside stakeholders throughout the process of test method development and validation. The following Federal regulatory and research agencies and organizations participate in this effort:
The LLNA was proposed to ICCVAM in 1997 as a method that could be used as a stand alone alternative to the Guinea Pig Maximization Test (GPMT) and the Buehler Assay (BA), methods which are currently accepted by regulatory authorities for assessing the allergic contact dermatitis potential of chemicals. The LLNA was proposed by Dr. Frank Gerberick from Procter and Gamble, Dr. Ian Kimber from Zeneca (UK) and Dr. David Basketeer from Unilever (UK). Through interactions with the sponsors, an ICCVAM Immunotoxicity Working Group (IWG) composed of Federal employees assembled information for an independent scientific peer review of the method. The IWG reviewed and appropriately augmented the ICCVAM Test Method Submission Guidelines (ICCVAM, 1998) to provide useful guidance to the test method sponsors on the information needed for the review. The initial submission from the sponsors was reviewed by the IWG and additional information requested. Suggested experts for the peer review panel (PRP) were solicited from Federal agencies and national and international professional societies and organizations. The IWG recommended a PRP composition that would represent a broad range of experience and expertise, including immunotoxicology, clinical immunology, molecular biology, and biostatistics. PRP members were from industry, academia, and government, and included scientists from the US, Denmark, Japan, and Norway. The PRP was charged with developing a scientific consensus on the usefulness and limitations of the new test method for assessing allergic contact dermatitis. In reaching this determination, the PRP was requested to evaluate all available information and data on the LLNA, and to assess the extent to which each of the ICCVAM criteria for validation and regulatory acceptance of toxicological test methods were addressed. The criteria used for the evaluation are described in the document Validation and Regulatory Acceptance of Toxicological Test Methods: A Report of the Ad Hoc Interagency Coordinating Committee on the Validation of Alternative Methods, NIH publication 97-3981 (ICCVAM, 1997). The PRP was provided with guidance for their evaluation (Appendix E), which included questions from the IWG to ensure that the assessment provided adequate information to facilitate ICCVAM and agency decisions on the method. Test method submission materials were made available to the public and a request for public comments was made via a Federal Register Notice (Appendix G) and other announcements. Information was sought regarding the usefulness of the LLNA, including information about completed, ongoing, or planned studies, and other data or information about the LLNA All comments and information submitted in response to the request were provided to the PRP in advance of the review meeting. The PRP met in public session on September 17, 1998, at the Gaithersburg Hilton, 620 Perry Parkway, Gaithersburg, Maryland, and opportunity for public comment was provided during the meeting. PRP members presented their evaluations and proposed conclusions and recommendations on each of the major sections and the PRP subsequently reached a consensus for each section. Following the meeting, the written evaluations, conclusions, and recommendations were consolidated as this PRP Report. Following the peer review meeting, the IWG prepared a proposed test method protocol (Appendix J) that incorporated the recommendations of the PRP into the original test method protocol submitted by the test sponsors (Appendix D). This protocol may be helpful to regulatory authorities that find the method acceptable for their purposes. Additional data analyses prepared by NICEATM for the PRP are also included as appendices in this document, as is the original test method submission. This entire report has been reviewed and endorsed by IWG and ICCVAM. This report along with ICCVAM recommendations on the usefulness of the method will be forwarded by ICCVAM to Federal agencies for their consideration. Federal agencies will determine the regulatory acceptability and applicability of this method according to their statutory mandates, and as deemed appropriate, issue guidelines, guidance documents, or proposed changes in regulations. The work of the PRP was truly a team effort, and their thoughtful and unselfish contributions are gratefully acknowledged. While all members contributed to this evaluation, the exceptional efforts of Dr. Jack Dean, who served as the PRP chair, and Dr. Lorraine Twerdok, who served as executive secretary for the PRP, deserve special recognition. The efforts of the IWG, and especially the IWG Co-Chairs Ms. Denise Sailstad and Dr. David Hattan, were instrumental in assuring a meaningful and comprehensive review that would address regulatory needs. Finally, the efforts of the NICEATM staff to ensure accurate analyses and timely distribution of information for the review, particularly Dr. Raymond Tice and Ms. Karen Haneke, are acknowledged. On behalf of ICCVAM, we thank all of the many individuals who contributed to this report. William S. Stokes, Co-Chair, ICCVAM Richard N. Hill, Co-Chair, ICCVAM Executive Summary For decades, guinea pig assays have been the standard used to assess the allergic contact dermatitis (ACD) potential of chemicals and products. These assays, in highly experienced hands, have considerable credibility, but are subject to false positive and false negative results. Interpretation of the results requires experience and expertise; follow-up testing in humans is sometimes required. In January 1998, the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) received the Local Lymph Node Assay (LLNA) Submission (Submission) from Drs. G. Frank Gerberick (Procter & Gamble, US), Ian Kimber (Zeneca, UK), and David A. Basketter (Unilever, UK) (Sponsors) for peer review. Following the receipt of this Submission, ICCVAM assembled an independent peer review panel (PRP) to evaluate the usefulness of the LLNA for hazard identification of potential human contact sensitizers. The ultimate aim of new ACD assays, such as the LLNA, is to minimize the frequency and severity of sensitization in human populations. Evaluation of the LLNA Submission was separated into seven sections, with three to five PRP members assigned to conduct an in-depth analysis of each section. This report is organized by these sections, as follows: (1) Test Method Description; (2) Test Method Data Quality; (3) Test Method Performance; (4) Test Method Reliability (Repeatability/Reproducibility); (5) Other Scientific Reviews; (6) Other Considerations; and (7) Related Issues. The evaluations from the seven sections are then summarized in Overall Summary Conclusions. This report focuses on the performance of the LLNA, and some of the critical assumptions (i.e., the potency of the standard allergens) have only been evaluated minimally. A public meeting of the PRP took place on September 17, 1998, in Gaithersburg, MD, to reach conclusions and make recommendations regarding the usefulness of the LLNA for hazard identification. In addition to reaching final conclusions on the analysis by section, the PRP also addressed the following two major questions:
In response to the first question, the consensus of the PRP was that the LLNA results, as submitted and supplemented by the Sponsors, demonstrated that the assay performed at least as well as currently accepted guinea pig methods (GPMT/BA) for the hazard identification of strong to moderate chemical sensitizing agents. The data submitted indicate that the LLNA does not accurately predict all weak sensitizers (false negative) and some strong irritants (false positive). The term weak sensitizer is somewhat arbitrary, since the terms weak, moderate, and strong apply to the percentage of animals reacting in the GPMT/BA as described in the published literature or papers submitted by the Sponsors. When comparing the LLNA with currently accepted methods (i.e., guinea pig methods), the LLNA appears to provide an equivalent prediction of the risk for human ACD. The review involved the evaluation of data on 209 chemicals, of which both LLNA and guinea pig data were available for 126 chemicals and both LLNA and human (HMT and HPTA) data were provided for 74 chemicals. An in-depth review of all the chemicals that have been defined in the published literature as human allergens was not conducted for this evaluation. From the analysis generated during the review process, the accuracy of the LLNA vs. GPMT/BA was 89% (N=97), LLNA vs. all guinea pig tests (GPT) was 86% (N=126), the LLNA vs. human data was 72% (N=74), GPMT/BA vs. human was 72% (N=57), and all guinea pig tests (GPT) vs. human was 73% (N=62). In terms of accuracy, sensitivity, specificity, and positive and negative predictivity, the PRP found the performance of the LLNA to be similar to that of the GPMT/BA. Equally important, the performance of the LLNA and the GPMT/BA was similar when each were compared to human data (HMT/HPTA). Performance calculations may be found in Tables 2 and 3 of this report. The PRP also agreed that the LLNA has several advantages over guinea pig methods for the following reasons:
Possible assay weaknesses (e.g., false negative results with some weak sensitizing agents and metals, false positive results with some strong irritants) were identified. It was recommended that these should be evaluated in future workshops. Also, data to support the testing in the LLNA of mixtures was not provided and the evaluation of pharmaceuticals was limited. In response to the second question, the PRP concluded that the LLNA offers several advantages with respect to animal use refinement compared to conventional guinea pig methods in that it involves less pain and distress. The method evaluates the induction phase and not the elicitation phase of the response, which significantly reduces the distress suffered by mice used in the LLNA when compared to guinea pig procedures (GPMT/BA). Furthermore, Freunds adjuvant is not used, and there is a substantial reduction in time required to perform the assay. Animal usage may also be reduced (protocol-dependent). In summary, the PRP unanimously recommended the LLNA as a stand-alone alternative for contact sensitization hazard assessment, provided that the following protocol modifications were made:
Additionally, the PRP recommended that retrospective data audits be conducted on at least three of the intra- and inter-laboratory LLNA validation studies conducted by the Sponsors. The panel commented that as additional experience is gained with the LLNA, there will be an opportunity to refine these interpretations. Further, the PRP concluded unanimously that the LLNA is a definite improvement with respect to animal welfare (i.e., refinement and reduction) over the currently accepted GPMT. The LLNA test as proposed measures lymphocyte proliferation using incorporation of 3H-methyl thymidine in draining lymph nodes of animals topically exposed to the test article. The measured lymphocyte proliferation response is an essential biological element in the induction phase of sensitization. In contrast, currently used guinea pig assays measure skin reactivity to a secondary challenge with the substance under investigation. It may even be argued that for hazard identification, sensitization (the primary immune response) is more relevant than the secondary response (eczematous reaction) of challenged skin. Sensitization is a prerequisite for ACD, and it is sensitization that constitutes the hazard. In a sensitized person, be it a respiratory or contact allergy, an allergic disease manifestation will not always develop upon challenge: there are individual-dependent factors, dose and mode of exposure factors, and adjuvant effects (including irritant potential and substances that increase skin penetration). All of these factors can be considered part of the risk assessment process rather than hazard identification. In the guinea pig models, hazard is combined with a set of defined risk conditions (secondary challenge conditions) and disease-analogous skin responses are measured. Thus, because of its pivotal role and obligatory presence in the process of allergic sensitization, cellular proliferative activity in the lymph node(s) draining the area of skin exposed to the substance under investigation must be considered an important and biologically relevant parameter in relation to contact allergy. In the proposed LLNA, increased levels of radioactive thymidine or uridine incorporation, measured from lymph nodes draining the application site, results from increased proliferation of cells in the lymph node at the time of chemical exposure and of cells that migrate to the lymph node because of the chemical exposure. Thus, there are two mechanisms behind an increased stimulation index with the current protocol: a net influx of lymphoid cells/increase in cell numbers, and an increased proliferative rate. A stimulation index (SI) 3 3 may predominately reflect an increase in cell numbers and/or an increased proliferative activity (per cell) of cells residing in the lymph node. This dual response probably increases the sensitivity of the test, because it measures the additive effect of two biological phenomena. 1.1. Sufficiency of test method and protocol description The Submission contains a thorough protocol. The scientific basis for the test is described as the measurement of the incorporation of 3H-methyl thymidine into lymphocytes in draining lymph nodes of animals topically exposed to the test article, as a measurement of sensitization. The endpoint of interest is stated clearly (SI 3 3). The proposed protocol provides sufficient detail such that appropriately trained personnel should be able to properly conduct independent studies. Dosing procedures, including the preparation and disposal of dosing solutions, are clear. The protocol specifies that the test article be applied to the dorsal aspect of the ear. Dosing only the dorsal aspect of the ear as opposed to splitting the dose between the dorsal and ventral aspect increases the concentration of chemical exposure per surface area. Information is provided on the appropriate choice of vehicles and the selection of doses, including the need to assess for a dose-response relationship. Problems associated with choice of vehicles and concentrations to be tested are discussed in Section III. The range of applications of the method are described in the Submission. It is implied but not directly stated that the method is to be used for low molecular weight organic chemicals and that the assay has not been validated for all metals or larger molecular weight compounds, such as proteins. The majority of the supporting data represents the testing of simple chemicals. One publication was included in the Submission on the testing of pharmaceuticals (Kimber et. al., 1998), although the number of pharmaceuticals tested was limited. The use of the LLNA to assess the skin sensitizing potential of mixtures and extracts was also not addressed in the Submission or by the PRP. Safety issues relating to the handling of chemicals and radioisotopes were well presented. Appropriate forms for record keeping were included as an appendix to the Submission. Acceptable variations in the protocol (e.g., the choice of animal strains, the number of mice per dose group , and the choice of vehicles) are described and prioritized. Although the use of different vehicles is described, the majority of the data presented in the Submission resulted from test articles applied in acetone-olive oil (AOO). The majority of the data was analyzed from pooled animals per group. However, the PRP strongly supports the analysis of data from individual animals. An aspect of the protocol that could cause differences in procedure between laboratories is the description of the lymph nodes to be assayed. These nodes, referred to as the auricular lymph nodes, are a designation for nodes draining the ear. Given that this is not standard anatomical nomenclature, it is possible that different laboratories could be removing different nodes for evaluation. To the best of the reviewers knowledge, there is no specific nomenclature for this set of lymph nodes. The anatomical location (e.g., diagram or photograph) of the auricular lymph nodes would be a beneficial addition to the protocol. Furthermore, it should be noted that locating the proper lymph nodes might be difficult when there is no induction by the test material. It is suggested that inexperienced personnel practice with a known sensitizer until competence is obtained. 1.1.1. Adequacy of agreement between the protocol used to generate Submission data and the proposed protocol Much of the data presented in support of the Submission were collected by following the proposed protocol. In some cases, slight modifications were made. Variations from the protocol included the use of four days of consecutive dosing instead of three; and the use of 125I-iododeoxyuridine as compared to 3H-methyl thymidine. In cases where variations occurred between laboratories in inter-laboratory validation studies, similar results were obtained from modified protocols (Kimber et. al., 1995; Loveless et. al., 1996). Information on variations in the protocol used for each of the chemicals included in the provided LLNA database would have been useful in understanding the total experience with the current "standard" protocol. In most instances, there is no clear rationale for the choice of one modification over another. Having a two-day rest period prior to injecting with 3H-methyl thymidine instead of one day is more convenient in a setting where people are working five-day weeks. There has been much more experience with the use of 3H-methyl thymidine as compared to 125I-iododeoxyuridine in the LLNA. Following discussion, the PRP recommended allowing the use either of 3H-methyl thymidine or 125I -iododeoxyuridine. 125I -iododeoxyuridine has a shorter half-life which results in less cost associated with radioactive waste disposal. 1.1.2. Appropriateness of dose selection procedure The dose selection process as defined by the protocol is based on previous experience in guinea pig tests, structure analysis, and solubility factors. If the LLNA is to be used as a 'stand-alone' assay on new substances, reference to guinea pig tests is inappropriate. Where no information is available, concentrations to be tested should be based on toxicity, solubility, and irritancy. The standard protocol states that three to five concentrations are selected among ten possible dose levels ranging from 0.1% to 100%. The published LLNA tests are usually performed by testing the substance of interest using a minimum of three concentrations. It is crucial to test high concentrations to avoid false negatives. An example of this potential problem is with ethylenediamine (free base) in Table 3 of Assessment of the Skin Sensitization Potential of Topical Medicaments using the Local Lymph Node Assay: An Interlaboratory Evaluation (Kimber et al., 1998). Ethylenediamine would have been classified as nonsensitizing if concentrations of 0.1 to 1.0% had been selected. Strong sensitization responses were observed at concentrations of 5.0 and 10% in AOO. Some other well known allergens require high concentrations to yield a SI _3 (i.e., eugenol, hexyl cinnmamic aldehyde, and penicillin G) (Montelius et al., 1998). For much of the data presented in the Submission, compounds were not tested at the highest possible concentrations and solubility data were not provided. The PRP recommends that a rationale for the selection of vehicle as well as for concentrations tested be included for each test article. Discussion of this issue is included in Section III. No information was provided regarding the need for determination of dermal irritation or acute toxicity data prior to conducting the actual test. If one assumes that irritation is not a confounding issue in the LLNA as it is in the guinea pig assays where the end point is a measurement of erythema and edema, then there are benefits to being able to test higher concentrations of compounds. If one was limited to testing non-irritating concentrations of highly irritating compounds, it is possible that high enough concentrations to reach a sensitizing dose may not be tested, resulting in false negative responses. Although several reports have presented data where exposure to highly irritating concentrations of chemicals resulted in an SI 3 3, the Sponsors have addressed the issue of irritation and suggest that proliferation induced by irritation may be non-dose responsive and rarely exceeds the required three-fold increase in SI over control to predict sensitization potential. The Sponsors have stated that local or systemic toxicity may result in a suppression of the response at high doses. It is possible that, in the absence of preliminary toxicity testing, using toxic concentrations of chemicals may result in the need for repeat studies. The protocol does not specify that animals be weighed at the beginning and end of the study. Having weight gain data available would allow for an evaluation of toxicity that may be useful in assessing data in which a decline in the dose-response relationship is seen at high doses and is recommended. To collect animal weight data, identification of individual animals is required. Individual animal identification is also a requirement for studies performed in compliance with Good Laboratory Practice (GLP) regulations. Additional comments relating to irritation were made by PRP members. The PRP members questioned whether a grading system for dermal irritation should be developed to quantify the degree of skin irritation at the treatment sites. It is not clear as to what prevents the application of a severe irritant or a corrosive substance. Further, the PRP questions whether there is a need for a prestudy screen of the irritation potential of the test material. Although solubility and potential toxicity may influence the concentrations that will be used in a test, the protocol does not provide clear guidance on the selection of a concentration for the performance of the assay. 1.1.3. Appropriateness of the number of dose groups The protocol specifies that a vehicle group and three to five test groups be assayed. Assuming that the appropriate concentrations are chosen (see No. 2 above), this study design is appropriate for a toxicology study. However, in the absence of any data on toxicity or solubility, details regarding how test concentrations should be chosen is necessary. 1.2. Adequacy and completeness of the test method protocol 1.2.1. Test method material and equipment, and animal usage The test method protocol is detailed and provides sufficient information on materials and equipment needed and technical procedures, such that trained personnel should be able to conduct the LLNA. The appendix of the Submission provides details on reagent preparation and sample sheets for record keeping. The LLNA is analyzed based on a comparison of the mean DPM from treated animals as compared to controls. This differs from the scoring of the guinea pig assays in which a test substance is scored as positive or negative based on the percentage of animals in a group which are responders (15% in a nonadjuvant assay and at least 8% in an adjuvant test) (Marzulli and Maibach, 1996). The guinea pigs used in these assays are outbred animals with a greater genetic variability than the inbred mice chosen for use in the murine LLNA. Test results have shown that, based on using a SI33 as the sole criteria for determining a positive response in the LLNA, an N of four or five mice per test group provides comparable results to the guinea pig tests with 10 to 20 animals. The specified age range of 8 to 12 weeks is appropriate for immunotoxicological studies. Mice become immune competent at approximately six to eight weeks of age (Shultz and Bailey, 1975; Tyan, 1981). The strain chosen is a known Th1 (T-helper cell type 1) responder. However, the choice of strain has been made without a systematic comparison of alternatives. There is adequate documentation for the influence of genetic factors on contact allergy, although there is less documentation on how important a role this might have in testing. There is adequate documentation that inbred mouse strains differ in delayed-type hypersensitivity (DTH) reactions to antigens (Shultz and Bailey, 1975). Few studies have been conducted to compare the responsiveness of other inbred mouse strains to the CBA mouse in the LLNA. The documentation in the paper cited on this point (Kimber and Weisenberger, 1989) is preliminary, with only one (strong) sensitizer (2,4-dinitrochlorobenzene [DNCB]), and with a protocol different from the one submitted to ICCVAM. A range of sensitizers should be tested in parallel in a number of representative inbred strains of mice before another strain can be considered validated. A better description of the responder properties of various mouse strains would be useful for evaluation of the robustness of the LLNA. Different lines of mice within a given strain (i.e., substrains) show genetic differences and will drift further apart genetically over time. Substrains may differ in their immune responses; one example is the DTH response to mycobacterial antigens in different substrains of C3H mice (Løvik et al., 1982). If different mouse strains are found to differ significantly in their LLNA response and genetic factors play a role, one obvious measure to help avoid false negatives would be to retest (suspicious) negative substances in a different strain of mice. Documentation provided (Kimber et al., 1998) suggests that for some CBA substrains, substrain differences have minimal effect on the LLNA response. The Sponsors protocol permits the use of both male and female mice, but only one sex in each experiment is proposed. Female CBA mice have been shown to develop a stronger contact dermatitis response as compared to males (Ptek et al., 1988). Furthermore, males are considered to show larger variation because of a greater tendency to fight and to be involved in social ranking processes if group housed. However, this clearly is mouse strain-dependent. In the future, the use of both genders of mice might offer economic advantages, both for institutions breeding their own mice, and for users who buy their mice from commercial breeders. The documentation supplied is with female mice only. If the protocol permits the use of male mice, systematic studies on sex differences in the response should be documented. 1.2.2. Test method data collection procedure The protocol adequately describes the measurement of the incorporation of 3H-methyl thymidine into proliferating lymphocytes in draining lymph nodes as a measure of sensitization. However, there appears to be two methods of performing the assay, one based on using lymph node samples pooled across mice within a treatment group (favored by the European collaborators) and another based on individual animal responses (favored by the American collaborators), which is evident in reviewing the publications from the inter-laboratory validation studies. It appears an assessment of DPM in lymph nodes from individual animals is advantageous to using lymph nodes pooled within a dose group to determine radioisotope incorporation. The pooled approach precludes statistical analysis of the data which should be used to aid in result interpretation. Thus, the draft protocol should be modified to recommend only the collection and analysis of individual animal data. 1.2.3. Data analysis, evaluation, and decision criteria The protocol allows for pooling of the draining lymph nodes from multiple mice within each test group or the analysis of pooled nodes from individual animals. The mean DPM for each test group is compared to the control group and if the SI of a test group is SI _3 fold higher than the concurrent control, the test chemical is considered to be a sensitizer. The Sponsors state that the three-fold increase is an arbitrary number chosen based on the performance of the assay with a group of known sensitizers. Extensive analysis performed by NICEATM with the assay supported the three-fold increase as an adequate indicator of the sensitizing ability of chemicals. The Sponsors state that the three-fold factor takes into consideration the variability within and between groups and allow for the assumption that irritation may elicit a low level of lymphocyte proliferation. The PRP had significant concerns about the lack of emphasis on statistical analysis in the Submission. Pooling lymph nodes from animals by dose group for radioisotope incorporation versus an evaluation of lymph nodes from individual animals to estimate the SI does not represent replicate testing and precludes any statistical analysis of the data. Statistical analysis would definitely benefit the LLNA protocol. It would confirm whether or not an apparently high SI _3 is due to chance variation (e.g., see Table 4, Kimber et al., 1995), thereby reducing possible false positives. It may detect whether an apparently low SI (<3) for a particular compound are statistically higher than can be explained by chance variation, and may thereby reduce the number of potential false negative responses. In both of these situations, the statistical results would at least call into question the decision based solely on SI, and thus suggest a retest. Additionally, the evaluation of individual animal data provides for trend analysis to confirm dose responsiveness. However, not all statistical differences are biologically meaningful or relevant for regulatory decision making. It is a practical question whether the qualitative statement from a statistical test is sufficient, or whether a quantitative element/magnitude of the difference also has to be considered. The SI represents one such quantitative parameter. Similar combinations of statistical and practical decision rules are used in genetic toxicology tests. Although the statistical significance of an observed response is very important, no rigid statistical decision rule should be the sole factor in determining the biological significance of a skin sensitization response. Other factors that should be considered include the magnitude of the effect SI _3, the strength of the dose-response relationship, chemical toxicity and solubility, and the consistency of the (positive and negative) control response with other contemporary studies. It is the recommendation of the PRP that data be generated by analyzing lymph nodes from individual animals. This view was supported by individuals at the Public Meeting representing regulatory agencies. This would allow for the use of a SI _3 for identifying positive responses and dose-response relationship, evaluation of incidence, and statistical analysis may be used as an aid in evaluating test results. Use of individual animal data allows for a formal statistical analysis of whether or not an elevated SI is significant relative to controls. These results can be used in conjunction with the three-fold SI rule to determine the skin sensitization potential of the test chemical. The following guidelines should be considered. The calculated measure of response (SI) will generally be simply the ratio of the mean DPM responses in the dosed and control groups. However, the investigator should be alert to possible "outlier" responses for individual animals within a group that may necessitate the use of an alternative measure of response (e.g., median rather than mean) or elimination of the outlier. Each SI should include a measure of variability that takes into account the inter-animal variability in both the dosed and control groups. For example, dividing each dosed group animal response by the mean control response and calculating the SD of these ratios does not take into account the variability inherent in the control group. The SI is a ratio of two random variables, and the formula for the SD of this ratio is available in many standard statistical textbooks. The statistical analysis should include an assessment of the dose-response relationship as well as pairwise dosed group vs. control comparisons. In choosing an appropriate method of statistical analysis, the investigator should maintain an awareness of possible inequality of variances and other related problems that may necessitate a data transformation or a nonparametric statistical analysis. 1.3. Positive, negative, and irritation control chemicals The protocol does not adequately address the use of controls. The protocol specifies the inclusion of a vehicle control but not a positive or irritation control. The inclusion of a single concentration of a moderate grade sensitizer as a concurrent positive control would provide validity to the assay indicating that all procedures involved in the assay were conducted properly. In addition, a positive control will provide a standard to compare between studies and laboratories. Regulatory agency representatives present at the public meeting supported the need for a concurrent positive control with each assay. The PRP recommends the use of a positive control in the form of a sensitizer inducing a moderate response. Based on the criteria set for the evaluation of the LLNA, there is no need for an irritation control. 1.4. Dose response interpretation The dose-response relationship is an advantage of this method and becomes important in the evaluation of equivocal results. The ability to evaluate multiple concentrations of the chemicals is an advantage of the LLNA because it provides added confidence that compounds that are skin sensitizers will be detected. The Sponsors have designated a SI _3 as the limit for classifying a chemical as a sensitizer. In equivocal cases where the SI does not reach three-fold, but there is a positive dose response, repeating the study to assess reproducibility may be appropriate. Also, the dose response relationship allows for the evaluation of potential systemic toxicity. In cases where a suppressed response is seen at high doses, the dose response may allow for recognition of a toxic response. 1.5. Strengths and/or limitations The strengths of the LLNA are its quantitative nature, the inclusion of a dose response relationship, the ability to test colored substances, improved animal welfare, and the reduction in the time required to conduct a study. The usefulness of the method for testing mixtures and extracts was not addressed in the proposal. Some strong irritants and sensitizing metals appear to be problematic for the LLNA. A failing of the LLNA, as described, is its inability to identify some metal salts as contact allergens. Ikarashi et al. (1992a; 1992b; 1993) suggest that the use of DMSO as a vehicle results in a positive LLNA test when metal salts, including nickel and copper salts, are applied to the skin. To better evaluate interlaboratory comparisons, the PRP would like to have seen more data generated from blinded studies. 1.6. Editorial/technical corrections The PRP found the protocol to be well written and easy to follow. 1.7. Conclusions The PRP found the recommended protocol to be thorough. The strengths of the assay were seen as its mechanistic basis, quantitative endpoint, and the inclusion of a dose response relationship. Weakness were seen as the assay resulting in false negatives (e.g., some metals and some clinically relevant allergens) and false positives (e.g., some irritants). Furthermore, there is limited experience with pharmaceuticals and mixtures/extracts. The value of adding a concurrent positive control was seen as providing validity to the assay and giving a standard by which to compare between studies and laboratories. It is crucial to test high concentrations of test materials to avoid false negatives. The choice of the highest concentrations tested should be based on solubility and toxicity. The choice of suitable vehicles are described and prioritized. However, the majority of the data presented in the Submission resulted from exposure to test articles applied in AOO. 1.8. Recommendations The following changes to the protocol were recommended:
Validation studies appear to have been conducted in the "spirit" of Good Laboratory Practice (GLP) (or Good Research Practice) as determined by standard operating procedures (SOP) at the individual institutions. Formal audited reports were not prepared because the data were primarily intended for publication. By definition, without an audited final report, a study does not conform to GLP. Data record forms in the sample protocol (Appendix D) and supplemental individual animal data supplied solely for PRP review indicated that record-keeping and data collection were adequate. 2.1. Protocol consistency during validation Assurance was not provided to indicate adherence to a standard protocol during the validation studies. Early validation studies were conducted before a standard protocol was available; thus, slight procedural variations occurred as described in the next section. Two protocol modifications were intentionally introduced during the later validation studies. 2.2. Protocol variations and modification during validation Several variations/modifications of the standard protocol are described in the validation studies. These variations and modifications included:
However, data based on using a four-day treatment protocol were not included in the database and this modification is currently not considered acceptable. Procedural variations nos. 2 to 4 are difficult to identify as true changes or modifications of the standard protocol, since they appeared to have more to do with how a particular laboratory performed the LLNA, rather than being an intentional modification for assay optimization. With the available documentation, in most cases it was not possible to distinguish which studies used which of these modifications. Consequently, a rigorous evaluation of the effects of these four protocol variations on test results was not possible. Modification nos. 5 and 6 were intentional modifications and are clearly described in Kimber et al. (1998). The justification for these two modifications was to evaluate the effects of slight modification on the predictive value of the test. This justification is adequate and, overall, these variations and modifications did not significantly alter test results, indicating that the LLNA is relatively insensitive to minor variations in procedure. 2.3. Data audits In the absence of formal audited reports and GLP compliance statements, it is not possible to determine if data audits were conducted by Quality Assurance Units. The Sponsors state that much of the data presented in support of the Submission were derived from audited GLP compliant studies (Appendix C), inferring that data audits were conducted. Additionally, the Sponsors state that, with retrospective audits, GLP compliance statements could be issued for the great majority of substances tested. The integrity of the validation data is also supported by the fact that all interlaboratory validation data were made available to, and scrutinized by, all participants. 2.4. Recommendation Due to lack of representative quality assurance and GLP documentation in the Submission, it is recommended that data quality and adherence to protocol (in individual studies) be confirmed by retrospective auditing of at least three individual LLNA studies. The studies should be selected by NICEATM from those conducted in the later phase of the interlaboratory validation, and should include laboratories from both the US and UK. 3.1. Data presentation The Sponsors Submission applies a three-fold SI for evaluating the sensitization potential of a chemical using the LLNA. The Sponsors initial Submission, which included only a table of "+" and "-" data, did not provide sufficient detail for the comprehensive evaluation of the LLNA. However, subsequent literature evaluation (Basketter and Scholes, 1992; Basketter et al., 1994; Basketter et al., 1996a; Basketter et al., 1998; Gerberick et al., 1992; Kimber et al., 1990; Loveless et al, 1996) carried out by NICEATM and PRP members provided more detailed information on SI for a majority of the chemicals evaluated. This compilation permitted a more definitive evaluation of LLNA performance, in particular, the application of the SI 3 3.0 rule and the determination of sensitivity and specificity of the assay in comparison to the GPMT/BA and human sensitization data. There were minor data inconsistencies, including double reporting under chemical synonyms for one chemical, inaccurate reporting of whether or not a standard guinea pig test method was used, and minor omissions in the Submission. Most of these inconsistencies were resolved during the review process and in discussions and teleconferences with the Sponsors. Comparison to literature citations confirmed the accuracy of almost all of the LLNA classifications provided by the Sponsors. However, the PRP could not confirm positive results (but did confirm negative results) reported for aniline, 4-chloroaniline, streptomycin sulfate, or a -trimethyl-ammonium 4-tolyoxy-4-benzenesulfonate, nor the equivocal result reported for neomycin sulfate. These chemicals were considered negative in the analysis of LLNA assay data, although it is recognized that unpublished data may exist that would support a positive call. Hydroquinone and quinol had the same CAS number and were changed to a single listing. Benzoic acid and glycerol were tested using a non-standard LLNA protocol and, in agreement with the Sponsors and consistent with other similar data, excluded from further consideration. Benzocaine yielded equivocal LLNA results among six separate studies and was excluded from subsequent performance evaluations. The revised data are compared to the Submission in Table 1. The LLNA was validated for hazard identification of chemicals, as defined by the National Research Council (NRC, 1983) with a proclivity to produce ACD. The LLNA assesses the induction process and does not assess the elicitation process. ACD refers to an immunologically mediated process in man or animal that is characterized by redness and swelling of the skin and is a cell mediated (type IV) process (Kawabata et al., 1996). For the purposes of this report, the LLNA assesses type IV hypersensitivity and no attempt has been made to validate this assay for immediate hypersensitivity and contact urticaria syndrome. Table 1. Comparison of Original and Revised Concordance Between the LLNA and Guinea Pig Tests
3.2. Adequacy of the test method performance evaluation There is a century of experience on the identification of chemicals that produce ACD in man. The definition of ACD in man is operational in nature in that several components are required for verification: this includes history, physical examination, diagnostic patch testing with appropriate controls, and natural history after removal of the contact allergen. For this review, the PRP compared the LLNA against guinea pig data and compared both the LLNA and guinea pig test data against human data, where available. This PRP did not conduct an in-depth review of all the chemicals that have been defined in the published literature as human allergens. The PRP, with the assistance of NICEATM, compared the LLNA to the guinea pig assays in terms of specificity, sensitivity, positive and negative predictivity, and accuracy. The purpose of this evaluation was to determine if the LLNA, as a test for hazard identification, is equivalent to or superior to the guinea pig assays. To accurately make that comparison, the guinea pig assay would have to undergo the same rigorous evaluation as the LLNA. The PRP is not aware of any such evaluation. Although much effort was expended to compare the LLNA to the GPMT/BA, the goal of LLNA testing is for hazard identification and to prevent human sensitization. Thus, the PRP attempted to compare the performance of the LLNA to available sources of human data that were viewed as the "gold standard." Of the 209 chemicals tested in the LLNA, 97 were also tested in the GPMT/BA, an additional 29 were tested using non-standard guinea pig tests, and 39 were tested using the human maximization test (HMT). Inclusion of compounds that are included in human patch test allergen (HPTA) panels expanded the comparative human data set to 74 compounds. These human data were not further validated as that would have required an exhaustive study of the literature to determine their potency. Thus, these data should be considered with the caveat that a few of the HPTA compounds may cause human sensitization only infrequently. Several deficiencies in the Submission materials were noted by the PRP. Since the choice of vehicle may be problematic in the LLNA, analysis of vehicle effects should have been more thoroughly evaluated. Acetone or AOO appeared to be the preferred vehicle in most studies, followed by N,N-dimethyl formamide (DMF), methyl ethyl ketone (MEK), propylene glycol (PG), dimethyl sulfoxide (DMSO), and saline or 50% acetone/saline. There are very few data available on vehicles other than AOO, DMF, and DMSO. It is desirable that predictive animal tests be performed with vehicles relevant for human exposure where possible. The choice of vehicle may be decisive for the determination of the SI. For instance, olive oil may pose problems in the LLNA since it is reported as an allergen giving an SI=16 to 23 when tested at 100%, and 2.9 to 3.6 when tested as AOO (4:1) (Montelius et al., 1996). The choice of test concentrations is also crucial to the proper performance of the LLNA. It is given in the standard protocol that "three to five concentrations are selected among ten possibilities ranging from 0.1% 100%." The preponderance of data is based on tests performed using three concentrations. It appears that some well known allergens require high concentrations to yield a SI 3 3 (e.g., eugenol, hexylcinnamic aldehyde, ethylenediamine, and penicillin G). For some non-sensitizing irritants (e.g., nonanoic acid and methyl salicylate), it appears that high concentrations yield a SI 3 3 (Montelius et al., 1998). It was not stated clearly enough in the Submission that the range of concentrations tested may be decisive for the result.3.3. Adequacy of the numbers of chemicals/products evaluated There have been a substantial number of chemicals and classes of chemicals tested using the LLNA to evaluate its performance. Few other toxicological assays have had this type of rigorous evaluation prior to use. However, the PRP noted that several classes of compounds for which the LLNA has been used were under-represented in the Submission. These include some weak sensitizers, irritants, organometals, and petroleum additives. The PRP noted that preferential testing of potent and moderate sensitizers over weak sensitizers would tend to yield better performance data for the LLNA than would be expected in general use for hazard assessment. The PRP disagrees with the statement in the Submission (Appendix C, page C-22) that a LLNA false negative for nickel sulfate is " . . . as unsurprising as it is unimportant" since ". . . new metals are not being invented." The PRP recognizes the importance of LLNA testing of new organometals, particularly in the petroleum additives industry. Data derived from the testing of coded samples in blinded studies would have allowed for a better comparison of LLNA performance to guinea pig and human data. The PRP is aware that such data exist but that it was considered proprietary and was not available for analysis. 3.4. Adequacy of test method performance data There is consensus among the PRP that with the inclusion of the additional material requested of the Sponsors, plus that drawn from published sources, sufficient information was available to evaluate the LLNA. As stated above, additional data for weak sensitizers, some irritants and certain metals, plus data from blinded studies, would have added further rigor to the review. 3.5. Sensitivity, specificity, concordance, false positive rate, and false negative rates The revised database described above and included in Appendix A was analyzed to determine sensitivity, specificity, false positive and false negative rates, and accuracy of the methods compared to guinea pig and human data. The results of these analyses are tabulated below in Tables 2 and 3. Table 2 is based on analysis of all available data for each comparison; Table 3 is limited to compounds for which there are LLNA, guinea pig and human sensitization data for the same compound. 3.5.1. Prediction of non-sensitizers According to a Chi square evaluation, there is a significant association between the LLNA and guinea pig test (GPMT/BA plus GPT) classification of positive and negative sensitizers (p value < 0.001). Based on 126 compounds (93 guinea pig positive and 33 guinea pig negative), the LLNA exhibited a sensitivity of 87%, specificity of 82%, and accuracy of 86%. The predictive value of a positive test was 93% and the predictive value of a negative test was 69%. The latter value suggests that the LLNA is more likely than guinea pig tests to identify compounds as non-sensitizers. However, the predictive value of a negative test when compared against the GPMT/BA only was 80%. From a regulatory standpoint, false negatives are of greater concern than false positives. In comparison to the human data, the LLNA exhibited a sensitivity of 72%, specificity of 67%, and accuracy of 72%. The predictive value of a positive test was 96% and the predictive value of a negative test was 17%. GPT gave a similar value for negative predictivity. It should be recognized that this latter value was based on only four human non-sensitizers. These analyses were also performed applying different SI values to establish a LLNA result as positive. As shown in Table 4, no overall improvement in accuracy was demonstrated if a SI of 2.0, 2.5, 3.5 or 4.0 was chosen instead of 3.0. A higher threshold improves the specificity but reduces the sensitivity. A SI 3 3 provided better concordance with guinea pig tests than the other thresholds tested.
Table 4. Influence of the Threshold SI on Sensitivity and Specificity
Using human response data as the "gold standard", three compounds (aniline, nickel sulfate, neomycin sulfate) were false negatives in the LLNA and one (sodium lauryl sulfate [SLS]/sodium dodecyl sulfate) was a false positive in the LLNA. The GPMT/BA registered four false negatives (musk ambrette, ammonium thioglycolate, ethylene glycol dimethacrylate, neomycin sulfate) and no false positives. While these data show one more false positive for the LLNA than the GPMT/BA, the rates of mis-classification for both are low and not significantly different. 3.5.2. Prediction of positive sensitizers The LLNA shows a high concordance with human data and guinea pig test data for strong and moderate sensitizers. The Sponsors reported a 93% positive predictivity in comparison with the guinea pig assays. Improvements in the LLNA should be targeted toward enhancing the detection of weak sensitizers. It is the opinion of some of the PRP members that improved detection of weak sensitizers may be accomplished using the LLNA if the number of exposures (or dose groups) and the number of animals were increased. However, from some false negative cases, the data demonstrate that compounds negative in the LLNA are strongly so and increasing the numbers of test animals would not be likely to have any effect on the test outcome. As stated in the previous section, three compounds yielded false negatives in the LLNA in comparison to human response data. The GPMT/BA also registered three false negatives. The analyses of sensitivity and specificity indicated the predictive value of a positive LLNA test was 93% and the predictive value of a negative test was 80% compared to GPMT/BA. When compared to human data the predictive value of a positive LLNA test was 96% and the predictive value of a negative LLNA test was 17%. Similar positive and negative predictivity values (100% and 16%, respectively) were found when the GPMT test was compared to human data. 3.6. Acceptability of sensitivity, specificity, concordance, and false positive and negative rates Analysis of concordance between the LLNA and guinea pig data and the LLNA and human data give confidence that the LLNA can reasonably predict human responses to sensitizers when compared to currently accepted methods for regulatory decisionmaking. Potential problems in the LLNA rest with certain non-sensitizing irritants mis-classified as positive for sensitization and false negatives (compared to human data) represented by compounds from several different classes. 3.7. Scientific validity of conclusions on assay usefulness 3.7.1. Clinical relevance and human predictivity The results of the LLNA are clinically relevant and the test is predictive except for some weak human contact allergens. The functioning of the immune systems of mice and humans are very similar as they relate to ACD. Human ACD generally arises through dermal exposure to non-abraded skin. It is a two-step process requiring first induction of specific immunity, followed by an elicitation response several weeks later. The LLNA utilizes topical application of the test compound to non-abraded skin and quantifies the induction phase (proliferation of T-lymphocytes in the draining auricular lymph nodes) as the indication of the potential of a compound to produce sensitization. One concern is that some non-sensitizing, irritant compounds may produce sufficiently profound lymphocyte proliferation to yield a false positive result. Also, some compounds that are recognized as human sensitizers do not produce a sufficiently strong proliferative response in the LLNA and are mis-classified as negative. This is also true for the guinea pig tests. 3.7.2. Regulatory utility of the method The utility of the method for regulatory use in hazard assessment of chemicals as potential human contact sensitizers has been clearly established, subject to the limitations discussed above. 4. Test Method Reliability (Repeatability/Reproducibility) In general, the initial LLNA Submission presented qualitative data, which demonstrate adequate intra- and inter-laboratory repeatability and reproducibility. The Submission was deficient, however, in the presentation of quantitative data supporting the reliability of the test method. The reproducibility of the test method results across laboratories was adequate for a biological assay. In all but one interlaboratory comparison study, all of the test chemicals were identified prior to testing. In the only blinded study, 20 of 25 test chemicals were coded and of these, six chemicals were not reproducibly identified among the four laboratories. More confidence in the intra- and inter-laboratory repeatability and reproducibility of the test method would have been achieved had more quantitative blinded studies been performed. Also, while in most cases the sensitizers and non-sensitizers were correctly identified, it is likely to be more difficult to yield repeatable data with non-sensitizing irritant compounds or weak sensitizers. 4.1. Adequacy of intralaboratory repeatability and reproducibility evaluations The data evaluated for intralaboratory repeatability and reproducibility were limited, in that only six chemicals were evaluated. These data (i.e., Basketter et al., 1996a; Kimber et al., 1998; Loveless et al., 1996) are presented in a summarized form in Tables 1 and 2 (Appendix C, pages C-12 and C-13, respectively) of the Submission. These data, while limited, indicate sufficient agreement; however, there are some discrepancies between the tables. For example, Table 1 of the Submission indicates that three tests were carried out on DNCB and all were positive. However, Table 2 of the Submission indicates that only two tests were carried out for this chemical, not three. Table 1 of the Submission presents qualitative intralaboratory repeatability data from one laboratory for six compounds including one potent sensitizer assayed three times, three moderate sensitizers assayed four to six times, and two non-sensitizers assayed four or six times. The data indicate that the LLNA correctly identified four known sensitizers, which occurred in three to six repeated tests on each chemical. In this same laboratory, methyl salicylate was correctly identified as a non-sensitizer in each of four tests, while benzocaine was identified as a non-sensitizer in five of six tests. Table 2 of the Submission presents quantitative intralaboratory data (i.e., EC3 values, defined as the estimated concentration needed to produce an SI of three) from five laboratories that performed two tests each on the potent sensitizer DNCB and two laboratories that performed six tests each on the moderate sensitizer HCA. An assessment (Appendix K) of the DNCB data presented in Table 2 of the Submission indicate a lack of significant intra-laboratory variability. The data in Table 2 of the Submission also allows for a calculation of coefficient of variation (CV) for intralaboratory variability, which is presented in Table 5. Recognizing the limitations of such a calculation (i.e., five of the CVs were based on only two tests), overall the CVs are reasonable. In all cases, the sensitizers and non-sensitizers were correctly identified. However, it is likely to be more difficult to yield repeatable data with non-sensitizing irritant compounds or weak sensitizers. The information provided is sufficient to show that the LLNA can be reproducibly performed in a qualitative manner. However, it would be useful if future evaluations included further statistical analysis of the data to more accurately establish responses by chemical class. Also, it would be useful if future studies include an analysis of the intralaboratory repeatability of this method with an emphasis on compounds with a maximum SI clustered around three. Table 5: Analysis of Intralaboratory Variability
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||