Table Of Contents

Appendix A. Description of the Survey, Limitations of the Data, and Other Sources of Data

A.1 Description of the Survey

A.1.1 Sample Design

The 1999 National Household Survey on Drug Abuse (NHSDA) sample was part of a coordinated 5-year design that will provide estimates for all 50 States plus the District of Columbia (DC) through the year 2003. The coordinated design will facilitate 50 percent overlap in first-stage units (area segments) between each two successive years from 1999 through 2003.

For the 5-year 50-State design, 8 States were designated as large sample States (California, Florida, Illinois, Michigan, New York, Ohio, Pennsylvania, and Texas) with samples large enough to support direct State estimates. Sample sizes in these States ranged from 2,669 to 4,681. For the remaining 42 States and DC, smaller, but adequate, samples were selected to support State estimates using small area estimation (SAE) techniques. Sample sizes in these states ranged from 756 to 1,280.

States were first stratified into a total of 900 field interviewer (FI) regions (48 regions in each large sample States and 12 regions in each small sample State). These regions were contiguous geographic areas designed to yield the same number of interviews on average. Within FI regions, adjacent Census blocks were combined to form the first-stage sampling units, called "area segments." A total of 96 segments per FI region were selected with probability proportional to population size in order to support the 5-year sample and any supplemental studies that the Substance Abuse and Mental Health Services Administration (SAMHSA) may choose to field. Eight sample segments per FI region were fielded during the 1999 survey year.

These sampled segments were allocated equally into four separate samples, one for each 3-month period during the year, so that the survey is essentially continuous in the field. In each of these area segments, a listing of all addresses was made, from which a sample of 223,868 addresses was selected. Of these, 187,842 were determined to be eligible sample units. In these sample units (which can be either households or units within group quarters), sample persons were randomly selected using an automated screening procedure programmed in a handheld computer carried by the interviewers. Youths (aged 12 to 17 years) and young adults (aged 18 to 25 years) were oversampled at this stage. Because of the large sample size associated with this sample, there was no need to oversample racial/ethnic groups, as was done on prior NHSDAs. Consistent with previous NHSDAs, the final respondent sample of 66,706 persons was representative of the U.S. general population (since 1991, the civilian, noninstitutionalized population) aged 12 or older. In addition, State samples were representative of their respective State populations.

During Quarter 1 of the 1999 NHSDA, it became evident that response rates were not comparable to those achieved in prior years. The principal cause for the reduction in response rates was the shortage of FIs and their inexperience. One action taken to overcome the response problem was to subsample from all pending cases so that cases retained could be worked more thoroughly. This special subsampling was conducted in two phases. During the first phase, a total of 8,640 of the 13,161 unfinished dwelling units (i.e., pending screeners) were pulled out of the sample. In the second phase, dwelling units eligible to be sampled included those that were unfinished and those with pending person interviews. A total of 3,958 such units were removed in the second phase. To reduce the effect of unequal weights, all pending dwelling units (all units from round 1 and 1,827 units from round 2) were put back into the sample in Quarter 2. The sample weights were adjusted to reflect the subsampling and putting back of cases.

The 1999 NHSDA also included a supplemental sample using the paper-and-pencil interviewing (PAPI) mode for the purposes of measuring trends with estimates comparable to 1998 and prior years. The design for the supplemental PAPI study used a probability subsample of 250 FI regions and employed a coordinated oversampling strategy to increase the representation of blacks and Hispanics. All segments selected for the main computer-assisted interviewing (CAI) study within the 250 FI regions were also selected for the PAPI study. Oversampling of blacks and Hispanics was achieved by a coordinated sampling scheme that oversampled FI regions with high concentrations of blacks and Hispanics and by screening for and oversampling blacks and Hispanics in dwelling units designated for the PAPI sample. The automated sampling procedure, when applied in the PAPI segments, specified which dwelling units were to be interviewed in the CAI mode and which were to interviewed in the PAPI and then applied the appropriate person selection scheme for that particular survey. A sample of 46,328 addresses was selected for the PAPI study. Of these, 40,584 were determined to be eligible, and the final respondent sample consisted of 13,809 persons.

A.1.2 Data Collection Methodology (CAI)

The data collection method used in the NHSDA involves in-person interviews with sample persons, incorporating procedures that increase respondents' cooperation and willingness to report honestly about their illicit drug use behavior. Confidentiality is stressed in all written and verbal communications with potential respondents, respondents' names are not collected with the data, and CAI methods, including audio computer-assisted self-interviewing (ACASI), are used to provide a private and confidential setting to complete the interview.

Introductory letters are sent to sampled addresses, followed by an interviewer visit. A 5-minute screening procedure conducted using a handheld computer involves listing all household members along with their basic demographic data. The computer uses the demographic data in a preprogrammed selection algorithm to select zero to two sample persons, depending on the composition of the household. This selection process is designed to provide the necessary sample sizes for the specified population age groupings.

Interviewers attempt to immediately conduct the NHSDA interview with each selected person in the household. The interviewer requests the selected respondent to identify a private area in the home away from other household members to conduct the interview. The interview averages about an hour and includes a combination of CAPI (computer-assisted personal interviewing) and ACASI. The interview begins in CAPI mode with the FI reading the questions from the computer screen and entering the respondent's replies into the computer. The interview then transitions to ACASI mode for the sensitive questions. In this mode, the respondent can read the questions silently on the computer screen and/or listen to the questions read through headphones and enter his/her responses directly into the computer. At the conclusion of the ACASI section, the interview returns to CAPI mode with the interviewer completing the questionnaire.

No personal identifying information is captured in the CAI record for the respondent. At the end of the day when an interviewer has completed one or more interviews, he/she transmits the data to RTI via home telephone lines.

A.1.3 Data Processing (CAI)

Interviewers initiate nightly data transmissions of interview data and call records on days when they work. Computers at RTI direct the information to a raw data file that consists of one record for each completed interview. Even though much editing and consistency checking is done by the CAI program during the interview, additional complex edits and consistency checks are completed at RTI. Resolution of most inconsistencies and missing data is done using machine-editing routinesdeveloped specifically for the CAI instrument. Cases are retained only if the respondent provided data on lifetime use of cigarettes and at least nine other substances.

Statistical Imputation. For some key variables that still have missing values after the application of editing, statistical imputation is used to replace missing data with appropriate response codes. Considerable changes in the imputation procedures that have been used in past NHSDAs were introduced for the 1999 CAI sample. Three types of statistical imputation procedures are used: (1) a standard unweighted sequential hot-deck imputation, (2) a univariate combination of weighted regression imputation and a random nearest neighbor hot-deck imputation (which could be viewed as a univariate predictive mean neighborhood method), and (3) a combination of weighted regression and a random nearest neighbor hot-deck imputation using a neighborhood where imputation is accomplished on several response variables at once (which could be viewed as a multivariate predictive mean neighborhood method). Because the primary demographic variables (e.g., age, gender, race/ethnicity, employment, education) are imputed first, few variables are available for model-based imputation. Moreover, most demographic variables have a very low level of missingness. Hence, unweighted sequential hot deck is used to impute missing values for demographic variables. The demographic variables can then be used as covariates in models for drug use measures. These models also include other drug use variables as covariates. The univariate predictive mean neighborhood method is used as an intermediate imputation procedure for recency of use, 12-month frequency of use, and 30-day frequency of use where these variables occur. The final imputed values for these variables are determined using multivariate predictive mean neighborhoods. The final imputed values for age at first use for all drugs and age at first daily cigarette use are determined using univariate predictive mean neighborhoods.

Hot-deck imputation involves the replacement of a missing value with a valid code taken from another respondent who is "similar" and has complete data. Responding and nonresponding units are sorted together by a variable or collection of variables closely related to the variable of interest Y. For sequential hot-deck imputation, a missing value of Y is replaced by the nearest responding value preceding it in the sequence. With random nearest neighbor hot-deck imputation, the missing value of Y is replaced by a responding value from a donor randomly selected from a set of potential donors close to the unit with the missing value according to some distance metric. The predictive mean neighborhood imputation involves determining a predicted mean using a model, such as a linear regression or logistic regression, depending on the response variable, where the models incorporate the design weights. In the univariate case, the neighborhood of potential donors is determined by calculating the relative distance between the predicted mean for an item nonrespondent and the predicted mean for each potential donor, and choosing those within a small preset value (this is the"distance metric"). The pool of donors is further restricted to satisfy logical constraints whenever necessary (e.g., age of first crack use must not be younger than age of first cocaine use). Whenever possible, more than one response variable was considered at a time. In that (multivariate) case, the Mahalanobis distance across a vector of several response variables' predicted means is calculated between a given item nonrespondent and each candidate donor. The k smallest Mahalanobis distances, say 30, determine the neighborhood of candidate donors, and the nonrespondent's missing values in this vector are replaced by those of the randomly selected donor. A respondent may only be missing some of the responses within this vector of response variables; in that case, only the missing values were replaced, and donors were restricted to be logically consistent with the response variables that were not missing.

Although statistical imputation could not proceed separately within each State due to insufficient pools of donors, information about the State of residence of each respondent is incorporated in the modeling and hot-deck steps. For most drugs, respondents were separated into three State usage categories for each drug depending on the response variable of interest. Respondents from States with high usage of a given drug were placed in one category, respondents from medium usage States into another, and the remainder into a third category. This categorical "State rank" variable was used as one set of covariates in the imputation models. In addition, eligible donors for each item nonrespondent were restricted to be of the same State usage category (the same "State rank") as the item nonrespondent.

Weights. The general approach to developing and calibrating analysis weights involved developing design-based weights, dk , as the inverse of the selection probabilities of the households and persons. Adjustment factors, ak(), were then applied to the design-based weights to adjust for nonresponse, to control for extreme weights when necessary, and to poststratify to known population control totals. In view of the importance of State-level estimates with the new 50-State design, it was necessary to control for a much larger number of known population totals. Several other modifications to the general weight adjustment strategy that had been used in past NHSDAs were also implemented for the first time with the 1999 CAI sample.

Weight adjustments were based on a generalization of Deville and Särndal's (1992) logit model. This generalized exponential model (GEM) (Folsom & Singh, 2000) incorporates unit-specific bounds (lk, uk), ks, for the adjustment factor ak() as follows:

 ,

where ck are prespecified centering constants, such that lk < ck < uk and Ak = (uk - lk)/(uk - ck)(ck -lk). The variables lk, ck, and uk are user-specified bounds, and is the column vector of p model parameters corresponding to the p covariates x. The -parameters are estimated by solving

where denotes control totals, which could be either nonrandom as is generally the case with poststratification, or random as is generally the case for nonresponse adjustment.

The final weights wk =  dkak() minimize the distance function (w,d) defined as

This general approach was used at several stages of the weight adjustment process including (1) adjustment of household weights for extremes, (2) adjustment of household weights for nonresponse, (3) poststratification of household weights to meet population controls for various demographic groups by State, (4) adjustment of person weights for extremes, (5) poststratification of selected person weights, (6) adjustment of person weights for nonresponse, and (7) poststratification of person weights.

Every effort was made to include as many relevant State-specific covariates (typically defined by demographic domains within States) as possible in the multivariate models used to calibrate the weights (nonresponse adjustment and poststratification steps). Because further subdivision of State samples by demographic covariates often produced small cell sample sizes, it was not possible to retain all State-specific covariates and still estimate the necessary model parameters with reasonable precision. Therefore, a hierarchical structure was used in grouping States with covariates defined at the national level, at the census division level within the Nation, at the State-group within census division, and, whenever possible, at the State level. In every case, the controls for the total population within a State and the five age groups within a State were maintained. Census control totals by age and race were required for the civilian, noninstitutionalized population of each State. Published Census projections (U.S. Bureau of the Census, 2000) reflected the total residential population (which includes those in the military and those who are institutionalized). The 1990 census 5 percent public use microdata file was used to distribute the State residential population into two groups, then the raking-ratio adjustment method was used to obtain the desired domain-level counts such that they respect both the State-level residential population counts as well as the national-level civilian and noncivilian counts for each domain. This was done for the midpoint of each NHSDA data collection period (i.e., quarter) such that counts aggregated over the quarters correspond to the annual counts.

Several other enhancements to the weighting procedures were also implemented. The control of extreme weights through winsorization was incorporated into the calibration processes. Winsorization truncates extreme values at prespecified levels and distributes the trimmed portions of weights to the nontruncated cases; note that this process was carried out using the GEM model discussed above. A step was added to poststratify the household level weights to obtain census-consistent estimates based on the household rosters from all screened households; these household roster-based estimates then provided the control totals needed to calibrate the respondent pair weights for subsequent planned analyses. An additional step poststratified the selected persons sample to conform with the adjusted roster estimates. The final step in poststratification related the respondent person sample to external census data (defined within State whenever possible as discussed above).

A.2 Limitations of the Data

A.2.1 Target Population

An important limitation of the NHSDA estimates of tobacco use prevalence is that they are only designed to describe the target population of the survey (i.e., the civilian, noninstitutionalized population aged 12 or older). Although this population includes almost 98 percent of the total U.S. population aged 12 or older, it does exclude some important and unique subpopulations who may have very different tobacco-using patterns. The survey excludes active military personnel, who have been shown to have lower rates of cigarette use. Persons living in institutional group quarters, such as prisons and residential drug treatment centers, are not covered in the NHSDA, and homeless persons not living in a shelter on the survey date are also excluded. Section A.3.2 describes a survey that provides data for military personnel.

A.2.2 Sampling Error and Statistical Significance

The sampling error of an estimate is the error caused by the selection of a sample instead of conducting a census of the population. Sampling error is reduced by selecting a large sampleand by using efficient sample design and estimation strategies, such as stratification, optimal allocation, and ratio estimation.

With the use of probability sampling methods in the NHSDA, it is possible to develop estimates of sampling error from the survey data. These estimates have been calculated for all prevalence estimates presented in this report using a Taylor series linearization approach that takes into account the effects of the complex NHSDA design features. The sampling errors are used to identify unreliable estimates and to test for the statistical significance of differences between estimates.

As was done in NHSDAs prior to 1999, direct survey estimates considered to be unreliable due to unacceptably large sampling error are not shown; instead, they are noted by asterisks (*) in the tables containing such estimates in the appendices. The criterion used for suppressing all direct survey estimates was based on the relative standard error (RSE), which is defined as the ratio of the standard error over the estimate.

For proportion estimates (p) within the range [0 < p < 1], rates and corresponding estimated number of users were suppressed if

when p < 0.5

or

when p > 0.5.

This is an ad hoc rule that requires an effective sample size in excess of 50 when 0.10 < p < 0.90. As (p) approaches 0.00 or 1.00, it requires increasingly larger effective sample sizes. Estimates were also suppressed if they were close to zero or 100 percent (if p < .00005 or if p > .99995).

For estimates of other totals, and means (not bounded between 0 and 1), estimates were suppressed if

Additionally, estimates of mean age were suppressed if the sample size was smaller than 10 respondents.

When making comparisons of estimates for different population subgroups from the same data year, the covariance term, which is usually small and positive, has typically been ignored. This results in somewhat conservative tests of hypotheses that will sometimes fail to establish statistical significance when in fact it exists.

A.2.3 Nonsampling Error

Nonsampling errors occur from nonresponse, coding errors, computer processing errors, errors in the sampling frame, reporting errors, and other errors. Nonsampling errors are reduced through data editing, statistical adjustments for nonresponse, and close monitoring and periodic retraining of interviewers.

Although nonsampling errors can often be much larger than sampling errors, measurement of most nonsampling errors is difficult or impossible. However, some indication of the effects of some types of nonsampling errors can be obtained through proxy measures, such as response rates and from other research studies.

Response rates for the NHSDA were stable for the period from 1994 to 1998, with the screening response rate at about 93 percent and the interview response rate at about 78 percent. Of the 187,842 eligible households sampled for the 1999 NHSDA main study, 169,166 were successfully screened for a weight-adjusted screening response rate of 89.6 percent. In these screened households, a total of 89,883 sample persons were selected, and completed interviews were obtained from 66,706 of these sample persons, for a weighted interview response rate of 68.6 percent. Some 11,276 (18.0 percent) sample persons were classified as refusals, 5,692 (6.7 percent) were not available or never at home, and 6,209 (6.8 percent) did not participate for various other reasons, such as physical or mental incompetence or language barrier. The response rate was highest among the 12- to 17- year-old age group (78.1%). The response rate was 71.2 percent for the 18- to 25-year-old age group and 66.7 percent for adults aged 26 or older.

The increase in nonresponse in the 1999 NHSDA can be attributed primarily to an insufficient number of FIs and their inexperience. Recruiting and training of FIs were major challenges due to the number required for the large sample and the tight labor market. This resulted in a relatively inexperienced staff of FIs. There were 2,010 FIs hired and trained, and more than a third of them did not complete the survey year (37.6 percent). Both prior NHSDA experience and on-the-job experience were shown to be related to nonresponse. Previously experienced interviewers and interviewers with one, two, or three quarters of on-the-job experience were more successful atobtaining an interview. The overall nonresponse was also demonstrated to be a product of the combined influences of urbanicity and the age and gender of the respondent. Interviews were completed at a greater rate in rural regions than in urban areas and by younger and female respondents.

Among survey participants, item response rates were above 98 percent for most questionnaire items. However, inconsistent responses for some items, including the tobacco use items, were common. Estimates of tobacco use from the NHSDA were based on the responses to multiple questions by respondents, so that the maximum amount of information was used in determining whether a respondent was classified as a tobacco user. Inconsistencies in responses were resolved through a logical editing process involving some judgment on the part of survey analysts (as such, it is a potential source of nonsampling error). Because of the automatic routing through the CAI questionnaire (e.g., lifetime drug use questions that skip entire modules when answered "no"), there was less editing of this type than in the PAPI questionnaire used in previous years. In addition, less logical editing was used because with the CAI data, statistical imputation was relied upon more heavily to determine the final values of drug use variables in cases where there was the potential to use logical editing to make a determination. The combined amount of editing and imputation in the CAI data is still considerably less than the total amount in the PAPI study.

NHSDA estimates are based on self-reports of substance use, and their value depends on respondents' truthfulness and memory. Although many studies have generally established the validity of self-report data and the NHSDA procedures were designed to encourage honesty and recall, some degree of underreporting is assumed. No adjustment to NHSDA data has been made to correct for this. The methodology used in the NHSDA has been shown to produce more valid results than other self-report methods (e.g., by telephone) (Aquilino, 1994, Brittingham, Tourangeau, & Kay, 1998; Turner, Lessler, & Gfroerer, 1992). However, comparisons of NHSDA data with data from surveys conducted in classrooms suggest that underreporting of drug use by youths in their homes may be substantial (Gfroerer, 1993; Gfroerer, Wright, & Kopstein, 1997).

A.2.4 Incidence Estimates

For diseases, the incidence rate for a population is defined as the number of new cases of the disease, N, divided by the person time, PT, of exposure or

 .

The person time of exposure can be measured for the full period of the study or for a shorter period. The person time of exposure ends at the time of diagnosis (e.g., Greenberg et al., 1996, pp. 16-19). We follow similar conventions for defining the incidence of first use of a substance.

Beginning in 1999, the NHSDA questionnaire allows for collection of year and month of first use for recent initiates. Month, day, and year of birth are also obtained directly or imputed in the process. In addition, the questionnaire call record provides the date of the interview. If we impute a day of first use within the year and month of first use reported or imputed, we then have the key respondent inputs in terms of exact dates. Exposure time can be determined in terms of days and converted to an annual basis.

Having exact dates of birth and first use also allows us to determine person time of exposure during the targeted period, t. Let the target time period for measuring incidence be specified in terms of dates. For example, for the period 1998, we would specify

,

a period that includes 1 January 1998 and all days up to but not including 1 January 1999. The target age group can also be defined by a half open interval as a = [a1,a2). For example, the age group 12 to 17 would be defined by a = [12, 18) for persons at least age 12, but not yet age 18. If person i was in age group a during period t, the time and age interval, It,a,i, can then be determined by the intersection

assuming we can write the time of birth as in terms of day (DOBi), month (MOBi), and year (YOBi). Either this intersection will be empty () or we will designate it by the half open interval where

and

 .

The date of first use, tfu,d,i, is also expressed as a exact date. An incident of first drug d use by person i in age group a occurs in time t if . The indicator function Ii(d, a, t) used to count incidents of first use is set to 1 when , and to 0 otherwise. The person time exposure measured in years and denoted by ei(d, a, t) for a person i of age group a depends on the date of first use. If the date of first use precedes the target period (), then ei(d, a, t) = 0. If the date of first use occurs after the target period or if person i has never used drug d, then

.

If the date for first use occurs during the target period It,a,i, then

.

Note that both Ii(d,a,t) and ei(d,a,t) are set to zero if the target period It,a,i is empty (i.e., person i is not in age group a during time t). The incidence rate is then estimated as a weighted ratio estimate:

where the wi are the analytic weights.

In prior years, before exact date data were available for computing incidence rates, a person was considered to be of age a during the entire time interval t , if his/her ath birthday occurred during time interval t (generally, a single year). If the person initiated use during the year, the person time exposure was approximated as one-half year for all such persons rather than computing it exactly for each person.

Because of the new methodology, the incidence estimates discussed in Chapter 4 are not strictly comparable to prior year estimates. However, because they are based on retrospective reports by survey respondents as was the case for earlier estimates, they may be subject to some of the same kinds of biases.

Bias due to differential mortality occurs because some persons who were alive and exposed to the risk of first drug use in the historical periods shown in the tables died before the 1999 NHSDA was conducted. This bias is probably very small for estimates shown in this report. Incidence estimates arealso affected by memory errors, including recall decay (tendency to forget events occurring long ago) and forward telescoping (tendency to report that an event occurred more recently than it actually did). These memory errors would both tend to result in estimates for earlier years (i.e., 1960s and 1970s) that are downwardly biased (because of recall decay) and estimates for later years that are upwardly biased (because of telescoping). There is also likely to be some underreporting bias due to social acceptability of drug use behaviors and respondents' fear of disclosure. This is likely to have the greatest impact on recent estimates, which reflect more recent use and reporting by younger respondents. Finally, for substance use that is frequently initiated at age 10 or younger, estimates based on retrospective reports 1 year later underestimate total incidence because 11-year-old children are not sampled by the NHSDA. Prior analyses showed that alcohol and cigarette (any use) incidence estimates could be significantly affected by this. Therefore, for these drugs no 1998 estimates were made.

A.3 Other Sources of Data

A variety of other surveys and data systems collect data on tobacco use. It is useful to consider the results of these other studies when discussing NHSDA data. In doing comparisons, it is important to understand the methodological differences between different surveys and the impact that these differences could have on estimates of tobacco product prevalence. This section briefly describes several of these other data systems, including recent results from them.

In-depth comparisons have been done of the methodologies of the three major federally sponsored national surveys of substance use by youths (i.e., the National Household Survey on Drug Abuse [NHSDA], the Monitoring the Future [MTF] study, and the Youth Risk Behavior Survey [YRBS]). In 1997, a comparison between the NHSDA and the MTF was published (Gfroerer et al., 1997). And in 1999, a series of papers comparing different aspects of the three national surveys was commissioned by the U.S. Department of Health and Human Services (DHHS), Office of the Assistant Secretary for Planning and Evaluation. Experts in survey methods for the latter effort reported the following findings:

A.3.1 Other National Surveys of Tobacco Use

Monitoring the Future (MTF). The MTF is a national survey that tracks drug use trends and related attitudes among America's adolescents. This survey is conducted annually by the Institute for Social Research at the University of Michigan through a grant awarded by the National Institute on Drug Abuse (NIDA). The MTF is composed of three substudies: (a) an annual survey of high school seniors initiated in 1975; (b) ongoing panel studies of representative samples from each graduating class that have been conducted by mail since 1976; and (c) annual surveys of 8th and 10th graders initiated in 1991. In 2000, for all three grades combined, there were 435 public and private schools and almost 45,200 students in the sample. The senior sample included 13,286 seniors in 134 public and private schools. As noted on the MTF website, in 2000 the 10th grade sample involved 14,576 students from 145 schools, and the 8th grade sample size was 17,311 students from 156 schools (MTF, 2000).

Comparisons between the MTF and students sampled in the NHSDA have generally shown NHSDA substance use prevalence levels to be lower than MTF estimates, with relative differences being largest for 8th graders. However, the direction of trends have generally been similar between the two surveys. The lower prevalences in the NHSDA may be due to more underreporting in the household setting as compared to the MTF school setting. The MTF does not survey dropouts, a group generally shown (using the NHSDA) to have higher rates of use (Gfroerer et al., 1997).

This school-based survey showed increases in smoking rates among students from 1991 to 1996. Cigarette smoking peaked in 1996 among 8th and 10th graders nationwide and in 1997 among 12th graders. Since those peak years, cigarette use has gradually declined, and recently released data indicated that cigarette use among adolescents declined sharply between the last two MTF surveys (MTF, 2000). For example, current (past month) smoking decreased significantly among 8th graders, falling from 17.5 percent in 1999 to 14.6 percent in 2000. Past month cigarette use also declined sharply among 12th graders, dropping from 34.6 percent in 1999 to 31.4 percent in 2000. Dailysmoking in the past month declined from 15.9 to 14.0 percent among 10th graders and from 23.1 to 20.6 percent among 12th graders. The proportion of 10th graders smoking heavily (i.e., smoking a half-pack or more of cigarettes per day) decreased among 10th graders from 7.6 percent in 1999 to 6.2 percent in 2000 and among 12th graders from 13.2 percent in 1999 to 11.3 percent in 2000. Prevalence rates for the use of smokeless tobacco remained stable.

Youth Risk Behavior Survey (YRBS). The YRBS is a component of CDC's Youth Risk Behavior Surveillance System, which biennially measures the prevalence of six priority health risk behavior categories: (1) behaviors that contribute to unintentional and intentional injuries, (2) tobacco use, (3) alcohol and other drug use, (4) sexual behaviors that contribute to unintended pregnancy and sexually transmitted diseases (STDs), (5) unhealthy dietary behaviors, and (6) physical inactivity. The 1999 national school-based survey used a three-stage cluster sample design to produce a nationally representative sample of students in grades 9 through 12. The 1999 State and local surveys used a two-stage cluster sample design to produce representative samples of students in grades 9 through 12 in their jurisdictions (CDC, 2000a). The 1999 national YRBS sample included 15,349 students in grades 9 through 12 in the 50 States and DC.

The YRBS found increases in trends for current cigarette use among students in grades 9 to 12. Current smoking rose from 27.5 percent in 1991 to 34.8 percent in 1999 (CDC, 2000a). Overall, lifetime, current, and frequent cigarette use prevalence (defined as smoking on 20 or more days of the 30 days preceding the survey) in the 1999 survey were 70.4, 34.8, and 16.8 percent, respectively. Although the NHSDA trend for smoking among youths (aged 12 to 17) has not shown these increases, the NHSDA estimates for years prior to 1994 were apparently substantial underestimates because the data were collected without private self-administered answer sheets. When the NHSDA converted to the use of these answer sheets in 1994, the smoking rate for adolescents approximately doubled. This raises questions about the accuracy of the NHSDA measurement of the trend prior to 1994, even after adjustments are made to account for the effect of the new questionnaire.

National Youth Tobacco Survey (NYTS). The American Legacy Foundation released findings from its 1999 NYTS in October 2000. The 1999 NYTS was designed to get data on tobacco-related issues for a nationally representative sample of students in grades 6 through 12. The survey was given to over 15,000 students in 131 school across the United States in the fall of 1999. The students completed anonymous, self-administered questionnaires that included a variety of tobacco-related questions. Major topics covered by the 1999 NYTS included patterns of tobacco use, knowledge and attitudes about tobacco, minors' ability to purchase tobacco products, and exposure to environmental tobacco smoke (ETS). The American Legacy Foundation found that in1999, approximately 7.3 percent of all adolescents were established smokers (they had smoked at least 100 cigarettes in their lifetime) (American Legacy Foundation, 2000).

College Alcohol Study (CAS). The Harvard School of Public Health's CAS is an ongoing survey supported by a grant from the Robert Wood Johnson Foundation. It surveys more than 15,000 students (18 to 24 years of age) at 140 four-year colleges in 40 States. The objective of the CAS is to look at high risk behaviors and to identify student- and college-level factors associated with these behaviors among college students. These behaviors include heavy episodic or binge drinking, smoking, illicit drug use, gun possession, violence, and other behavioral, social, and health-related problem facing America's college students today. The principal investigator is Henry Wechsler.

The CAS includes all forms of tobacco use: cigarettes, cigars, pipes, and smokeless tobacco. The prevalence of cigarette smoking by college students, which was sharply up between 1993 and 1997, stabilized between 1997 and 1999 (Harvard School of Public Health, 2000). In the 1999 CAS, a total of 14,138 students in 119 four-year colleges were surveyed. The 1999 data indicated that nearly half of all respondents (45.7 percent) had used a tobacco product in the past year, and one third (32.9 percent) had used a tobacco product in the past month (current use). Cigarettes accounted for most of the tobacco use (28.5 percent of the 18- to 24-year-old college students had smoked cigarettes in the 30 days prior to survey). Cigar use was also substantial with 37.1 percent citing lifetime use, 23.0 percent reporting past year use, and 8.5 percent saying they were current cigar users. Among college students, men were significantly more likely than females to be tobacco users and tobacco use was significantly higher among white students as compared to African-American, Hispanic, and Asian students.

National Longitudinal Study of Adolescent Health (Add Health). In 1994-96, Add Health was conducted to measure the effects of family, peer group, school, neighborhood, religious institution, and community influences on such health risks as tobacco, drug, and alcohol use. The survey also asked about substance abuse (alcohol, tobacco, and illicit drugs). The survey consisted of three phases. First, roughly 90,000 students from grades 7 through 12 at 145 schools around the United States answered brief questionnaires. Next, interviews were conducted with about 20,000 students and their parents in the students' homes. Then, 1 year later, the students were interviewed a second time in their homes. Survey results from the September 1994 survey indicate that nearly 3.2 percent of 7th and 8th graders smoked 6 or more cigarettes a day as did 12.8 percent of 9th through 12th graders (Resnick et al., 1997).

Partnership Attitude Tracking Study (PATS). In November 1999, the Partnership for a Drug-Free America (PDFA) released results from the 1999 PATS, the only ongoing national research that tracks drug use and drug-related attitudes among children as young as 8 and 9 years old, teenagers, and their parents. Data from the 1999 PATS showed declines in cigarette use among teenagers (see PDFA, 2000). For teenagers in grades 7 through 12, the prevalence of past month cigarette declined from 42 percent in 1998 to 37 percent in 1999. For those in grades 7 and 8, past month smoking declined from 36 percent in 1998 to 33 percent in 1999. Among 9th and 10th graders, the decline observed for past month cigarette use was from 44 percent in 1998 to 35 percent in 1999. For the oldest teenagers (those in grades 11 and 12), the decrease in past month cigarette use was from 47 percent in 1998 to 42 percent in 1999.

National Health Interview Survey (NHIS). The NHIS is a continuing nationwide sample survey that collects data using personal household interviews. In 1997, the data collection methodology changed from paper-and-pencil questionnaires to a computer-assisted personal interviewing (CAPI) instrument. The 1998 NHIS was conducted by the Bureau of the Census for the National Center for Health Statistics (NDHS). The survey estimated that 24.0 percent of the population age 18 and over were current cigarette smokers in 1998. Among males, 25.9 percent reported current cigarette smoking compared to 22.1 percent of females aged 18 or older. Current smokers are defined as those who have smoked at least 100 cigarettes in their lifetime and answer that they currently smoke, including those who smoke only on some days. The current smoker definition used in the NHIS is somewhat different from that used in the NHSDA where current cigarette smoking is defined as any use in the past month.

Surgeon General's Report on Smoking and Health. The Surgeon General's report on smoking and health (DHHS, 1994) included smoking prevalence data from a number of sources, including the NHSDA. Comparisons between the various sources were made and methodological differences were assessed. These comparisons used NHSDA data prior to 1994, which were based on the interviewer-administered smoking questions, and thus show low rates of smoking in the NHSDA, particularly among youths.

A.3.2 Survey of Population Not Covered by the NHSDA

Worldwide Survey of Substance Abuse and Health Behaviors Among Military Personnel. The 1998 Worldwide Survey of Substance Abuse and Health Behaviors Among Military Personnel was sponsored by the Department of Defense and conducted by Research Triangle Institute (RTI). The survey interviewed 17,264 active duty Armed Forces personnel worldwide. Military personnel generally exhibited lower rates of cigarette use than the civilian population, but this finding seems largely due to an increase in smoking among civilians rather than significant decreases among military personnel or changes in the military population (Bray et al., 1999).

Top Of PageTable Of Contents