ABC News' Polling Methodology and Standards
The Nuts and Bolts of Our Public Opinion Surveys
by GARY LANGER, Langer Research Associates
May 3, 2013
A summary of ABC News polling standards and methodology follows.
Langer Research Associates, primary polling provider to ABC News, advises the news division on standards for disclosure, validity, reliability and unbiased content in survey research, and evaluates data when requested to establish whether it meets these standards.
On disclosure, in addition to the identities of the research sponsor and field work provider, we require a detailed statement of methodology, the full questionnaire and complete marginal data. If any of these are lacking, we recommend against reporting the results. Proprietary research is not exempted.
Methodologically, in all or nearly all cases we require a probability sample, with high levels of coverage of a credible sampling frame. Self-selected or so-called “convenience” samples, including internet, e-mail, “blast fax,” call-in, street intercept, and non-probability mail-in samples do not meet our standards for validity and reliability, and we recommend against reporting them.
We do accept some probability-sample surveys that do not meet our own methodological standards – in terms of within-household respondent selection, for example – but may recommend cautious use of such data, with qualifying language. We recommend against reporting others, such as pre-recorded autodialed surveys, even when a random-digit dialed telephone sample is employed.
In terms of content, we examine methodological statements for misleading or false claims, questionnaires for leading or biasing wording or ordering, and analyses and news releases for inaccurate or selective conclusions.
In addition to recommending against reporting surveys that do not meet these standards, we promote and strongly encourage the reporting of good-quality polls that break new ground in opinion research.
Field work for most of ABC’s U.S. polling is carried out by Abt SRBI of New York, N.Y., using a dual-frame sample design covering both landline telephone and cell phone-only respondents, with samples produced by Survey Sampling Inc. of Shelton, Conn. We tested cell-only sampling in August 2008, made it a regular part of our sample design in October 2008 and reported on our approach in detail at the annual meeting of the American Association for Public Opinion Research in May 2010; see that paper here.
In the landline component of these surveys, a sample of landline households in the continental United States is selected by SSI via random digit dialing procedures, in which all landline telephone numbers, listed and unlisted, have an equal probability of selection.
SSI starts with a database of all listed telephone numbers, updated on a four- to six-week rolling basis, 25 percent of listings at a time. This database of directory-listed numbers is then used to determine all active blocks – contiguous groups of 100 phone numbers for which more than one residential number is listed. All possible numbers in active blocks are added to the random digit database.
Until 2005, ABC News followed the industry norm of excluding all listed business numbers (compiled from sources such as Yellow Pages directories and the Dunn and Bradstreet Business Data database) from the sample. However, an ABC-led study (Merkle, Langer, Cohen, Piekarski, Benford and Lambert, 2009, Public Opinion Quarterly) found that this “cleaning” process excludes respondents who have home-based business-listed phones and no other lines at home on which they take calls, creating 3 percent noncoverage of eligible households with no offsetting gains in productivity. As a result of this evaluation, we do not exclude listed business numbers from our landline sample, with the exception of those in business-only blocks or exchanges.
Each telephone exchange in the landline sample is assigned to the county where it’s most prevalent. In the first stage of selection, the database is sorted by state and county, and the number of telephone numbers to be sampled within each county is determined using systematic sampling procedures from a random start, such that each county is assigned a sample size proportional to its share of possible numbers. In the second stage of selection, telephone numbers are sorted within county by area code, exchange and active block, and using systematic sampling procedures from a random start, individual phone numbers within each county are selected. The sampled phone numbers are pre-dialed via a non-ringing auto-dialer to reduce dialing of non-working numbers.
Wireless telephone numbers also have an equal probability of selection in the ABC sample. To produce these samples, SSI begins with the latest monthly listing of every existing telephone area code and exchange. About half of these are pooled by their producers in contiguous groups of 10 100-block phone numbers, or 1,000-blocks, with information including whether each pooled 1,000-block does or does not include wireless numbers, either solely or on a shared basis with landline numbers. All such wireless-inclusive 1,000-blocks are included for sampling purposes. For numbers that are not 1,000-block pooled, wireless service information is available at the exchange level only; therefore all numbers in those exchanges also are included for sampling purposes.
All numbers used in wireless sampling are then handled at the 100-block level. Given the absence of any cell-phone directory, all 100-blocks used for sampling purposes are considered active. Shared 100-blocks that include listed landline numbers are removed from the wireless sample, because they are included in the landline frame; shared 100-blocks with no listed landline numbers are retained in the wireless frame. As such the two frames are mutually exclusive, with no overlap. (The number of cell numbers included in the landline sample for any reason – porting, forwarding, or shared 100-blocks with listed landlines – is insignificant, one-tenth of 1 percent in an SSI study. Fewer than that would be cell-only.)
Each 100-block is assigned to a county based on the billing coordinates of the exchange. The database is sorted by county code, carrier name and 100-block. A sampling interval is determined by dividing the universe of eligible 100-blocks by the desired sample size. From a random start within the first sampling interval, a systematic nth selection of 100-blocks is performed and a 2-digit random number between 00 and 99 is appended to each selected 100-block stem.
Unlike sampled landlines, sampled wireless phone numbers are not pre-dialed via non-ringing auto-dialer, and are hand-dialed by Abt SRBI interviewers.
Sample Management and Interviewing
As noted, ABC’s approach is to view these landline and wireless sampling frames as mutually exclusive; the purpose of the wireless frame is to address the landline frame’s non-coverage of cell-only households, using the most recent estimates from the federal government’s in-person National Health Interview Survey.
Abt SRBI draws wireless sample proportionate to its distribution in the country’s four U.S. Census regions, per NHIS data. Respondents are screened for cell-only status; those with landlines they use to take calls are not interviewed by cell phone, since they are covered in the separate landline frame. Cell-only respondents’ place of residence is checked and their Census region adjusted accordingly, if necessary.
Landline sample is drawn proportionate to its estimated distribution in the country’s nine Census divisions.
In each sample, phone numbers are released for interviewing in replicates by Census region (cell) or division (landline) to allow for sample control. Numbers are called multiple times during the field period in multi-night polls. Interviews are conducted via a computer-assisted telephone interviewing (CATI) system. Abt SRBI’s professional interviewers, and their supervisors, are extensively trained in interviewing practices, including techniques designed to achieve the highest possible respondent cooperation.
Cell-only respondents are not offered compensation. A reimbursement check is offered if use of minutes is raised as an objection and the respondent subsequently supplies his or her mailing address; on average this occurs in six cases per survey, out of a current per-survey sample of approximately 200 cell-only respondents.
Final data are weighted using demographic information from the U.S. Census to adjust for sampling and non-sampling deviations from population values. Until 2008 ABC News used a cell-based weighting system in which respondents were classified into one of 48 or 32 cells (depending on sample size) based on their age, race, sex and education; weights were assigned so the proportion in each cell matched the Census Bureau’s most recent Current Population Survey. To achieve greater consistency and reduce the chance of large weights, ABC News in 2007 tested and evaluated iterative weighting, commonly known as raking or rim weighting, in which the sample is weighted sequentially to Census targets one variable at a time, continuing until the optimum distribution across variables (again, age, race, sex and education) is achieved. ABC News adopted rim weighting in January 2008. Weights are capped at lows of 0.2 and highs of 6.
In procedures since the start of the dual-frame design, cell-only and landline samples first are weighted by Census region to their respective proportions of the population (per NHIS cell-only estimates). The combined sample is then rim-weighted to full-population Census parameters for age, race, sex and education. A post-weight is applied to the cell-only sample if needed to correct its final proportion within the full sample.
Surveys commonly are weighted to the number of telephone lines in each respondent’s home to adjust for the higher probability of selection of multiple-line households. ABC News has studied the effect of such weighting (Merkle and Langer, Public Opinion Quarterly, Vol. 72 No.1, Spring 2008) concluding that it carries the risk of distortion, and, when done properly, has no meaningful impact on the data. ABC News polls therefore are not weighted to the number of household phone lines.
Poll results may deviate from full population values because they rely on a sample rather than a census of the full population. Sampling error can be calculated when probability sampling methods, such as those described here, are employed, using the standard formula (at the 95 percent confidence level) of (SQRT(.25/sample size))*1.96. There can be other sources of differences in polls, such as question wording and order, design effect from clustering in an area probability sample, systematic non-coverage or selection bias.
As a function of sample size, sampling error is higher for subgroups. We analyze subgroups only as small as 100 cases (or very near it), for which the error margin is 10 percentage points. See our fuller description of sampling error here.
A survey’s response rates represents its contact rate (the number of households reached out of total telephone numbers dialed, excluding an estimate of nonworking and business numbers) multiplied by its cooperation rate (the number of individuals who complete interviews out of total households reached).
It cannot be assumed that a higher response rate in and of itself ensures greater data integrity. By including business-listed numbers, for instance, ABC News increases coverage, yet decreases contact rates (and therefore overall response rates). Adding cell-only phones also increases coverage but lessens response rates. On the other hand, surveys that, for instance, do no within-household selection, or use listed-only samples, will increase their cooperation or contact rates (and therefore response rates), but at the expense of random selection or population coverage. (For a summary see Langer, 2003, Public Perspective, May/June: 16-8.)
Research has found no significant attitudinal biases as a result of response rate differences. A study published in 2000, “Consequences of Reducing Nonresponse in a National Telephone Survey” (Keeter, Miller, Kohut, Groves and Presser, POQ 64:125-48), found similar results in surveys with 61 and 36 percent response rates. A follow-up in 2006, “Gauging the Impact of Growing Nonresponse on Estimates from a National RDD Telephone Survey” (Keeter, Kennedy, Dimock, Best and Craighill, POQ 70:759-79), based on surveys with 50 and 25 percent response rates, again found “little to suggest that unit nonresponse within the range of response rates obtained seriously threatens the quality of survey estimates.” As far back as 1981, in “Questions & Answers in Attitude Surveys,” Schuman and Presser, describing two samples with different response rates but similar results, reported (p. 332), “Apparently the answers and associations we investigate are largely unrelated to factors affecting these response rate differences.”
In spring 2003 ABC News and the Washington Post produced sample dispositions for five randomly selected ABC/Post surveys at the request of Prof. Jon Krosnick, then of Ohio State University, for use in a study of response rates. The cooperation rate calculations produced by Krosnick’s team for these five surveys ranged from 43 to 62 percent, averaging 52 percent; response rates ranged from 25 to 32 percent, based on what AAPOR describes as a “very conservative” estimate of the number of business and nonworking numbers in the sample (known as “e”). The range was 31 to 42 percent using a more common estimate of this variable proposed by Keeter et. al. in 2000.
In their study (“The Causes and Consequences of Response Rates in Surveys by the News Media and Government Contractor Survey Research Firms,” in Advances in Telephone Survey Methodology, Chapter 23, Wiley 2007), Holbrook, Krosnick and Pfent concluded, “lower response rates seem not to substantially decrease demographic representativeness within the range we examined. This evidence challenges the assumptions that response rates are a key indicator of survey quality.”
Pre-election polling presents particular challenges. As Election Day approaches these polls are most relevant and accurate if conducted among voters. Yet actual voters are an unknown population – one that exists only on (or, with absentees, shortly before) Election Day. Pre-election polls make their best estimate of this population.
Our practice at ABC News is to develop a range of “likely voter” models, employing elements such as self-reported voter registration, intention to vote, attention to the race, past voting, age, respondents’ knowledge of their polling places and political party identification. We evaluate the level of voter turnout produced by these models and diagnose differences across models when they occur.
The use of political party identification in likely voter models is a subject of debate among opinion researchers. It’s used commonly by campaign pollsters, less so among academic researchers. After extensive evaluation ABC News has employed party ID as a factor in some likely voter models for our general election tracking polls, chiefly to adjust for trendless night-to-night variability in political partisanship. (A tracking poll is a series of consecutive, one-night standalone polls reported in a multi-night rolling average.)
ABC News has presented detailed evaluations of our 2000, 2004 and 2008 tracking polls at polling conferences and in published work (Langer and Merkle 2001; Merkle, Langer and Lambert 2005; also in “Public Opinion Polling in a Globalized World,” Springer 2008; Langer et al. 2009) showing that party ID factoring had little effect on our estimates of vote preferences.
With thanks to Linda Piekarski of SSI for review and comment.