For a version of this post with interactive maps, check it out on GitHub. Introduction Oftentimes, when working with public data, there will be a geospatial component to the data —…
Note: this piece originally appeared in Adweek
By now, everyone can agree: done right, data-based insights can drive smarter, more personalized touchpoints with the people that matter, and improve the overall quality of business decisions. For this to work, though, companies need accurate, trustworthy data. As we like to say at my organization, “garbage in, garbage out.”
About a year ago, Eric Keating from Zaius wrote for Adweek that “smart marketers should turn to first-party data first.” Since then, the gardens’ walls have grown higher, and the legislative landscape has gotten more complex. If smart marketers needed to turn to first-party data in 2018, by 2019 it’s a must-have for a consumer-facing business.
It sounds both simple and obvious: Data that comes directly from the target audience is the most accurate, trustworthy data there is. But there’s one aspect that many aren’t considering: gathering information from surveys is easy. But gathering unbiased information from surveys — a common source of first-party data — is really, really hard. All surveys invite bias, and it is the marketer’s job to understand how the biases in their survey data affect the insights generated. All too often, this is overlooked.
While survey science is complicated, there are some basic concepts that everyone on the marketing team should understand.
Social desirability bias: Where a respondent answers untruthfully because they are conscious of how their answers appear to others. People are more likely to lie about sensitive subjects, whether that’s downplaying something like drinking or overstating something like donations to charitable causes. If you work at, say, an alcoholic beverage manufacturer, or a marijuana company, make sure you ask whoever is running your survey how they mitigate this bias.
Mode bias: Where the way in which the information was collected (phone, online, mailed, 1:1 in-person, focus group) invites bias. There is no perfect mode of data collection, so it’s important to understand the benefits and limitations of each and choose accordingly.
Most companies use focus groups or online questionnaires. Focus groups are great for qualitative research — they can provide helpful feedback for UX designers, give insight into brand awareness, or offer context on community dynamics. But they also invite social desirability bias, mentioned above. Online surveys minimize social desirability bias, however, with no moderator, respondents can get confused or lose focus and answer inaccurately. Internet coverage is also an important issue, as the population without Internet access may be systematically different from the population with Internet access. If a public health agency is interested in assessing hardship in access to quality health care, doing a simple online survey won’t suffice.
Coverage bias: Where you fail to survey a section of the population that matters to your research question. What you want to learn should dictate who you talk to — for example, if you’re doing a customer acquisition campaign, it’s crucial to survey outside your customer base. If you want to understand customer churn, you must survey former customers. If you’re conducting a national survey, it is crucial to have a representative sample of respondents from across the country.
Sampling error: Sampling error, a measure of statistical inaccuracy, decreases as we increase the sample size — but it isn’t linear. Surveying 2,000 people may be double the cost of surveying 1,000 people, but the statistical accuracy of the n=2,000 survey is not double that of an n=1,000 survey. When commissioning survey research, understand what sample size will decrease sampling error to an acceptable amount to avoid overpaying.
Response bias: Where those surveyed don’t accurately represent everyone that matters. For example, if the population of interest skews female (e.g. makeup consumers) but the survey respondents skew male, there will be a bias in population estimates. On the other hand, gender imbalance may not result in bias if there is no difference in attitude/outcome between men and women (e.g. understanding public transportation habits). It’s difficult to get a respondent pool that exactly matches who you want to study — this is controlled by weighting, which requires a statistical background. Make sure any survey provider offers proper weighting.
Problematic questions: Where the questions are asked in a way that will lead to biased responses. In some cases they’re confusing or reliant on memory (e.g. how many hamburgers did you eat last year?) leading (e.g. “You support tax cuts for the middle class, don’t you?”) or there’s just too many of them and the respondent starts to speed through, without giving careful thought to each question, just to get to the end. We see responses begin to drop off dramatically when survey length exceeds 11-23 minutes, depending on the mode. To make matters more complex, our research shows nearly no relationship between what people “like” and what actually changes their mind (preference and persuasion are different).
A talent manager friend once showed me how to spot paid product placements in magazine photos. It was like taking the Matrix’s “red pill” — it’s on my radar now and I can’t look at People Magazine the same way. Similarly, remain vigilant for these signs of survey bias, even if your team doesn’t have the budget for a social scientist. The more important first-party data becomes, the more critical it is for entire marketing teams to help spot potential biases.
Ellen Houston is the Managing Director of Applied Data Science for Civis Analytics.