What are we weighting for?
1/27/09 / Beth Mulligan
Let’s say that you just conducted a public survey of your community for a community needs assessment. In your community, 29% of residents are between age 18 and 34, and 29% are age 55 or older. Yet among your survey respondents, 8% are aged 18-34, while 52% are aged 55 or older (this is a bit extreme, but not immpossible pattern to find in an RDD telephone survey where no effort is made to request younger respondents in each household).
Now, let’s say that one of the issues your community is trying to assess is whether to allocate funding to the parks and recreation department for developing an athletic field to be used for ultimate frisbee, soccer, and other sports leagues. And let’s say that most of the 18-34 year olds in your survey rated funding for this field as their highest priority among a set of alternatives, while none of the respondents aged 55 or older rated it their highest priority.
How should you determine the overall amount of support for an athletic field?
If you simply calculate the percentage of survey respondents who rated the field as their first priority, you may conclude that 18% rate the new field their top priority for the community (including most of the young people plus some people in each of the other age categories – see table below).
However, does that figure really represent the opinions of your community as a whole? Look at the differences between the demographic makeup of your survey respondents and the demographic makeup of your community. Are they pretty different? If so, the best way to determine what your community thinks of the athletic field is to weight the survey responses based on the community demographics. So, respondents aged 18-34 get larger weights because they are underrepresented among respondents, while respondents aged 55 and older get smaller weights because they are overrepresented among respondents. Then you tabulate the weighted responses for the survey. The weighted responses may show that not 18%, but 37% of community residents rate the new athletic field as their top priority.
Does the difference between 18% and 37% matter to your community? Let’s say the breakdown of support for a list of five initiatives looked like this:
If you went to your city council with the results in the first column, it would seem that there was a pretty clear mandate to allocate funds for a senior center in your community. However, if you went to your city council with the results in the second column, it would seem that there is about equal support for an athletic field or a senior center among residents in the community. Since the unweighted results are based on a significantly greater percentage of people aged 55 and older than there are in the community as a whole, and significantly fewer younger adults, using the raw responses would be a particularly biased way to determine the city’s priorities.
When your community does a survey, whether you’re a community member, or a council decision maker, you want the results to tell you about your actual community. Weighting the survey respondents based on the demographic characteristics of the community (usually in terms of age and gender, at least) is the only way to ensure this. If your survey sample is substantially different from the actual community, this can have a relatively large effect on the margin of error for your results too, and it’s important to adjust that accordingly.
It’s important to realize that even in surveys where you’ve drawn a simple random sample of the population to survey, the respondent demographics will nevertheless be somewhat different than the community demographics because demographic groups differ in their willingness to participate in surveys. You cannot assume your sample to be representative of the population just because you tried to reach a simple random sample of the population. Make sure your research team is using appropriate methodologies to make sure you are getting results about your actual community. What are you waiting for?