Sampling and Surveys: How Trust Ratings Relate to Reality
It's actually a lot like eating pizza.
(SPOT.ph) Surveys are powerful tools in gathering data, usually from big groups of people. Survey questionnaires can measure different variables, both quantitative and qualitative, such as satisfaction scores or personal details like age, sex, and educational background. In national contexts, surveys are often used to help us get a pulse on public sentiments on issues or political figures and candidates.
While they are a means to provide us information to make better decisions, not all surveys are created or implemented equally. What we usually see in the news are survey findings already summarized and packaged neatly for us, but this doesn’t mean we shouldn’t question the results and methodology used to arrive at these. Faced with several percentages and jargon you might not understand in a survey, you might find yourself overwhelmed and asking, “How can a sample of a mere thousand represent millions? How is this sample even chosen? How do we know how reliable the survey is? What does it really tell us?”
To answer these questions, we need to understand a very important concept in statistics called sampling.
Sampling is a lot like eating pizza. You don’t need to eat the whole pie to know what it tastes like; more often than not, a slice will suffice. However, not all slices are created equally. In getting a taste of the pizza, you don’t want a slice that’s just mostly crust, nor do you want one cut right from the middle. You also don’t want to get the particular slice that contains most of the pepperoni (Now that would just be too greedy.) These kinds of slices, while part of the pizza, do not best represent it. In some cases, with even just a small slice, we wouldn’t need the rest of the pizza as tasting more than that would just be redundant.
Like taking a slice out of a pizza, sampling is a statistical way of representing the population, or the group that contains all the elements, through a derived subset called the sample. We do not need to measure the whole or every single element within it to get a reliable and fairly accurate sense of what it looks like. We conduct sampling because a census, or an assessment of the whole population, would entail using much more resources such as time, money, and effort. Meanwhile, a sample, when done properly, provides us enough information to estimate and make reasonable conclusions and decisions regarding the population.
There are two main types of sampling: Probability and non-probability sampling. Probability sampling entails a random selection of elements within a population for a sample, such that the chance of each element being selected can be computed for. This sampling method is widely preferred as it gives us the best chance of having a sample that most accurately represents a population, due to statistical randomness.
In non-probability sampling, on the other hand, we cannot compute for the probability of an element being selected due to the subjective selection of members for the sample. While using this method is less costly in terms of money, time and effort, it introduces bias to the sample, which will skew our results. Statistical bias isn’t the same as how we typically understand “bias” as it isn’t just an inclination to a certain side, but rather, “the tendency of a statistic to overestimate or underestimate a parameter.” This means the sample drawn does not accurately represent the population as much as a probability sampling method would, posing a problem for conclusions drawn from the findings. For example, online surveys can be biased due to self-selection, because individuals volunteer to be part of the sample. One must be cautious in making generalizations about the population from a biased sample due to the lack of validity in capturing what the population looks like.
Another key difference between probability and non-probability sampling is the margin of error, along with the confidence level, which cannot be computed for in a non-probability sampling, another one of its weaknesses. The margin of error and confidence level
For example, a survey was conducted to find out if Filipinos are for or against pineapples on pizza. If the survey results show that 60% are for pineapples on pizza with a ±3% margin of error, this means we can reasonably expect 57% to 63% of the population of Filipinos to be for pineapples on pizza. The confidence level, on the other hand, tells us how sure we can be of seeing the same results if we had repeated the study. If the confidence level is 95% (a value commonly used for most surveys) we can expect to see the same results in 95 out of 100 repetitions of the survey sampling. In general, we want to keep our margin of error low and our confidence level high, while also taking into consideration the resources used in conducting the survey. In non-probability sampling, these two measures are absent and thus, the generalizability of the sample findings is more difficult to assess.
Sampling is one of the fundamental concepts in statistics and has enabled us to know more about our world in more