A frequent issue in weighting is how to handle non-response in the sample when the target proportions add back to the total population. For example, we may wish to weight a sample to 51% female and 49% male, but what do we do with the 5% of respondents for whom no gender is indicated? This is a particularly vexing problem in sample balancing, where some variables may have non-response and others not.
A weight is the ratio of the probability of selection in the sample to that in the population. For a sample to be representative, probability theory requires that every member of the population have a known non-zero probability of selection. This means that every case must have a non-zero weight and that the weighted cases must add back to 100% of the projected total. This is why QBAL will not allow you to weight cells with data to zero or assign non-zero weights to cells with no data and will give an error message if either situation occurs.
One possible solution is imputation, where non-response to a variable is changed to the "most likely" response, based on logical or statistical analysis of other information in the data. While imputation is widely used in media research and experimental design, it is often not practical, or even possible, in ad hoc surveys. Furthermore, some statisticians object to imputation on principle because it modifies the data using arbitrary criteria.
A more common solution is to adjust the weights for each item by the proportion of the sample responding to that variable. This has the advantage that when responses are percentaged on total answering the weighted distribution will match the target supplied, yet when responses are percentaged on the total sample, non-response will appear in the proportion actually obtained from the sample.
When QBAL is used to compute weights and insert them in a data file, the list of target proportions (supplied as a numberset in the spec file) always has one more entry than the number of cells defined in the corresponding variable. The last numberset value is used to weight cases with non-response, and must be zero if all cases are explicitly accounted for in the variable specs.
If a QBAL variable does not account for everyone as specified, you can change the last entry in the numberset to a non-zero value to force computing a weight for the non-response cases, or (preferably) explicitly specify another cell to catch the non-response and add that value to the numberset before the final zero. Either way, you need to know the number of non-responses to the variable and you need to adjust the target weights to take them into account.
QTAB can be used to tabulate the non-response and to adjust a set of target weights in a single step. qbadjust.zip contains the specs, data and listings for the QTAB run that produced the table shown below.
QBADJUST on file: qt32demo.dat - Apr 25 2001, 14:30 TABLE SIZE Size of Company Case % of % of Original Adjusted Count Total Answering Target Target Total 1065 100.00% - - 100.00% Total Answering 1008 94.65% 100.00% - 94.65% Under $1 million 259 24.32% 25.69% 35.00% 33.13% $1-$10 million 418 39.25% 41.47% 30.00% 28.39% $10-$50 million 232 21.78% 23.02% 20.00% 18.93% Over $50 million 99 9.30% 9.82% 15.00% 14.20% No Answer 57 5.35% - - 5.35%
These QTAB specs were designed as a template so that a user can replace the sample variables and make minimal changes (fully documented in the spec file) in order to generate similar tables for their own data sets. Since QBAL and QTAB share a common specification language, QBAL variables and numbersets transfer easily to QTAB.
Note that the "No Answer" row is not explicitly specified in these QTAB variables, but is generated after the data are tabulated. When transferring the results to QBAL, you must take this into account, either by explicitly adding a cell or terminating the numberset with the adjusted "No Answer" proportion from the table instead of "0".
Adding the non-response cell in the numberset is preferable because it explicitly documents the adjustment in the specs.