

SUBJECT: STANDARD FOR ESTIMATING SAMPLING ERROR 

EFFECTIVE DATE: 03/16/87 CES STANDARD 870502 







PURPOSE: To establish procedures for developing estimation methods that can be used to generate standard errors for key statistics to be studied or reported in CES publications. 








o During the sample design phase for any survey to be conducted by or for the Center, consideration should be given to the methods to be used for variance estimation. If variances are to be calculated directly on the statistics to be studied, the variance estimators should be developed as part of the sample design. If variances are not calculated directly, but instead are calculated using a replication technique, the formation of replicates should be done as part of the sample design and the variance estimators that use the replicates should be developed as part of the sample design. 








o Where possible, estimation should make use of other data from the survey, from prior surveys, or from administrative records or censuses to minimize the variance of the survey estimates. Ratio or regression estimators should be used whenever possible if they do not substantially complicate the analysis while providing some reduction in the variance of the statistics being estimated. 








o If replication techniques are used, some research should be done as part of the sample design to determine the number of random groups to be used, and the stability of the estimators for the statistics being estimated. As a rule of thumb, jackknife estimators and balanced repeated replication are to be preferred over random groups estimators because of their greater stability in most cases, and the fact that jackknife estimators can eliminate the first order bias in ratio or regression estimates. 








o As part of the reporting of survey results, if there are a large number of statistics being estimated and presented, it may be more practical to model the variances and present the summary form of the model. This can be done by presenting the relationships between the relative variances and the values being estimated as a hyperbola called a GATT curve. A regression is run with the relative variance as the dependent variable and the reciprocal of the estimate for which the variance is calculated as the independent variable. A broad selection of variables to be reported on in the survey are used in the regression to establish the relationship, and different GATT curves can be calculated for different subgroups of the population to reflect both different sample sizes and different design effects. The closeness of the fit of the GATT curve should be tested before the GATT curve is reported in a publication  a poorly fitting curve gives a bad approximation to the actual variances and should not be reported. 