# Statistics Colloquium: Dr. Tommy Wright

## US Census Bureau

Friday, September 8, 2017

11:00 AM - 12:00 PM

11:00 AM - 12:00 PM

Mathematics/Psychology : 401

**Title:**

*No Calculation When Observation Can Be Made*

**Abstract**

For use in connection with the general and complete observations that would be known from a full census, Kiaer (1895, 1897) presents a purposive “Representative Method” for sampling from a finite population to provide “…more penetrating, more detailed, and more specialized surveys…” Many credit this method with laying seeds for current sampling methods used in producing official social and economic statistics. At a time when just about all official statistics were produced by censuses, Kiaer had much opposition, especially from statistician von Mayr, who said (a translation), “…no calculations when observations can be made.”

Neyman (1934) brought probability to this Representative Method using stratified random sampling. Probability makes it possible to express uncertainty about the results from the Representative Method and to say how good the results are. Neyman presents details for the well-known and widely used optimal allocation of the fixed sample size among the various strata to minimize sampling error. When sample sizes are rounded to integers from Neyman’s allocation, minimum sampling error is not guaranteed. Wright (2012) improves Neyman’s result with a simple derivation obtaining exact results that always yield integer sample size allocations while minimizing sampling error. Wright (2014, 2016, 2017) obtains exact integer optimal allocation results when there are mixed constraints on sample sizes for each stratum or when there are desired precision constraints. With exact optimal allocation, we demonstrate a decrease in needed sample size for the same precision using 2007 Economic Census data in the sample design for part of the subsequent Service Annual Survey.

We conclude by calling on the phrase “…no calculation when observation can be made” to muse about current world-wide considerations to make greater use of data from additional sources (e.g., administrative records, commercial data, big data…) to produce official statistics.