The United States contains over 85,000 local government entities, from townships to airport authorities and school districts. And if the maxim that "all politics is local" holds true, political science has been missing much of political reality. The problem lies in faulty data analysis methods used to make a moderately-educated guess (inference) regarding the behavior of individuals based on the data of group behavior, called aggregate or ecological data. This "ecological inference problem" was first recognized in 1919, in an effort to calculate how how newly enfranchised women would cast their ballots nationwide. While some statistics were available from prior state elections, the ecological inference problem prevented scholars from distinguishing men's and women's votes within the same electoral precinct. Despite modern statistics, ecological inferences are still required in political science research when individual-level surveys are:
Ecological inferences are also required for other social sciences as well as marketing, education, public policy, geography, history, medicine, statistics and more. Almost all researchers who use aggregate data have encountered some form of the ecological inference problem. They may finally have relief with a groundbreaking solution from Harvard University's political scientist Gary King. Supported by NSF's Methodology, Measurement and Statistics Program and by the Political Science Program, Dr. King has devised a new statistical model and implemented it in computer software, which also resulted in a Princeton University Press book : A Solution to The Ecological Inference Problem: Reconstructing Individual Behavior From Aggregate Data. In the history of ecological inference literature, only 49 comparisons exist between estimates from aggregate data and the known data on an individual level - a reflection of the field's focus on hypothesis and theory without economic, sociological or other foundations. In contrast, one of the linchpins of King's methodological model is that it is not merely theoretical, but validated with extensive real-life data. King tested his statistical method with data sets of groups for which individual behaviors were known, making more than 16,000 comparisons from five data sets between his estimates and individual's behavior. For example, estimates of the levels of African-American and Caucasian voter registration were compared to the known answer in public records. The method does not always work, since information is lost in the aggregation process, but King's approach indicates how much information is left in aggregate data to make inferences about individuals. As such, it is possible to learn when the inferences will be relatively certain and when they will not. In the words of Frank Scioli, director of NSF's Political Science Program, "I expect Gary King's solution will contribute to the production of more accurate, insightful data analysis in a variety of research studies, leading to more informed policy making and better understanding of our economy and society." King's solution to the ecological inference problem can, for example…
|
|
For more information please see:
This research is supported by the Methodology, Measurement and Statistics Program and by the Political Science Program. |
All photos and illustrations are copyright© of their respective owners and may not be used without permission. |
|