UP504
(Prof. Campbell)
Data: Structure,
Characteristics and Sources (including accessing data on the www)
last updated Tuesday, January 28, 2003 5:12 PM |
Sections of this document: Overview definitions US Census census forms census geography census 2000 other sources |
Other UP504 class pages of interest: other useful statistical sites overview of US Census sources |
When you are to gather or construct a data table, there are several dimensions to consider:
1. time (single point in time, comparative statics, time-series)
2. space (geographic location: e.g., city, county, MSA, state, country)
3. unit of analysis (e.g., person, household)
4. variables (e.g., annual income, age, occupation)
Also: what comparative cases (if any) will you use?
Some common data problems:
exploratory-inductive: But
sometimes serendipity leads to unexpected data.
Sample vs. Full Count (Census)
sample size - N
population size - M
sampling fraction = N/M
normally we assume that N/M
-> 0 (that is, one is sampling a very small fraction of the population)
Data Sources (and Citations)
1. paper
2. electronic based on a paper published version
3. electronic with no paper published source
(also: data tapes)
1. Web browser (to view this document)
2. web page composer /html editor (to create this document)
3. FTP (to download and upload this page to my ifs space so that it is available on the web); One MAC version is Fetch.
4. Excel -- to analyze downloaded data (or use SPSS, SAS, Systat, etc.)
5. Adobe Acrobat (to read formatted .pdf
files)
census
OED, 2nd ed.
census se.nss, sb. [L. census registering of Roman citizens and their property, registered property, wealth, f. censere to rate, assess, estimate. ]
1. The registration of citizens and their property in ancient Rome for purposes of taxation.
2. Applied to certain taxes, esp. a capitation or poll-tax. Obs.
3.
a. An official enumeration of the population of a country or district, with various statistics relating to them. Also attrib.
A census of the population has been taken every tenth year since
1790 in the United States of America, since 1791 in France, and since 1801 in
Great Britain. In Ireland the earliest census was in
1813, since which it has been taken simultaneously with that of
Great Britain.
b. attrib., as in census return,
-table,
-taker; census-paper, a paper left at each house, to be filled up with the names, ages, etc., of the inmates, and returned to the enumerators on the day of taking the census.
-----
ENCYCLOPAEDIA BRITANNICA
http://www.britannica.com
census
an enumeration of people, houses, firms, or other important
items in a country or
region at a particular time. Used alone, the term usually
refers to a population
census--the type to be described in this article. However,
many countries take
censuses of housing, manufacturing, and agriculture.
-----
statistic
OED, 2nd ed.
statistic stati.stik, a. and sb. [ad. G. statistik sb. statistisch
adj., Fr. statistique adj. and fem. sb., ad. mod.L. statisticus, f. *statista
(Ital. statista) statist. Cf. Ital. statistico adj.,
statistica sb., Sp., Pg. estadÌstico adj., estadÌstica
sb. The earliest known occurrence of the word seems to be in the title of the
satirical work Microscopium Statisticum, by `Helenus Politanus', Frankfort
(?), 1672. Here the sense is prob. `pertaining to statists or to statecraft'
(cf. statistical a. 1). The earliest use of the adj. in anything resembling
its present meaning is found in mod.L. statisticum collegium,
said to have been used by Martin Schmeizel (professor at Jena, died 1747) for
a course of lectures on the constitutions, resources, and policy of the various
States of the world. The G. statistik was used as a name
for this department of knowledge by G. Achenwall in his Vorbereitung zur Staatswissenschaft
(1748); the context shows that he did not regard the term as
novel. The Fr. statistique sb. is cited by LittrÈ from Bachaumont (died
1771); Fr. writers of the 18th c. refer to Achenwall as having brought the word
into use. The sense-development of the word may have been
influenced by the notion that it was a direct derivative of L; status state
sb. ]
B. sb.
1.
a. = statistics 1. rare.
b. A quantitative fact or statement.
c. Statistics. Any of the numerical characteristics of a sample
(as opposed to one of the population from which it is drawn). Cf. parameter
2 f.
2. = statistician.
-------
sample
sample s.mp'l, , sb. Forms: 4 sampel, saumpel, -pul, -ple, saunpil,
4-5 saumpil, 4-6 sampill, saumple, 5 sampil(le, sampull, saumpyl, 4- sample.
[ME. sample, aphetic f. essample: see
example sb. ]
1. A fact, incident, story, or suppositious case, which serves to illustrate, confirm, or render credible some proposition or statement. (Cf. example sb. 1.) Obs.
2.
a. A relatively small quantity of material, or an individual
object, from which the quality of the mass, group, species, etc. which it represents
may be inferred; a specimen. Now chiefly Comm., a
small quantity of some commodity, presented or shown to customers
as a specimen of the goods offered for sale. (An individual article offered
as a specimen of goods sold by number and not by
weight or measure is now more commonly called a pattern.)
b. of immaterial things.
c. A specimen taken for scientific testing or analysis.
d. Statistics. A portion drawn from a population, the study of
which is intended to lead to statistical estimates of the attributes of the
whole population.
The term "census" has at least three common uses:
1. as a type of count: a full count (at least in theory) rather than a sample
2. as a data set: the actual count of the U.S. population every ten years. Hence Decennial censuses (every 10 years - 1980, 1990, 2000, etc.)
3. as a government
agency: the government agency that administers this count (the Bureau
of the Census, which is under the Department of Commerce). Note:
the decennial census is but one of MANY sets of data that the agency collects.
The U.S. Constitution provides for a census of the population every 10 years, primarily to establish a basis for apportionment of members of the House of Representatives among the States. For over a century after the first census in 1790, the census organization was a temporary one, created only for each decennial census. In 1902, the Bureau of the Census was established as a permanent Federal agency, responsible for enumerating the population and also for compiling statistics on other subjects. Historically the census of population has been a complete count. That is, an attempt is made to account for every person, for each person's residence, and for other characteristics (sex, age, family relationships, etc.). Since the 1940 census, in addition to the complete count information, some data have been obtained from representative samples of the population. In the 1990 census, variable sampling rates were employed. For most of the country, 1 in every 6 households (about 17 percent) received the long form or sample questionnaire; in governmental units estimated to have fewer than 2,500 inhabitants, every other household (50 percent) received the sample questionnaire to enhance the reliability of sample data for small areas. Exact agreement is not to be expected between sample data and the complete census count. Sample data may be used with confidence where large numbers are involved and assumed to indicate trends and relationships where small numbers are involved.
Census data presented here have not been adjusted for underenumeration. Results from the evaluation program for the 1990 census indicate that the overall national undercount was between 1 and 2 percent the estimate from the Post Enumeration Survey (PES) was 1.6 percent and the estimate from Demographic Analysis (DA) was 1.8 percent. Both the PES and DA estimates show disproportionately high undercounts for some demographic groups. For example, the PES estimates of percent net undercount for Blacks (4.4 percent), Hispanics (5.0 percent), and American Indians (4.5 percent) were higher than the estimated undercount of nonHispanic whites (0.7 percent). Historical DA estimates demonstrate that the overall undercount rate in the census has declined significantly over the past 50 years (from an estimated 5.4 percent in 1940 to 1.8 percent in 1990), yet the undercount of Blacks has remained disproportionately high.
link: The
2000 U.S. Census
Where is each person counted?
( US Census language reproduced below with web sources ...)
2000 | 1990 |
The 2000 Census Residence Rules "Planners of the first U.S. decennial census in 1790 established the concept of "usual residence" as the main principle in determining where people were to be counted. This concept has been followed in all subsequent censuses and is the guiding principle for Census 2000. Usual residence has been defined as the place where the person lives and sleeps most of the time. This place is not necessarily the same as the person's voting residence or legal residence. Also, noncitizens who are living in the United States are included, regardless of their immigration status." "Citizens of foreign countries who have established a household or are part of an established household in the U.S. while working or studying, including family members with them - Counted at the household. Citizens of foreign countries who are living in the U.S. at embassies, ministries, legations, or consulates - Counted at the embassy, etc. Citizens of foreign countries temporarily traveling or visiting in the U.S. - Not included in the census." Boarding school students - Counted at their parental home rather than at the boarding school. College students living away from home while attending college - Counted where they are living at college. College students living at their parental home while attending college - Counted at their parental home. |
For the 1990 Census: Persons temporarily away from their usual residence, whether in the United States or overseas, on a vacation or on a business trip, were counted at their usual residence. Persons who occupied more than one residence during the year were counted at the one they considered to be their usual residence. Persons who moved on or near Census Day were counted at the place they considered to be their usual residence." How about students? |
questionnaire type | who received the questionnaire | 2000 - Format of Compiled Census Data (Summary File) | 1990 - Format of Compiled Census Data (Summary Tape File) |
long form | a sample (either 1/6 or 1/2 or 1/8 of hhds. receive this form, depending on population size of location): overall: 1-in-6. see documentation on sampling rates. | SF3 | STF3 |
short form | full count (every hhd. receives this form) | SF1 | STF1 |
In between the 10 Year Census -- How are population estimates made?
Current Population Survey (CPS)
This is a monthly nationwide survey of a scientifically
selected sample representing the noninstitutional civilian population. The sample
is located in 754 areas comprising 2,121 counties, independent cities, and minor
civil divisions with coverage in every State and the District of Columbia and
is subject to sampling error. At the present time, about 50,000 occupied households
are eligible for interview every month; of these between 4 and 5 percent are,
for various reasons, unavailable for interview.
While the primary purpose of the CPS is to obtain
monthly statistics on the labor force, it also serves as a vehicle for inquiries
on other subjects. Using CPS data, the Bureau issues a series of publications
under the general title of Current Population Reports, which cover population
characteristics (P20), consumer income (P60), special studies (P23), and other
topics.
Urban and rural÷
Hispanic (many
be of any racial category - so don't add with racial categories, since it cuts
across racial categories)
see US
Census definition
LINKS: US Census Geography Census Geography US Census Geography Reference Resources US Census Geographic Services and Information the "Geographic Overview" (on tracts, blocks, etc.) Current 1998 List of Metropolitan Areas Metropolitan
Areas and Components, 1996, With Fips Codes |
A Hierarchy of Census Areas (from the 1990 Census): from BIG to small
see a pdf version
of this hierarchy
1 | Nation (US) |
4 | Regions (e.g., Midwest) |
9 | Divisions (e.g., East North Central) |
57 | States and Statistically Equivalent Entities (e.g., Michigan) |
3,248 | Counties and Statistically Equivalent Entities (e.g., Washtenaw) |
60,228 | County Subdivisions and Places (e.g., Ann Arbor) |
576 | American Indian and Alaska Native Areas |
62,276 | Census Tracts and Block Numbering Areas (BNAs) |
229,192 | Block Groups (BGs) |
7,017,427 | Blocks |
What are blocks?
"Census blocks are small areas bounded on all sides by visible
features such as streets, roads, streams, and railroad tracks,
and by
invisible boundaries such as city, town, township, and county
limits,
property lines, and short, imaginary extensions of streets and
roads.
source: technical
documentation
Metropolitan
Areas: MSAs, CMSAs, etc.
Metropolitan
Areas:
Detroit as an example
A Map of Lower Michigan Counties
![]() |
35
Detroit-Ann Arbor-Flint, MI CMSA 35 0440 Ann Arbor, MI PMSA 35 0440 26091 Lenawee County 35 0440 26093 Livingston County 35 0440 26161 Washtenaw County 35 2160 Detroit, MI PMSA 35 2160 26087 Lapeer County 35 2160 26099 Macomb County 35 2160 26115 Monroe County 35 2160 26125 Oakland County 35 2160 26147 St. Clair County 35 2160 26163 Wayne County 35 2640 Flint, MI PMSA 35 2640 26049 Genesee County |
Population in the Detroit-Ann Arbor-Flint,MI CMSA and its three component
MSAs,
1980 - 1994 (in thousands)
METROPOLITAN AREA | 1980 | 1990 | 1991 | 1992 | 1993 | 1994 | 1980-90 | 1990-94 |
Detroit-Ann Arbor-Flint,MI CMSA | 5,293 | 5,187 | 5,215 | 5,236 | 5,246 | 5,256 | -2.0 | 1.3 |
Ann Arbor, MI PMSA | 455 | 490 | 498 | 504 | 509 | 515 | 7.7 | 5.1 |
Detroit, MI PMSA | 4,388 | 4,267 | 4,285 | 4,299 | 4,304 | 4,307 | -2.8 | 0.9 |
Flint, MI PMSA | 450 | 430 | 432 | 432 | 433 | 433 | -4.4 | 0.7 |
GUIDE TO FIPS CODES:
(Note: FIPS = Federal Information Processing Standards) see this resource
MSA= Metropolitan Statistical Area
CMSA= Consolidated Metropolitan Statistical Area
PMSA= Primary Metropolitan Statistical Area
SS= State
CCC= County
PPPPP= Place (city/town)
Type of Metropolitan Area | Number | Example | |
MSA (metropolitan statistical area) | stand alone metro
area (a county or counties)
|
268 | (e.g., Lansing-East Lansing, MI MSA) |
CMSA (consolidated MSA) | a very large metro area, consisting of a collection of PMSAs | 21 | (e.g., Detroit-Ann Arbor-Flint, MI CMSA) |
PMSA (primary MSA) | a subset of CMSAs | 73 | (e.g., Ann Arbor, MI PMSA) |
New York CMSA has 15 PMSAs
LA CMSA has four (albeit big ones)
Detroit CMSA has three: Ann Arbor, Detroit, and Flint.
MA (Metropolitan Area) The MA classification is
a statistical standard developed for use by Federal agencies in the production,
analysis, and publication of data on MAs. The MAs are designated by the Office
of Management and Budget. Metropolitan Areas can be classified as a Metropolitan
Statistical Area (MSA) or as a Consolidated Metropolitan Statistical Area (CMSA),
that is a MA divided into Primary Metropolitan Statistical Areas (PMSAs.) See
also MSA/CMSA/PMSA.
PMSA (Primary Metropolitan Statistical
Area) An area defined by the Office of Management and Budget as a Federal statistical
standard, comprised of one or more counties (county subdivisions in New England),
within a metropolitan area, having a population of 1,000,000 or more. When PMSAs
are established, the larger area of which they are component parts is designated
a Consolidated Metropolitan Statistical Area.
CMSA (Consolidated Metropolitan Statistical
Area) An area defined by the Office of Management and Budget as a Federal statistical
standard. In metropolitan areas where Primary Metropolitan Statistical Areas
(PMSAs) are defined, the larger area of which the PMSAs are components is designated
a CMSA.
MSA (Metropolitan Statistical Area) An
area defined by the Office of Management and Budget as a Federal statistical
standard. An area qualifies for recognition as an MSA if it includes a city
of at least 50,000 population or an urbanized area of at least 50,000 with a
total metropolitan area population of at least 100,000. See also (MA).
NECMA (New England County Metropolitan
Area) A county-based equivalent to the official metropolitan areas in the six
New England States, where the standard components are county subdivisions (cities
and towns) instead of counties as in other states.
For descriptive details and a
listing of titles and components of MA's, see Appendix II.
Metropolitan Areas (MA's)
The general concept of a metropolitan area is one
of a core area containing a large population nucleus, together with adjacent
communities that have a high degree of social and economic integration with
that core.
Metropolitan statistical areas (MSA's),
consolidated metropolitan statistical areas (CMSA's),
and primary metropolitan statistical areas (PMSA's)
are defined by the Office of Management and Budget (OMB) as a standard for Federal agencies in the preparation and publication of statistics relating to metropolitan areas.
The entire territory of the United States is classified
as metropolitan (inside MSA's or CMSA' -- PMSA's are components of CMSA's) or
nonmetropolitan (outside MSA's or CMSA's).
MSA's, CMSA's, and PMSA's are defined in terms of entire counties except in New England, where the definitions are in terms of cities and towns. The OMB also defines New England County Metropolitan Areas (NECMA's) which are countybased alternatives to the MSA's and CMSA's in the six New England States. From time to time, new MA's are created and the boundaries of others change. As a result, data for MA's over time may not be comparable and the analysis of historical trends must be made cautiously. For descriptive details and a listing of titles and components of MA's, see Appendix II.
Also, New England has NECMAs: New England
county MA. Place and county alternatives to the standard MAs
home page | FAQ (frequently asked questions) | new in 2000: ability to select multiple racial categories. |
time table of data products release from 2000 Census |
American FactFinder - the data retrieval system for the 2000 Census |
How to access the 2000 Census Data:
for an overview, see Comparison of 2000 Census Delivery Vehicles, UM Documents Center
two of many options:
(the most common way) |
Accessing CensusCD
2000 Long Form via UM Library Citrix¨ Service
http://www.lib.umich.edu/citrix/cens00.html (through the UM Library system) |
US Government (including the Bureau of the Census) | |
|
http://www.census.gov/main/www/access.html |
American Fact Finder (the US Census new Interactive database engine) | http://factfinder.census.gov/servlet/BasicFactsServlet |
|
http://www.census.gov/main/www/glossary.html |
|
http://www.census.gov/statab/www/ |
|
http://www.census.gov/epcd/cbp/map/96data/26/161.TXT |
|
http://www.fedstats.gov/ |
|
http://www.census.gov/statab/www/smadb.html |
|
http://www.bts.gov/ |
US Census Maps | http://www.census.gov/geo/www/maps/ |
US Census Map Products | http://www.census.gov/geo/www/maps/CP_MapProducts.htm see the population density map for 2000 |
CDC MAPPING | http://www.cdc.gov/nchs/products/pubs/pubd/other/atlas/atlas.htm |
State of Michigan | |
|
http://www.michigan.gov/census/ |
|
http://www.michigan.gov/cgi |
OTHER | |
|
http://www.undp.org/ |
|
www.cyburbia.com |
|
www.mapblast.com
www.mapquest.com |
History of Statistics (UCLA site) | http://www.stat.ucla.edu/history/ including as early Chinese version of Pascal's Triangle (binomial distribution) |