Ruca Data
ZIP Code RUCA Approximation
Because the smallest geographic identifiers available for most health care data sets are ZIP codes, a ZIP code approximation of the Census tract-based RUCA codes was developed. The ZIP code approximation is based on the Census tract codes and are not based on commuting data unique to the ZIP code geographic unit.
The RUCA ZIP code approximation was the result of the following generalized analytical steps:
1. 2004 residential and commercial ZIP code information were obtained from Claritas Inc.
2. A ZIP/Census tract crosswalk file from Claritas Inc. identified their 2004 estimates of the populations of all ZIP and Census tract segments. For instance, a ZIP code area might have its boundaries crossed by four Census tracts. For such a ZIP code, the file would contain a population estimate for each of the four segments.
3. Each Census tract was assigned a RUCA code as explained in the section on Census tract RUCA methodology, which used Census Bureau definitions of Urbanized Areas and Urban Clusters and work commuting flows based on the 2000 Census results.
4. A file was created where each residential ZIP code had its estimated segment populations aggregated by the 33 RUCA codes. For example, if a ZIP code area had four Census tract segments each of which had a population of 4000 and two of them were RUCA Census tract assigned 1.0, one was assigned 1.1, and one was assigned 2.0, the file would show ZIP code number, 8000 population for 1.0, 4000 population for 1.1, 4000 population for 2.0, zero population for 2.1, and zero for each of the remaining rubrics.
5. A RUCA assignment algorithm was developed that assigned each ZIP code a RUCA code based on the distribution of its population across the RUCA codes. A clear example would be a ZIP code in which 95% of the population was in RUCA code 7.0 Census tracts and 5% was in RUCA code 8.0 – wherein the ZIP would be assigned as a 7.0. The assignment algorithm was developed through trial and error and progressively developed assignments by rules. A retrospective combining of the rules would reduce their number, although the order often deliberately assures that a more appropriate code is assigned.
The assignment algorithm consisted of the following rules (the rules are in order and once a ZIP is assigned with a rule it is not considered further):
- if one code (x.x) represent 66.67% or more of the population, it was assigned
- if all the codes of a number type (i.e., 1s = 1.0 and 1.1, 2s, 4s etc.) represent 66.67% of the population, the larger of the subcodes was assigned (e.g., 1.0)
- if 1.0 + 1.1 + 2.0 + 2.1 represent 66.7% or more of the population, the larger of the subcodes was assigned
- if 1.0 + 1.1 + 2.0 + 2.1 represents 50.0% or more of the population and no other code represents more than 40.0% of the population and 4.1 + 5.1 + 7.1 + 8.1 + 10.1 represents more than 20.0% of the population, then the larger of codes 1.1, 1.1, 2.0, 2.1 was assigned
- if the 4s + 5s + 6s or the 7s + 8s + 9s either represent 66.7% or more of the population, then the largest of the codes (i.e., largest of the x.x codes from among the 4/5/6 group) was assigned
- if the maximum x.x code represents more than 50.0% of the population, it was assigned (and less than 66.7%)
- if all the 1s+2s+3s were greater than or equal to 66.7%, the largest was assigned
- if 7-10s (excluding 7.1, 8.1, and 10.1) were greater than or equal to 66.7% and 1s and 2s were less than 50% and a 7-10 was greater than 40%, the code from 7-10.6 that was larges was assigned
- A few of the residential ZIPs with population (201 or 0.6%) remained unassigned after the preceding six assignment rules were implemented. These ZIPs were quite unique in their RUCA Census tract configurations and it was deemed prudent to hand assign them. Dr. Hart assigned each of these ZIP codes a RUCA (twice and then adjudicated the few conflicts). Most of the hand assignments were apparent when examined. For some other ZIPs, the assignment was very difficult because of odd cominations of Census tract and ZIP code area boundaries. For example, in rare instances, ZIP code areas moved from the edge of an Urbanized Area outward radially and encompassed many different codes ranging across the rural-urban spectrum. In these troublesome cases, assignment was shaded toward urban (i.e., given a problematic assignment decision between two codes, the more urban one was assigned).
Assignment of ZIP codes to RUCAs by rule were as follows:
Rule 1: 27,235
Rule 2: 419
Rule 3: 436
Rule 4: 33
Rule 5: 689
Rule 6: 976
Rule 7: 34
Rule 8: 5
Rule 9: 201
Total: 29,944
6. Commercial ZIP codes were assigned the RUCA of the most appropriate residential ZIP code based on a Claritas Inc. file (e.g., a commercial ZIP that was surrounded by a residential ZIP was assigned the residential ZIP RUCA). Adding these ZIPs to the file makes it contain 41,798.
7. Because the Census tract RUCA assignments were based on 2000 Census Bureau data and the ZIP code information and population were based on 2004 ZIP codes and population estimates, there were some loose ends related to ZIPs that had no population and a few other peculiarities (0.5% of the ZIP assignments and a negligible percentage of the population). In each case, assignments were made that were based on geographically contiguous areas. There were 152 of these ZIPs which brings the total file size to 41,950.
8. Six ZIP code areas cross state lines. The ZIP code area was assigned to the state where the majority of the population was estimated to reside and flags were added to the downloadable files.
9. The ZIP code approximation RUCAs are being used in several contexts and match extremely well with the ZIP codes from various data sources (e.g., HCFA claims data ZIP codes). ZIP codes from outside 50 states and District of Columbia do not match and ZIP codes that are not legitimate do not match (e.g., errors in reported ZIP codes and, in a few instances, ZIP codes that are not United States Postal Service standard that are only being used locally).
The table below shows how well the ZIP code RUCAs correspond to the Census tract RUCAs.
RUCA Code |
Census Tract Population (% of total) |
ZIP Code Population (% of total) | % that agrees exactly (x.x level) | % that agrees (x level) | % that agrees (4 categories)* | % that agrees (dichotomy)** |
1.0 |
201,940,419 (68.9%) |
206,023,383 (70.3%) |
99.3 |
99.3 |
100 |
100 |
1.1 |
3,258,625 (1.1%) |
3,368,644 (1.1%) |
98.0 |
98.7 |
100 |
100 |
2.0 |
25,450,158 (8.7%) |
21,517,274 (7.4%) |
75.0 |
75.5 |
96.3 |
96.3 |
2.1 |
1,619,843 (0.6%) |
1,075,221 (0.4%) |
50.3 |
60.2 |
97.6 |
97.6 |
3.0 |
2,061,081 (0.7%) |
1,605,802 (0.5%) |
61.9 |
61.9 |
81.1 |
81.1 |
4.0 |
13,399,399 (4.6%) |
15,831,520 (5.4%) |
96.7 |
97 |
99.4 |
99.7 |
4.1 |
1,358,782 (0.5%) |
1,678,667 (0.6%) |
93.6 |
93.8 |
99.6 |
99.6 |
4.2 |
5,130,490 (1.8%) |
6,251,631 (2.1%) |
95.7 |
96.4 |
97.0 |
97.2 |
5.0 |
6,672,064 (2.3%) |
4,540,423 (1.5%) |
56.9 |
57.3 |
94.1 |
99.0 |
5.1 |
150,017 (0.1%) |
90,207 (0.0%) |
35.0 |
36.4 |
70.0 |
70.0 |
5.2 |
1,276,674 (0.4%) |
598,367 (0.2%) |
35.3 |
39.9 |
84.5 |
86.7 |
6.0 |
1,108,659 (0.4%) |
898,197 (0.3%) |
60.2 |
60.3 |
81.8 |
97.1 |
6.1 |
487,528 (0.2%) |
298,609 (0.1%) |
46.1 |
48.7 |
78.9 |
85.6 |
7.0 |
5,658,521 (1.9%) |
6,969,376 (2.4%) |
91.8 |
92.5 |
97.8 |
99.6 |
7.1 |
738,858 (0.3%) |
1,013,113 (0.3%) |
82.2 |
82.3 |
99.5 |
99.5 |
7.2 |
216,914 (0.1%) |
260,866 (0.1%) |
79.4 |
79.6 |
79.7 |
99.6 |
7.3 |
2,432,786 (0.8%) |
2,954,397 (1.0%) |
88.0 |
90.7 |
91.1 |
92.5 |
7.4 |
1,435,868 (0.5%) |
1,806,151 (0.6%) |
91.1 |
92.4 |
95.3 |
99.9 |
8.0 |
2,433,005 (0.8%) |
1,863,564 (0.6%) |
55.1 |
55.4 |
94.8 |
99.2 |
8.1 |
43,975 (0.0%) |
33,190 (0.0%) |
49.2 |
50.4 |
73.5 |
73.5 |
8.2 |
40,864 (0.0%) |
43,975 (0.0%) |
99.8 |
99.8 |
100 |
100 |
8.3 |
393,360 (0.1%) |
166,650 (0.1%) |
29.9 |
33.3 |
84.0 |
87.8 |
8.4 |
150,546 (0.1%) |
65,081 (0.0%) |
32.9 |
37.3 |
86.4 |
99.6 |
9.0 |
1,132,146 (0.4%) |
866,163 (0.3%) |
56.4 |
56.8 |
85.8 |
97.2 |
9.1 |
521,374 (0.2%) |
412,667 (0.1%) |
57.8 |
60.0 |
78.2 |
83.5 |
9.2 |
278,524 (1.0%) |
246,112 (0.1%) |
63.4 |
65.5 |
83.7 |
99.3 |
10.0 |
4,456,294 (1.5%) |
4,293,875 (1.5%) |
88.9 |
95.2 |
95.1 |
98.8 |
10.1 |
310,836 (0.1%) |
293,379 (0.1%) |
73.3 |
77.3 |
93.2 |
93.2 |
10.2 |
355,964 (0.1%) |
330,750 (0.1%) |
77.2 |
80.8 |
80.8 |
99.8 |
10.3 |
387,077 (0.1%) |
249,905 (0.1%) |
57.0 |
61.2 |
61.2 |
98.0 |
10.4 |
2,341,613 (0.8%) |
2,136,706 (0.7%) |
76.0 |
81.8 |
81.5 |
89.2 |
10.5 |
2,446,814 (0.8%) |
2,326,439 (0.8%) |
78.9 |
82.7 |
82.6 |
98.5 |
10.6 |
3,247,599 (1.1%) |
2,826,382 (1.0%) |
74.3 |
78.7 |
78.7 |
98.7 |
TOTAL |
292,936,686 (100%) |
292,936,686 (100%) |
92.9% |
93.4% | 98.07% |
99.02% |
*The four categories are: urban (1.0, 1.1, 2.0, 2.1, 3.0, 4.1, 5.1, 7.1, 8.1, 10.1); large rural (4.0, 4.2, 5.0, 5.2, 6.0, 6.1), small rural (7.0, 7.2, 7.3, 7.4, 8.0, 8.2, 8.3, 8.4, 9.0, 9.1, 9.2); isolated (10.0, 10.2, 10.3, 10.4, 10.5, 10.6)
**Dichotomy refers to urban (1.0, 1.1, 2.0, 2.1, 3.0, 4.1, 5.1, 7.1, 8.1, 10.1) and rural ( 4.0, 4.2, 5.0, 5.2, 6.0, 6.1, 7.0, 7.2, 7.3, 7.4, 8.0, 8.2, 8.3, 8.4, 9.0, 9.1, 9.2, 10.0, 10.2, 10.3, 10.4, 10.5, 10.6)