-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Hello Jens,
I noticed some inconsistent behavior when using get_census(level = "CMA", ...) across different census years, and I wasn’t able to find documentation describing this pattern:
- 1996/2001: Only CMAs are returned
- 2006/2011/2016: Both CMAs and CAs are returned, but CAs with municipal code D returns NA values for all vectors
- 2021: Both CMAs and CAs are returned and include valid data
Here is extracting all CMAs across years, and summing the number of rows to show change between 2001 and 2006:
> CMAs <- lapply(
+ c("CA1996", "CA01", "CA06", "CA11", "CA16", "CA21"),
+ cancensus::get_census,
+ regions = list(C = "01"),
+ level = "CMA"
+ )
> sapply(CMAs, nrow)
[1] 43 46 152 155 157 160
And examples using the 2006 and 2016 datasets:
> cancensus::get_census("CA06", regions = list(C = "01"), level = "CMA", vectors = "v_CA06_103")[c(1,3,10)]
# A tibble: 152 × 3
GeoUID `Region Name` `v_CA06_103: Rented`
<chr> <fct> <dbl>
1 10001 St. John's (B) 20115
2 10005 Bay Roberts (D) NA
3 10010 Grand Falls-Windsor (D) NA
4 10015 Corner Brook (D) NA> cancensus::get_census("CA16", regions = list(C = "01"), level = "CMA", vectors = "v_CA16_4838")[c(1,3,10)]
# A tibble: 157 × 3
GeoUID `Region Name` `v_CA16_4838: Renter`
<chr> <fct> <dbl>
1 10001 St. John's (B) 25485
2 10005 Bay Roberts (D) NA
3 10010 Grand Falls-Windsor (D) NA
4 10011 Gander (D) NA
5 10015 Corner Brook (D) NAProbably a methodological change from StatsCan side? But when using cancensus, it breaks the assumption that level = "CMA" will return only CMA-level geographies with usable data. Could CMAs and CAs be separated when using level = "CMA"? And what would cause the NA issues for CAs (D) in 2006, 2011 and 2016? Or some documentation would help on the inconsistency (I apologize if that is already the case).
Thanks for maintaining this package, it's well-designed and a powerful tool!