Skip to content

Inconsistent behavior for level = "CMA" across census years #208

@bdbmax

Description

@bdbmax

Hello Jens,

I noticed some inconsistent behavior when using get_census(level = "CMA", ...) across different census years, and I wasn’t able to find documentation describing this pattern:

  • 1996/2001: Only CMAs are returned
  • 2006/2011/2016: Both CMAs and CAs are returned, but CAs with municipal code D returns NA values for all vectors
  • 2021: Both CMAs and CAs are returned and include valid data

Here is extracting all CMAs across years, and summing the number of rows to show change between 2001 and 2006:

> CMAs <- lapply(
+   c("CA1996", "CA01", "CA06", "CA11", "CA16", "CA21"),
+   cancensus::get_census,
+   regions = list(C = "01"),
+   level = "CMA"
+ )
> sapply(CMAs, nrow)
[1]  43  46 152 155 157 160

And examples using the 2006 and 2016 datasets:

> cancensus::get_census("CA06", regions = list(C = "01"), level = "CMA", vectors = "v_CA06_103")[c(1,3,10)]
# A tibble: 152 × 3
   GeoUID `Region Name`           `v_CA06_103: Rented`
   <chr>  <fct>                                  <dbl>
 1 10001  St. John's (B)                         20115
 2 10005  Bay Roberts (D)                           NA
 3 10010  Grand Falls-Windsor (D)                   NA
 4 10015  Corner Brook (D)                          NA
> cancensus::get_census("CA16", regions = list(C = "01"), level = "CMA", vectors = "v_CA16_4838")[c(1,3,10)]
# A tibble: 157 × 3
   GeoUID `Region Name`           `v_CA16_4838: Renter`
   <chr>  <fct>                                   <dbl>
 1 10001  St. John's (B)                          25485
 2 10005  Bay Roberts (D)                            NA
 3 10010  Grand Falls-Windsor (D)                    NA
 4 10011  Gander (D)                                 NA
 5 10015  Corner Brook (D)                           NA

Probably a methodological change from StatsCan side? But when using cancensus, it breaks the assumption that level = "CMA" will return only CMA-level geographies with usable data. Could CMAs and CAs be separated when using level = "CMA"? And what would cause the NA issues for CAs (D) in 2006, 2011 and 2016? Or some documentation would help on the inconsistency (I apologize if that is already the case).

Thanks for maintaining this package, it's well-designed and a powerful tool!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions