ARcenso: a Package Born From Chaos, Powered by Community

Andrea Gomez Vargas & Emanuel Ciardullo

useR! 2025

August 9, 2025

Hi


  • Sociologist

  • Learned R through the R community

  • Code with a social perspective

  • Population Statistics Analyst at INDEC (ONS) in Argentina.

  • rOpenSci Champion, Cohort 2023–2024

Why is census data important?

  • Helps understand the population’s characteristics and needs.

  • Is essential for designing effective public policies (social, economic, and territorial planning).

  • Serves as a vital component of academic, social, and market research.

  • Forms the foundation for evidence-based decision-making.

The 2022 Census Operation


Argentina conducted its national census in 2022, the country’s largest statistical operation. Starting in 2023, we began publishing the final results, to which I contributed.

In addition to presenting the data, we had to edit and generate hundreds of Excel tables using R, which was a very manual process.

An idea in the middle of chaos


We had to make many Excel tables. We didn’t know if they were useful or even usable. During that process, I thought this should be a package in R, but the idea remained up in the air until rOpenSci Champions appeared and took shape.

rOpenSci Champions Program


This is a programme that promotes leadership in open science and free software. It offers mentoring and peer support, particularly for individuals from historically underrepresented groups.

What was (is) my project?


Develop an R data package that makes available the official national population census data of Argentina, produced by the National Institute of Statistics and Censuses (INDEC), covering the period from 1970 to 2022. The data are homogenized, organized, and ready to use. The package provides open access to these datasets, facilitating their use by the public, researchers, and decision-makers.

Why?


Historical census results for 1970, 1980, 1991, 2001, 2010 and 2022 in Argentina are available in different formats through physical books, PDFs, excel files, and REDATAM outputs, without having a unified system or format that would allow working with the data from these six census periods as a database.

This fragmentation limits data accessibility, interoperability, and reuse, especially for users working within the R environment.

The proposal

From excel tables to ordered tables in R

Original excel download

Tidy table in R

How it all started: key questions

Six censuses to organize, one package to build

Conceptual Framework: UN Census Data Structure


Census Topics

  • Core: Essential variables
    (e.g., age, sex, population)
  • Derived-core: Calculated variables
    (e.g., fertility rates)
  • Additional: Country-specific topics
    (e.g., religion)

Conceptual Units

  • Population: Individuals
  • Housing: Physical dwellings
  • Household: People sharing a dwelling

Geographic Coverage

  • National level
  • Jurisdictional level

Conceptual Framework: FAIR Principles


Findable → Six national censuses (1970–2022) in one R package, clearly versioned.

Accessible → Open, homogenized datasets with docs & metadata.

Interoperable → Tidy tables ready to integrate with other R data.

Reusable → Standardized codes, open license, reproducible structures.

A Thousand Excels, a Thousand Formats

From Excel Storms to a Clear Path

Stage Census years Geographic level
1 1970 National and 24 jurisdictions
1980 National level
2 1991 and 2001 National level
3 2010 National level
4 2022 National level
5 1980 and 1991 24 jurisdictions
6 2001 and 2010 24 jurisdictions
7 2022 24 jurisdictions

Our technical approach




  • Download: Automated web scraping to collect census tables from official sources.

  • Select: Listed, classified, and extracted relevant files and metadata (census year, geography, topics).

  • Transform: Converted Excel tables into tidy, standardized datasets using base R.

  • Function development: Built R functions to access, manipulate, and visualize the data efficiently.

  • Package creation: Integrated datasets and functions into the ARcenso package for easy use and reproducibility.

  • Version control: Used Git and GitHub for tracking changes, collaboration, and release management.

ARcenso: From Idea to Reality

{ARcenso} 📦


Installation

# install.packages("remotes")
remotes::install_github("SoyAndrea/arcenso")



Package activation

library(arcenso)

get_census()

get tables

get_census( year = 1970, 
            topic = "CONDICIONES HABITACIONALES", 
            geolvl = "Total del país")
#> $c70_total_del_pais_poblacion_c18
#>                   regimen_de_tenencia hogares personas  cuartos
#> 1                         Propietario 3553250 13778700 11197900
#> 2            Inquilino o arrendatario 1380950  4692800  3305350
#> 3 Ocupante en relación de dependencia  353300  1402500   880050
#> 4                   Ocupante gratuito  575650  2271150  1196500
#> 5                    En otro carácter  192950   816350   419800
#> 
#> $c70_total_del_pais_poblacion_c20
#>     tama?o_hogar                     regimen_tenencia hogares
#> 1   De 1 persona                                Total  615900
#> 2   De 1 persona                          Propietario  255900
#> 3   De 1 persona             Inquilino o arrendatario  199350
#> 4   De 1 persona Ocupante con relación de dependencia   52600
#> 5   De 1 persona                    Ocupante gratuito   82100
#> 6   De 1 persona                                 Otro   25950
#> 7  De 2 personas                                Total 1125250
#> 8  De 2 personas                          Propietario  652950
#> 9  De 2 personas             Inquilino o arrendatario  302400
#> 10 De 2 personas Ocupante con relación de dependencia   49250
#> 11 De 2 personas                    Ocupante gratuito   91300
#> 12 De 2 personas                                 Otro   29350
#> 13 De 3 personas                                Total 1230600
#> 14 De 3 personas                          Propietario  744800
#> 15 De 3 personas             Inquilino o arrendatario  290650
#> 16 De 3 personas Ocupante con relación de dependencia   62150
#> 17 De 3 personas                    Ocupante gratuito  103200
#> 18 De 3 personas                                 Otro   29800
#> 19 De 4 personas                                Total 1255000
#> 20 De 4 personas                          Propietario  787900
#> 21 De 4 personas             Inquilino o arrendatario  266000
#> 22 De 4 personas Ocupante con relación de dependencia   65650
#> 23 De 4 personas                    Ocupante gratuito  102850
#> 24 De 4 personas                                 Otro   32600
#> 25 De 5 personas                                Total  818550
#> 26 De 5 personas                          Propietario  516100
#> 27 De 5 personas             Inquilino o arrendatario  157500
#> 28 De 5 personas Ocupante con relación de dependencia   48200
#> 29 De 5 personas                    Ocupante gratuito   71550
#> 30 De 5 personas                                 Otro   25200
#> 31 De 6 personas                                Total  443250
#> 32 De 6 personas                          Propietario  272000
#> 33 De 6 personas             Inquilino o arrendatario   80000
#> 34 De 6 personas Ocupante con relación de dependencia   29000
#> 35 De 6 personas                    Ocupante gratuito   45750
#> 36 De 6 personas                                 Otro   16500
#> 37 De 7 personas                                Total  276750
#> 38 De 7 personas                          Propietario  163400
#> 39 De 7 personas             Inquilino o arrendatario   44950
#> 40 De 7 personas Ocupante con relación de dependencia   19950
#> 41 De 7 personas                    Ocupante gratuito   35200
#> 42 De 7 personas                                 Otro   13250
#> 43 De 8 personas                                Total  121450
#> 44 De 8 personas                          Propietario   70600
#> 45 De 8 personas             Inquilino o arrendatario   18250
#> 46 De 8 personas Ocupante con relación de dependencia   10050
#> 47 De 8 personas                    Ocupante gratuito   16250
#> 48 De 8 personas                                 Otro    6300
#> 49 De 9 personas                                Total   76000
#> 50 De 9 personas                          Propietario   40950
#> 51 De 9 personas             Inquilino o arrendatario    9400
#> 52 De 9 personas Ocupante con relación de dependencia    7150
#> 53 De 9 personas                    Ocupante gratuito   12900
#> 54 De 9 personas                                 Otro    5600
#> 55   De 10 y más                                Total   93350
#> 56   De 10 y más                          Propietario   48650
#> 57   De 10 y más             Inquilino o arrendatario   12450
#> 58   De 10 y más Ocupante con relación de dependencia    9300
#> 59   De 10 y más                    Ocupante gratuito   14550
#> 60   De 10 y más                                 Otro    8400

check_repository()

report of available tables


check_repository( year = 1970, 
                  topic = "CONDICIONES HABITACIONALES", 
                  geolvl = "Total del país")
#>                            Archivo
#> 1 c70_total_del_pais_poblacion_c18
#> 2 c70_total_del_pais_poblacion_c20
#>                                                                                                      Titulo
#> 1    Cuadro 18. Total del país. Hogares particulares, personas y cuartos, por régimen de tenencia. Año 1970
#> 2 Cuadro 20. Total del país. Hogares particulares, por tamaño del hogar según régimen de tenencia. Año 1970

ARcenso()

shinyapp for consulting

arcenso()

Documentation

Next steps 💫



  • Improve and expand the package documentation

  • Continue the phased roadmap

  • Reach rOpenSci peer review standards

  • Increase package adoption and visibility

  • Seek institutional support and funding

Citizen Origins, Community Purpose

ARcenso is a citizen-driven initiative that emerged from our professional experience working with census data.

It was born from the need for accessible, tidy data among data users, and inspired by the collaborative spirit of the R community.

The People Behind ARcenso

With the support of the rOpenSci Champions Program, cohort 2023–2024, this project is led by Andrea Gómez Vargas, main developer, alongside Emanuel Ciardullo as co-developer and Luis D. Verde as mentor. Tamara Derner designed the project hex logo, and Ariana Bardauil created the Quarto slide theme.

The R Community Powering ARcenso

Behind every open-source tool, there’s a community lifting each other up.

A Shared Tool, A Growing Community


ARcenso has been presented at local R chapters, universities, and regional conferences.


What began as a personal need became a shared resource.

From User to Developer

I started out searching for solutions.

I ended up building one.

Thank U, Gracias! 😁