The Project

During the development of my corpus of Costa Rican slang dictionaries, a particularity of one of the texts stood out as possible inspiration for further analysis. Carlos Gagini’s Diccionario de costarriqueñismos is plagued with toponyms, which he listed as vocabulary words. His interest in place names is especially evident when compared to Arturo Agüero Chaves’s text, which has considerably less geographical terms (see Figure 11 of the corpus). I was able to obtain a physical copy of Miguel Ángel Quesada’s Nuevo diccionario de costarriqueñiismos (the third slang dictionary that I mentioned in my corpus but didn’t work with), and it is just like Agüero’s book in this sense.

Gagini’s dictionary also includes three appendixes that list locations in the country. The first one contains geographical names that appeared in official government publications between 1859 and 1917, excluding “names of saints and those that are very well-known” (Gagini, 1918, p. 250). These, Gagini explains, had already appeared in Félix F. Noriega’s Diccionario geográfico (“geographical dictionary”), published in 1904. The second appendix is composed of geographical names taken from publications of the Central Office of Statistics (currently the National Institute of Statistics and Census). The third one presents geographical names taken from maps of Costa Rica.


Figure 1. Beginning of Appendix I from Gagini’s dictionary (p. 250).

Figure 2. Beginning of Appendix II from Gagini’s dictionary (p. 273).

Figure 3. Beginning of Appendix III from Gagini’s dictionary (p. 274).

Alongside the topic of toponyms, I was also attracted by the coexistence of different languages in Costa Rican slang, particularly that of indigenous languages and Spanish. I once worked on a project that analyzed two Costa Rican sculptures through the lens of the country’s dual identity, shaped by its history of colonization. The dictionaries inevitably reflect this same identity. It’s no surprise that Gagini was also passionate about the study of indigenous cultures and languages, publishing “Ensayo lexicográfico sobre la lengua de Térrraba” (“Lexicographical Essay on the Language of Térraba”) in 1892 and Los aborígenes de Costa Rica (Costa Rica’s Aboriginal Population) in 1917 (“Gagini, Carlos,” n.d.).

My idea for this first project was to create a map with the districts of Costa Rica, indicating the language of origin of each district’s name. After learning how to use the online mapping platform CARTO, this seemed like a feasible task in terms of the tools that I needed, given that all the data processing could be done with this software. However, the data collection would determine which variables and information I could actually analyze alongside the toponyms.


Part I. Data Collection

Costa Rica’s administrative division has three levels; these are as follow, with their current number of components (“Administrative-Territorial Division of the Republic”, 2015):

– 7 provinces

– 81 cantons

– 481 districts

I chose to focus on the districts, thinking that the finer the granularity, the richer the analysis of languages across the country. Though I still think this to be true, I now realize that it would have been more adequate to work with cantons, given time constraints. I didn’t have the time to collect data for 481 districts, and made the decision to focus only on the districts of San José (the province with the most districts, 123 in total). Though these are still more than the cantons, they are not representative of the entire country: San José only occupies 10% of national territory (“Costa Rica y sus provincias”, 2007).

The latest list of districts was taken from La Gaceta, one of the official publications from which Gagini collected toponyms for Appendix II (see Figure 2). Then, the file with the polygons representing all of the country’s districts was downloaded from ArcGIS (file owned by user carloscastilloalfaro). When I compared the data in these two sets, I realized that two of the districts in the 2015 publication are not included in the file. It was easy to figure out the reason for this discrepancy. La Gaceta’s decree states that the district Jaris was created on June 28, 2012, and the district Quitirrisí on October 23, 2014 (“Administrative-Territorial Division of the Republic”, 2015, p. 9). The file download page indicates that it was uploaded on July 30, 2012. It is clear why Quitirrisí has no associated polygon. As for Jaris, the maker of the file might have been unaware of the recent creation of a new district.

This situation reminds me of an aspect of the humanities that I read about. “Humanists study the world created by humanity” (Gardiner & Musto, 2015, p. 14). Borders are a political matter, and thus don’t necessarily fit within the humanities. But they are human constructions, not absolute nor definite. They are one of the features of the world we’ve created that most inform our perception of said world. And because our views of the world are ever-changing, so are borders.

Having uploaded the ArcGIS file on CARTO, I altered the resulting dataset to include information about the districts that was relevant to the project. The modified CSV file can be downloaded here. The following are extra columns that I added to the original data table:

– Language: indicates whether the word is of Spanish, indigenous, or another origin

– Sublanguage: if the language is indigenous, it indicates which indigenous sublanguage it is

– Category: I created three categories, based on how many names fit into them, and/or how relevant I thought they were in relation to the languages. These are Botany (botanical terms), Roman Catholicism (terms related to this religion, recognized as the State religion by the constitution), and Person (it only includes four names, which correspond to names of historical figures)

– Subcategory: Roman Catholicism is further divided into Saint (names of Saints) and Mary (names of the Virgin Mary). Person is divided into Cacique (indigenous leaders) and Roman Catholic Church (Church authorities). I didn’t count Church authorities as a subcategory of Roman Catholicism because, unlike Saints and the Virgin, these two figures (a pope and a bishop) are not objects of cult)

– Meaning: short description and/or English translation of the name

– IDS value: IDS stands for Índice de Desarrollo Social, or “social development index” in English. It’s calculated by the Ministry of National Planification and Economic Policy, to “classify districts and cantons of the country according to their level of social development” (“Costa Rica: Índice de Desarrollo Social (IDS) 2013”, 2013). I included this in the dataset out of curiousity to see if there was any correlation between the language of the name of a district and its position in the IDS ranking (from 2013).

Gathering information for these categories was time-consuming, as I had to consult several different sources. Yet I’m immensely grateful to have had access to such sources, and that the information they provide is out there in the first place.  Research is unavoidably collaborative.

Amongst these sources where the three Costa Rican slang dictionaries, academic articles about indigenous toponomy in the region, a database of names of saints and other holy people, and even newspaper articles. I was as thorough as I could, and it was during this process that I decided to create categories for some of the names (such as those used in the slang dictionaries).

Curating the data implied issues of its own. These are some examples:

– Synonyms (taxonomy): I encountered a problem with the first name that I wanted to tranlsate into its botanical term. Two of the dictionaries had it as an entry, but the associated scientific names were different. I originally thought that it was a matters of accuracy, but later learned that synonyms exist in taxonomy as the scientific names of the same specimen under different systems. I thus had to make decisions in regards to which name to use

– Different plants: in one case though (with the word “Uruca”), two sources did point to two different plants that are very similar, but not the same. Again, I had to decide which one to go for

– Disregard for full names: many of the district names have references to locations in them. One example is San Francisco de Dos Ríos, which means “Saint Francis of Two Rivers”.  The name is made of two parts: Saint Francis (Roman Catholicism, Saint) and two rivers, referring to the actual two rivers that border the district in the north and south. I made the choice to ignore the reference to the rivers. Firstly, because it simplifies the handling of the data (just one category per name), and secondly, because it is my judgement (for the purposes of this project) that the naming of the saint says more about the culture that named the place than the geographical information.


Part II. Maps

The following are the maps that I created in CARTO, using the modified dataset. In the first two (maps 1 and 2), the hexagonal pattern corresponds to indigenous languages, the dotted pattern to Spanish, and the line pattern to Haitian Creole.

Map 1. Districts of San José according to language of origin of their names (patterns) and their name categories (colors)


Map 2. Districts of San José according to language of origin of their names (patterns) and IDS values (colors). The relative development levels indicated in the legend follow the classification system used in the govenrment report about the 2013 social development index.


Map 3. Districts of San José whose names originate from an indigenous language, divided into sublanguages (colors) and compared to a map of the probable distribution of native languages across the territory before the arrival of Spanish colonizers. The basemap was obtainedby downloading the original map from Wikipedia, changing its colors with a photo editor, rectifying it on Map Warper, and then connecting it to CARTO with its tiles address. The darker gray patches indicate where a specific indigenous language was predominant.


Map 4. Districts of San José whose names belong to the Roman Catholicism category. As expected, all of these names are also in Spanish (as Catholicism was brought to the continent by the Spaniards).  These districts are divided according to subcategory (colors). The blue districts belong to another category (Person), but are included in this map given the connection of these two persons with the Catholic Church (they were Church authorities).


My main issue, as I mentioned before, had to do with the limited scope that choosing only the districts of San José gave me in terms of analyzing toponyms in Costa Rica. However, it gave me a very complete picture of San José. So, before discussing these results, a second version of this project that seeks to represent Costa Rica would either have to include all 481 districts or the cantons across the country. Another improvement would be to carry out quantitative comparisons among the data.

San José is the province where the capital city of San José is located. It’s the densest province in terms of population (“Costa Rica y sus provincias”, 2007) and has the highest development rates (the first 5 districts in the ranking belong to San José, and the last 18 belong to other provinces). Alongside this, most of the districts have names in Spanish, and the category with the most districts is Roman Catholicism. This could hint at a correlation between the tradition of the colonizer and progress, yet the answer is not conclusive. For instance, I live in Escazú, one of the districts that is highest in the index (7th place) and named after a cacique. It is also true that out of 6 districts in the very low relative social development level, only one has an indigenous name.

Regarding the third map, I originally did not include the georeferenced basemap with the distribution of the original languages, and only theorized that the districts with name in specific sublanguages might correspond with the places where those languages where spoken. After a wake-up call from my professor, I compared my map with the one I found online, and effectively, the huetar-names districts are in or just outside the region where the language was spoken.

I think it would have been interesting to find a map of “religious distribution” in the province and compare it to my fourth map. I didn’t have much hope in finding it, and in fact didn’t, but I’m curious to know if there are any relationships with the districts named after Roman Catholicism and the beliefs and practices of its inhabitants.

There are certainly many more analyses that could be carried out with this data, and more so with data from across the country. At a personal level, I enjoyed focusing in my province, to understand it better as a whole; there are many, many districts that I haven’t been to, and even if I have, I wasn’t aware of the meaning of many of their names. I’m can’t help but think about how knowing another person’s name is traditionally a crucial component of establishing a connection. Knowing a place’s name might have the same effect. No wonder Gagini, known for his efforts to establish “the” Costa Rican identity in national literature, cared so much about toponyms.



(1) Executive Power. (2015). Decree No. 39286-MGP. Administrative-Territorial Division of the Republic. Published in La Gaceta Diario Oficial No. 220, Thursday November 12 of 2015. Costa Rica.

(2) Gagini, C. (1918). Diccionario de costarriqueñismos. San José: Imprenta Nacional.

(3) Gagini, Carlos, (n.d.). Retrieved from

(4) Gardiner, E. & Musto, R. G. (2015). The Digital Humanities: A Primer for Students and Scholars. New York: Cambridge University Press.

(5) Costa Rica: Índice de Desarrollo Social (IDS) 2013. (September, 2013). Retrieved from

(6) León-Castella, A. & Murillo, M. E. (April 9, 2007). Costa Rica y sus provincias. Retrieved from


Leave a Reply

Your email address will not be published. Required fields are marked *