Walter de Gruyter GmbH
‘Divorced ~ castrated’: Gender stereotypes in the Croatian section of Kontekst.io, a computer-generated thesaurus of synonyms and semantically related terms
2025
Summary This article examines gender stereotypes in the selected entries referring to marital status in a computer-generated thesaurus of synonyms and semantically related words called Kontext.io. This thesaurus is very popular in Croatia – in fact, it is the most popular local online thesaurus used in the country. We specifically focus on its Croatian section, which is derived from a large web corpus of Croatian, comprising 1.4 billion words, using a natural language processing (NLP) model based on word embeddings. The analyzed entries from the thesaurus include: udana/udata (married feminine ), oženjen (married masculine ), razvedena (divorced feminine ), razveden (divorced masculine ), udovica (widow) and udovac (widower). We first categorize the synonyms and semantically related terms from these entries into various semantic fields. We then critically analyze gender bias according to the fields, using the framework of critical lexicography and critical discourse analysis. The results point to presence of gender bias in the analyzed entries.
Partners
Subscribe to repository