Don’t Patronize Me! An annotated dataset with Patronizing and Condescending Language
We are all condescending or patronizing sometimes, even without intention. And this happens more often when we speak about vulnerable communities.
Don’t worry, I know this is a mistake you usually make, we all make it sometimes, but I am bringing you a solution. Ok, that was condescending, sorry, but it was meant to be an example of Patronizing and Condescending Language (PCL) and how it makes us feel.
What is PCL?
Somebody is patronizing or condescending when their language denotes a superior attitude towards others, talks down to them, or describes them or their situation in a charitable way, raising a feeling of pity and compassion.
The tricky aspect of PCL, especially if it is geared towards vulnerable communities, is that often the author has positive intentions when using it. And they might even be unaware of the tone of their message and how it can impact others.
But, generally, these messages just want to help people (in need), what’s wrong with that?
An NGO campaign might want to raise funds to help a community or an individual. A person talking to another in a more vulnerable situation might want to express their understanding, support or admiration towards the challenges of whom they consider the vulnerable person. News stories might look to raise awareness about underprivileged groups and move their audience to action.
But, have all of them thought how these supposedly vulnerable people feel when they are referred to in such a way? Do they know if those who have been considered as vulnerable or underprivileged would consider themselves as such? Do they want or need your help? Will they be willing to receive the compassionate or pitiful looks of others?
There is nothing wrong, of course, in wanting to help others. But we can raise funds, awareness, move to action or describe a rough life experience without being condescending. And we must, because the impacts of PCL are many and harmful.
Many researchers in sociolinguistics have studied the traits of PCL and its impacts on potentially vulnerable communities. Some of them are:
- PCL fuels discriminatory behaviour (Mendelsohn et al., 2020);
- it creates and feeds stereotypes (Fiske, 1993) which drives to greater exclusion, discrimination, rumour spreading and misinformation (Nolan and Mikami, 2013);
- it strengthens power-knowledge relationships (Foucault, 1980) by presenting communities in need as passive receivers of help, waiting for a saviour to help them out of their situation. (Bell, 2013; Straubhaar, 2015);
- it minimizes or tends to avoid stating the reasons for very deep-rooted problems, and sometimes even subtly blames the individuals for their situations,
- and it proposes ephemeral and simple solutions (Chouliaraki, 2010) which oversimplify the deep-rooted problems vulnerable communities face.
Long story short, patronizing and condescending language creates a discriminatory mindset which makes it more difficult for vulnerable communities to overcome difficult situations and reach total inclusion (Nolan and Mikami, 2013).
How can Natural Language Processing help with PCL?
Although harmful behaviour in language (i.e. hate speech, offensive language, fake news, rumour propagation or misinformation, among others) has been widely studied in NLP, PCL has been a neglected area of study until very recently. In the last years, though, NLP has seen more interest in detecting subtler kinds of bias and relations of power, among others, which opens an interesting opportunity for studying PCL.
With the objective of encouraging more research in PCL, we have created the Don’t Patronize Me! Dataset, a collection of more than 10,000 paragraphs from news stories about vulnerable communities published in 20 English speaking countries.
In addition, we propose a novel taxonomy of three top level and 7 low level PCL categories which describes the different types of condescension that can be found in news articles about vulnerable communities.
The data is annotated at two levels: 1) Classifying each paragraph as containing or not Patronizing and Condescending Language and 2) In those paragraphs containing PCL, identifying which spans of text contain the condescension and to what category, from our 7 categories taxonomy, they belong to.
Identifying PCL is hard even for humans because it is subjective and subtle. For instance, I might find condescending something which another person might consider an objective portrayal of a situation. Or some people might not see the harm in describing how those in a privileged position donate their remainings to those who need them. Also, we would expect a member of a so-called vulnerable community to feel more patronised than one person who does not belong to such group while reading how others refer to them.
However, after a challenging process, we are in a position to release our dataset to the community. We trained some baseline models to have an idea of how the current NLP techniques perform in this task and, although we saw that identifying PCL is feasible, it is still a challenge. You can check our COLING 2020 paper here.
I believe patronization and condescension can only harm vulnerable communities, enlarge the distance between people with different backgrounds, feed inequalities and, in summary, pollute our rich, varied society with unfair power relations and ignorance. As always, the worst affected by all this are the underrepresented groups. If NLP is able to properly identify when we are being condescending or patronizing towards others, we will have at least the possibility to change our message to make it more inclusive and constructive and contributing, that way, to a more responsible communication.
REFERENCES:
Lilie Chouliaraki. 2010. Post-humanitarianism: Humanitarian communication beyond a politics of pity. International Journal of Cultural Studies.
Katherine M Bell. 2013. Raising Africa?: Celebrity and the rhetoric of the white saviour. PORTAL Journal of Multidisciplinary International Studies, 10(1).
Susan T Fiske. 1993. Controlling other people: The impact of power on stereotyping. American psychologist, 48(6):621.
Michel Foucault. 1980. Power/knowledge: Selected interviews and other writings, 1972–1977. Vintage.
Julia Mendelsohn, Yulia Tsvetkov, and Dan Jurafsky. 2020. A framework for the computational linguistic analysis of dehumanization.
David Nolan and Akina Mikami. 2013. ‘the things that we have to do’: Ethics and instrumentality in humanitarian
communication. Global Media and Communication, 9(1):53–70.
Rolf Straubhaar. 2015. The stark reality of the ‘white saviour’ complex and the need for critical consciousness: A document analysis of the early journals of a Freirean educator. Compare: A Journal of Comparative and International Education, 45(3):381–400.