Is Wikidata Socially Biased?

  • Links:PDF
  • Many real world Knowledge Graphs (KGs) such as Wikidata are susceptible to having unbalanced distributions of people of different genders, ethnicities, religions, and nationalities. Biases due to such unbalanced distributions are then propagated to the Deep Learning models making use of KGs such as embeddings (KGEs) that are trained on those KGs. A very recent study has demonstrated that due to such unequal distribution in Wikidata, biases related to professions are seen in KGEs.

    These biases when encoded in KGEs are harmful to downstream tasks such as machine translation that use such KGEs. The study [1] measures biases in Wikidata and Freebase KGs considering one relation (gender, ethnicity, religion, or nationality) at a time. As an example, in terms of gender bias, men are more likely to be bankers and women more likely to be homekeepers.