As the world around us becomes more intertwined with statistical algorithms and predictive models, it becomes vital to ask how our inherent biases creep into these systems. We live in an age where the ads we see, the music we listen to, and even the food we eat has been optimized according to specific data patterns. Acknowledging the inseparability of reality and data science is the first step to understanding the scope of technology’s influence. The second is remembering that this technology has human roots.
Tools created by people are influenced by the agendas and contexts that surround their development. Since it is impossible to separate humans from this creation process, it is also impossible to separate human biases from this process. Examples of algorithms and technology that display prejudice are endless. Searching “AI Biases” on the internet generates a list of incidents ranging from a Google service that changed image labels depending only on skin tone to Amazon’s AI recruiting process that selected male candidates more frequently than similarly qualified females.
In the field of data science, there are two major schools of thought when it comes to assessing the inherent predispositions within a tool. The leading hypothesis is that programmers are biased and therefore the tools they build are bias due to overt or latent agendas. A second school of thought believes that data fed into the models are inherently unequal. In this realm, inequality is a product of the data’s context and these biases are reproduced by the model. Although the truth probably lies somewhere in between both ideologies, it is important to systematically test these ideas.
Bo Cowgill and coauthors present valuable conclusions that add to the debate of AI ethics and inherent biases. Evaluating the performance of about 400 AI engineers in an audit-like scenario, the authors find that biased predictions are largely caused by biased data while personal demographic characteristics such as gender and ethnicity have little effect on performance.
There are three major conclusions that remain with the reader at the end of Cowgill’s research. First, participants that received unbiased data were more likely to work longer on the project. This “better data” treatment served as a complement to effort and may have functioned as a mechanism that led to the reduction of gender inequality measurements in this group’s predictions. Along with this, biased data caused biased results, but a simple reminder of this prejudice was enough to correct for about half of this error. These two facts taken together imply that better data or the reminder of poor data can produce significant improvements to outcomes. Lastly, the authors find that individuals that belong to minority groups do not exhibit less bias in their algorithms; rather, similar demographics often showed the same direct of error terms. This suggests that a singular type of person may not be able to overcome inherent tendencies, but a group of diverse individuals can strengthen each other’s weaknesses. These findings are pertinent to the future of work and provide new understandings around societal biases within AI.
Cowgill, B., Dell’Acqua, F., Deng, S., Hsu, D., Verma, N., & Chaintreau, A. (2020). Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3615404