How to reduce bias in AI?

Prejudice is part of human nature, and because humans create machines, they may unintentionally impart bias to them. In our time, it is impossible not to see the effects and repercussions of human bias towards each other, and now machine bias has been added to the equation, which puts us in a very dangerous position.

The damage and the severity of the automatic bias is a two-fold problem. The first part is the effect. A person trusts the objectivity of artificial intelligence very much, and therefore if the bias introduced during the training process is not detected, the problem will multiply. The second part relates to automation, as sometimes artificial intelligence models are connected to a software function, which automates the bias, which puts us in a critical situation.

How does the VRIO model be used to create a competitive advantage for companies?

What is AI ​​bias?

Artificial intelligence bias is when algorithms, especially those that rely on deep learning, distinguish between people based on race, gender, and others. And as we've already said, bias seeps into the algorithms through training data or during the data preparation process.

For example, if the deep learning algorithm is fed more light-skinned faces than is fed with darker-skinned faces, then the bias will be for fair-skinned people. Bias can also be introduced during the data preparation stage, including identifying the features you want the algorithm to take into account.

Why are the sensitive variables not simply removed from the data?

The direct question that anyone might think of is why not removing sensitive variables from the data to eliminate bias?

In fact, it is not that simple, and eliminating the sensitive variables does not solve the problem. Even if the sensitive variables are eliminated, their association with other data will remain, for the data is a set of information, numbers, and results related to each other, and machine learning methods can deduce, using probabilities, the hidden variables, i.e., those that have been eliminated. Machine learning algorithms can redefine data from other data and thus base their judgment on biased data.

“Blind taste test” to eliminate bias.

*Photo by fauxels from Pexels*

The test of blind taste has been around for decades and is tasting something without knowing what it is and evaluating it based on just the sense of taste.

In the 1970s, a blind taste test of Coca-Cola and Pepsi was common, that is, determining which drink they prefer without knowing the name (identity). The result was that the majority preferred Pepsi to Coca-Cola, but in reality, once the “brand” was known, the bias was in favor of Coca-Cola at the expense of its competitor.

The upshot is that removing the identity information, in this case, the Coca-Cola packaging, eliminated bias, and pushed people to rely solely on the sense of taste.

Therefore, blind taste testing reduces human bias by removing key identifying information from the evaluation process. A similar approach can be adopted to eliminate bias in machines. But we should not stop here because, as we have already mentioned, removing the identity will not solve the problem. Other additional paths must be adopted, whether in terms of the trial, evaluation, and modification.

Case Study: Bias in the replication of results from scientific studies

*Photo by fauxels from Pexels*

In the world of medicine and science, the results of clinical trials are published. Then a group of scientists evaluates them to determine their replicability on the ground. The result is that 86% of scientific studies published in the field of medicine, biology, and social sciences are not repeatable, according to scientists' assessments, which leads to annual losses estimated at $ 28 billion and delays basic, important, and vital processes for humans such as the discovery of vaccines and the discovery of treatments for some diseases as happened, during the Corona pandemic.

The main problem here revolves around bias. When scientists review the results, the focus is on quantitative data while completely disregarding the qualitative data, assigning numbers to importance over words. In turn, human reviews are affected by identity, such as the name of the university that conducted the research, for example. This image is transmitted to artificial intelligence devices that assess a bias similar to that of humans. To eliminate bias, the blind taste test implies eliminating identity and then reliance on qualitative data to support or contradict quantitative data through a path of determinant based on experiment, scrutiny, and comparison.

Applying the “blind taste” methodology in the business world

Photo by Canva Studio from Pexels

Blind taste tests can be applied in the business world to reduce bias to obtain better results. For example, take what happens when analyzing the financial performance of large companies in front of shareholders and analysts. The “audience” here will use the content to predict the company's future performance, which quickly and strongly affects stock prices.

But the human recipient is biased, as it tends to use the numbers presented, not the qualitative data, and it tends to give the identity of the speaking person great importance. For example, bias will be for Jeff Bezos or Elon Musk and what they say compared to others less known.

Artificial intelligence systems can look beyond the information that stimulates bias and deal with textual factors (words, not numbers) and other factors to provide objective assessments that can be relied upon to make decisions when properly trained.

Why did the G20 summit place innovation among its most important files, and how is it realistically applied?

Key keys to managing bias when building artificial intelligence

Choose the correct learning model for the problem.

There is a reason AI models are unique, as each problem needs different solutions. There is no single model that can be adopted to avoid bias, but there are several early signs of bias.

For example, supervised and unsupervised learning models each have their pros and cons. Unsupervised models that collect or reduce dimensions can learn bias from their data set. If belonging to group A is closely related to behavior (B), the model can confuse the two. While moderated models allow more control over data selection bias, this control can introduce human bias into the process.

Impartiality through ignorance – except sensitive information from the model – may seem like a practical solution. However, it still contains weaknesses and therefore is not sufficient on its own but must be combined with other things.

Therefore, the best model for a particular case must be identified and explored from all sides to find the gaps and fix them. Although this approach requires a long time, it is much better than discovering the problem after working with the artificial intelligence system.

Choose a representative data set for training.

There is a fine line between data that can lead to bias and ensure flexibility in the data analysis process. Therefore, it must be ensured that the training data is diverse and includes different segmentation groups. Partitioning in data can become a big problem later if the actual data is not divided in the same way.

It is not preferable to have different models for different combinations. Once it has been discovered that there is not enough data for a single group, weighting can increase its relevance in training, but this must be done with great caution as it can lead to new, unexpected biases.

Monitor performance with real data

Photo by Christina Morillo from Pexels

No company makes intentionally biased AI, but biased models are what lead to it. This is why real-world applications should be simulated as much as possible when creating algorithms.

What can be done is to rely on statistical data in addition to real data. For example, if the characteristics of people who default on loans are identified, and if the statistics favor a specific category, it is necessary to work on the “cause” and enter it into the artificial intelligence system. The goal is to create equality of results and to ensure equal opportunities.

Thanks for reading

Header photo by Alex Knight from Pexels

-—————————————————————————————————

Be sure to subscribe to the coil platform in order to support us!

Coil is building towards a future where anyone who is willing to pay for content can get one subscription from any provider that works everywhere. Supporting creators becomes automatic and no longer depends on an oligopoly of huge content platforms. Users would no longer need a dozen subscriptions just because the content they enjoy is stuck on a multitude of different sites.

Learn more about coil platform on the COIL BLOG :

https://coil.com/u/coil

Continue reading with a Coil membership.