Search for...

Most AI Models Are Trained on Biased Data. How Can We Make AI Fairer?

What happens when chatbots and AI systems give erroneous, or maybe even biased responses? While these systems are designed to provide quick and accurate responses to a wide range of inquiries and to automate various tasks, when the underlying algorithms or data sources are flawed, the results can be incorrect, misleading, or even harmful.
Image credit: Pixabay
, ~

What happens when chatbots and AI systems give erroneous, or maybe even biased responses? While these systems are designed to provide quick and accurate responses to a wide range of inquiries and to automate various tasks, when the underlying algorithms or data sources are flawed, the results can be incorrect, misleading, or even harmful.

For several years now, AI researcher Viktor Miloshevski has been at the forefront of the movement to ensure that artificial intelligence is developed in a responsible and ethical manner. The Macedonian-born scientist has contributed to the advancement of ethical AI, with his PhD research, providing a comprehensive examination of the moral and ethical dilemmas that arise in the development of AI.

Titled “The moral and ethical dilemmas of the infosphere: the Big Data challenge in the process of creation of Artificial Intelligence”, Miloshevski’s thesis offers a fresh perspective on the complex and pressing issues of how ethics intertwine with AI. As he explains, the research is one of the very few in Europe that is observing the deep social phenomena in the process of creation of AI.

In that same direction, his new project called LivAI has to contribute to the ethical digital advancement of the European communities through the piloting of AI adult educational pathways. The project also aims to produce a handbook for understanding ethics in the field of data and AI and an e-learning platform with a validation and certification platform for acquired competencies.

In an interview for The Recursive, he reflects on the role of ethics in AI and what this means for the future of the widespread implementation of this technology.

The Recursive: Could you tell us a bit about your background, experience and research in the field of ethical AI?

Viktor Miloshevski: I am currently in the final stage of my Ph.D. studies and I am finishing my Ph.D. thesis at the University of the Balearic Islands. I belong to the Human Cognition and Evolution research group.

The research group was created in 2000, and it focuses on the philosophical, anthropological, psychological, psychopathological, paleontological, and ethical aspects of human nature.

I applied my research plan to the University in 2017, years before the topic was in the spotlight, so it could be said that my research was started ahead of its time. My professional background is in the field of innovation projects, where I work in the field of digitalization of education and the creation of digital tools and products.

Read more:  Choose the AI startup of the year

How would you define ethical AI and what are its main principles?

AI is quite a new phenomenon, which makes it really hard to define it with one ultimate definition. Ethics on the other hand it’s a very old phenomenon that has an evolving definition depending on the context and environment in which the conceptualization is practiced or observed.

That is why the AI community is centered more on the definition and adoption of principles rather than a single ultimate definition.

As there are different policy documents that cover the basic principles I would say that they are centered around: human-centric AI, trustworthy AI, respectful in terms of human rights and democratic values, socially beneficial, and so on.

Why is ethical AI important and what are the main ethical concerns here?

If we accept the philosophy of Luciano Floridi (Italian philosopher on the topics of information and computer ethics) and establish the fact that we are living in an Infosphere – a world where being online is no longer a choice but a survival instinct, then we recognize that our everyday life is driven by AI and coherent technologies.

Since AI and technologies are governing our everyday life, the ethical frameworks in which those enablers operate could be argued are of utmost importance for our well-being and prosperity.

The exponential growth in computational power and the massive acceptance of AI as a process driver in every field created a space for fast growth and deployment of models that are not necessarily trained to be ethical or unbiased.

Nowadays it could be claimed that every ML model is biased since the training data is biased according to one or more indicators. So having biased AI is already an ethical concern, and accepting the fact that all AI models are somehow biased is an emergency.

Another point of concern that is recognized by the International community as well is the understanding of AI. There are some points in the functioning of a model where even the creators of that model are not sure how it works and why the results that are produced are quite different from the initial results. The inability to predict long-term scenarios for AI and other agents is an ethical and security concern at the same time.

Read more:  Coaching tech and how a Romanian-born startup is building the first culture OS

How can ethical AI be implemented in practice?

One thing is quite clear, there is no magical formula or algorithm that will solve all the ethical concerns of AI. The creation of such a solution is still far away. I claim this because of the simple fact that 99% of the AI models are trained on biased data.

I claim that 99% of the human agents involved in the training process are in poses of internal biases that they are not necessarily aware of. Biased data equals a biased model and from the simple fact biased model in practice could not be an ethical model.

One of the solutions that regards data is the trend of using synthetic data for the training of the models. This type of data is arguably less biased but it’s far away from an ethical solution.

My personal approach and part of my experimentation and pilot projects is a solution with an educational framework in the field of AI ethics, the deployment of ethical data pipelines, and the creation of ethical APIs.

What are the main goals and results that you hope the LivAI project will achieve?

The LivAI project is my newest ‘baby’ and I am quite happy that the University Jaime I from Spain and their GIANT research group accepted my project and am grateful for the co-financing by the Spanish Government and the European Union. I just came back from the kick-off meeting in Spain, and I could say that we have a solid consortium and a very innovative approach.

The overall objective of the project is to contribute to the ethical digital advancement of European communities through the piloting of AI adult educational pathways.

The specific objectives of the project are

  • To enhance institutional commitment to the implementation of the EU guidelines on ethics in artificial intelligence
  • To create Open Educational Resources (OER) that will provide and validate competencies in the field of data and AI
  • To promote ethical data use in the field of education on all levels.
Read more:  US-based Databricks, Co-founded by Romanian CTO, Opens 2025 With Largest Debt Financing to Date

As visible, the project is part of my three steps strategy for ethical AI- Education, Data and API, where it comes under the domain of education.

ChatGPT is definitely the most talked about AI product this year – how do you see 2023 in terms of how AI can further develop?

ChatGPT is in the spotlight but what is more important is that the spotlight is on OpenAI since ChatGPT is just one model devised on the GPT engine with version 3.5. GPT stands for Generative Pre-trained Transformer and it is qualified as a language model. What is less known is that OpenAI applies the same GPT engine on DALL-E – the text-to-image generator and other models.

The breakthrough with ChatGPT is just an indicator of how advanced the current state-of-the art in the field of AI. If we take into consideration that ChatGPT is an ‘old model’ trained on web data until the end of 2021 and that same model with every user becomes more and more powerful and advanced we could conclude that this is just the beginning of an AI mass evolution.

Nowadays we see the announcements of other language models but since I am quite deep into this topic I could say that there are even more advanced models and all AI environments, where agents are observed and inserted in interactions in totally autonomous mode, meaning there are all, AI marketplaces and all AI communities that function with an autonomous exchange of values and services and capacity for learning, just like human agents function on daily basis.

Another important direction is the development of Artificial General Intelligence (AGI), and for me, that is the highlight of AI in the coming years.

Help us grow the emerging innovation hubs in Central and Eastern Europe

Every single contribution of yours helps us guarantee our independence and sustainable future. With your financial support, we can keep on providing constructive reporting on the developments in the region, give even more global visibility to our ecosystem, and educate the next generation of innovation journalists and content creators.

Find out more about how your donation could help us shape the story of the CEE entrepreneurial ecosystem!

One-time donation

You can also support The Recursive’s mission with a pick-any-amount, one-time donation. 👍

https://therecursive.com/author/bojanstojkovski/

Bojan is The Recursive’s Western Balkans Editor, covering tech, innovation, and business for more than a decade. He’s currently exploring blockchain, Industry 4.0, AI, and is always open to covering diverse and exciting topics in the Western Balkans countries. His work has been featured in global media outlets such as Foreign Policy, WSJ, ZDNet, and Balkan Insight.