How to build transparent AI to enable more equitable products by Dipanwita Das

Automatic Summary

Transparent AI And Equitable Products: A Comprehensive Guide

In today's world, artificial intelligence (AI) plays a crucial role in numerous industries, from health to finance. However, one key factor that often gets overlooked is the significance of transparency in AI. In this blog post, we'll look at how transparent AI can enable more equitable products.

Understanding Transparency and Bias in AI

Transparency in the context of AI refers to the inspection and comprehension of how a model deduces its inference, which is crucial for enhancing model performance. Conversely, explainability is about the model's capacity to reason out its output in human-understandable terms - a key aspect of transparency.

Bias, on the other hand, is a set of assumptions made in training data that could skew the model's application away from the real world scenario.

Why is Transparency Essential in AI?

Transparency is vital in AI as it plays an important role in crucial decision-making processes, especially in regulated industries like health or finance. Lack of transparency could negatively impact health, livelihoods and other critical outcomes. It also inhibits tracing mistakes for improvements.

Retaining The Human Element

Despite the automation that AI brings, involvement of a human expert in the loop is necessary. This practice does not slow the workflow but allows us to attain a high accuracy model and minimize errors, increasing both the scale and speed of operations.

The Importance of Fixing Data For AI

Data cleaning and fixing data makes up a crucial stage in AI. It's important to train models with domain-specific language and recognize the insufficiency of data for key populations and unmet needs. Responsibility lies in identifying the data gap and striving to continue improving the data collection process.

Avoiding Bias and Ensuring Transparency

Having understood the key concepts, here are some tips for avoiding bias and increasing transparency in AI:

  • Break down complex decisions into simpler indicators for easy understanding of the output.
  • Increase redundancy in your AI workflow. Use various models to serve the same decision for a multifaceted view.
  • Present clear information – make sure the data is human interpretable to allow for bias mitigation.

In conclusion, transparency in AI is not just about openness in processes and algorithms, but also about understanding biases and fixing data to ensure representation. The goal is to create equitable, fair, and efficient AI systems that can improve our world.

Video Transcription

Welcome everyone to this session on how transparent A I will enable more equitable products. Uh My name is Deana Das and I am the CEO and co-founder of Soro. Uh The agenda for today that we have zeroed in on is as follows.First and foremost, I'm gonna start by identifying who I am what my company does. And frankly, why do we care about transparency and equitable in A I other than the fact that we logically understand it to be important, I will take you through some key definitions of transparency versus explainability, how we define bias both for the purposes of this session as well as for uh as well as to give you some context on how source Sarah looks at it.

We'll talk about why transparency is essential and essentially the different risk levels um in the application of A I in the world of business and how to design systems that take into account the risk level for each of these different kinds of applications. We'll discuss how source Sarah retains the human element, but also how retaining the human element in the form of a subject matter expert or a human in the loop um serves our purpose of making sure that we have transparency and ex explainability. We'll also discuss a little bit about fixing the data at the end of the day, what we do in A I is driven by what data we have to work on and we have an equal commitment to building great algorithms without bias or identifiable bias. At the same time, we have an equal obligation to continuously improve the collection of data and making sure that the data itself is excellent. And last, as always, I like to end on a positive note, there are lots of ways of addressing bias. So I will take us through how we can avoid bias and increase transparency and frankly how Soro does it. So like I said, my name is De Panida. Das, you can call me D um And I've been working at the intersection of analytics of start up and health data for about the last 12 years. Previously, I built digital scientific platforms for the world's largest public health effort.

And uh so cro grew out of the challenges we faced in communicating health science, particularly I supported teams in about 20 different countries on the largest drivers of lung cancer, of meta, of metabolic disease and uh for interventional surgery. And despite the challenge of the very complex and changing science, we were able to transform over many years a billion health outcomes today Soro serves leading life sciences companies in transforming how they collect and communicate their most critical scientific and clinical product data.

And as we know, we're in the business of making sure that the data that backs uh the use of a drug or a product to treat patients um is explainable, is transparent, it's auditable and that it is focused on efficacy and use. So this is really the shift from working in public health to using A I and data science uh to push for a more equitable use of data but most important and effective use of the world's knowledge. So definitions, this is really really critical to start here. So the way we define transparency and explainability also casually referred to as Xa I transparency, speak to any efforts that enable the people to peek in and to inspect how the model made the inference. This enables us to keep improving the performance of the model because knowing how and why it failed the first time around is really essential to not doing it. The second time throughout my talk today, I'll be using a bunch of potentially orchestra related metaphors. So please bear with me as I do that. Um But let's move on to to explain ability. A model's ability to explain why it has outputted X when receiving an input A is essential because it needs to be done in a human understandable terms. And this is a big difference here. Explainability is about making sure that the human can understand, it's not a purely mechanical thing.

And explainability also makes it so that the decision of a model, the prediction or suggestion, it is fair to the users and the consumers of the result of the model. For the purposes of this talk, I will define bias as a set of assumptions that have been made in the training data that pushes the model in a direction that is different from the real world in which is applied. I'll give you an example of how I look at this, let's say, in the real world, we're applying a data set on social determinants of health. But the data set hasn't taken into account a particular demographic or a particular geography. But the model is being applied in a real world scenario where that geography or population is essential and is included, that's really where bias plays a really um a really dangerous role. It separates what is theoretical from what is real and causes a lot of ramifications in the real world. Let's define A I and this is really where my orchestra uh metaphors is going to start. When we talk about A I, we talk about an ensemble of agents that use a variety of linguistics or statistical models to look at a data set or a problem set and derive inferences.

If you really think of an orchestra at this moment in time, you have, you have a viola, you have a violin, you have a piano, um all of these different types of instruments that together create that great symphony. Each of these can be looked at as a model that is driving a particular type of inference. The conductor. Here in this scenario is our customer is the subject matter expert that is verifying that the performance of each of these models are as it should be. So we have a symphony in the end last but certainly not the least human in the loop. This is absolutely mission critical for what source does in life sciences is to have an sme in the loop that is verifying and adjudicating the decisions the A I pushes for before it hits the end customer or the end stakeholder. Our approach is both hybrid and human in the loop, which means when Soro builds a product, it is using both statistical and linguistic approaches to A I as well as always retaining a human in the loop. Next, let's retry and understand where transparency is essential and why.

Um I found this really great image last night on a on a bit of a maslow's hierarchy of risk. Let's put it that way. So when a decision really matters, it is almost always in a regulated industry, an example could be diagnosis or treatment recommendations, pharmacovigilance which is an active tracking product performance in market, uh particularly looking for any deviation from safety or efficacy that could be dangerous for the patient population.

Clinical trial designs and the populations that are being targeted or addressed through those clinical trials. Also in the financial sphere where you can have automatic profiling credit scoring and underwriting, automated trading and decisions. Often these impact a human being in a very, very major way.

But when we do not have transparency, a health and livelihoods can, can be at stake. These out outcomes are often unfair. And because the bias is so deeply embedded, even the company that is using these models aren't always in a place to deal with an appeal from an end user. And last and in my opinion, possibly most damaging, you cannot trace mistakes to improve. And this is really important for us to understand when a human being is making a decision and that decision is wrong. The end user has the room to inform, to give feedback and that human being will maybe not make a mistake the next time around with black box applications and a lack of transparency and explainability. It is impossible for us to know why and where a mistake was made and thus we cannot fix it. So with that in mind, Soro has chosen and this is how we think about retaining the human element straight out the gate. Before I talk about the slide itself, let me address a common misconception. This having a human expert in the loop is not equivalent to a human being actually doing the work. And that is a very, very important differentiation.

This human in the loop structure and solution architecture does not in any way slow anything down. What it does is it allows us to take a very high accuracy model and performance and ensure that we're minimizing the error bar, we're retaining scale and we're shifting the focus from just purely scoring to really enhancing our customers workflow casually. We like to say that we give our customers superpowers because we enable them to work at greater scale and greater speed with increased accuracy. I found this again, a great, great image from Gartner here which lays out the different levels of human involvement required on the basis of that risk from low to high risk. Let's start on the left with decision support for medical diagnosis, an extremely high risk category machines here can provide visualizations, explorations alerts. But at the end of the day, the decision must be made by a trained physician taking into account not only the sf like not only their understanding of medicine, but also the ethics involved in it.

Financial investments is a bit of a 5050 A I can certainly be used to generate recommendations and maybe even certain degree of analysis. But at the end of the day, you can have a very back and flow of maybe the machine does the first suggestion and the human being checks off on it or a human being starts first and then the machine comes second. So again, this um this beautiful flow and exchange between machine and human makes it possible again for us to realize real business benefits from a transparent human in the loop A I. And last is a lower risk category of decision making, which is automating something of a next best action where auto where autonomous decisions are made by machines for forecasts, et cetera and risks are managed by a human in the loop for exceptional cases. So this particularly in a field like life sciences in which patient health outcomes are impacted, it is absolutely essential to have the human in the loop. To be the final adjudicator of the decision. We talked about fixing the data. We are only as good as our data. So in a highly technical industry like life science, it is important to train models with domain specific language and to really appreciate that English, for example, is not the language of life sciences. English is also not the language of financial services.

English is also not the language of manufacturing. So being able to differentiate between a language like English or French or Spanish and that of the domain is extremely important. But it is also equally important for us to identify when there is insufficient data for key populations and for unmet needs, you can hear ac in the diagram that represent locations of diversity in genomic data is horrendous. So when this core data was collected, it was different populations were not adequately or proportionately represented. But that raw data set is now informing what we do in the world of A I, which means inherently what we do is flawed. So our approach to this has been to take focus on unmet needs and to also help our customers understand where there is a paucity of data. Continuing. Um My theme of music, we take a really ensemble approach to avoiding this bias and increasing transparency. As I begin to wrap up my little talk, uh I want to leave you with some tips on how to avoid bias and increase transparency in the world that we work in. First and foremost, breaking down complex decisions into simple indicators where it is easier to understand the final output if we know what steps were involved in getting to that output.

So if I continue on the music metaphor, we are here building an ensemble of musicians where we can understand how the final output is coming to be because we understand how each of the pieces work and their role in that symphony, increasing increase redundancy in your A I workflow.

One of the things I have learned from working in this sphere for the last several years is the importance of workflow. And how workflows allow, allow us to keep a better handle on transparency and auditability. And to also use fail safe models where multiple models and A I techniques are being used to serve the same decision. So you're getting a multifaceted view and you're able to, again make adjustments as required. And last, again, probably one of my favorites and somewhat counterintuitive is presenting an unequivocal and clear information to the end user. And in this case, a dry score which says 0.871 accuracy is hard to understand. But when you have a heat map like this diagram shows or, or of a document coloring all the elements that led to a conclusion that is human brain represented human interpretable. And therefore, again, allows a human being to mitigate for bias. Again. Thank you very much for listening to me today. I'm going to stop sharing my screen and maybe and engage in some questions. Uh OK. Um So I am going to address the question from Kathleen Nelson about how can you provide uh additional examples of bias? The example, what language is interesting. So one of my favorites is actually when you again talk about healthcare. So typically we think about projected classifications. When we think about age, we think about demographics, uh we think about income levels and national origin and race.

These are all things that are, are are used that we have seen negatively impact us in the world where it's been used to deny people the access to free and fair services. As a case in point at the same time, when it comes to regulating A I in the healthcare and life sciences space, this is actually absolutely essential. So, creating a world where the, you know, to be able to represent in data and ontology of bias, a whole set of relationships that are bias and bad in certain spheres like finance and financial services. But those that are then essential in life sciences or medicine to consider is a really great example of where, you know, language is really important here. Um I'm happy to answer more questions and um and I appreciate your very, very nice comments. Um So I'll, I'll hold on for another minute. Any more questions, tips. I would love some feedback as well. I'm glad you found this interesting crystal. OK. So in my last minute, a quick wrap up context is everything um where we are looking for transparency in context is absolutely essential. Number two, transparency and explainability. Oh What motivated me to get into this field?

I like working in spaces that drive our world, medicine and life sciences and public health touches every one of us myself. Those I love and everyone around me. Um A I and cognitive tooling is some of the most promising tech we have seen that can help a human being in a very, very data rich field. So, you know, uh health care data is increasing, I think at a 48% Kegger, which means we have a ton of it. And technically, we should be able to make better decisions about human health. But whether or not we can is driven by um our ability to use that data effectively. So that really, really drew me to this. Also, I had a monopoly of personal experiences, particularly last year where both my partner and I became caregivers to our parents and got a front row seat to what it is to treat complex diseases, which is also a critical area of Soro's work and why it is so important to build effective A I that is explainable and transparent.

Um Thank you very much for your time today. Uh This is a, this is a big field. You're right. It is the tip of the iceberg and I hope that you will continue to engage with it. Um And really, really interrogate the assumptions we make about data and A I as we work in this space. Thank you everyone.