So, what’s AI anyway? Non-expert users in the face of disruptive technology

Raffaella Panizzon
Language Engineer

Video Transcription

Good, perfect. Oh, thank you so much for this. Um um So I think I can start now. Um So um 2022 was the magic year for A I. And as 2023 unfolds, uh we are witnessing even more impressive development.Uh We've seen unprecedented advances in text to image generation, speech synthesis, drawing to animation conversion. And most notably, we've seen the application of large language models to build a conversational agents uh that can virtually answer any question in a matter of seconds.

Now, we can even generate moving pictures from a string of text. Uh This is something we could have hardly imagined only a few years ago. Um And so we are now at a unique point in time where A I truly is a sufficiently advanced technology that is indistinguishable from magic. Um If we want to paraphrase Arthur C Clarke, who was the creator, the, the writer of 2001 A Space Odyssey. Um And on top of this, all of these magic tools have been made available for free to anyone uh with the internet connection overnight. Now, uh the thing is all of this powerful and disruptive technology never came with an instruction manual uh for the non expert user. And while A I practitioners are aware of the most common risks and limitations of large language models, the average user has no idea. Um And understandably so because not everyone can be a machine and an expert after all. And no expert users just input a prompt on a web page and receive an output, but they're not really aware of what goes on under the hood. Um And they don't even know whether they can trust the information that they receive or how that output was was generated. Um We should also remember that people still think of artificial intelligence as something associated with science fiction. Um So for example, robots suddenly becoming sentient and deciding to destroy the world or computers rubbing against programmers and becoming some sort of uber humans.

Um And this happens because that's what people have been exposed for a very long time. Uh That's the A I that they've had somehow direct contact with. Um a and as an A I practitioner myself, I really see this all the time. So um people talk to a piece of software as if they were a real person and the term artificial intelligence doesn't really help either. Um It's actually quite misleading um because the concept of intelligence is typically associated with the ability of a sentient being to elaborate thought.

And our reference is of course the intelligence that we're most familiar with which is human intelligence. Um Intelligence is also what allows us to reason and tell the difference between what is possible and what is impossible. Um As we see in this uh sort of meme that I created uh the output from G BT makes no sense. And that's really obvious for a human, for example, but not for a machine. Um As Noam Chomsky put it in true intelligence is also capable of moral thinking, this means constraining the otherwise limitless creativity of our minds with a set of ethical principles. And so if we think of intelligence in all of these ways, um artificial intelligence may be misconstrued as a machine's ability to think reason and be aware of its own thoughts, which is simply not true. This is not what machines can do, this is what humans can do. And but why should we really care about non expert users so much? Um At the end of the day, they're not, they're not the ones creating this technology, they're in the receiving end of it. Uh Well, the reason is that um known expert users consited the overwhelming majority of, of users of this disruptive technology. Um for example, Church EPT alone. So I'm not even counting all of the other generative A I uh services out there.

Church EPT alone has 25 million daily users let that sink in 25 million daily users. Um So it it is very unlikely that all of these are machine learning experts, but it is more likely that some of them are, for example, Children and minors. And so this is our current scenario. Uh We have a very large number of known expert users who have access to tools that seemingly work like magic. And some of them tend to humanize machines and think that they may become sentient soon and do a lot of damage to the world. Now, do we see a problem here? Um Well, I do and um the problem that I see is that there are a number of societal and ethical risks um associated with releasing disruptive technology without properly educating the people who are going to use it. And I believe people should be aware of these eight main aspects that I I'm going to um quickly describe now. Um So some of you may have heard of the hallucinations problem. Um This happens when the language generated by the systems is so assertive and convincing that we believe everything they claim. Um We may not realize that sometimes the content produced is actually made up by the machine or, or at least that it contains substantial flows.

Um For example, we have cases where G BT even made up entire scientific papers uh to support an unfunded claim. So it invented papers that did not exist. And a related issue to this is um also the humanization of machines. So if we combine the two we basically completely trust the output of the machine. But we also think we're talking to a human. And there are cases where people use the A I as some sort of therapist or confidant and trusted the output so much that when the machine suggested suicide, they went through with it. Now, these are clearly extreme cases. Um but we should acknowledge that this may happen and has happened. Um And especially when users are very young and vulnerable, this is really risky. Um And this is also the reason why it was mentioned in this uh sort of magic effect at the beginning of my talk. Um This is what I was actually thinking about um a further consequence of hallucinations. Um And humanization is also the redistribution of agency. Uh We need to keep in mind that the responsibility of the output of A I is not with the software, but with the people who train it fine tune it and decide to release it to the public.

Um Placing accountability in the hands of the companies that create this technology is extremely important. Uh Because it reminds us that these are just tools, nothing more, nothing less accountability is on humans always. Um And um another issue is that uh there are a number of ethical aspects that are not really talked about enough. Um For example, workers exploitation, um you see the output of the system requires human intervention sometimes um to edit responses that are considered unacceptable, inappropriate for a number of reasons. So, um uh for example, people in Kenya were paid as little as $2 an hour to edit machine output. Um and they were exposed to sometimes really very, very disturbing content that had a huge impact on their mental health. And all of this is done to improve services that enrich a small minority of mostly white men and who typically are already rich. And I think this is really important for people to, to understand this and then to remember this when they decide to use um services like this. Um Another ethical issue lies also in the selection of training data.

So data for large language models were taken by basically scraping the internet. Um And the issue is that the internet does not really represent all languages, cultures and and perspectives um equally. Um It is actually a very imbalanced corpus filled with misogyny racism, white supremacism et cetera.

And now training a model on this data means that responses are bound to mimic the input. And so the output will risk put in even more unwanted content online and perpetrate the stereotypes and worldviews of only part of the population. And it will also repro likely it's also likes to reproduce systems of oppression that are already existing. And the same goes with images. Um For example, stereotypes about women are perpetuated by showing them in overly sexualized depictions um or when they are excluded from some professions, but um overrepresented in others. So for example, if you ask the Lee for the image of a nurse, it's likely to produce a woman.

But if you ask for the image of a CEO, it's likely to produce the image of a man. Um And connected to this is also the issue of copyright violation. Um As I said, contents were just taken from the internet and no compensation was given to all those who created those contents. Um And, and yet these companies are generating huge profits from materials that they obtained free of charge. Uh This is also a serious ethical issue, I believe. Um right to privacy is also another issue. Um As you know, you need to create an account in order to use the services and your prompts are recorded and used to further train the model. Now, this is all done without our explicit consent, I believe. Now Italy is the only exception in this and we don't really know exactly how these data are used, who can access them and in what capacity um there are then potential malevolent uses of A I just like there are for any tool. Um for example, fake news um can be easily generated multiplied and spread. Um And this happens much more quickly and in a much more convincing ways because they can be posted on multiple platforms at the same time um by what are called soul funds.

And if people and the press don't do accurate fact checking. It can become increasingly easy for malevolent agents to influence public opinion and even affect the result of elections. For example, it will also become easier to agencies and other institutions and stem citizens because you're using the same language that is used by these institutions and people could have a hard time and telling the difference. Um And finally, uh training large language models has a considerable environmental impact that is not very often discussed. Um It's calculated that training one large language model would produce as many CO2 emissions as a transamerican flight. Um And we should also remember that the model will need to be retrained periodically because new data is put on it. Uh new information um is put out there all the time. So it's not just one training session, it's a number of them. And also um we need to remember that different institutions are training different models. So we have a lot of people training a lot of models at the same time and this has an impact on the environment. Um So how do we tackle all this? Um Well, I believe that this needs to be a collective effort uh from A I practitioners uh from the press institutions, anyone that has a voice.

Um And we all have a responsibility to dispel false myths about A I and raise people's awareness of its potential as well as of its risks. So that people can make an informed and conscious use of it. I am not in favor of banning this technology, but I do think it has become clear that it needs to be regulated and popularized to protect users. Primacy, creators, copyright and everyone from improper usage. And there's already uh endeavors in this sense. For example, UNESCO uh recently published this quick guide on how to use cha PT and it's, it's based on three main aspects. So one is, do you care if the output is true or not? The other one is, do you have expertise in this field that you're asking about and free? Can you take full responsibility of this if you're using this? And I think these are three really important aspects that can guide people in a very easy way. This is the, I think this is a very effective flow chart. And um in conclusion, regardless of whether you are an A I expert or not, uh we all play a role in shaping the future of technology. Uh Our human intelligence and our ability to tell right from wrong are the best tools that we have to make the most of artificial intelligence. Thank you. So, um oh, I see questions um on the chat. Sorry for going on this time.

Um Oh, there's no time for questions. Oh, oh, thank you. Um um I'm sorry if I went over time. I um