AI biases, “hallucinations” and the larger implications

This is the second installment in the series exploring the impacts of AI-generative text on the Pagan community, and beyond. 

TWH – There is no shortage of serious issues to unpack when it comes to AI-generative applications that range from incorrect information like “hallucinations” that can even affect the accuracy of web searches; how information to train LLMs (large language models) was sourced; to conflicting and potentially biased answers that AI chatbots offer up.

While the implications of how AI applications like Chat-GPT (Chat Generative Pre-trained Transformer) might impact a variety of industries and the people who work within those fields, the question of bias may be one of the larger concerns.

Assistant professor in philosophy and data science, Damien Patrick Williams, at the University of North Carolina at Charlotte, wrote a compelling article earlier this year published in American Scientist that examines the issue of baked-in bias when it comes to AI-generative systems.

Williams notes the difference in how the word bias is often applied and interpreted, “For some people, the word bias is synonymous with prejudice, a bigoted and closed-minded way of thinking that precludes new understanding. But bias also implies a set of fundamental values and expectations. For an AI system, bias may be a set of rules that allows a system or agent to achieve a biased goal.”

Defining bias is as important as understanding how it figures into the programming that powers entire AI-generative systems and how algorithms function in the greater scheme of things.

By David S. Soriano – Own work, CC BY-SA 4.0]

 

Explaining how these applications work Williams states, “It is much easier to see through the mystique of ChatGPT and other AI applications once you understand exactly what they are and what they do. The truth about such algorithms is that they’re literally just sets of instructions. You have a set of standardized operations within which particular weights and measures can be adjusted. In so doing, you have to adjust every element of the whole to make sure the final product still turns out the right way.

“Algorithms are often sold as magical, but they are neither unexplainable nor even terribly unfamiliar. The recipe for any food—just as for anything you have to make—is an algorithm, too. My favorite algorithm is pumpkin pie. If you go to make a pumpkin pie, you might decide you’d like less butter, more sugar, or more milk. But you can’t adjust the proportion of the pie’s ingredients without considering the rest, or you’ll end up with a crumbly or spongy mess; it won’t really be a good pie. You must adjust the whole recipe, the whole algorithm.”

While all of that is pretty straightforward, the history of programming and how algorithms were created and by whom shines a glaringly bright light on inherent biases. This is not a new problem but seems to be one that developers and the companies that employ them struggle to address in any meaningful way.

Earlier programs that employed basic language models, like Global Vectors for Word Representation (GloVe) and Word2Vec showed considerable prejudicial bias due to the databases used to fuel their responses, as well as being limited in their ability to map correct associations between words in large segments of text. The limitations of these programs would ultimately result in the creation of the transformer framework that relies on LLMs and utilizes trillions of words–what we know today as Chat-GPT.

Unfortunately, the same issues when it comes to prejudicial bias found within GloVe and Word2Vec are not only present in Chat-GPT applications but are proliferated quickly and over a much larger scope.

Some examples that Williams gives are those found within facial recognition programs or other systems that lack a diversity of data: “Prejudicial bias not only informs the input and output of these systems but the very structures on which they are built. If Google image recognition is trained on more examples of cats than Black people; or if the testing group for a digital camera’s blink detection includes no people of Asian descent; or if the very basis of photographic technology doesn’t see dark skin very well, how can you possibly be surprised at the biased results?”

It is easy to see how this might translate to law enforcement, healthcare, and beyond when it comes to the impact these systems can have—often the opposite of the originally intended purpose. Williams rightly points out, “Self-evaluating for bias, including implicit bias, is not something that most humans do well. Learning how to design, build, and train an algorithmic system to do it automatically is by no means a small task. Before that work can begin, the builders will also have to confront the fact that even after mitigation, some form of bias will always be present.”

Now imagine how those biases might play out when it comes to Pagan publishing, Michelle Belanger told TWH, “This is a problem for all material produced by AI, but I think it’s particularly antithetical to Pagan works, given our collective aspirations of inclusivity and pluralism. In light of Dr. Williams’ studies about how AI inevitably thinks, a Pagan book generated by AI is almost certainly going to reinforce a white, colonial, binary, and Christianized European take on pretty much everything.” Belanger concluded, “That’s not a Pagan book I’d want to read.”

Ivo Dominguez, Jr. echoed similar concerns, “The other problem for Pagan community with AI is the glut of beginner books. If you flood the entry point with mediocre material you also limit more advanced material.”

He continued, “This has a negative impact on those looking at a point of entry into our communities and often offers incorrect or even harmful information. Do we need to feed the flames around inclusion when AI is using the works of marginalized communities?”

Dominguez also pointed out that written works by actual practitioners could easily end up being crowded out by AI works, especially when one author using AI could potentially produce 100 titles or more within just a year.

Then there is the problem of “hallucinations” where AI-generative chatbots completely fabricate sources that never existed, as WIRED reported last month. While the fabricated source that the AI chatbots cited did not exist, the problem did not stop there.

Due to the fact that when Daniel S. Griffin, Ph.D., whose dissertation was on web searches, published the fabricated chatbot responses on his blog earlier this year, it resulted in poisoning Bing’s search results with false information. Not only that, but when Griffin repeated his query the chatbot actually elevated the hallucination above the rest of the search results, putting it on a par with sites that provide reliable facts.

“It gives no indication to the user that several of these results are actually sending you straight to conversations people have with LLMs,” Griffin said.

[WIRED noted after Griffin’s comment: “Although WIRED could initially replicate the troubling Bing result, it now appears to have been resolved. Caitlin Roulston, director of communications at Microsoft, says the company has adjusted Bing and regularly tweaks the search engine to stop it from showing low authority content”]

The article also points out the clear danger of how this type of hallucination could wreak havoc when it comes to the web and searching for accurate information and potentially mislead millions of people in a very short amount of time. There is also the issue of search engines being unable to automatically determine what is AI-generated text.

By Madhav-Malhotra-003 – Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=127185596

 

And then there are just the instances of generative AI producing wildly incorrect information that could have devastating effects.

Belanger offered the following assessment with examples, “Bluntly put, AI gets a lot of things wrong — sometimes dangerously so. For example: a New Zealand-based grocery chain offers an AI program to help customers come up with money-saving recipes. This included, in one recent instance, offering an ‘Aromatic Water Mix’ of bleach, water, and ammonia, chirpily describing this as the ‘perfect non-alcoholic beverage to quench your thirst and refresh your senses,’ in language it was certainly trained to regurgitate by its human programmers. But nothing in the AI’s training gave it the ability to recognize that this mix is deadly to humans. A recent spate of AI-generated books on foraging have proven equally as deadly in their misinformation.“

As horrifying as these examples are, the possibilities of more and far worse disasters are certainly possible if no guardrails are put in place to prevent them in the future.

Then there are the implications for spiritual practices. If AI cannot distinguish between mustard gas and a safe, refreshing drink, how might that translate to the myriad of magical and spiritual practices?

Belanger offered, “Now, consider these limitations when applied to an incense blend. An herbal remedy. An invocation to a spirit – or a banishing. Pagan writers already struggle for legitimacy. Most mainstream critics respond as if our beliefs and experiences are made up, if not the product of active delusion. Add AI into this mix, and the misinformation will only proliferate. Ultimately, AI-generated Pagan books will erode trust in our research and our practices.”

While this prediction is certainly plausible, there are other factors to consider which will be examined in the next installment of this series.


The Wild Hunt is not responsible for links to external content.


To join a conversation on this post:

Visit our The Wild Hunt subreddit! Point your favorite browser to https://www.reddit.com/r/The_Wild_Hunt_News/, then click “JOIN”. Make sure to click the bell, too, to be notified of new articles posted to our subreddit.

Comments are closed.