Dr Oliver Hartwich AI’s potential to polarise society

Oliver Hartwich - Sesame Street characters at The Last Supper

Published in The New Zealand Herald (Auckland), 25 August 2022

Have you heard of Leonardo’s masterpiece, The Muppets’ Last Supper? Or Kandinsky’s colourful Swirling Music Notes? Or Frida Kahlo’s famous The Chickens Are Coming Home to Roost?

No?

Well, the reason you have not heard of them is that they did not yet exist. Until I created them.

*The Chickens are Coming Home to Roost, Frida Kahlo style*

An ingenious software called Midjourney offers everyone willing to pay a monthly subscription the option to create images from text.

*The Muppets’ Last Supper, da Vinci style*

Tell it you want a portrait of Bill Gates in the style of van Gogh, and it will do that. It can also imagine what a photo of Henry VIII in Times Square would look like. Or indeed anything else you ask it to draw.

Midjourney is part of a new family of text-to-image apps. Others are OpenAI’s Dall-E2 and Google’s Imagen.

They all work slightly differently, though unlike Midjourney, Dall-E2 and Imagen are not yet available to the public.

What each of these generators have in common: They give us a visual glimpse into the future of Artificial Intelligence, or short: AI.

Now, some people may quibble whether “intelligence” in this context is the right word. In a psychological sense, few things in the world of AI would pass as “intelligent”.

That said, we should not get hung up on words. Intelligent or not, AI software is becoming ever more impressive.

Until a few years, it would have been hard to imagine a computer program able to paint in a famous painter’s style. Then, the first image processors could turn photos into artistic paintings, say in the style of cubism or impressionism. And now, it is possible to generate entire images out of a simple text.

We can already guess what the next development stage will be. From still images, it will only be a matter of time until video sequences can be dreamed up by software. You will soon be able to make your favourite celebrity or politician walk on water, if only in a video.

What is possible for image generation is happening in text creation and manipulation, too.

About a decade ago, the first attempts at speech recognition and computer translation were mainly good for party games. The results they produced were often comical.

Again, developments over the last few years made an enormous difference.

Only five years ago, in August 2017, DeepL launched in Cologne, Germany. The small start-up used neural networks and machine learning to put more established translation services to shame. DeepL’s translations were so good, they were barely distinguishable from human translations.

Five years on, and DeepL’s mass-market competitors Google Translate and Microsoft Translator have caught up. They, too, now offer human-like translations. Google’s product can translate entire books within seconds while even keeping the original formatting.

Meanwhile, speech recognition has also made a quantum leap. Though dictation software has been around for a couple of decades (remember Dragon NaturallySpeaking?), in the past, these packages needed to be carefully trained. Even then, it was a game of chance if it understood one’s voice.

Nowadays, with no previous practice, speech recognition can pick up even fast, colloquial speech, in noisy environments. And if you combine it with translation software, you have an almost perfect interpreter.

Just to round it off, you can get software to help you write.

“Thus, you can get it to draft business correspondence or even academic papers.”

Oops, that was not me but QuillBot AI’s co-writer suggesting how I should continue this column.

In truth, I would not let a software write my columns. My mind is still a bit too complicated for AI (just yet). But I am happy to let AI help me edit my writing.

A small army of AI tools has been looking over my shoulder for some time, such as Israeli company AI21’s WordTune. And you thought I was a talented writer.

What I have described so far is just a tiny segment of AI. There are AI application in nearly every aspect of life, from medicine to agriculture, from transport to the military. Even dating is no longer safe from algorithms. And perhaps more scarily, legal sentencing.

For techies like myself, the unfolding AI revolution is, mostly, a dream. I am what marketing people call an early adopter. I love trying out all new things in the tech world for the sheer enjoyment of it.

That said, I am not just a techie but also a policy wonk and an economist. And that complicates my child-like excitement somewhat.

I am not even talking about the danger of unethical use of these technologies. You only have to imagine how powerful deep-fake video clips can be. A fabricated video showing the US President declaring war on country X could trigger a swift response long before the fake is revealed.

AI technology has the potential to disrupt society, even if used legally and ethically.

Again, I am not even thinking of the most obvious effects. Of course, when machine-translations become as good and as accurate as human translations, professional interpreters should think of alternative careers.

Also, where future software packages can write gripping sports commentary, insightful stock market reports or even summarise Parliament’s question time, the future of journalism could be different.

But these are only the obvious effects.

Much deeper lies the question of how AI will change society. And to understand that, perhaps a look at previous tech revolutions can give us a clue.

For many previous technological changes, new technologies did not only improve general economic and social circumstances.

I do not mean this in a sense of cultural pessimism. The cultural pessimists believed that radio would be the end of the newspaper. They then told us that TV would be the end of the radio. They then believed that the Internet would kill TV.

At each step of the way, the pessimists were wrong. Though all these media adapted over time, they still coexist until the present day. The world did not end because of technological progress.

However, we have also observed that people from different socioeconomic backgrounds use these media differently.

Thus, people from poor educational backgrounds are unlikely to watch highly educational programmes on TV. And vice versa.

TV, just as radio and newspapers before, acted as a cultural amplifier. Instead of levelling society culturally, it amplified both mediocrity and excellence.

So why would it be any different with AI?

To use AI to its greatest effect requires a degree of cultural capital. Unless you knew both The Muppets and da Vinci’s Last Supper, you would not ask a software package to combine the two into a painting.

We may well imagine a future in which AI makes some jobs in the economy redundant. And we can also imagine an AI future in which people from different socioeconomic and educational backgrounds use AI differently.

For both these reasons, AI has the potential to drive society further apart. There will be an AI-elite able to use AI for its advantage (and able to afford it). And there will be others for whom AI will remain a mixture of entertainment and convenience at best – and out of reach at worst.

None of this is an argument against AI. It certainly is not a call for regulating AI.

But it should be an argument for developing our own human intelligence and making the most of it.

Paradoxically, in a world increasingly shaped by AI, the value of our own intelligence and education does not diminish. Instead, the returns on human capital could well increase because of the presence of AI.

I look forward to the day when I can debate this thesis with an AI chatbot. I am sure it will make my argument more robust.

In the meantime, perhaps I should ask Midjourney to paint a bright future for New Zealand, Salvador Dali style? Surrealist sounds just about right.