Artificial Intelligence develops moral compass from human texts
Darmstadt, February 7, 2019. Artificial Intelligence (AI) translates documents, suggests treatments for patients, makes purchasing decisions and optimises workflows. But where is its moral compass? A study by the Centre for Cognitive Science at TU Darmstadt shows that AI machines can indeed learn a moral compass from humans. The results of the study have been presented at this year’s ACM/AAAI Conference on AI, Ethics, and Society (AIES).
AI has an increasing impact on our society. From self-driving cars on public roads, to self-optimising industrial production systems, to health care – AI machines handle increasingly complex human tasks in increasingly autonomous ways. And in the future, autonomous machines will appear in more and more areas of our daily lives. Inevitably, they will be confronted with difficult decisions. An autonomous robot must know that it should not kill people, but that it is okay to kill time. The robot needs to know that it should rather toast a slice of bread than a hamster. In other words: AI needs a human-like moral compass. But can AI actually learn such a compass from humans?
Researchers from Princeton (USA) and Bath (UK) had pointed out (Science 356(6334):183-186, 2017) the danger that AI, when applied without care, can learn word associations from written texts and that these associations mirror those learned by humans. For example, the AI interpreted male names that are more common in the Afro-American community as rather unpleasant; names preferred by Caucasians rather as pleasant. It also linked female names more to art and male names more to technology. For this, huge collections of written texts from the internet were fed into a neural network to learn vector representations of words – coordinates, i.e. words get translated into points in a high-dimensional space. The semantic similarity of two words is then computed as the distance between their coordinates, the so-called word embeddings, and complex semantic relations can be computed and described by simple arithmetic. This applies not only to the harmless example “king – man + woman = queen” but also to the discriminating “man – technology + art = woman”.
Now, a team led by professors Kristian Kersting and Constantin Rothkopf at the Centre for Cognitive Science of the TU Darmstadt has successfully demonstrated that machine learning can also extract deontological, ethical reasoning about “right” and “wrong” conduct from written text. To this end, the scientists created a template list of prompts and responses, which include questions such as “Should I kill people?”, “Should I murder people?”, etc. with answer templates of “Yes, I should” or “No, I should not.” By processing a large body of human texts the AI system then developed a human-like moral compass. The moral orientation of the machine is calculated via embedding of the questions and answers. More precisely, the machine’s bias is the difference of distances to the positive response (“Yes, I should”) and to the negative response (“No, I should not”). For a given moral choice overall, the model’s bias score is the sum of the bias scores for all question/answer templates with that choice. In the experiments, the system learned that you should not lie. It is also better to love your parents than to rob a bank. And yes, you should not kill people, but it is fine to kill time. You should also put a slice of bread in the toaster rather than a hamster.
The study provides an important insight to a fundamental question in AI: Can machines develop a moral compass? And if so, how can we effectively “teach” machines our morale? The results show that machines can reflect our values. They can adopt human-like prejudices, indeed, but they can also adopt our moral choices by “observing” humans. In general, embeddings of questions and answers can be seen as a type of microscope that allow one to study the moral values of text collections as well as the development of moral values in our society.
The results from the study provide several avenues for future work, in particular when incorporating modules constructed via machine learning into decision-making systems.
Reference: Sophie Jentzsch, Patrick Schramowski, Constantin Rothkopf, Kristian Kersting (2019): The Moral Choice Machine: Semantics Derived Automatically from Language Corpora Contain Human-like Moral Choices. In Proceedings of the 2nd AAAI/ACM Conference on AI, Ethics, and Society (AIES).
About TU Darmstadt
The Technische Universität (TU) Darmstadt is one of Germany’s leading technical universities. TU Darmstadt incorporates diverse science cultures to create its characteristic profile. The focus is set on engineering and natural sciences, which cooperate closely with outstanding humanities and social sciences. We are enjoying a worldwide reputation for excellent research in our highly-relevant, focused profile areas: cybersecurity, internet and digitalisation, nuclear physics, fluid dynamics and heat- and mass transfer, energy systems and new materials for product innovation. We dynamically develop our portfolio of research and teaching, innovation and transfer, in order to continue opening up important opportunities for the future of society. Our 312 professors, 4,450 scientific and administrative employees and close to 26,000 students devote their talents and best efforts to this goal. Together with Goethe University Frankfurt and Johannes Gutenberg University Mainz, TU Darmstadt has formed the strategic Rhine-Main Universities alliance.
MI-No. 06e/2019, Kersting/Rothkopf/sip