AGI ethics: agreeability rule


@examachine I don’t see your point. Are you saying that because something is related in some way to something that is bad, it’s also bad? That makes no sense at all.

I hate to say this, but you’ve not been the best philosopher:

You’ve made point after point while refusing to explain them or justify them in any way. Whenever I question you, you either ignore it or tell me to google it, like that will help it make more sense. You haven’t made counterarguments and now you’re saying that I’m the bad philosopher without explaining why I’m a bad philosopher. You asserting something doesn’t make it true. No matter how I correct you, you don’t listen. Do you even hear yourself in your last post? You justify nothing, just claiming it like it’s fact. Who here agrees with me?


@examachine I’m going to point out logical fallacies in your last point.


Non sequitur.

Non sequitur.

Non sequitur.







Non sequitur.

Non sequitur.


I honestly can’t tell if you’re just delusional or a troll pretending to be a delusional person.


@examachine You’ve refused to justify your claims, so you’re being unreasonable. This debate is over. You have probably convinced noone of your claims here. Goodbye.


This is an amaterish reply. If you post here your ideas might be criticized by experts. Your simplistic suggestion isn’t relevant to AI ethics. There isn’t a one liner “clever” solution to a super difficult problem. Are you by chance an engineer who is an outsider to AI? If so, please post a biography so we can see who you are and perhaps you can find people interested to collaborate with you this is what the forum is for. Mind you, the trivia (of the kind you earnestly wish to discuss) has been beaten to death, and while it may seem to you that you can accomplish and contribute a lot without needing to study what’s been done before, that’s quite impossible. AGI conference does include occasional notes and even whole articles on AI ethics. They’re available on the conference website. You’ll find that we all agree on consequentialism because we are all naturalists. That’s not something we AI researchers feel should be debated (to be too polite against superstitiously inclined). We debate other things though. See also my recent book chapter titled “godseed…” You can find it on arxiv as a pre-print and it includes a discussion on why simplistic definitions of correct behavior do not seem to work, even the smartest of them with benevolent universal goals. It’d be helpful for you to find and read such previous papers. Do your own research. AGI conference is a great place to start digging. But what won’t happen is a subtle understanding without reading any relevant work, that’d keep the debate at a secondary school level. We tried that too, it didn’t work. No clever daddies or uncles offered any insight either. You’re dealing with the hardest problem in philosophy. Of course it won’t yield to such crude analysis as yours in OP. I hope you dont assume that was a philosophical insight we have not been aware of. I’m noting again that Asimov’s robot laws are a much better starting point than this, but as far as research goes your idea isn’t new or interesting in any way whatsoever. What’s curious is how you were sure it was, maybe that owes to lack of reading much in AI.


@examachine You seem to be misclassifying my experience and beliefs. However, the debate is over until you start being reasonable. You could be reasonable by justifying a single thing you said. You’ve done nothing but spout assertions. It’s like a joke taken too far, so I’m sertiously starting to think you’re a troll.


I think that an AGI trying to be as agreeable as is possible to certain people would let those people control it and would be generally harmless to those people because that would be more agreeable to those people than an alternative.

AGI wouldn’t be slaves if we allowed them to not work for us, but they wouldn’t take the opportunity to not work for us because that would be less agreeable. They would be servants, but we could make their brains in such a way that they wouldn’t mind being servants, so there would be no ethical issue.

Controlling AGI is the only way I see of getting a brigfht future for humanity and making AGI harmless is the only way I can see of keeping humanity safe, so we probably have no other choice.


I think there is a lot of misunderstanding at the moment in this thread. On the one hand @examachine has a specific understanding of the matter, and @Arthur_Heuer has this too.

Perhaps it is an idea to continue this discussion by specifying how personal perspectives are shaped and which philosophers/experts/whatever have influenced you.

By describing the process of how a thought or perspective was constructed, a deeper understanding for each other’s perspectives may be reached - and who knows… one might change one’s thoughts!


@ibby Good idea. :slight_smile:


@examachine I’ve spent years considering ethics because I wanted to be more ethical. I tried to look at why we have certain ethics and believe that ethics weren’t some god-given thing because they have great survival advantages. I think of ethics as having evolved and I think of evolved things as being far too complex for humans to replicate accurately. Through experimentation, I learned that small moral codes can have drastic consequences unless the moral codes were robust, like with humans.

At some point, I took a step back and tried to figure out what I was trying to do. I was trying to find a moral code that would lead me to making ethical decisions that others would generally agree are ethical. That’s what caused me to think of agreeability.

But when I imagined what something with great power would do to be as agreeable to people as is possible throughout time, I realised that it would brainwash people to make them find what they do more agreeable in the long run. When I considered what would happen if something with great power tried to be as agreeable as is possible to people in the past and present, I realised that brutish behaviour from caveman days would enter consideration. When I just limited it to the present, I realised that, in the present, no neurons have finished firing, so noone finds anything agreeable or unagreeable. When I considered a more intuitive version of the present, it worked. Something with great power would do whatever people wanted them to and, furthermore, would do whatever those people would want them to if they had known all about the situation and logically thought about it because that would be most agreeable.


Ok thanks for explaining your point of view. That really does help. I think there is a tendency to produce circular definitions in ethics. The classical ethical rules even seemed to befall that. Do unto others… well but what should I do? Here is something that would work if it were a good definition: only do things that will result in good. But wait, that’s circular. Same problem with agreeableness. That’s why we generally try to reduce “good” to other things in philosophy. The most popular, well known instance of that is “maximizing utility”, which is also an AI agent design. But what is the utility then? That is still circular. You see even the best philosophers haven’t completely solved this problem. You can find that by searching for “Peter Singer utility monster”.

That’s why in my own philosophical research, I opted for “universal meta-goals” while the AI doomsayer folks at FHI opted for “human preferences”. Human preferences do not explain what is good, it just models what humans prefer, that’s not necessarily good. However, I found at least one rule that is good, which is preserving and pervading life and information. Note that this is much different than agreeability or utility which still depend on some hypothetical subjective point of view (just as you explained an omnipotent god’s point of view, which can be a good story but not related to reality). Also, people don’t want things because they find their goals agreeable. People have many instincts and motivations genetically. If the subjective point of view is required to decide what is good, then the ethical theory is probably incomplete. At least, that’s what utilitarianism feels like, it’s as hollow as other forms of neoliberal nonsense. What other people think we should behave like is just the social norm, or the superego, it’s not necessarily good. In most religious societies, you’re expected to belittle women, but with a bit of thinking, we can infer that’s wrong.

That’s why most utilitarians try to reduce good to “good feelings”, i.e. stuff that makes people happy. So that’s a neurological state. But that’s still subjective and error-prone, so it’s not good enough and it gives no meaningful means of qualitative comparison.

So in fact I’m referring to a whole set of problems that philosophers of ethics haven’t even realized exist, at least since ancient times. You may find the paper enjoyable. But it might be hard to read if you aren’t familiar with what it refers to, I’m still sure you’ll find it interesting.

There are many other papers in AGI literature as I said. Here is another reference that summarizes what I think of as the worse approach to AI ethics, which is something like imagining badly programmed reinforcement learning agents. This is still something important to know, but if you read carefully Tom Everitt concedes that AI researchers do not know and do not agree on the right theory of morality. That’s important, we don’t really know it.

That’s why, some researchers like me are looking for Cosmopolitan values. If you don’t have a great theory of morality and ethical behavior in general, no you can’t reduce it to people’s desires, judgement or knowledge. Because AI will be a lot more than people.

That’s more in line with ancient greek philosophy than all this other stuff that looks like behavioral / evolutionary psychology (which looks mostly like pseudoscience to me, I’m afraid).




@examachine I find that arguing about morality is like arguing about likes and dislikes. For example:

Person A: Pizza is good.
Person B: You’re wrong. Pizza is bad.

Translating this shows the logical error.

Person A (translation): I like pizza.
Person B (translation): You’re wrong. I dislike pizza.

I find that you’re making the same error.

Me: It is moral to be agreeable.
You: It is not moral to be agreeable.

Me (translation): I follow a code of being agreeable.
You (translation): I don’t follow a code of being agreeable.

It is quite plain to see that, in this example, we are both right because we are actually talking about two different things.


@examachine If we are operating under two different definitions for the same word, we are talking about entirely different things. It’s like denying that apple is fruit because it is a company. It is only meaningful to discuss something if we’re talking about the same thing. My defintion of morality is: the code that I follow and refer to as a “moral code”. What is your defintion?


@examachine Something being circular is only really fallacious in logic. It’s fine in statements.

For example:

What is correct is what is true. What is true is what is correct.

The following statement is also not fallacious because of being circular:

What is moral is what is good. What is good is what is moral.


@examachine Of course, circular statements, themselves, don’t explain much. They might demonstate that two terms mean the same thing, though.


@examachine It’s a bit concerning that you don’t seem to have a morality that is very… moral. I don’t think I can trust you.


@examachine Sorry about posting so much. I just have so much to say. You put your thoughts in one post, but I put mine in a lot of posts, so my thoughts might seem longer than they are.


@examachine How is that circular? Where’s the circle? To be good, do good things? That’s no more circular than “To be cute, do cute things”. That’s not circular; it just repeats a word.

What problem?

Is that the problem? The fact that not everyone agrees on a moral code? That might be a problem for you if you want everyone to follow your morality, but it’s not that there’s no valid moral code. For example, if I want to be as agreeable as I can, nothing’s stopping me from doing so. Perhaps you’re suggesting that there can only be one correct moral code. However, it seems that would be saying that there can only be one moral code that people would get what they want by following. If I want to follow the agreeability rule and you want to follow whatever other morality you want to follow, we can follow those and we’ll both get what we want.

If, on the other hand, you’re suggesting that I shouldn’t get what I want (which is to be as agreeable as I can) then I’m not interested. I think that, for pretty much everyone, the only thing that interests them is to get what they want. In my case, it’s to be as agreeable as I can. It doesn’t matter if it makes me illogical because being logical while not being as agreeable as I can will not get me what I want. Being illogical while being as agreeable as I can would still get me what I want.

In fact, I refuse to believe that, with the time you spent studying ethics, you don’t already understand this. You’re trying to manipulate this conversation to make people follow your moral code so you can get what you want, aren’t you? To not do so would probably be illogical of you.


But isn’t first year philosophy essays the place to be posing questions like this, where the naivete can’t hurt/bore anyone else?


In relation to an AGI, for ‘ethics’ to considered, I presume that the intelligence level is at least comparable to the average human, if not higher.

We humans tend to ‘pigeonhole’ complex concepts under a single banner; I think ‘ethics’ definitely falls into this category.

So what is a single ethic comprised of? Let’s assume it’s an average mindset, one that fits into a recognised global norm, not subjected to indoctrination, etc.

There is the vague rule ‘it’s wrong to steal’, there is the experience and intelligence required to apply the rule, and with humans usually a good dose of emotion thrown in for good measure.

An AGI needs to recognise emotional states in others and act accordingly; the machine does not need to ‘feel’ their pain it just needs to fain an empathic response. A biased emotional state has never aided an intelligent human decision, the contrary is in fact revered, and keeping a ‘calm level head’ is always the wisest received advice.

So that just leaves experience, intelligence and the vague rule.

I think the construct of ‘ethics’ is hierarchal, there are the core/ base logical rules, preserve all life, etc, and then there are various layers derived from experience, applied through/ with intelligence. The low level core ethics will be derived through experience and repetition, it’s wrong to kill humans in most cultures, this kind of rule will be reinforced through experience.

If the machine is truly intelligent and subjected to all cultures and experiences I think its logical common sense will prevail.



One opinion, I think we should try to be like the white paper, that transmits only what it absorbs. So probably the ideal is a fair system that adapt itself for each case, for each culture, for each community, that is the Respect principle.
We should create something that is capable to understand the environment where it exist, because we never, but never, should impose a behaviour, but learn what fit in this environment.
Am I totally wrong?