An Act's Moral Goodness Depends on How It Affects Your Character

Feb 15, 2025

Acts have obvious direct consequences. For example shouting at someone in public will make make the person you’re shouting at upset. But they also have second-order consequences: by shouting at someone, you contribute to creating the sort of world where people shout at each other. You probably lower social trust, you probably make shouting in public a bit more acceptable. And you become the sort of person who shouts at people.

I think this last type of second-order consequence, where an act contributes very slightly to you becoming a different type of person, is under-appreciated and actually often dominates the moral calculus. And so it’s actually quite useless to ask, for example, is it good to punch someone. If you’re short-tempered and punch on angry an angry impulse, it’s probably bad. If you’re a coward and the punch is a step in the direction of you finally standing up for yourself, potentially it’s good.

The tweet which inspired this post.

On one hand this is a bit unsatisfying in that it means perfectly evaluating the moral goodness of an act is computationally intractable. But on the other, things do be like that sometimes. Some meandering thoughts related to this idea now follow.

Blanket good-bad judgements are dumb

Is abortion bad? Well, is punching someone bad? Thinking this in the second-order-impact-on-character way makes more comfortable answering these questions with “idk, depends, it’s contextual”.

What kind of person does the act make you become?

Ok, so you punched someone. In what way does this change your character? What kind of person does this make you become?

I think this question is slightly doomed, because your brain’s black-box deep-learning machinery will do this unconsciously for you, and it will do it in a way which is probably not very amenable to conscious interpretation. But attempting anyways, I think two things will be taken into account: the hypothesis’s explanatory power and its simplicity.

So if you punched Jack on Tuesday evening, your character changes such that you’re the sort of person who punches Jack on Tuesday evenings. But it only moves in this direction a tiny bit because this hypothesis is so complicated and awkward, even though it explains your behavior quite well. Your character will probably also update somewhat in the direction of you being a generally violent person, since this is a decent explanation and also much simpler. But the exact real updates are some opaque black box neural-net mess, so you probably shouldn’t think about this consciously too much.

Mandatory predictive-processing digression: I don’t mean ‘character’ and ‘updates’ abstractly and philosophically, I mean these to refer to real changes in your brain’s real world-model. In the predictive processing frame, somewhat speculatively, it makes sense to think of two ‘you’s: you, the bio-physical cellular automaton whose behavior is governed by physical law, and you, the self-image, the ego, i.e. your brain’s model of what you-the-bio-automaton will do. The brain’s world-model is of course part of the bio-automaton, and also your brain’s world-model drives the bio-automaton’s behavior. So when your world-model is predicting the bio-automaton it also has to predict what it itself will do, leading to some funny weird recursion business. I really like this idea and it deserves its own post at some point, but the relevant point for now is that in this frame, when you do something, this is useful evidence about what you-the-bio-automaton do: your brain will do some kind of approximate bayesian inference to update its own theory of how ‘you’ behave. This is e.g. how habits are implemented - you do something regularly, you-the-world-model make the inference that this is the sort of thing you-the-bio-automaton does, and the updated world-model drives the bio-automaton to behave this way even more in the future. The update after punching someone is analogous.

How this relates to Kant’s murderer question

(Do not be fooled, I haven’t actually read Kant, I am a dilettante.)

Consider Kant’s murder at the door question:

Suppose that an axe murderer comes to your door and demands you tell him where your friend is, so that he can kill her. Your friend in fact is in your basement. You lie and tell the murderer your friend is in the next town over. He heads off to the next town, and while he’s gone you call the police and bring your friend to safety.

Kant says lying here is bad, because it would be self-defeating when universalized, or something. But there are multiple levels at which we could claim the relevant moral rule lives: maybe it’s “do not lie”, but maybe it’s “do not lie to murders”, or something dumb like “do not speak after opening the door” or “do not lie to people who are holding something”. Scott sees this problem in the linked post and suggests a kind of meta consideration of thinking about how they way we universalize universalizes. This is neat, but I think we can also think of this situation through the ‘hypotheses weighed by predictive power and simplicity’ lens: we’re breaking all of these rules, and for each rule the badness of breaking this rule is proportional to its goodness (accuracy x simplicity). Or in the character-impact frame: we’re making all these updates, but most of them by a negligible amount because the hypothesis sucks. Or in the second-order-impact-on-society frame, we’re contributing to creating a world in which people do all these things (lie, lie to murderers, lie to people who are holding something) proportionally to the goodness of the explanations).

The last framing seems particularly Kant-aligned to me: the universalizability rule sounds to me a bit like a heuristic for taking into account the second-order consequences of contributing to create a world where people act like you do. (Or maybe this is the pop-Kant understanding of universalizability, “imagine everyone acted the way you did”, which is too shallow, in which case, I don’t know). And it is right to consider this, because there is no “true you” and looking at other people act gives you genuinely useful bits of information about what type of person “you” are. And then the predictive power x simplicity thing helps us to do this properly, without getting stuck on the “oh no which level do we draw the inference at” thing.

Does this have any practical takeaways at all or are we just playing word games?

Mostly it’s just word games, and realistically I am writing this for my own entertainment. But I can imagine learning to think this way could be helpful if you’re confused/paralyzed by the seeming contradictoriness of existing moral rules. Understand that they are contradictory because people are different, what is bad for others may be good for you. Be honest with yourself to see when you’re hiding behind a complex/awkward self-serving belief like. Use your acts thoughtfully to become more good and less bad.

Incredible what kind of intellectual acrobatics one can do to rederive common wisdom.

← Back to home