Where "the Sequences" Are Wrong II
Focus more on doing what you’re already doing better, rather than worry you’re doing the wrong thing(s).
[Epistemic Status: Perhaps what should have been stated in the first part of this series of essays, is that I don’t intend to “disprove” the entire Sequences, nor even show that they are more wrong than right. Rather, my aim is to point out where I suspect that certain patterns of thinking that I feel have have been a source of stress to the rationality community1 may have originated from. It turns out it’s difficult to pin down where this happened.]
Focus more on doing what you’re already doing better, rather than worry you’re doing the wrong thing(s).
In the field of self-improvement, there are always two dimensions along which people can be said to improve:
Doing what you’re already doing better.
Not doing the wrong things, or not doing the right things wrongly.
This way of putting it is fairly “discrete” in the sense that actions are considered “things” and that “things” can be considered the right or wrong “things” to be doing, as well as can be done rightly or wrongly.
Actually, the first way is fairly less discrete than the second way, and we’ll talk about why this is, as well as why the second way introduces problems (and why this has to do with discreteness).
Remember what it feels like when you perceive that you’ve done something wrong: It feels like, “ow!” It is always also partially up to you how you wish to process this “ow.” On one end of the spectrum, an “ow” simply means “don’t do whatever you just did.” On the other end of the spectrum, an “ow” means “don’t do exactly whatever you just did.”
However, actions are not truly binary: There’s action A, and then there’s action ~A. If taking action A produces an “ow,” you will interpret that to mean one of the following:
not (take action A) - “do nothing”
take action (not A) - “do whatever the opposite of A is”
take action A’ (A prime) - “do something like A but different”
(Negation is a nice little algebraic operator that can be arbitrarily shifted onto different terms, like a multiplicative factor.)
The third option is unlike the other two in that it does not imply that action A was “the wrong thing” per se.
Now imagine the situation in which “A and B” are true, and you take action C, and experience an “ow.” Here’s where the “discreteness” comes into play. You could conclude one of the following:
Whenever A and B, do not take action C.
Whenever A and B, take action ~C.
Whenever A and B, do something like C but different.
The first two are like rules, whereas the third one isn’t quite a rule, because it leaves things open to possibilities. Lets say C is composed of two things, X and Y, so that not C = C’ = (X xor Y) or Z (neither X nor Y).2
Our brains like to use semantic meanings, so that’s why I allow myself some wiggle room to experiment with logic, here. I like formality, but not before I have mastered it within a domain, because it usually shows the maturity of that domain. So I am choosing to be informal until that has happened (or rather, to be increasingly more formal as best as one can).
If C is composed of N things, then this negation operator allows C’ to consist of up to N-1 things chosen from the original N components of C.
One of the reasons I’m doing it this way is that I would like letters to stand for propositions as well as actions and goals. Thus, setting a variable equal to “false” or negating it has to be able to account for several new meanings.
We can consider objects to be fuzzy superpositions of many, potentially infinitely many, numbers of objects, but also that how we choose to decompose these objects preserves meanings and relationships pretty well, even when we choose a manageable number of objects in our decomposition. Thus, when we receive an “ow” we can also decompose the “ow” meaningfully, too.
So when you feel an “ow”, you can note that several of the objects you were holding in your mind at the time are at the fore, a bit more than the others. Some of these objects might feel a bit more “owie” than the others, and that may also be why they had been brought to the fore. You can make use of those differences in feeling in order to make your updates.
Instrumental convergence is an attractor basin for humans, too, which means that actions that appear to get you closer to what you want will also start to look more attractive. That means you can look for both the “owies” as well as components to actions that seem enticing, as well.
So my claim is that this isn’t the same thing as “not doing the wrong things, or not doing the right things wrongly”, and that this is also a better alternative to doing that.
To be clear: I can only justify criticizing the Sequences in the form that I am recommending, which, as part of this argument, is against the way that the Sequences themselves appear to support.
Where “the Sequences” are wrong: they tend to over-recommend worrying about not doing the wrong things.
First, to show that this criticism is justified, I’m going to provide several excerpts from the Sequences where I believe it is saying or implying this.
I laughed a bit when I first started writing this section, thinking to myself “boy, I hope I can find enough evidence showing that the claims I’m claiming are being made actually are!” So, without further ado, I give you:
Predictably Wrong
This, the first book of "Rationality: AI to Zombies" (also known as "The Sequences"), begins with cognitive bias. The rest of the book won’t stick to just this topic; bad habits and bad ideas matter, even when they arise from our minds’ contents as opposed to our minds’ structure.
It is cognitive bias, however, that provides the clearest and most direct glimpse into the stuff of our psychology, into the shape of our heuristics and the logic of our limitations. It is with bias that we will begin.
Almost this entire book, so I’ll just list the titles of the ones that are most obvious:
Scope Insensitivity
Availability
What’s A Bias?
Planning Fallacy
The Lens That Sees Its Flaws
Fake Beliefs
An account of irrationality would be incomplete if it provided no theory about how rationality works—or if its “theory” only consisted of vague truisms, with no precise explanatory mechanism. This sequence asks why it’s useful to base one’s behavior on “rational” expectations, and what it feels like to do so.
“Making Your Beliefs Pay Rent” is not an example of what I’m talking about, and so, I would call that one of the places where the Sequences are right, not wrong.
However, this book describes itself as an “account of irrationality.”
From Belief In Belief:
The rationalist virtue of empiricism is supposed to prevent us from making this class of mistake. We’re supposed to constantly ask our beliefs which experiences they predict, make them pay rent in anticipation. But the dragon-claimant’s problem runs deeper, and cannot be cured with such simple advice. It’s not exactly difficult to connect belief in a dragon to anticipated experience of the garage. If you believe there’s a dragon in your garage, then you can expect to open up the door and see a dragon. If you don’t see a dragon, then that means there’s no dragon in your garage. This is pretty straightforward. You can even try it with your own garage.
No, this invisibility business is a symptom of something much worse.
This chapter is one I’ve put heavy focus on before, so I won’t spend too much time here. The reason I focus on it is because it presents us with a cartoon-like caricature of a certain type of irrationality: the invisible dragon in the garage. The problem lies in how that cartoon-like caricature got into people’s heads in the first place. Was it religion? Was it social pressure to conform? Was it honest people who just really wanted to believe in the invisible dragon against the grain of their egregore?
You can, of course, as an honest person, disbelieve in the invisible dragon against the grain of your egregore. But “not doing the wrong things / not doing the right things wrongly” is more typically what our egregores tell us to worry about.
This particular egregore tells us to worry about not accidentally believing in invisible dragons. The book that’s entitled “Fake Beliefs”, by its title, cannot mostly be about “why it’s useful to base one’s behavior on “rational” expectations, and what it feels like to do so.” The main thing it’s about is not having any fake beliefs, which we can assume are bad. “Fake beliefs” sound pretty bad, don’t they? Sure wouldn’t want to have any of those!
But it’s kind of implied that the way one ensures they do not have any “fake beliefs” is by basing one’s behavior on rational expectations, to paraphrase considerably. But if fake beliefs are mostly caused by the grain of one’s egregore, pressuring one to conform, profess belief, to insist that one believes what they say they do and to wear their beliefs as attire, then I think it’s reasonable to conclude that a removal of that pressure, or a conscious choice to work against it, would be enough to deal with that worry - as the descriptor “fake” would suggest.
The egregore created by the Sequences is like the ones that tell you to believe in invisible dragons, but just in the opposite direction; Instead, you profess that you most assuredly do not believe in any invisible dragons.3
The Sequences talk about what it looks like when people believe false things, but when it talks about why people believe false things, it tends to recognize that people often do so while acting as a member of a group (e.g. religion), but it generally speaking blames individual cognition for why people fall to the “dark side” (epistemologically).
Dark Side Epistemology
This argument, clearly, is a soldier fighting on the other side, which you must defeat. So you say: “I disagree! Not all beliefs require evidence. In particular, beliefs about dragons don’t require evidence. When it comes to dragons, you’re allowed to believe anything you like. So I don’t need evidence to believe there’s a dragon in my garage.”
Having false beliefs isn’t a good thing, but it doesn’t have to be permanently crippling—if, when you discover your mistake, you get over it. The dangerous thing is to have a false belief that you believe should be protected as a belief—a belief-in-belief, whether or not accompanied by actual belief.
Belief in the invisible dragon is just so, so, so much fun that even the negative parts of the belief (like the fact that it produces no constraint on expectations whatsoever) is completely drowned out by the amount of fun it provides, and that’s why the belief must be protected!
But what if it becomes weird to believe in the dragon? How many people do you know who appear to openly believe in “invisible dragon” type thingies that are also very socially unacceptable (e.g. that there’s not even a tiny cult they are a member of which believes this thing, whatever it may be)?
If there are any beliefs that actually function in this described way, as in they appear to consistently provide the believer with a steady stream of positive utility, enough to predictably overcome any amount of negative utility caused by the belief including however much social unacceptability it carries with it, I am interested in learning what those beliefs are (and not in a sarcastic way).
When arguments have become soldiers, and the argument has escalated to a this-side versus that-side battle, then the other side believes “false” things while your side believes “true” things. Logic becomes more binary than it was before. The Sequences say that this kind of situation amounts to epistemically hazardous weather. When the weather calms down, the people in the outgroup no longer believe in “false” things per se, rather, they believe things that are sort of “true-esque” at worst. Their beliefs aren’t totally false, but things that reasonable people, perhaps with different cultural backgrounds and emotional makeup, could believe had they been exposed to a specific stream of input data that might be different from yours.
In war time, that charitable rendition of your opponents’ beliefs becomes far too costly and time-consuming. The inferential distances have been stretched to become insurmountable.
But I mean… is it actually more costly and time-consuming?
It seems like the beliefs that actually need the most “protecting” are the ones that are about your opponents - that they are totally wrong, and that coming to see them as people like you who are partially correct about lots or even most things is detrimental to the war effort.
“We’re right” needs protecting too, but “we’re all basically right” is the default, peace-time view.
“They’re wrong” doesn’t threaten “we’re right” from being true, except only to the extent to which “they’re wrong” becomes most of what you’re all about.
So given that, and choosing to learn on what I’ve just argued is the better epistemological framing, the Sequences can’t be totally wrong, otherwise they couldn’t have the appeal that they do (to me included).
Another restatement of the thesis: The Sequences basically get this part right, however, they also appear to blame people who “mean well” in terms of how they actually perform their cognitive steps, as making the bulk of the errors that lead to faulty conclusions.
Consider a scenario in which an AI researcher, working on his or her own, comes to believe they’ve found a promising avenue towards solving AI alignment.
The rationalists currently appear to actively push down that kind of person’s hopes.4
Just Lose Hope Already
That’s not even the sad part. The sad part is that he still hasn’t given up. Casey Serin does not accept defeat. He refuses to declare bankruptcy, or get a job; he still thinks he can make it big in real estate. He went on spending money on seminars. He tried to take out a mortgage on a ninth house. He hasn’t failed, you see, he’s just had a learning experience.
And yet it seems to me that how to not be stupid has a great deal in common across professions. If you set out to teach someone how to not turn little mistakes into big mistakes, it’s nearly the same art whether in hedge funds or romance, and one of the keys is this: Be ready to admit you lost.
I can understand giving up on the specific angle of attack you are trying. But I probably wouldn’t advise that someone should give up hope in whatever they were actually trying to do completely.
The second of those two quotes, by the way, might be the best example of what I’m looking for thus far.
Why is “admitting you lost” such a great virtue? If I was negotiating a peace treaty and I was on the side of a war that won, that peace treaty would basically be written such that the losing side never had to admit they were wrong about as much as possible, save, perhaps, that it was wrong to be an evil aggressor who committed war crimes.
Their gods can totally still officially exist, I don’t mind that at all! So long as they don’t kill anyone for not believing in them.
If admitting you lost is super hard like everyone says it is, perhaps that’s because you don’t want to give up trying to accomplish what may be your terminal (hardest to change) goals. You can always change your strategies.
We also have to take into account that telling your enemies how virtuous it is for them to admit when they lost is a very predictable strategy. It’s fairly easy to do this when you’re making fun of someone else (and this piece is notable in that Eliezer is doing exactly that).
Where’s The Mysterious Third Thing?
When there appear to be two opposing sides to an argument, it is not always the case that one side is right and the other is wrong. Is the truth “in the middle?” Sort of. It’s called the “mysterious third thing.”
The third thing can be a higher-dimensional object than either of the two opposing sides, so it’s not always simply the average of both sides, per se. But it can be.
It’s very hard to do Strategy 1 vs. Strategy 2 if you don’t know about the mysterious third thing.
Actually, I’m going to conclude this chapter on a positive note for the Sequences, since they do mention the third thing:
The Third Alternative
To do better, ask yourself straight out: If I saw that there was a superior alternative to my current policy, would I be glad in the depths of my heart, or would I feel a tiny flash of reluctance before I let go? If the answers are “no” and “yes,” beware that you may not have searched for a Third Alternative.
Strong agree. Especially the part about how it should feel good to accept a new alternative. If you are clinging to something beloved but you can’t seem to make your beloved belief compatible with the rest of reality, consider that there is probably something about the belief that makes it beloved to you, which does not have to be the pieces of it which most contradict reality. In fact, those parts of the belief (the contradictory-with-reality-parts) might feel good to lose.
However, it is possibly the case that I believe a slightly stronger version of this, which is that one needn’t traverse through any “pits of despair” while they search for a better alternative.
There is actually a central mystery that I am trying to solve, which has caused me to scour these texts to find a possible source of the anomaly in question5. The mystery is still unsolved, because I find that the Sequences are themselves very “mysterious third thing”: They only seem to proscribe what I consider the “bad kind of rationality” if you read them that way or cherry-pick lots of examples, but in lots of other places, they don’t.
So what happened?
To be more specific, patterns of thinking that have led to a “pessimism bias.”
Where “not (X and Y)” usually means not both, so thus either one or the other, or neither (in which case I am allowing some third thing to exist).
Or perhaps even more commonly, to profess that you personally know many people who do believe in invisible dragons and you can assure me that their belief has definitely harmed them in ways that they are oblivious to.
Namely, to footnote 4 above, why MIRI has turned toward so much pessimism of late, why “doomers” have turned towards slight demagoguery, and how this could even be possible while Eliezer is supposedly running the place.