effective altruism and the tyranny of expected value

i'm broadly sympathetic to effective altruism. the core idea is hard to argue with. if you're going to try to do good, you should think carefully about what actually works. use evidence. compare interventions. allocate resources where they have the most impact. that's just being serious about helping people.

where it gets complicated is expected value reasoning applied to low-probability, high-impact scenarios.

the pascal's mugging problem

EA has drifted toward longtermism. the argument goes like this. there could be trillions of future people. the expected value of preventing existential risk is astronomically high even if the probability of any specific risk is low. therefore, working on existential risk reduction is one of the most impactful things you can do.

the math checks out. the reasoning is valid. and yet something feels off. when you multiply a tiny probability by an astronomically large payoff, you can justify almost anything. this is basically pascal's mugging dressed up in utilitarian clothes.

i don't think this means longtermism is wrong. i think it means expected value reasoning breaks down at the extremes and we need additional tools. moral intuitions, risk aversion, humility about our ability to affect the far future.

what i actually think

the most defensible EA positions are the boring ones. malaria nets work. deworming is cheap and effective. GiveDirectly cash transfers help people. these are empirically validated, near-term interventions with measurable outcomes. they don't require speculative reasoning about the far future.

AI safety is somewhere in the middle. the risk is plausible and the interventions (like mech interp research) are concrete and tractable. you don't need to invoke trillions of future lives to justify working on AI alignment. you just need to believe that increasingly capable AI systems could cause real harm if we don't understand them. that seems pretty reasonable.

the community problem

EA's biggest weakness isn't the philosophy. it's the community dynamics. any movement that selects for high-IQ people who think they've figured out the most important thing to work on is going to develop some unhealthy patterns. arrogance. insularity. galaxy-brained reasoning that justifies weird conclusions.

the FTX collapse was the worst-case version of this. but even in its milder forms, the pattern of "i've calculated that X is the most impactful thing, therefore normal rules don't apply to my pursuit of X" is dangerous.

i still identify with the core ideas. i just try to hold them with more humility than the median EA forum poster.

status: evolving note. my views on EA keep shifting. the core insight about taking impact seriously is durable. the specific conclusions change as i learn more.