I pose the question whether we’re again building too much momentum toward what we think is best and thereby erode our ability to react to new insights and adjust our strategy.

A few years ago, 80,000 Hours et al. noticed that the funding gaps they saw in the spaces they prioritized seemed to be about to shrink below the levels of the talent gaps they saw. This meant they had to perform a costly reversal in the outreach strategy, from promoting earning to give as an underused opportunity to warning people against it whose comparative advantage didn’t seem maximally unambiguous.

Similarly, the growth of what used to be called the EA “movement” used to be prioritized until people started warning of the dangers of getting it wrong. But at that point it had already been gotten wrong in that today there are a lot of people who identify as EAs but don’t have the time to keep up with the bleeding-edge research (and in other ways too). In my circles, we’re even unsure whether we’re just seeing more informed people and less informed people who’ll eventually follow the first group, or whether we’re seeing a permanent fragmentation of the community.1

Now I suspect that the same failure mode is repeating in a third domain.

I’m not aware of any plausible theory according to which at least AI safety focused on alleviating risks of extreme suffering (s-risks) is not the most pressing problem to focus on. But my filter bubble now focuses on education and recruiting for the cause almost exclusively, which also seems worrisome to me, especially when compared to prioritization research. Some considerations:

  1. Trivially, the considerations that led us to reconsider our approaches to ETG and growth were unknown before they were discovered. More unknown unknowns may become known that will lead us to either deprioritize AI safety a little or change something about our approach to it. But now we’re building such momentum that we’re eroding the value of having the option to make such course adjustments.2
  2. Relatedly, I worry that the enormous momentum that gets built at the moment will lead to a continued increase in available, skilled applicants to AI safety organizations. For many possible underlying skill distributions, the qualification of the best applicant increases with the number of applicants. So the current development discourages hiring and growth just as deflation discourages investment. This will lead us to overestimate the current talent gap and may also delay research in general.
  3. Very few people are cause neutral. Those who are have a huge comparative advantage for prioritization research, but if they get convinced that AI safety research is more important, they may specialize in it to the point where a switch back later would be very costly.

These are only considerations, and it is well possible that the current momentum is actually not yet enough, that I’m mistaken about its degree, or a number of other countervailing considerations. So I have three main suggestions:

  1. Put a stronger focus again on prioritization research to be prepared if AI safety is not the most important thing after all or there are low limits to its growth.
  2. Spread questions, not answers: A big part of the trouble with ETG, offsetting, and various other “answers” is that they are called for on the margin in a certain situation. Spreading them is costly, and once it’s done, the situation may change, requiring spreading of the opposite answer.3 All it takes to sidestep these costs and risks is to spread the questions, e.g., instead of spreading offsetting, focus outreach on making people curious to find out how they can best reduce animals suffering such that the researchers who put out the tentative answers can maintain maximal fidelity through long blog posts or papers and have low costs advertising their conclusions, whether it is, at that time, offsetting, veganism, or working for an animal rights organization.
  3. Marry the generally accepted margin thinking with something like systems theory to gain a better understanding of the above failure mode and how to avoid it. Over longer time scales, we’ll probably also run into issues with exacerbating oscillations if we don’t get our reaction delays right.

I’ve consulted some friends before writing this because s-risk-focused AI safety does seem like the most important thing to me and does still seem neglected to me, and it seemed potentially dangerous to me to risk braking down momentum in the right direction. So please don’t update on these considerations too strongly before someone has put out a good quantitative model trading these factors off against the surely many countervailing factors.

  1. This also contributes to the recent concerns about representativeness: I’d be happy to see the variety of opinions represented fairly that well-informed EAs hold, but it seems to me like a waste of everyone’s time to include opinions that majorities hold only because they haven’t had time to catch up on the intellectual process of the past years. 

  2. One might argue that AI safety is so enormously neglected compared to AI research that there’s no such risk for decades at least, but EA and ETG are still neglected when you choose a wide enough reference class: very few people are EAs and I want to start a social enterprise partially because I expect it to be easier to get larger investments than larger grants. Plus, I’m particularly concerned about the momentum among people with the rare advantage of cause neutrality. 

  3. GiveWell and ACE are doing it very deliberately and are careful to measure whether their donors really follow their recommendations or whether they become invested in the particular charities, e.g., whether they switch their donations to a different one when the first announces that their funding gap has been filled. This is a tricky process that has to be worth the costs and risks. 


comments powered by Disqus