Discussion about this post

User's avatar
Jurgen Gravestein's avatar

Great analysis on the (lack of) evidence supporting the claim that we are close to powerseeking AI systems.

I wonder sometimes how much of today AI doomerism is rooted in literature and, as a culture, in our collective memory.

The true irony would be if a future superintelligence would go rogue because it had learned its powerseeking behaviors from the corpus of dystopian robot novels written by us in past decades.

If AI would ever seek to rule over us, it would be because we created it in our own image. Only if we give it the will, we give it its power. On that same note: there was a recent paper that suggested GPT-4 developed a theory of mind. I wrote an short analysis of it.

You might find it interesting:

https://jurgengravestein.substack.com/p/did-gpt-4-really-develop-a-theory

Expand full comment
Richard Tomsett's avatar

The step in the argument/thought experiment that an AI "will fail if anyone ever modifies its goals" has never satisfied me. To get to the "paperclip maximiser" scenario, you have to accept that the AI will be able to do all sorts of outrageously resourceful and intelligent-seeming things in order to achieve its goal, AND ALSO that the AI cannot change its own goal(s). Why would such a resourceful/intelligent AI not simply change its own goal, such that it could maximise its reward extremely easily? This seems a much more simple approach for the AI than taking over the universe (taking over the universe is fraught with uncertainties, which the superintelligent AI would presumably realise, and therefore conclude changing its own goal to one it's able to maximise reward on with greater certainty would be a better course of action).

I don't think I've seen the fixed goal assumption really challenged, but then I haven't followed the conversations on this topic closely, just read [admittedly only half of] Superintelligence and a few other summaries of the general ideas. I'd be interested to see what others have argued about this.

Expand full comment
2 more comments...

No posts