Understanding Agency

During a summer fellowship at the Center on Long-term Risk, I investigated how philosophical and neuroscientific perspectives on agency could inform AI safety research.

This work explored Dennett’s intentional stance—a framework for attributing beliefs and desires to systems—and examined insights from computational neuroscience about how the human brain implements agency. The goal was to develop better conceptual tools for reasoning about agent-like behavior in AI systems and understanding what it means for systems to have goals or objectives.

The research aimed to bridge philosophical clarity about agency with practical considerations for building and aligning AI systems that exhibit goal-directed behavior.

Publications & Writing

Grokking the Intentional Stance (LessWrong)
Integrating Three Models of Human Cognition (LessWrong)

Jack Koch

Explorer

Understanding Agency

Publications & Writing

Graph View

Backlinks