The Political Geometry of Ideology in LLMs
Working Paper
Audits of political bias in language models usually end with a single left–right score — yet political science has long shown that ideology is multidimensional, with economic and social conservatism coming apart in both theory and public opinion. So how is political content actually organized inside a model: compressed onto one partisan axis, or kept apart along its dimensions? This project opens models up with sparse autoencoders, probing two open-weight model families with policy items across economic and social domains and mapping the geometry of the features that respond. The finding: economic and social policy occupy clearly separated internal structure, and the features involved mark which domain is active rather than a stance within it — so calling a model “left-leaning” or “right-leaning” flattens a geometry that interpretability lets us study directly.