The Political Geometry of Ideology in LLMs

Working Paper

Audits of political bias in language models usually end with a single left–right score — yet political science has long shown that ideology is multidimensional, with economic and social conservatism coming apart in both theory and public opinion. So how is political content actually organized inside a model: compressed onto one partisan axis, or kept apart along its dimensions? This project opens models up with sparse autoencoders, probing two open-weight model families with policy items across economic and social domains and mapping the geometry of the features that respond. The finding: economic and social policy occupy clearly separated internal structure, and the features involved mark which domain is active rather than a stance within it — so calling a model “left-leaning” or “right-leaning” flattens a geometry that interpretability lets us study directly.

Concept
Multidimensional ideology in model representations
Methods
Sparse autoencoders, representation analysis, activation steering
Data
Open-weight model families probed with policy items across economic and social domains
Presented
IC2S2 2026