Invited speaker at the NULab Critical Forum — the talk series of Northeastern’s center for digital humanities and computational social science — on strategies for classifying political text with large language models. The talk focused on a recurring problem: LLMs will happily label anything, but a label is only useful if it measures the construct you claim it does. I walked through the validity pitfalls — where prompt-based stance classification quietly drifts from the concept a researcher set out to measure — which ties directly to the “Stance Is Not a Construct” work. See also AI Interpretability.