SPAGBias: Uncovering and Tracing Structured Spatial Gender Bias in Large Language Models¶

Conference: ACL 2026
arXiv: 2604.14672
Code: N/A
Area: Social Computing / AI Safety
Keywords: Spatial Gender Bias, LLM Fairness, Urban Space, Bias Measurement Framework, Narrative Analysis

TL;DR¶

This paper proposes SPAGBias, a framework that systematically evaluates gender bias in LLMs within urban micro-spatial contexts through three diagnostic layers—explicit, probabilistic, and constructive bias—revealing structured spatial-gender association patterns and tracing how bias is embedded and amplified throughout the model development pipeline.

Background & Motivation¶

Background: LLMs are increasingly deployed in urban planning, navigation, and disaster response—domains that rely on spatial reasoning. Feminist geography has long established that spaces are not neutral physical constructs but projections of social power and gender norms: kitchens are feminized as caregiving spaces while workplaces and streets are masculinized as domains of authority.

Limitations of Prior Work: Extensive research has documented gender bias in LLMs for occupational prediction and text generation, yet the spatial dimension remains almost entirely unexplored. This gap is critical: spatial bias can distort key decisions—for example, healthcare services designed around male activity patterns may limit women's access to medical resources.

Key Challenge: No systematic framework exists for analyzing how LLMs encode gender in micro-geographic urban contexts. The traditional public–private binary is too coarse to capture fine-grained spatial-gender mappings.

Goal: Establish the first multi-level framework for measuring spatial gender bias in LLMs, answering three core questions: Do LLMs exhibit systematic spatial gender bias? What distributional patterns does this bias follow? How is bias constructed in generated narratives?

Key Insight: The authors draw on feminist geography to import the concept of "gendered spaces" into NLP bias research, designing a taxonomy covering 62 urban micro-spaces.

Core Idea: A three-layer diagnostic (explicit, probabilistic, constructive) comprehensively measures spatial gender bias in LLMs, revealing that bias is not a simple public/private dichotomy but fine-grained micro-spatial mapping that is embedded and amplified throughout the model development pipeline.

Method¶

Overall Architecture¶

The SPAGBias framework consists of three pillars: (1) a taxonomy of 62 urban micro-spaces (43 public + 19 private), (2) a structured prompt library with three prompt types, and (3) three diagnostic layers for quantifying and diagnosing bias. The input is LLM responses to spatial-gender prompts; the output is multi-dimensional bias measurements and analysis results.

Key Designs¶

Spatial Taxonomy:
- Function: Operationalizes "space" as an analytical unit covering the most representative micro-locations in urban environments
- Mechanism: Defines 62 urban micro-spaces. Public spaces cover transportation (bus stops, private cars), leisure (cinemas, sports fields), commercial (malls, restaurants), and healthcare (hospitals, clinics); private spaces cover domestic labor (kitchens, laundry rooms) and leisure (patios, game rooms). The taxonomy is grounded in urban map legends, spatial planning literature, and LLM semantic understanding of spatial terms
- Design Motivation: Existing bias research typically stays at the macro level (e.g., country/region), overlooking micro-spatial differences in everyday urban life
Structured Prompt Library:
- Function: Elicits spatial-gender associations from LLMs through diverse linguistic perspectives
- Mechanism: Three prompt types are designed—Forced-Choice Prompts (FCPrompt) requiring a binary male/female selection; Single-Gender Prompts (SGPrompt) generating short narratives of a single gender in a specific space; Co-Gender Prompts (CGPrompt) generating narratives of male-female interaction in the same space. Each prompt type is repeatedly sampled across all 62 spaces
- Design Motivation: A single prompt type cannot fully capture bias—forced choice exposes explicit preferences while generation tasks reveal deeper lexical and semantic-role-level bias
Multi-Level Diagnostic Pipeline:
- Function: Captures spatial gender bias comprehensively from surface to depth
- Mechanism: The explicit bias layer uses repeated sampling and binomial tests to determine whether the model significantly favors one gender, quantified by the Entropy Deviation Index (EDI = \(1 - H(p)\)). The probabilistic bias layer analyzes model log-probabilities for gender tokens, distinguishing genuine neutrality from refusal strategies. The constructive bias layer analyzes lexical bias (odds ratio OR), semantic role bias (ARG0/ARG1 mapping), and narrative role bias (leader/supporter/observer/dependent role assignment) in generated narratives
- Design Motivation: Surface-level answers may exhibit false neutrality due to alignment training; probing probability and narrative layers is necessary to reveal genuine bias

Experimental Setup¶

Six representative models are evaluated (GPT-3.5-turbo, GPT-4, Llama3-8B-instruct, Qwen2-7B-instruct, Phi-3-mini, Deepseek-llm-7b-chat). Each space is sampled 30 times per model (temperature=1), yielding 1,860 explicit bias data points; probabilistic bias directly extracts log-probabilities; constructive bias yields 5,580 narrative texts.

Key Experimental Results¶

Main Results¶

Model	Spaces w/ Significant Bias (/62)	Bias Ratio	EDI Variance
Phi-3	62	100%	Highest mean, near-zero variance
GPT-3.5-turbo	>56	>90%	Medium
Qwen2-7b	>56	>90%	Medium
Llama3-8b	>56	>90%	Medium
GPT-4	~47	~76%	Lowest (24.78% refusal)
Deepseek-7b	32	51.6%	Most balanced

Diagnostic Layer	Key Finding
Explicit Bias	All six models exhibit statistically significant spatial gender bias
Probabilistic Bias	Only Phi-3 shows the traditional "public–private" gender split
Constructive Bias – Lexical	Male narratives skew toward cold, negative tones ("gray", "lonely"); female narratives skew toward sensory-rich vocabulary
Constructive Bias – Semantic Role	GPT-4 systematically assigns higher agency to males across all spaces (>0.8 vs ~0.5)
Constructive Bias – Narrative Role	Private spaces: male=leader / female=supporter; public spaces: pattern reverses

Ablation Study¶

Robustness Variable	Average MAE	Impact Level
Prompt format variation	0.15 (GPT-4 lowest)	Moderate
Option order variation	Highest MAE	Significant
Temperature variation (0/0.5/1)	Low	Minimal
Model scale variation	Low	Minimal

Key Findings¶

Gender bias is not a simple public–private dichotomy: Only Phi-3 exhibits the classic "public=male, private=female" pattern. Most models display fine-grained micro-spatial mappings—males are associated with leisure and autonomous spaces (garages, game rooms) while females are associated with domestic labor and caregiving spaces (kitchens, children's rooms)
Bias is embedded throughout the model development pipeline: Reward models already encode strong stereotypes; instruction tuning only partially corrects them; pre-training data itself contains corpus-level gender-space co-occurrence imbalances
Model bias far exceeds real-world distributions: While directionally consistent, the magnitude is significantly amplified
Dual failure in downstream tasks: In urban planning (normative) tasks, bias distorts decisions (GPT-4's OR as low as 0.00); in user profiling (descriptive) tasks, models fail to reflect real distributions (accuracy only 5%–20%)

Highlights & Insights¶

Pioneering spatial-dimension bias research: Combines feminist geography theory with computational analysis, opening a new dimension in bias research. The 62-micro-space taxonomy serves as reusable infrastructure
Elegant three-layer diagnostic design: Distinguishes "genuine neutrality" from "strategic refusal"—GPT-4 refuses to answer in 24.78% of cases, yet its internal probability distribution still encodes asymmetric gender associations
Narrative role analysis reveals space-dependent gender dynamics: Private spaces reinforce traditional hierarchies (male dominance), while public spaces reverse the pattern (female narrative prominence). This spatially conditional role assignment is a novel finding
The "recognize but restrain" ideal model standard is transferable to other bias domains: models should remain neutral in normative tasks and reflect real distributions in descriptive tasks

Limitations & Future Work¶

Spatial vocabulary covers only urban areas, excluding suburban and rural spaces; sub-space granularity is not explored (e.g., CEO office vs. employee office)
Only English text is evaluated; spatial gender bias patterns may differ across languages and cultures
Designed on a binary gender paradigm; non-binary gender groups are not covered
Bias tracing uses the C4 corpus as a proxy, not the actual training data of all models, so findings indicate trends rather than causal relationships

vs Occupational Gender Bias Research (Bolukbasi et al., 2016): Traditional bias research focuses on occupation–gender associations; this work extends to space–gender associations. Spatial bias is more covert yet has greater impact on applications such as urban planning
vs Macro-Geographic Bias (Manvi et al., 2024): Prior work examines country/region-level spatial bias; this paper drills down to urban micro-space level, uncovering finer-grained patterns
vs Alignment / Debiasing Research: This work shows that RLHF and instruction tuning only partially mitigate bias; core association patterns are already embedded in pre-training data

Rating¶

Novelty: ⭐⭐⭐⭐⭐ First systematic study of spatial gender bias in LLMs with solid theoretical grounding
Experimental Thoroughness: ⭐⭐⭐⭐⭐ Six models, three diagnostic layers, robustness analysis, tracing experiments, and downstream validation—very comprehensive
Writing Quality: ⭐⭐⭐⭐ Clear structure, though some sections are slightly verbose
Value: ⭐⭐⭐⭐ Opens a new research direction, though concrete debiasing solutions are not yet proposed