Unified District Model

September 2025

In order to score new plans, it is necessary to create a statistical model of the relationship between districts’ latent partisanship and candidates’ incumbency status with election outcomes. This enables us to estimate district-level vote shares for a new map and the corresponding partisan gerrymandering metrics. This page describes the details of our methodology and how we validate the results of this model.

Results for uncontested elections are imputed as described in The Impact of Partisan Gerrymandering on Political Parties and its appendix, by Nicholas Stephanopoulos and Christopher Warshaw.

Methodology

The Big Picture

We use the correlation between the presidential vote on the one hand and state legislative or congressional votes on the other to predict how new districts will likely vote and so how biased a plan will be. Our correlations come from the last 14 years of elections and are estimated separately for state legislatures and Congress. They factor in how much each state's and election year's results might differ from others and—where appropriate—any extra advantage incumbents might have. We also allow our predictions to be imperfect by quantifying how much our method missed the actual outcomes of past elections, including the degree to which partisan tides have changed party performance from one election to the next. This enables us to generate the most accurate, data-driven, and transparent prediction we can.

This is the model we use when we have the latest geographic election data for a particular state. When we do not have such data, we use a slightly different model described here. At present we only have fully updated data for Congress, so only congressional results are reported on this methods page. See the companion page for state legislative results, and for the results of a modified version of the congressional model for states that do not have fully updated data.

The Details

We use a Bayesian hierarchical model of district-level election returns, run on either state legislatures or congressional delegations (depending on the outcome of interest), for the elections from 2011 through 2024. Formally, the model is:

where

𝑖 indexes district level elections
𝑠 indexes states, with 𝑠(𝑖) denoting the state of district election 𝑖
𝑐 indexes election cycles, with 𝑐(𝑖) denoting the election cycle of district election 𝑖
𝑘 ∈ [1, 2] indexes covariates, with 0 identifying intercepts
𝑦_𝑖 is the Democratic share of the two-party vote in district election 𝑖
𝑿_𝑖 is a matrix of covariate values for district election 𝑖
𝛽 is a matrix of population-level intercept and slopes corresponding to covariates 𝑿
𝛽_𝑠(𝑖) and 𝛽_𝑐(𝑖) are matrices of coefficients for the state and election cycle, respectively, of district election 𝑖
𝜎_𝑦 is the residual population-level error term

The model allows the slope for all our covariates—as well as the corresponding intercept—to vary across both states and election cycles. Based on exploration of different model specifications, we allow for correlated random effects across cycles but assume no such correlation across states to facilitate convergence.

We run separate models for state legislative and congressional outcomes and with and without incumbency as a covariate. PlanScore identifies a plan as state legislative or congressional based on the number of seats in the plan and the state for which it is submitted.

𝑘 ranges between 1 and 2: if a user designates incumbency for any seat in a plan, predictions come from the model that includes both presidential vote and incumbency as covariates; if all seats are left open, predictions come from a model with only presidential vote. Presidential vote is the two-party district-level Democratic presidential vote share, centered around its global mean (0.515), while incumbency status in district election 𝑖 is coded -1 for Republican, 0 for open, and 1 for Democratic. We do not have the 2020 presidential vote for estimating new plans in two states—Kentucky and South Dakota—so we used the 2016 presidential vote in the model for those states. In the small number of remaining state-cycle combinations that were missing presidential vote we used the presidential vote for the same district in the next presidential election (or the previous presidential election where the next one was not available).

When generating predictions, PlanScore draws 1000 samples from the posterior distribution of model parameters, and uses them to calculate means and probabilities. We also add in the offsets for the 2024 presidential election cycle, and then also add in samples from the covariance matrix of cycle random effects to allow the uncertainty of predicting for an unknown election cycle to propagate into our predictions. This has the effect of predicting for an election like 2024 in most respects, but with error bounds that encompass the full range of partisan tides that occurred over the last decade.

Full results for our four separate models can be found below.

Table 1: Congress prediction model with incumbency (𝑘 = 2)
	Estimate	95% Credible Interval
POPULATION-LEVEL
Intercept (𝛽₀)	0.51	[0.49, 0.54]
Presidential vote (𝛽₁)	0.87	[0.80, 0.93]
Incumbency (𝛽₂)	0.04	[0.02, 0.05]
STATE-LEVEL
Standard Deviations
Intercept (𝜎_{𝛽_0𝑠})	0.01	[0.01, 0.01]
Presidential vote (𝜎_{𝛽_1𝑠})	0.08	[0.06, 0.11]
Incumbency (𝜎_{𝛽_2𝑠})	0.01	[0.01, 0.01]
CYCLE-LEVEL
Standard Deviations
Intercept (𝜎_{𝛽_0𝑐})	0.03	[0.01, 0.06]
Presidential vote (𝜎_{𝛽_1𝑐})	0.07	[0.03, 0.14]
Incumbency (𝜎_{𝛽_2𝑐})	0.02	[0.01, 0.03]
Correlations
Intercept - Pres. vote (𝜌𝜎_{𝛽_0𝑐}𝜎_{𝛽_1𝑐})	−0.09	[−0.72, 0.65]
Intercept - Incumbency (𝜌𝜎_{𝛽_0𝑐}𝜎_{𝛽_2𝑠})	−0.37	[−0.87, 0.34]
Pres. vote - Incumbency (𝜌𝜎_{𝛽_1𝑐}𝜎_{𝛽_2𝑐})	−0.59	[−0.94, 0.16]
Note: Model estimated in brms for R. Model based on 4 MCMC chains run for 6000 iterations each with a 2000 iteration warm-up. All model parameters converged well with 𝑅̂ < 1.01.

Table 2: Congress prediction model without incumbency (𝑘 = 1)
	Estimate	95% Credible Interval
POPULATION-LEVEL
Intercept (𝛽₀)	0.51	[0.49, 0.53]
Presidential vote (𝛽₁)	1.04	[0.99, 1.09]
STATE-LEVEL
Standard Deviations
Intercept (𝜎_{𝛽_0𝑠})	0.02	[0.01, 0.02]
Presidential vote (𝜎_{𝛽_1𝑠})	0.08	[0.06, 0.11]
CYCLE-LEVEL
Standard Deviations
Intercept (𝜎_{𝛽_0𝑐})	0.03	[0.01, 0.05]
Presidential vote (𝜎_{𝛽_1𝑐})	0.05	[0.03, 0.11]
Correlations
Intercept - Pres. vote (𝜌𝜎_{𝛽_0𝑐}𝜎_{𝛽_1𝑐})	−0.55	[−0.95, 0.29]
Note: Model estimated in brms for R. Model based on 4 MCMC chains run for 6000 iterations each with a 2000 iteration warm-up. All model parameters converged well with 𝑅̂ < 1.01.

Predictions

The charts below show comparisons between this model’s in-sample predictions and observed historical scores for plans with at least 7 districts.

Data Sources

Precinct-level presidential vote data used by this model is mostly sourced from the Voting and Election Science Team at University of Florida and Wichita State University.