Arizona Poll Prediction for Remaining Votes using Bayesian Inference

wu am i
3 min readNov 9, 2020

--

Will Arizona flip? with less than 80K votes remaining to be counted and the Biden’s lead keeps falling, what would be the final outcome?

Here is the live update https://alex.github.io/nyt-2020-election-scraper/battleground-state-changes.html, we see the lead falling while the remaining votes too coming down to less than 100k. And the gain/loss almost keeps a nice linear trend line.

MS Excel would just do the job of fitting a regression line but what’s the fun without using complex analytics tools?! So, decided to use Bayesian Inference and as a newbie I thought this would be a good learning experience as well!

Here we go! plot the original data

#sample data, x is leading by and y is to be counted
array([[ 16985., 81209.],
[ 16952., 81536.],
[ 20102., 98275.],
[ 20102., 110925.],
[ 19348., 117787.]])
fig, ax = plt.subplots()
plt.scatter(x, y)
#reverse the axis for better readability
ax.yaxis.set_ticks_position(“right”)
ax.yaxis.set_label_position(“right”)
ax.invert_xaxis()
plt.xlabel(‘To be Counted’,fontsize=12)
plt.ylabel(‘Leading by’,fontsize=12)
ax.xaxis.set_tick_params(labelsize=10)
ax.yaxis.set_tick_params(labelsize=10)

There is a nice trend! but where it would end? we use a PyMC3 model to do a Bayesian Inference and we are just interested in the ‘intercept’ that’s the point the trend line intercepting the y-axis!

#we just need top 50 to minimize noise/outliers
Y = df[:50,0]
X = df[:50,1]
with Model() as basic_model:
#initial values, sigma less than 200 is not accepted by the funtion!
sd=300
mu=0
# Priors for unknown model parameters
intercept = Normal('intercept', mu=mu, sigma=sd)
beta = Normal('beta', mu=mu, sigma=sd)
sigma = HalfNormal('sigma', sigma=sd)
# Expected value of outcome
mu = intercept + beta*X
# Likelihood (sampling distribution) of observations
Y_obs = Normal('Y_obs', mu=mu, sigma=sigma, observed=Y)

Then we run the sample to generate the posteriors

with basic_model:
# draw 2000 posterior samples
trace = sample(2000, tune=2000,)

Posterior Analysis

The left column consists of a smoothed histogram of the marginal posteriors of each ‘intercept’, ‘beta’, while the right column contains the samples of the Markov chain plotted in sequential order.

In addition, the summary of common posterior statistics:

az.summary(trace, round_to=2)

In the above summary, we are just interested in the ‘intercept’, the mean is 668.3 and SD is 290.26, so, the final outcome, considering 2 SD, would range from 668.3–2x290.26 to 668.3+2x290.26, i.e the final lead would be a low of 87.78 and to a high of 1248.82, rounding off 88 to 1249!

Yes, Biden wins still!

That’s all! and here is a visualization of the same

with basic_model:
ppc = sample_posterior_predictive(
trace, var_names=[“intercept”, “beta”, “Y_obs”]
)

Adapting code from the PyMC3 samples

#let's draw the regressionl line beyond 0 to be counted for better readability
X1=np.asarray([450000.,300000.,100000.,0.,-10000])
mu_pp = (ppc["intercept"] + ppc["beta"] * X1[:, None]).T
fig,ax = plt.subplots()ax.plot(X, Y, "o", ms=4, alpha=0.4, label="Data")
ax.plot(X1, mu_pp.mean(0), label="Mean outcome", alpha=0.6)
az.plot_hpd(
X1,
mu_pp,
ax=ax,
fill_kwargs={"alpha": 0.8, "label": "Mean outcome 94% HPD"},
)
az.plot_hpd(
X,
ppc["Y_obs"],
ax=ax,
fill_kwargs={"alpha": 0.8, "color": "#a1dab4", "label": "Outcome 94% HPD"},
)
#reverse the axis for better readability
ax.yaxis.set_ticks_position("right")
ax.yaxis.set_label_position("right")
ax.invert_xaxis()
plt.xlabel('To be Counted',fontsize=12)
plt.ylabel('Leading by',fontsize=12)
ax.xaxis.set_tick_params(labelsize=10)
ax.yaxis.set_tick_params(labelsize=10)
Visualization of the final outcome!

That’s all! here is the Jupyter notebook

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

wu am i
wu am i

Written by wu am i

truth serum - pythonist, perpetual learner, ai/ml enthusiast, hope to build a personal robotic assistant (pra?) to take care me in my old age!

No responses yet

Write a response