Arizona Poll Prediction for Remaining Votes using Bayesian Inference

3 min readNov 9, 2020

Will Arizona flip? with less than 80K votes remaining to be counted and the Biden’s lead keeps falling, what would be the final outcome?

Here is the live update https://alex.github.io/nyt-2020-election-scraper/battleground-state-changes.html, we see the lead falling while the remaining votes too coming down to less than 100k. And the gain/loss almost keeps a nice linear trend line.

MS Excel would just do the job of fitting a regression line but what’s the fun without using complex analytics tools?! So, decided to use Bayesian Inference and as a newbie I thought this would be a good learning experience as well!

Here we go! plot the original data

#sample data, x is leading by and y is to be counted
array([[ 16985.,  81209.],
       [ 16952.,  81536.],
       [ 20102.,  98275.],
       [ 20102., 110925.],
       [ 19348., 117787.]])fig, ax = plt.subplots()
plt.scatter(x, y)
#reverse the axis for better readability
ax.yaxis.set_ticks_position(“right”)
ax.yaxis.set_label_position(“right”)
ax.invert_xaxis()
plt.xlabel(‘To be Counted’,fontsize=12)
plt.ylabel(‘Leading by’,fontsize=12)
ax.xaxis.set_tick_params(labelsize=10)
ax.yaxis.set_tick_params(labelsize=10)

There is a nice trend! but where it would end? we use a PyMC3 model to do a Bayesian Inference and we are just interested in the ‘intercept’ that’s the point the trend line intercepting the y-axis!

#we just need top 50 to minimize noise/outliers
Y = df[:50,0]
X = df[:50,1]
with Model() as basic_model:
    #initial values, sigma less than 200 is not accepted by the funtion!
    sd=300
    mu=0
    # Priors for unknown model parameters
    intercept = Normal('intercept', mu=mu, sigma=sd)
    beta = Normal('beta', mu=mu, sigma=sd)
    sigma = HalfNormal('sigma', sigma=sd)# Expected value of outcome
    mu = intercept + beta*X# Likelihood (sampling distribution) of observations
    Y_obs = Normal('Y_obs', mu=mu, sigma=sigma, observed=Y)

Then we run the sample to generate the posteriors

with basic_model:
    # draw 2000 posterior samples
    trace = sample(2000, tune=2000,)

Posterior Analysis

The left column consists of a smoothed histogram of the marginal posteriors of each ‘intercept’, ‘beta’, while the right column contains the samples of the Markov chain plotted in sequential order.

In addition, the summary of common posterior statistics:

az.summary(trace, round_to=2)

In the above summary, we are just interested in the ‘intercept’, the mean is 668.3 and SD is 290.26, so, the final outcome, considering 2 SD, would range from 668.3–2x290.26 to 668.3+2x290.26, i.e the final lead would be a low of 87.78 and to a high of 1248.82, rounding off 88 to 1249!

Yes, Biden wins still!

That’s all! and here is a visualization of the same

with basic_model:
 ppc = sample_posterior_predictive(
 trace, var_names=[“intercept”, “beta”, “Y_obs”]
 )

Adapting code from the PyMC3 samples

#let's draw the regressionl line beyond 0 to be counted for better readability
X1=np.asarray([450000.,300000.,100000.,0.,-10000])
mu_pp = (ppc["intercept"] + ppc["beta"] * X1[:, None]).Tfig,ax = plt.subplots()ax.plot(X, Y, "o", ms=4, alpha=0.4, label="Data")
ax.plot(X1, mu_pp.mean(0), label="Mean outcome", alpha=0.6)
az.plot_hpd(
    X1,
    mu_pp,
    ax=ax,
    fill_kwargs={"alpha": 0.8, "label": "Mean outcome 94% HPD"},
)
az.plot_hpd(
    X,
    ppc["Y_obs"],
    ax=ax,
    fill_kwargs={"alpha": 0.8, "color": "#a1dab4", "label": "Outcome 94% HPD"},
)
#reverse the axis for better readabilityax.yaxis.set_ticks_position("right")
ax.yaxis.set_label_position("right")
ax.invert_xaxis()
plt.xlabel('To be Counted',fontsize=12)
plt.ylabel('Leading by',fontsize=12)
ax.xaxis.set_tick_params(labelsize=10)
ax.yaxis.set_tick_params(labelsize=10)

That’s all! here is the Jupyter notebook

dsivakumar/bayesian

Permalink Dismiss GitHub is home to over 50 million developers working together to host and review code, manage…

github.com

Arizona Poll Prediction for Remaining Votes using Bayesian Inference

dsivakumar/bayesian

Permalink Dismiss GitHub is home to over 50 million developers working together to host and review code, manage…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by wu am i

No responses yet