Piet Stam - The math of risk equalization

Meet & greet

What do you expect to learn from this course?
My agenda
- First, an intro to the math
- Second, applying it to some data
R package rvedata based on my PhD thesis

Context

🇳🇱 health care (basic benefits)
🇳🇱 health insurance
🇳🇱 system of risk equalization
Traditional focus on incentives
- NOT actual behavior
- NOT effects (efficiency & equity)

Data collection

Large national data set
Population vs. sample
Weights for insurance period
Multiple records per insured
Pseudonyms for merging data sets

Which are the “acceptable costs”?

“The costs of services that follow from a quality, intensity and price level of treatment that the sponsor considers to be acceptable to be subsidized.” (Van de Ven and Ellis, 2000)

Two extremes:
- Best practice costs
- Actual expenditures
Q: which is more health based?
🇳🇱: Y = actual expenditures with average prices for some services

Which subgroups to compensate?

“The REF equation should only include parameters which equalize cost differences in health status of an insured as a consequence of differences in age, gender and other objective measures of health status.” (Health Insurance Decree:389, p.23)

Compensation for S(olidarity)-type groups

Age
Gender
Health status

No compensation for
N(on-solidarity)-type groups

Propensity to consume
Input prices
Regional overcapacity (SID)
Provider practice style

The regression equation

with

health expenses observed during some period in time
is the th S-type risk factor,
is the th N-type risk factor,
()

Which causal diagram?

Big assumption

Define and rewrite

What to do if assumption fails?

Schokkaert and Van de Voorde () recommend a 2-step method:

estimate ( in regression with and variables
predict with set at prevalences

The formula then reads as follows:

with being a row i/o matrix.

Or… ignore this omitted vars bias

In practice, we apply this equation:

and try to extend with as much (measurable) S-type variables as possible.

Regression without an intercept

Traditional OLS:

include an intercept
omit one category of age/gender
omit one category of each other (which one?)

OLS w/ risk equalization:

do not include an intercept
include all categories of all other ’s
set total effect of age/gender := sum of
set total effect of each other := 0

Apply weights

Weights define length of insurance contract
Potential reasons for :
- 2 or more records for 1 individual -> sum Y and X
- babies born
- people deceased

Use aggregation to save computer time

“Vertical aggregation” for each unique combination of X
Total number of rows = number of unique combinations
W := sum of observations for each unique combination
Y := average expenses for each unique combination
X := set of prevalences for each unique combination
OLS estimation using these W, Y and X
Bekijk mijn blog voor een eenvoudig voorbeeld

Region: individual & zip code data

In 2002 a two-step approach was implemented:
- step 1: (indiv. level)
- step 2: (zip-code level)
As step 2 can be read as:
- step 2:
Implicit restriction: and not correlated
If this assumption is false, the estimators are inconsistent
Therefore, was added to step 2 since the 2006 model
Nowadays, one comprehensive regression at indiv. level

(Ex post) risk sharing

Definition: insurers are retrospectively reimbursed for some of the costs of some of their insurance members (Van de Ven and Ellis 2000)

Table risk sharing methods

Assessment framework

Table assessment framework

Install package `rvedata`

Source: https://github.com/risicoverevening/rvedata
Metadata rvedata

https://pietstam.nl/talks