June 2, 2005

Propensity score method and causal inference

Two days ago, an Ohio high school graduate shot his family members and friends to death in his graduation day. Two months ago, a Minnesota high school student killed several of his classmates.Why did these tragedies happen?We all want to know.

The May 27 Science Magazine published a report suggesting that earlier firearm violence exposure could cause later serious violent behavior. This is not much surprising, as other studies have already reached similar conclusion.However, the strong word—“cause” that authors from U of Michigan used in the title unnerves many people (and probably that is why it was accepted by the Science which seldom publishes social science reports).

Given recent lessons from hormone replacement therapy, all researchers are skeptical of any strong conclusions from observational studies. Then how could the authors claim the “causal effect” using an observational study?

The magic, as advertised in the paper, was the propensity score method.

Propensity score method was proposed by DB Rubin and his colleagues 20 years ago.It has been underused for a long time but it is getting popular these days. Here is an outline of this method in the context of firearm exposure study.

Over five years, three assessments were conducted every two years among adolescents from 78 Chicago neighborhoods.

At assessment 1, demographic, socioeconomic, behavior and psychological, and health related factors, together with neighborhood characteristics, were assessed. These covariates were used to develop the propensity score.

At assessment 2, firearm exposure status was obtained among these adolescents.Stepwise logistic regressions with the covariates from the first assessment were employed to predict the probability of exposure. The estimated probability is the propensity score. Thus, hundreds of covariates were reduced to one variable—the propensity score.Participants were then grouped into 12 strata based on the propensity score (It is too many.Usually one creates only five groups).

At assessment 3, the perpetrators, the outcome, were defined as those who experienced serious firearm violence during last 12 months.A stratified analysis by propensity score was then conducted to assess the magnitude of association (such as odds ratio). One can also use regression to adjust for residual confounding effects, e.g., small imbalance of covariates within propensity score strata.

The propensity score method is intuitively appealing and has several advantages over model based analysis such as adjustment for all covariates in one regression.

First, the propensity score summarizes many confounding factors (confounders) which are related to both exposure and outcomes.By explicitly exploring the relationship between exposure and confounders, one may discover imbalance among these variables and rectify it.

Second, by conducting stratified analysis for outcomes, one does not assume any association (e.g., linear) between outcome and confounders (in particular the joint distribution of confounders).

Third, one doesn’t have to worry too much about how to adjust hundreds of covariates in the outcome analysis. In the traditional regression analysis, too many covariates cause the dataset too sparse, and may require a large number of outcomes. Using propensity score, one essentially reduces the number of covariates. Note, in developing propensity score, we usually have enough exposed participants as “outcomes.”

Fourth and the most importantly, post-stratifying data based on propensity score is analogous to constructing a random experiment design within an observational study.By balancing propensity score between exposed and unexposed groups, one essentially creates comparable groups similar to those in random trials.As random trials are more valuable than observational studies in assessing causal inference, this feature is certainly desirable.

Now back to the firearm study.Did this study provide enough evidence to suggest a causal-effect link between firearm exposure and subsequent serious violence? There are many standards in assessing causal inference but let’s examine a couple essential criteria relevant to observational studies and to this study.

First, the risk factor must be associated with the outcome.In this study, the statistical significance of the results seems to support this. (Warning: no statistical significance doesn’t mean not causal).

Second, the factor must occur before the outcome.This seems obvious but has often been overlooked.In this study, the exposure did occur before the outcome assessment.However, the exposure is not static in nature.That is, those unexposed at the second assessment can be exposed to firearm during the following years, and vice versa.Nonetheless, because those having outcome were exposed to firearm by definition, exposure switches that occurred in the unexposed group were more likely to attenuate the association rather than strength it (if we believe there is a positive association).Therefore, there is no need to worry about this criterion either.

Third, is it consistent with other studies?Yes, the results from this study were consistent with conclusions from many previous studies.

Fourth, are there any experimental studies that can confirm the results?Well, there is no way to conduct experimental studies on this kind problem. The authors undertook indirect ways such as propensity score method to construct an “experimental” study.However, the propensity score only balances those known confounders.Unobservable factors are not accounted for in the method. Furthermore, although the paper includes more than one hundred candidate variables in the model for propensity score, they used stepwise logistic regression to select only 48 covariates (including quadratic terms) in the final model, which is questionable.Unfortunately, they didn’t provide the goodness-of-fit statistic for the final logistic regression.We didn’t know how well the estimated propensity score reflects the true probability of exposure.

In addition, the propensity score method is most useful in large dataset which can provide sufficient exposed and unexposed observations within each propensity stratum (i.e., overlapping observations within propensity score). However, because 20% participants dropped out of the study after the second assessment, there were only 210 exposed participants in the third assessment.In addition, because this study used too many propensity strata (12), the sample size in each stratum was too small and severely unbalanced on exposure status.

The outcome analysis in this report is also questionable.It is possible that some covariates are not significantly related to the exposure in the above stepwise regressions but may be significantly related to the outcome by themselves.Theoretically, those covariates unrelated to exposure (i.e., orthogonal) should not affect the relationship between exposure and outcome.However, reality is far more complicate than statistical theories. Different combinations of variable sets may yield different answers.

Fifth, are there any biological, psychological, or social theories for the association?Well, sort of.For example, social learning theory—learning by observing is a good candidate.However, given the complexity of the social phenomenon, the propensity score adjustment seems too parsimonious. In fact, because so many socioeconomic and psychological factors are related to violence, and because these relationships are naturally dynamic, regression methods used to form the propensity score and to assess the outcome are inevitably inadequate.

Overall, although asserting the “causal” relationship between violence exposure and subsequent serious violent behavior was somewhat overstretching the truth, the report had indeed advanced our knowledge on this arresting social phenomena. Its methodology was better than many previous reports. Nevertheless, for any causal factor whose effects are intertwined with myriads of others, it is never easy to reach a definite conclusion.


Freely hosted by www.xlogit.com. Powered by WordPress.