Comments for Manuscript "Readmission Penalties and Health Insurance Expansions: A Cautionary Dispatch From Massachusetts."

Reviewer Comments
Reviewer Number:1
The manuscript reports an interesting study of a ‘natural experiment’ resulting from the implementation of a universal health insurance program in Massachusetts. The study raises questions regarding unintended consequences of policy implementation, in particular regarding the interpretation of selective quality indicators before, during, and following implementation of insurance reform.

The precautions raised by the authors are potentially important and bear public scrutiny. However, the authors may wish to consider some of the following suggestions for sharpening the presentation.
•    Introduction
o    The second paragraph is a bit opaque in developing a rational for the hypothesis. Indeed, it appears that the hypothesis might have been arrived at post hoc.
•    Methods
o    Minimal data are provided to inform the reader how or why the authors chose to divide hospitals into quartiles and to compare one quartile against the other three. Based on information available, this important dichotomization seems somewhat arbitrary. Perhaps a figure illustrating the distribution of hospitals would be more convincing.
o    This reviewer is not trained as a statistician, therefore, the following critique may not be appropriate. However, I wonder if there might be more appropriate methodologies to assess trends over time among different cohorts than the difference- in-differences analysis. At least, provide a rational for selecting this method for analysis.
o    The authors might want to consider a descriptor other than ‘intervention group’ for the quartile of hospitals exhibiting the greatest increase in readmission rates. Because the intervention in the state was global, it seems inappropriate to label this subset as the ‘intervention’ group.
•    Results
o    The authors appropriately highlight the limitations of a non-experimental study. The interpretation of results should also be qualified for the potential for ecological fallacies, i.e., the change in readmit rates may have nothing to do with the change in health policy.
o    The argument for an association between policy change and the outcome of interest might be strengthened by including in the analysis additional demographic characteristics of the respective hospitals. For example does the highest cohort of hospitals serve populations of lower socio-economic status?
o    Refine figures and tables
?    Figures and tables should ‘speak for themselves.’ Figures 1 and 2 appear to be generated from a spread sheet. The abscissa in both tables is difficult to read and should be refined with editing. While the vertical, dotted lines in the figures indicate phases of implementation, the interpretation is not immediately apparent. The presentations could be sharpened with simple labeling.
?    Table 2 reporting the Natural Log Splines Regressions is rather arcane for the general reader. The authors might present the data so that it is more intuitive for the reader.
?    There was a trend for increasing readmission rates among the ‘control’ hospitals. Does this not have implications in support of the authors’ conclusions?

Reviewer Number:2
This paper presents the results of an analysis of Massachusetts data about hospital readmissions before, during, and after implementation of the health care reform law in that state. The authors used quarterly data on readmissions from 2 years before the reform, the approximately 15 month period during implementation, and then for 2 years after the reform. The divided Massachusetts hospitals into quartiles based on their number of uninsured hospital admissions, and then compared the top quartile with the rest. Their results showed that, beginning with the onset of insurance reform, hospital readmissions rose in the upper quartile group, and more modestly in the remainder of hospitals starting in the post-reform period. The authors conclude that using readmission rated add as a quality metric may be unfair to hospitals that are currently caring for large numbers of uninsured patients.

The effects of the ACA are, of course, of interest to most people. And the potentially unintended effects of using hospital readmissions as a quality metric are also of interest to people like me, and presumably to hospital administrators. So the topic is of interest. But the questions I have about this paper are: are the results they are reporting real? And if so, what do they mean?

In terms of "is it real?", was a protocol for this analysis developed and made publicly available? One that justifies why the methods used were employed? The reason I ask is that there are many ways to do such an analysis, and there is always the chance that small effects like this may be sensitive to how the analysis was performed. So, in this case, the authors divided up the hospitals into quartiles, and then compared the top quartile to the other quartiles. Why not compare each quartile separately? Why use quartiles? Why not tertiles, or deciles? What would these results look like if this had been done as tertiles or deciles, with the highest tertile compared to the bottom two tertiles , or the highest decile compared to the bottom 9 deciles? To the degree that doing the analysis different ways continues to find consistent results, the more likely I am to believe the results are real. But as it stands now, there is one analytic method presented, which doesn't have any particular justification that I can see, and this is the only result presented.

The next question of "is it real?" These kinds of data would seem to be a perfect fit for analysis using statistical process control. I suspect if they did so they would find that there were signals for special cause variation in both of their groups. Whether this is higher in their high quartile hospitals or not, I am not sure.

But let's assume that it is. The next question I have is, "what does it mean?' The authors posit that what it means is that readmission rates are influenced by factors other than the quality of care. This means they are making the assumption that quality of care is remaining constant across these hospitals over these time points (or, at least that whatever changes may be happening in quality are the same across all hospitals). Are there data to support this? If not, then at least the authors should state that their conclusions have an un-tested assumption about quality. But let's also assume that the quality assumption is sustained. So let's assume that there is a temporal increase in hospital readmissions during this time period, and that it is unrelated to hospital quality. The authors posit that the Massachusetts health reform is the cause. Was there anything else going on during this time period, that might also influence hospital admissions and readmissions? This time period also includes the greatest economic havoc since the Great Depression. How might this have been expected to influence hospital readmissions, and is there any reason to think this influence might be different between the economically well off and the economically deprived, who are almost certain to be the uninsured? I don't know the answer to this, but do think it is another factor that is at least worth considering.
Some minor points: while I acknowledge that there is imprecision around a lot of terms used in research, I do not think this study design should be called an experiment, even with the qualification of a quasi-experiment. The term "experiment" implies that there is an intervention which is applied to some groups and not others - preferably at the control of the investigator. That is not the case here. This is a state-wide intervention. This is a time-series analysis comparing different groups.
Table 1 is a waste to time, and it hurts the authors to present this as "patient characteristics…were similar substantively...although some characteristics reached statistical significance because of sample size." Firstly, the data available in table 1 are very scanty - age, sex, race, number of diagnoses - and cannot be considered adequate in terms of making assumptions about the similarity or dis-similarity of the patients between the two groups, in terms of their likelihood for readmission. Secondly, a 50% increase in the number of nonwhite participants in the "intervention" group should hardly be attributed to "some...reached statistical significance because of sample size." Thirdly, the number of diagnoses is almost certainly influenced by contact with the health care system, which the uninsured have less of. Lastly, to try and imply that these two patient populations were more-or-less the same at baseline is ludicrous, since we know the uninsured differ from the insured in more ways than simply having health insurance.
Is reference 7 really different than reference 8? They seem to be identical.