Therapy - Are the Results Valid?

Evaluating the Validity of a Therapy Study

We have now identified current information which can answer our clinical question. The next step is to read the article and evaluate the study. There are three basic questions that need to be answered for every type of study:

Are the results of the study valid?
What are the results?
Will the results help in caring for my patient?

This tutorial will focus on the first question: are the results of the study valid? The issue of validity speaks to the "truthfulness" of the information. The validity criteria should be applied before an extensive analysis of the study data. If the study is not valid, the data may not be useful.

The evidence that supports the validity or truthfulness of the information is found primarily in the study methodology. Here is where the investigators address the issue of bias, both conscious and unconscious. Study methodologies such as randomization, blinding and follow-up of patients help insure that the study results are not overly influenced by the investigators or the patients.

Evaluating the medical literature is a complex undertaking. This session will provide you with some basic criteria and information to consider when trying to decide if the study methodology is sound. You will find that the answers to the questions of validity may not always be clearly stated in the article and that readers will have to make their own judgments about the importance of each question.

Once you have determined that the study methodology is valid, you must examine the results and their applicability to the patient. Clinicians may have additional concerns such as whether the study represented patients similar to his/her patients, whether the study covered the aspect of the problem that is most important to the patient, or whether the study suggested a clear and useful plan of action.

Note: The questions that we used to test the validity of the evidence are adapted from work done at McMaster University. See the References/Glossary unit: 'Users' Guides to the Medical Literature.'

Critical Review Form for Therapy Study

Read the following article and determine if it meets the validity criteria. (Click on title to access free full text).

Mingrone G. Bariatric surgery versus conventional medical therapy for type 2 diabetes. N Engl J Med. 2012 Apr 26;366(17):1577-85. doi: 10.1056/NEJMoa1200111. Epub 2012 Mar 26. PubMed PMID:22449317.

1. Were patients randomized? The assignment of patients to either group (treatment or control) must be done by a random allocation. This might include a coin toss (heads to treatment/tails to control) or use of randomization tables, often computer generated. Research has shown that random allocation comes closest to insuring the creation of groups of patients who will be similar in their risk of the events you hope to prevent. Randomization balances the groups for known prognostic factors (such as age, weight, gender, etc.) and unknown prognostic factors (such as compliance, genetics, socioeconomics, etc.). This reduces the chance of over-representation of any one characteristic within the study groups.

2. Was group allocation concealed? The randomization sequence should be concealed from the clinicians and researchers of the study to further eliminate conscious or unconscious selection bias. Concealment (part of the enrollment process) ensures that the researchers cannot predict or change the assignments of patients to treatment groups. If allocation is not concealed it may be possible to influence the outcome (consciously or unconsciously) by changing the enrollment order or the order of treatment which has been randomly assigned. Concealed allocation can be done by using a remote call center for enrolling patients or the use of opaque envelopes with assignments. This is different from blinding which happens AFTER randomization.

3. Were patients in the study groups similar with respect to known prognostic variables? The treatment and the control group should be similar for all prognostic characteristics except whether or not they received the experimental treatment. This information is usually displayed in Table 1, which outlines the baseline characteristics of both groups. This is a good way to verify that randomization resulted in similar groups.

4. To what extent was the study blinded? Blinding means that the people involved in the study do not know which treatments were given to which patients. Patients, researchers, data collectors and others involved in the study should not know which treatment is being administered. This helps eliminate assessment bias and preconceived notions as to how the treatments should be working. When it is difficult or even unethical to blind patients to a treatment, such as a surgical procedure, then a "blinded" clinician or researcher is needed to interpret the results.

5. Was follow-up complete? The study should begin and end with the same number of patients in each group. Patients lost to the study must be accounted for or risk making the conclusions invalid. Patients may drop out because of the adverse effects of the therapy being tested. If not accounted for, this can lead to conclusions that may be overly confident in the efficacy of the therapy. Good studies will have better than 80% follow-up for their patients. When there is a large loss to follow-up, the lost patients should be assigned to the "worst-case" outcomes and the results recalculated. If these results still support the original conclusion of the study then the loss may be acceptable.

6. Were patients analyzed in the groups to which they were first allocated? Anything that happens after randomization can affect the chances that a patient in a study has an event. Patients who forget or refuse their treatment should not be eliminated from the study results or allowed to “change groups”. Excluding noncompliant patients from a study group may leave only those that may be more likely to have a positive outcome, thus compromising the unbiased comparison that we got from the process of randomization. Therefore all patients must be analyzed within their assigned group. Randomization must be preserved. This is called "intention to treat" analysis.

7. Aside from the experimental intervention, were the groups treated equally? Both groups must be treated the same except for administration of the experimental treatment. If "cointerventions" (interventions other than the study treatment which are applied differently to both groups) exist they must be described in the methods section of the study.

This article meets most of the validity criteria for a therapy article. The next step is to review the results.

What are the results?

How large was the treatment effect? What was the absolute risk reduction?

Results: At 2 years, diabetes remission had occurred in none of the patients receiving medical therapy, as compared with 15 of 20 (75%) undergoing gastric bypass and 19 of 20 (95%) undergoing biliopancreatic diversion (P<0.001 for both comparisons). There was a significant association between study group and rate of remission. However, since there were no remissions in the medical-therapy group, risk ratios were computed in a more conservative fashion on the assumption that remission had occurred in the 2 patients in the medical-therapy group who dropped out.

	Remission of Diabetes	No remission of Diabetes
Gastric Bypass	15	5
Medical Therapy	2	18

Experimental Event Rate (EER) = 15 / 20 = 75%
outcome present / total in experimental group

Control Event Rate (CER) = 2 / 20 = 10%
outcome present / total in control group

Absolute Benefit Increase (ABI) = 75% - 10% = 65%
is the arithmetic difference between the rates of events in the experimental and control group. An Absolute Benefit Increase (ABI) refers to the increase of a good event as a result of the intervention. An Absolute Risk Reduction (ARR) refers to the decrease of a bed event as the result of the intervention. [ARR = EER-CER]

Relative Risk (RR) = .75 / .10 = 7.5
is the ratio of the risk in the experimental group compared to the risk in the control group.proportional reduction in risk between the rates of events in the control group and the experimental group. [RR = EER/CER]

Relative Benefit Increase (RBI) = 65% / 10% = 650%
is the proportional increase in benefit between the rates of events in the control group and the experimental group. [RBI = EER - CER / CER]

Numbers Needed to Treat (NNT) = 1 / .65 = 2
is the number of patients who need to be treated to prevent one bad outcome or produce one good outcome. In other words, it is the number of patients that a clinician would have to treat with the experimental treatment compared to the control treatment to achieve one additional patient with a favorable outcome. [NNT = 1/ARR]

Clinical versus Statistical Significance

"Although it is tempting to equate statistical significance with clinical importance, critical readers should avoid this temptation. To be clinically important requires a substantial change in an outcome that matters. Statistically significant changes, however, can be observed with trivial outcomes. And because statistical significance is powerfully influenced by the number of observations, statistically significant changes can be observed with trivial (small) changes in important outcomes. Large studies can be significant without being clinically important and small studies may be important without being significant." (Effective Clinical Practice, July/August 2001, ACP)

Clinical significance has little to do with statistics and is a matter of judgment. Clinical significance often depends on the magnitude of the effect being studied. It answers the question "Is the difference between groups large enough to be worth achieving?" Studies can be statistically significant yet clinically insignificant.

For example, a large study might find that a new antihypertensive drug lowered BP, on average, 1 mm Hg more than conventional treatments. The results were statistically significant with a P Value of less than .05 because the study was large enough to detect a very small difference. However, most clinicians would not find the 1 mm Hg difference in blood pressure large enough to justify changing to a new drug. This would be a case where the results were statistically significant (p value less than .05) but clinically insignificant.

Source:
Guyatt, G. Rennie, D. Meade, MO, Cook, DJ. Users' Guide to Medical Literature: A Manual for Evidence-Based Clinical Practice, 2nd Edition 2008.

Apply the results to your patient

Were the study patients similar to my population of interest?
Does your population match the study inclusion criteria?
If not, are there compelling reasons why the results should not apply to your population?

Were all clinically important outcomes considered?
What were the primary and secondary endpoints studied?
Were surrogate endpoints used?

Are the likely treatment benefits worth the potential harm and costs?
What is the number needed to treat (NNT) to prevent one adverse outcome or produce one positive outcome?
Is the reduction of clinical endpoints worth the potential harms of the surgery or the cost of surgery?

Mingrone and colleagues measured surrogate markers for clinical outcomes. 2 years seems sufficient to detect effects on glycemic control and diabetes remission. Given the progressive nature of type 2 diabetes, longer follow-up could help characterize with greater precision the extent to which these benefits are sustained over time. Further ascertainment of the nature and frequency of surgical complications associated with different procedures, surgical experience and volume levels, and patient characteristics would be helpful in decision making. Longer, larger multicenter studies measuring such patient-important outcomes as mortality, morbidity, end-organ damage, functional capacity, and quality of life are needed. The findings of Mingrone and colleagues add to the body of evidence favoring bariatric surgery but, alone, should not result in a rush to do more surgeries. (ACP Journal Club. 2012 Jul 17;157:JC1-12.)

Take this information back to your patient and discuss the issues with him and help him decide on a plan of action.

Self-evaluation
the evidence

6. Evaluate your performance with this patient

Did you ask a relevant, well focused question? Do you have fast and reliable access to the necessary resources? Do you know how to use them efficiently? Did you find a pre-appraised article? If not, was it difficult to critically evaluate the article?

Source:
Guyatt, G. Rennie, D. Meade, MO, Cook, DJ. Users' Guide to Medical Literature: A Manual for Evidence-Based Clinical Practice, 2nd Edition 2008.