A new RCT of masks for Covid-19 is making heads spin
Critical appraisal is tough, particularly if you've already made up your mind
Randomised controlled trials are considered the gold standard in medical evidence for a reason. They eliminate major sources of bias and confounding through randomisation, and subject an intervention to as close to the actual conditions and outcomes we care about as possible.
Randomised trials of facemasks for Covid-19 will always be controversial because many people have already made up their minds about their use. This doesn’t change the fact they are sorely needed, and a new study comparing the use of N95 vs usual medical masks for health care workers (HCW) is welcome.
Unfortunately the controversy over the study has spawned a lot of ill informed commentary. Every cloud has a silver lining however, and we can turn this into an excellent opportunity to learn some valuable critical appraisal skills.
What is the study?
The study is titled, “Medical Masks Versus N95 Respirators for Preventing COVID-19 Among Health Care Workers” and is published in the journal, “Annals of Internal Medicine.
We often break down randomised trials into a PICO format (Population, Intervention, Comparator, Outcome).
P - HCW at 29 facilities across Canada, Israel, Pakistan and Egypt
I - Use of a medical mask when providing routine care for patients with known or suspected Covid-19 (universal medical masking at other times, N95 for aerosol generating procedures)
C - Use of a fit tested N95 mask when providing routine care for patients with known or suspected Covid-19 (universal medical masking at other times, N95 for aerosol generating procedures)
O - Time from randomisation until a positive rt-PCR result for Covid-19 (measured as a Hazard Ratio (HR))
The study is a “non-inferiority” design, which can sound confusing. Rather than testing whether an intervention is better than the control (known as a “superiority” design), you set a pre-specified threshold of how much worse you are willing for the intervention to be. This sounds strange, but it’s usually because there are other benefits (cheaper, easier, less invasive etc) which means you would tolerate it being a bit less effective.
In this study the non-inferiority margin is a HR of 2, meaning if the upper 95% confidence interval of the HR is <2, we would declare medical masks for routine care of Covid-19 patients non-inferior (we will come back to this).
What did they find?
Over the study period of ~10 weeks (which was at different time points at the different sites) there was no significant difference found in HR of Covid-19 rt-PCR positivity between the 2 groups (HR 1.14, 95% CI 0.77 to 1.69) and as the upper CI was below 2, medical masks were found non-inferior for routine care of patients with Covid-19.
This result is most compatible with a small increase in hazard (14%) of acquiring Covid-19 if wearing medical masks vs N95, but with a large degree of uncertainty which is also compatible with no difference, a big increase in hazard (nearly 70%) or a sizeable decrease in hazard (23%).
What are the misunderstandings?
There are many misunderstandings about this study already, so we will focus on some of the most important.
What question is the study answering?
This study is answering a very specific question:
“Does requiring health care workers (who are already wearing medical masks all day and N95 masks for aerosol generating procedures) to wear medical masks instead of N95 masks for routine care of Covid-19 patients, result in a less then doubling of the hazard of acquiring Covid-19”.
This study is not asking, “do masks work”, which is an utterly meaningless question out of context. At the most basic level, we know masks designed to filter particulates work because they are rigorously tested to confirm they do so. This is about their effect in real life implementation in this specific scenario.
This study does a great job of answering this specific question, and suggesting the study is bad, flawed, underpowered, etc just expose a lack of understanding about the purpose of clinical trials.
For example, suggesting that the study is flawed because HCWs didn’t wear N95 all day, every day, simply misunderstands the point of the study - this is a completely different question (perhaps worthy of an RCT of it’s own).
The same goes for criticisms that we don’t know where the HCWs caught the infection. It doesn’t matter, as this wasn’t the chosen outcome, which is total Covid-19 infections (arguably the more important outcome - no matter if N95s reduce your risk at work but your overall risk of getting infected is just the same).
What about differences in different regions?
The study produces an “unplanned analysis”, of the differences in the different countries which were involved - “unplanned” almost certainly means it was done at the behest of the peer reviewers, and is a pointless exercise.
There are differences between the different locations, which is precisely what you would expect as the sample sizes are much smaller. This is a post hoc, legitimately underpowered analysis subject to multiplicity (test enough things and eventually you'll find something significant by chance), both type 1 and 2 errors (high risk of false positives and negatives) which is uninformative and should be ignored.
What about adherence?
Some have claimed that a flaw in the study is that adherence to the intervention was different in the two groups (self reported “always” for 80.7% in N95 vs 91.2% in medical masks). This is not a flaw - far from it! It is very informative and one of the important factors the trial is assessing. It would suggest that requiring HCWs to wear N95s for routine care of Covid-19 is more difficult than medical masks, and therefore they are less able to do it properly. This would be an inherent problem with N95s which limits their utility.
Imagine asking HCWs just not to breath at any point - this would be highly effective, but as they would be unable to comply it would be a useless intervention. Ability to adhere is important and is one of the key reasons why RCTs are so useful in scenarios such as this.
What are the real limitations?
The most controversial aspect of non-inferiority trials is the non-inferiority margin. A HR of 2 is quite a big difference - it roughly means that at any one time you are twice as likely to get Covid-19. The justification in the supplement is confusing because they amended the design during the study (not uncommon), but seems largely based on an absolute difference of 5% (i.e. 10% of medical mask group vs 5% of N95 group), due to the expected effectiveness being much greater, at 75% relative risk reduction (i.e. expected result 10% medical mask group vs 2.5% N95 group).
Different people will have very different views on what an acceptable non-inferiority margin would be, but it would dramatically impact the required sample size. For example, to do a superiority trial to show N95s provided a HR of 0.8 (reduced hazard by 20%) you would need around 6,500 participants.
The difference you want to be able to detect is known as the “minimally clinically important difference”, and is one of the most important and overlooked facets of trial design. If you care about very small differences, you need a very big study.
Summary
This study can tell us that for HCWs, being made to wear an N95 for routine care of Covid-19 patients vs a medical mask does not make a very large difference to your overall risk of getting Covid-19. It cannot answer other questions that it was not designed to answer, but this is not a flaw of the study. A Ferrari can’t fly - but this is not a flaw, they are not designed to.
If you have a defined period of exposure to someone with Covid-19, properly wearing an N95 mask will certainly provide you with superior protection for that period of exposure. But your cumulative risk is much more than these isolated periods of exposure, and it is difficult to wear them all the time. Wearing N95s in this way may well reduce your total risk of getting Covid-19, but by a small amount which would require a very large trial to elicit with confidence.
If we want the answer to different questions, we need more trials.
Thanks Alasdair -- a strong analysis so clearly written.
Thanks for explaining this study. You're substacks are really interesting and help to keep me updated. I don't work on a covid ward any longer but it's good to know we did the best for our staff when I did!