Chrome Extension
WeChat Mini Program
Use on ChatGLM

Re: Evaluation of Proposed Protocol Changing Statistical Significance From 0.05 to 0.005 in Foot and Ankle Randomized Controlled Trials.

The Journal of foot and ankle surgery : official publication of the American College of Foot and Ankle Surgeons(2023)

Cited 1|Views16
No score
Abstract
We read with interest the recent editorial by Reddy et al (1Reddy AK Scott JT Joshua Stephens B Patel A Checketts JX Stotler WM Hawkins BJ Vassar M Evaluation of proposed protocol changing statistical significance from 0.05 to 0.005 in foot and ankle randomized controlled trials.J Foot Ankle Surg. 2022; 61: 925-926Abstract Full Text Full Text PDF PubMed Scopus (2) Google Scholar). We agree with the authors when they state that changing the threshold for statistical significance from 0.05 to 0.005 in foot and ankle randomized controlled trials (RCTs) would heavily alter literature published in the field. However, their claim that it will increase the quality of RCT methodology is misleading. As with any evaluation or judgment, it is possible for the conclusions from a clinical trial to be incorrect. A type I error (false positive) occurs when a study rejects the null hypothesis that is actually true in the population. Correspondingly, a type II error (false negative) occurs when a study rejects a null hypothesis that is true. Both are influenced by decisions made regarding trial design and methodology throughout the lifetime of a trial. The choice of the statistical significance threshold in an RCT sample size calculation is one such decision. This controls the likelihood of a type I error in a study and the subsequent incorrect rejection of the null hypothesis. This is a much studied and much misunderstood debate with many opinions and counter-opinions (2Wasserstein RL Lazar NA. The ASA statement on p-values: context, process, and purpose.Am Stat. 2016; 70: 129-133Crossref Scopus (3558) Google Scholar,3Greenland S Senn SJ Rothman KJ Carlin JB Poole C Goodman SN Altman DG Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations.Eur J Epidemiol. 2016; 31: 337-350Crossref PubMed Scopus (1447) Google Scholar). We argue that changing this single factor from 0.05 to 0.005 is unlikely to improve the quality of published research. Instead, we argue that the likely impact will simply be larger trials, longer timeframes and associated higher costs resulting in a paucity of meaningful research findings. Fig. illustrates the relationship between the choice of significance level and the required sample size for a 2 arm parallel group design with fixed power of 80% and 3 effect sizes; small (green, 0.25), moderate (red, 0.4), and large (blue, 1). To put these effect sizes in context, the median standardized target effect size in a recent systematic review of 107 RCTs funded by the UK National Institute for Health and Care Research (NIHR) was 0.30 (interquartile range 0.20-0.38) (4Rothwell JC Julious SA Cooper CL. A study of target effect sizes in randomised controlled trials published in the Health Technology Assessment journal.Trials. 2018; 19: 544Crossref PubMed Scopus (22) Google Scholar) and we found similar values in trials in hallux valgus (5Dawson J Doll H Coffey J Jenkinson C Oxford and Birmingham Foot and Ankle Clinical Research GroupResponsiveness and minimally important change for the Manchester-Oxford foot questionnaire (MOXFQ) compared with AOFAS and SF-36 assessments following surgery for hallux valgus.Osteoarthritis Cartilage. 2007; 15: 918-931Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar) and ankle fractures (6Kearney R McKeown R Parsons H Haque A Parsons N Nwankwo H Mason J Underwood M Redmond AC Brown J Kefford S Costa M AIR trial collaboratorsUse of cast immobilisation versus removable brace in adults with an ankle fracture: multicentre randomised controlled trial.BMJ. 2021; 374: n1506Crossref PubMed Scopus (9) Google Scholar). The impact of the suggested change in statistical significance threshold from 0.05 to 0.005 is most pronounced in studies designed with small effect sizes; an almost doubling in required sample size from 440 to a target well in excess of 800. Even for moderate effect sizes of 0.4 there is an increase of almost 75% in sample size target. To ensure that studies recruit these larger samples would require trials to open more recruitment sites, recruit for longer; or both. Both of these strategies require additional staff time to deliver, and since staff costs are one of the largest expenses of a Clinical Trial Unit (CTU) (7Hind D Reeves BC Bathers S Bray C Corkhill A Hayward C Harper L Napp V Norrie J Speed C Tremain L Keat N Bradburn M Comparative costs and activity from a sample of UK clinical trials units.Trials. 2017; 18: 203Crossref PubMed Scopus (18) Google Scholar), it is clear that increasing recruitment targets will also add significant costs to a study. Additionally, this strategy also places additional burden on already stretched hospital resources (8Mahase E. Address staffing crisis to tackle waiting list backlog, say MPs.BMJ. 2022; Google Scholar). There are also ethical issues associated with the suggestion made by the authors. If trials are unnecessarily larger and longer this will inevitably lead to delays in the evaluation of new treatments. Larger trials are also more expensive so research funders would be able to fund fewer trials. Taken together, this would mean that patients may have effectives treatments withheld for longer than is necessary, or worse, exposed to inferior or even harmful treatments. While ensuring that trials are appropriately powered is important, many funders now calling for innovative RCTs to ensure that these scare resources are prioritized to where needed. For example, the NIHR, which funds much of the UK's clinical trial research, is now committed to delivering “more efficient and more innovative clinical research” through the UK-wide Clinical Research Recovery Resilience and Growth program (9NIHR. The Future of UK Clinical Research Delivery: Saving and Improving Lives 2022 Available at: https://sites.google.com/nihr.ac.uk/thefutureofukclinicalresearch/home/programme-updates/increasing-the-use-of-innovative-research-designs?pli=1. Accessed December 14, 2022.Google Scholar). Increasing the threshold for statistical significance is a step away from these aims. Whereas identifying each area for improvement is beyond the scope of this letter, there are some excellent recent articles reflecting the volume of poor research across all areas of healthcare which provide recommendations for system wide improvement (10Pirosca S Shiely F Clarke M Treweek S. Tolerating bad health research: the continuing scandal.Trials. 2022; 23: 458Crossref PubMed Scopus (8) Google Scholar, 11Cook JA Julious SA Sones W Hampson LV Hewitt C Berlin JA et al.DELTA2 guidance on choosing the target difference and undertaking and reporting the sample size calculation for a randomised controlled trial.BMJ. 2018; 19: 606Google Scholar, 12Cook JA McCulloch P Blazeby JM Beard DJ Marinac-Dabic D Sedrakyan A. IDEAL framework for surgical innovation 3: randomised controlled trials in the assessment stage and evaluations in the long term study stage.BMJ. 2013; 346 (f2820-f)Crossref Scopus (129) Google Scholar). We contest that focusing on adjusting this hurdle at the analysis stage of an RCT is insufficient; there are many more hurdles in the design and conduct stages of an RCT which would affect much more meaningful change in the quality of published research in the foot and ankle field.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined