Methods of Prediction: Main Concepts
Predictions in clinical research are fundamental techniques that can benefit patients’ outcomes and medical practices. Prediction research is the process of predicting future outcomes, based on certain patterns, markers, and variables (Waljee et al., 2014). Accurate prediction models can indicate the future course of a treatment or the risk of developing an illness. Note that in contrast to the popular explanatory research, predictive research does not tackle the problem of causality or preconceived theoretical concepts. Forecast models simply use statistical methods and data mining to predict future clinical outcomes. To be more precise, predictions rely on numerous scientific methods which are part of statistical inference. Usually, statistical inference techniques are used to draw conclusions from data sets and include procedures, such as regression analysis, linear regression, and vector autoregression models.
Predictions can shape not only healthcare practices but societies in general. From weather forecasts to stock market performance, predicting the future is an essential factor for success. Yet, in clinical settings, methods of prediction are paramount as they can literally save lives. By implementing accurate methods of prediction and forecast, scientists can explore the predictive properties of certain biomarkers and patients’ characteristics, which can allow them to predict numerous clinical outcomes (e.g., rehospitalization, predictability of patient beds, risk of developing a disease, etc.). Therefore, it’s not surprising that predicting health outcomes is vital to patients and families, as well as professionals and governments.
2. Types of Methods of Prediction and Validation
Methods of prediction vary between research fields, and there are numerous multivariable prediction models which tackle vital aspects, such as model development, validation, and impact assessment. Note that the most traditional predictive approach is the Bayesian approach. However, with the increasing influence of machine learning and artificial intelligence algorithms, experts can implement other sophisticated models and software programs, such as the powerful random-forest approaches (Waljee et al., 2014). Machine learning algorithms are beneficial in the identification and analysis of potential, unexpected, and marginal predictors.
In addition, it’s interesting to mention that an analysis conducted by Bouwmeester and colleagues (2012) identified and classified five types of medical studies with different models of prediction:
- Predictor finding studies: These studies explore in detail which predictors independently contribute to the actual prediction (e.g., medical diagnosis).
- Model development studies without external validation: This approach is based on the development of prediction models (e.g., predictive techniques to guide patient management) by assessing the different weights per predictor.
- Model development studies with external validation: Similar to the model development studies above, these studies are based on models tested across external datasets, with a focus on external validation (e.g., temporal validation).
- External validation studies with or without model updating or adjustment: These studies focus on assessing and adjusting previous models of prediction based on new participant data as well as new validation data.
- Model impact studies: These studies explore the actual effect of the models of prediction on health outcomes and healthcare practices.
Note that the research team concluded that prediction models must involve external validation and impact assessments in order to be successful.
3. Methods of Prediction: Developing, Validating and Assessing Models
Although validation is a vital part of any prediction model, there are several steps that predictive research follows. The first step is the development of a predictive model. Note that the selection of irrelevant predictive variables can lead to poor performance. Missing data is another crucial aspect which experts must consider. Validation, as explained above, is one of the most important factors for success. The model performance should undergo internal validation (e.g., splitting the dataset) and external validation (e.g., new patients’ data). Apart from internal and internal validation, there are several types of validation practices: temporal (including new subjects from the same institute), geographical (focusing on another institute, city, etc.), and transmural (exploring different levels of care, e.g., primary care) (Janssen et al., 2009). Sadly, when validation shows poor performance, researchers usually develop a new predictive model instead of adjusting for errors. By not updating old models of prediction, prior knowledge is left behind, and predictive research often becomes particularistic.
The final step of predictive research is assessment: researchers must assess the performance of their predictive model. This goal can be achieved through numerous additional tests, such as calibration, discrimination or reclassification (Waljee et al., 2014).
II. Predictions in Observational Studies and Randomized Controlled Trials
- Observational Studies and Randomized Controlled Trials
The abundance of research methods and study designs can help experts explore a wild range of settings, populations, and phenomena. Nevertheless, observational studies and randomized controlled trials are the most popular and effective types of studies which help professionals evaluate treatment outcomes. Note that in observational studies, outcomes are observed just after a certain intervention. In randomized controlled trials, on the other hand, randomization is used to reduce bias and measurement errors (Braun et al., 2014).
Consequently, researchers claim that randomized trials are more effective than observations because randomization procedures can eliminate bias and nuisance (Trotta, 2012). At the same time, some research topics cannot be tested via a randomized trial as such studies would be unethical. Imagine randomizing non-smokers to smoking groups! It’s no surprise that the quality of the study design is the most important factor for research success.
- Survival Analysis and Censoring
Since medical research and epidemiological studies involve the measurement of the occurrence of an outcome, prediction models become paramount. In particular, survival analysis or lifetime data analysis focuses on the measurement of time to event (e.g., outcome). Time to event can be fatal (e.g., death), time to a clinical endpoint (e.g., disease), positive (e.g., discharge from hospital), and neutral (e.g., cessation of breastfeeding) (Prinja et al., 2010). Note that time to event can be measured in days, months, years, etc.
Nevertheless, in survival data, the end of the follow-up period won’t occur for all patients (Altman & Bland, 1998). In fact, this is a phenomenon, known as censoring, which researchers must consider. Censoring occurs when information on time to outcome event is not available (e.g., due to loss to follow-up or an accident). We should mention that there are three types of censoring: right, left, and interval. Let’s have a look at a study about breastfeeding mothers (Ishaya & Dikko, 2013). In case participants are still breastfeeding after their last survey, this is known as the right censoring. Left censoring occurs when mothers enter the study after they’ve stopped breastfeeding. Interval censoring, on the other hand, is when mothers stop breastfeeding in between two successive check-ups.