Scientific studies that investigate the effects of environmental exposures on human health often must rely on imperfect measurement processes to quantify the concentration of environmental contaminants in biosamples. These imperfect measurement processes can result in challenges such as the limit of detection (LOD), which is defined as the minimal concentration of an environmental contaminant that can be reliably distinguished from an uncontaminated reference. PROTECT Biostatisticians Bhramar Mukerjee and Jonathan Boss recently published their work with Project 1 researchers on the development of a statistical tool that accounts for re-calibrated LODs to improve the estimation of health effects.

The reasoning behind the proposed methodology can be illustrated by the following hypothetical example. Suppose we are conducting a case-control study where we want to evaluate whether women with preterm deliveries have higher than average Bisphenol A (BPA) concentrations compared to women with full-term deliveries. A well-established recruitment strategy for case-control studies is to recruit cases with preterm birth outcomes first and then recruit controls with full-term deliveries which match the demographic characteristics of the cases. If the lab analyzes batches of biosamples on a rolling basis, then the earlier batches will be mostly comprised of biosamples from women with preterm deliveries and the later batches will be mostly comprised of biosamples from women with full-term deliveries. Now suppose that the re-calibrated LOD for the later batches is different from the LOD for the earlier batches. Then the measurement process itself is associated with the health outcome we are investigating. Therefore, we need a statistical method that can tease out whether BPA is associated with preterm delivery while simultaneously accounting for the fact that the measurement process is also associated with preterm delivery.

In environmental health studies with large numbers of participants, biosamples are often run in separate batches as they accrue, resulting in different LODs corresponding to each batch. Depending on the order in which the biosamples are assigned to batches, re-calibration of the LOD across batches matters a lot for the purposes of building statistical models. Some of the environmental exposure data collected through PROTECT are subject to the issue of multiple batch-specific LODs, thereby requiring downstream statistical analyses to account for re-calibrated LODs. “Our publication assists PROTECT by providing statistical methods that are customized to the structure of PROTECT’s exposure data so that numerical conclusions are as rigorous as possible,” says author Jonathan Boss. “The new statistical methodology in combination with open-source software will allow PROTECT researchers to have an easily accessible, tailored statistical method to use going forward.”

Moving forward, we hope that these new statistical methods will continue to improve our biological analysis despite the challenges posed by environmental health research.