ALGAE Testing Part 5: Exposure Features

An automated protocol for assigning early life exposures to longitudinal cohort studies

Testing Part 6: Accommodating Bad Address Periods

by Kevin Garwood

Testing Overview Previous Next

Background

One of the major challenges of developing the protocol was trying to develop ways to make it accommodate bad address periods, which can occur in the exposure time frame of some study members and result in periods where their exposure values are either unknown or of poor quality.

ALGAE now allows bad address periods to be accommodated, but their contributions can be hidden in the aggregation daily exposure records that help produce life stage exposure results.

In order to allow poor exposure values in analysis but also indicate their significance to researchers, ALGAE associates the following data quality day counters with each pollutant:

The sum of all these count values equals the duration of the life stage. Researchers can use the totals to create their own data quality thresholds. The tests in this section describe scenarios whose effect on results should be verified by these totals.

Coverage

The coverage in this area is a bit complicated. We first want to test that when bad address periods are fixed, that they don't contribute to the data quality rating of the life stage exposures. Next, we introduce two test "all or nothing" cases which are meant to verify that the data quality variables show correct values when either the entire exposure time frame is spent at a good address or when it is spent at an invalid address.

Next, we start with checking that in the birth address assessment, the outcome of whether the birth address is at a good address or an invalid one is reflected in the data quality measures. The test cases then move onto testing that the same idea applies with the locations that fall on the first days of life stages. As we move on with the test cases, we switch between considering invalid and out of bounds causes of bad address periods.

Testing then moves on from the life stage assessment to the cleaned and uncleaned assessments. They use exactly the same address history data, but in the uncleaned assessment:

  • days involved with gaps and overlaps are ignored
  • the number of gap and overlap days is also subtracted from the life stage duration.

Tests that verify fixed bad address periods don't affect exposure data quality

Test Case: Fixable bad address period 1

If a bad address period can be fixed, then it should make no contributions to the data quality indicators. Here, the study member lives at three address periods where two good address periods sandwich an invalid address period. ALGAE should correct the address period, resulting in only good address days.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 92 92 76 365
Life Stage Duration 92 92 76 365

Test Case: Fixable bad address period 2

This test case just again confirms that a fixable bad address period makes no contribution to the data quality of a life stage exposure. In this example, the bad address period is fixed, and then a gap is later filled. Regardless of whether gaps or overlaps are corrected afterwards, the fixed bad address period should not influence data quality for life stage exposures.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 92 92 76 365
Life Stage Duration 92 92 76 365

Tests for "all or nothing" scenarios of all good or all bad address time frames

Test Case: No exposure loss

In this test case, a study member lives at one valid address period which has exposure values for every day of her exposure time frame. The results should show that the total number of good address days equals the total days in the life stage.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 92 92 76 365
Life Stage Duration 92 92 76 365

Test Case: Total exposure loss

A study member has lived at only one address period that spans the entire exposure time frame. It has an invalid geocode, so we would expect that every life stage will have the invalid address days equal the life stage duration.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 92 92 76 365
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 0 0 0 0
Life Stage Duration 92 92 76 365

Tests for bad address periods in birth address assessments

Test Case: Birth address assessment spent at good address period

In this scenario about the birth address exposure assessment, the study member occupies a good address period that ends on the day she was born. Afterwards, she lives at an invalid address period. Because the birth address mobility assessment uses the birth address to represent the location for the entire early life analysis, the entire exposure time will appear to be spent at the good address period.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 92 92 76 365
Life Stage Duration 92 92 76 365

Test Case: Birth address assessment spent at invalid address

The study member occupies an invalid address period that starts at conception and ends within the first few days of the EL period. Afterwards, she lives at a good address period. Again, because of the way the birth address assessment works, the results will show that she lived at invalid address for the entire early life analysis.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 92 92 76 365
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 0 0 0 0
Life Stage Duration 92 92 76 365

Tests for bad address periods in life stage assessments

Test Case: Life stage mobility assessment with invalid address

In the life stage mobility assessment, a study member occupies an invalid address period that starts on the first day of T2. She then lives at a good address period that spans the rest of that life stage. Because of the way the life stage assessment works, she will appear to have occupied an invalid address period for all of T2.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 92 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 92 0 76 365
Life Stage Duration 92 92 76 365

Test Case: Life stage mobility assessment with out of bounds address

In the life stage mobility assessment, a study member lives at good address period until the first day of a life stage. Afterwards, she occupies an out of bounds address. The assessment should place her at the good address period for all of T2, even though she only spent one day out of 92 days.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 0 0 0
Out of Bounds Address Days 0 0 76 365
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 92 92 0 0
Life Stage Duration 92 92 76 365

Tests for bad address periods in cleaned and uncleaned mobility assessments

In these test cases, we are verifying that ALGAE's facilities for fixing gaps and overlaps treats an unfixable bad address period as if it were like any other record in the residential address history. In each test case, we report expected results for both cleaned and uncleaned assessments. The two main differences between the two are that in the uncleaned assessment:
  • days involved with gaps and overlaps are ignored
  • the total number of days involved with gaps and overlaps is subtracted from the life stage duration.

As we go through our handling of gaps and overlaps, we sometimes change the kind of exposure data quality type we use.

Test Case: Bad address periods and gaps

In this scenario, there is a gap between a good address period and an invalid address period that follows it.

Expected Results for Cleaned Assessment

Exposure Category T1 T2 T3 EL
Invalid Address Days 50 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 42 92 76 365
Life Stage Duration 92 92 76 365

Expected Results for Uncleaned Assessment

Exposure Category T1 T2 T3 EL
Invalid Address Days 40 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 0 0
Missing Exposure Days 0 0 0 0
Good Address Days 42 92 76 365
Life Stage Duration 92 92 76 365

Test Case: Poor address periods and combinations of gaps and overlaps

In this case, we alter the temporal boundaries of a poor address period twice: once to fix a gap and another to fix an overlap.

Expected Results for Cleaned Assessment

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 35 0
Missing Exposure Days 0 0 0 0
Good Address Days 42 92 41 365
Life Stage Duration 92 92 76 365

Expected Results for Uncleaned Assessment

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 0 0 0
Out of Bounds Address Days 0 0 0 0
Poor Address Days 0 0 25 0
Missing Exposure Days 0 0 0 0
Good Address Days 42 92 26 365
Life Stage Duration 92 92 76 365

Testing multiple types of exposure quality in the same life stages

In this test case, a study member has each type of exposure quality in each life stage. We want to check that ALGAE will assess each life stage exposure quality type independently of each other.

Expected Results

Exposure Category T1 T2 T3 EL
Invalid Address Days 0 25 20 92
Out of Bounds Address Days 0 2 4 6
Poor Address Days 0 26 21 94
Missing Exposure Days 0 4 8 12
Good Address Days 92 35 23 161
Life Stage Duration 92 92 76 365