Testing Part 4: Address History Features
by Kevin Garwood
Testing Overview | Previous | Next |
Background
These features describe the temporal aspects of cleaning address periods. They ensure that study members live at exactly one address for each day of their exposure time frame. It mainly focuses on how the protocol handles temporal gaps, overlaps and deletions in a chronologically ordered sequence of address periods.
Tests examine cleaned address variables that appear in the
res_early_cleaned_addr
result file and the sensitivity variables that
appear in the res_sens_variables
and res_early_stage_sens
result files.
If all the tests in both address history and geocode feature areas pass, then we may assume that all the address periods used in the Exposures test area will be valid.
Coverage
Input Fields Covered by Test Cases
Table | Field |
---|---|
original_geocode_data | geocode |
original_geocode_data | has_valid_geocode |
Test Case Design
This area covers a lot of variables, many of which involve variables which capture the extent of changes made from cleaning address periods:-
out_of_bounds_geocodes
-
invalid_geocodes
-
fixed_geocodes
-
total_addr_periods
-
over_laps
-
gaps
-
gap_and_overlap_same_period
-
deletions
-
imp_blank_start_dates
-
imp_blank_end_dates
-
imp_blank_both_dates
-
imp_last_dates
-
days_changed
-
has_bad_geocode_within_time_frame
-
total_contention_days
Test for a study member who has no address periods
The protocol needs to anticipate errors that may occur when the input data sets are linked. If theoriginal_study_member_data
and
original_address_history_data files
are prepared by separate
groups, then it is possible that study members mentioned in one file may
not appear in the other.
If a study member has no address periods, then would we use NULL
or 0 to indicate the total number of address periods they occupied during his
or her exposure time frame? On one hand we know that the study member must
have lived somewhere, so it would seem zero would be incorrect. However,
if the protocol is counting the number of available address periods that
cover the exposure time frame, then the answer would be 0.
In this area, if study members have no associated address periods, we
will assign zero rather than NULL
to variables that count
different types of changes.
Test for a study member who has one address period
If a study member experiences no address changes, then many fields will have predictable values:-
fixed_geocodes = 0
-
total_addr_periods = 1
-
over_laps = 0
-
gaps = 0
-
gap_and_overlap_same_period = 0
-
deletions = 0
-
imp_blank_start_dates = 0
-
imp_blank_end_dates = 0
-
imp_blank_both_dates = 0
-
imp_last_dates = 0
-
days_changed = 0
-
total_contention_days = 0
Test for a study member who has two or more address periods
Once study members have at least two address periods, they are able to have gaps, overlaps, deletions, contention days, and fixed bad geocodes.Test exhaustively for gaps and overlaps using two successive address periods that are 1, 2 and 3 days of length
One of the most important aspects of the protocol is that it is able to create a temporally continuous address history that spans the entire exposure time frame of the study member. Fixing gaps and overlaps that may exist between successive address periods is a critical part of that process. Study members will not likely have address period durations that are so short. However, periods of 1, 2 and 3 days often capture many of the edge test cases in test cases and we felt that if the protocol could handle them, it could handle combinations of address periods that had an arbitrary length.
Include test case where the same address period is involved with cleaning gap and an overlap
It is possible that the temporal boundaries of an address period could be changed twice: once in response to fixing a gap and another in response to fixing an overlap. The diagram below illustrates this case: