ALGAE Protocol: Terminology

An automated protocol for assigning early life exposures to longitudinal cohort studies

Terminology

This section defines concepts that are used in the ALGAE protocol.

by Kevin Garwood

Term Definition
Address period Describes the period where a study member was at a given residential address. An address period comprises a study_id, a geocode, a start date and an end date.
Assigned geocode The location which the ALGAE data cleaning algorithm assigns to a study member for a given day. The location is determined based how the software adjusts start end end dates of address periods to fix any temporal gap and overlap errors.
Birth Address Assessment A method of assessing exposures that is only used in the early life analysis. The assessment method has the following characteristics:
  1. is based on the cleaned temporal boundaries of residential address periods
  2. uses the location study members occupied at birth as their location for their entire early life exposure period (eg: T1.start_date to EL.end_date inclusively). Any other moves they may have made during T1, T2, T3 or EL are ignored.
  3. are not accompanied by exposure measurement error values.
Cleaned Mobility Assessment A method of assessing exposures which:
  1. is based on the cleaned temporal boundaries of residential address periods
  2. considers the contributions of exposures from all address periods which overlap a study member's exposure period
  3. is accompanied by exposure measurement error assessments for days that were fixed through data cleaning efforts (See Calculations and Algorithms for information about how errors are assessed).
Conception date The date when life begins. By default, the protocol defines conception date to be: birth date - (7 x gestation age at birth in weeks). See Life Stage Calculations.
Contention day A day where a study member could be placed at more than one address. A contention day occurs when a temporal gap or overlap exists between successive address periods.
EL The "Early Life" stage of the early life analysis. By defauult, it has a time frame that is defined as [birth_date, birth_date + 1 year - 1 day]. See Life Stage Calculations.
Gestation age at birth The estimated age of a foetus, measured in weeks. Often this is assessed by foetal scans, estimated date of a pregnant mother's last menstrual period or both.
Good address day One of the categories used to assess the quality of a daily exposure value. Within the context of a given pollutant, a daily exposure value is considered a good address day if
  1. its geocode is valid
  2. there is a non-empty exposure value for the that day.
Invalid address day One of the categories used to assess the quality of a daily exposure value. Within the context of a given pollutant, a daily exposure value is considered an invalid address day if
  1. its geocode is invalid
  2. there are no exposure values for that pollutant
Invalid geocode A geocode is invalid if it satisfies either of the following criteria:
  1. It has a blank value. This means that you should see a blank value in the geocode column of the original_address_history_data table original_geocode_data table has a "Y" value for has_valid_geocode
  2. It has a non-blank value, but its entry in the original_geocode_data table has a "N" for the field has_valid_geocode.

The second criterion would suggest that your geocoding software tried its best to give a match for a poorly specified residential address, but the quality of the guess was still unacceptably low.

Life Stage Mobility Assessment A method of assessing exposures which:
  1. is based on the cleaned temporal boundaries of residential address periods
  2. uses the location study members occupied on the first day of each life stage as their location for the entirety of that life stage. Therefore moves made during the life stage are ignored for exposure assessment.
  3. are not accompanied by exposure measurement error values.
Missing exposure day One of the categories used to assess the quality of a daily exposure value. Within the context of a given pollutant, a daily exposure value is considered a missing exposure day if:
  1. its geocode is valid
  2. it is associated with exposure values for some days for that pollutant
  3. for the given day, there is no exposure value available.
NAME High altitude pollution that comes from outside the exposure area.
NOX_rd Nitrogen oxide pollution from roads.
Opportunity geocode The alternative location that a study member could have occupied had data cleaning not been applied to the address periods.

Consider two successive address periods an and an+1. If a gap exists between them, then the assigned geocode is an+1 because ALGAE will include those days in an+1. However, if cleaning had not happened, then the opportunity geocode would have been an.

If an overlap exists between them, then the assigned geocode is again an+1 because ALGAE favours preserving the start dates of an+1 at the expense of shrinking the end date coverage of an. In this case, an is again the opportunity geocode.

Out of bounds day One of the categories used to assess the quality of a daily exposure value. Within the context of a given pollutant, a daily exposure value is considered an out of bounds day if
  1. its geocode is valid
  2. there are no exposure values for that pollutant
Out-of-bounds geocode A geocode is out-of-bounds if it satisfies both of the following criteria:
  1. It has a valid geocode. This means the geocode has a non-blank value and that its entry in the original_geocode_data table has a "Y" value for has_valid_geocode
  2. It is not associated with any exposure values. This means that the geocode, while valid, should not appear anywhere in the table original_exposure_data table.
PM10_gr PM10 particulate matter coming from sources other than roads.
PM10_rd PM10 particulate matter coming from road sources.
PM10_tot PM10 particulate matter coming from all sources.
Poor address day One of the categories used to assess the quality of a daily exposure value. Within the context of a given pollutant, a daily exposure value is considered a poor address day if:
  1. its geocode is invalid
  2. there is a non-empty exposure value for the that day.
T1 Trimester 1 of pregnancy. By default, the early life analysis defines the time frame for Trimester 1 as being [conception date, conception date + 92 days]. See Life Stage Calculations.
T2 Trimester 2 of pregnancy. By default, the early life analysis defines the time frame for Trimester 2 as being [conception date + 93 days, conception date + 183 days]. See Life Stage Calculations.
T3 Trimester 3 of pregnancy. By default, the early life analysis defines the time frame for Trimester 2 as being [conception date + 184 days, birth date + 183 day]. Remember that for premature babies, T3 can be shortened and in very premature cases it can be missing entirely from the exposure results. See Life Stage Calculations.
Uncleaned Mobility Assessment A method of assessing exposures which:
  1. is based on the cleaned temporal boundaries of residential address periods
  2. considers the contributions of exposures from all address periods which overlap a study member's exposure period
  3. ignores contention days from exposure calculations
  4. is not accompanied by exposure measurement error values