Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Age for buddy-matching #240

Open
MPagel opened this issue Dec 11, 2024 · 1 comment
Open

Age for buddy-matching #240

MPagel opened this issue Dec 11, 2024 · 1 comment

Comments

@MPagel
Copy link
Contributor

MPagel commented Dec 11, 2024

Currently the age of patients and their matched control is determined from the date-of-query and then binned into 5 year wide groupings.

However, this allows for a somewhat more non-deterministic pairing of patients and their matched negative controls than is likely intended, as people will age into a new age bracket at a different time than their negative controls. Buddies that are produced one year for a patient may not be eligible buddies in the subsequent year (and conversely, a new set of buddies may be made available).

Suggestion: use a fixed date as a reference for age determination for buddy matching. Study start date or projected study end date would be "logical" reference dates.

@MPagel
Copy link
Contributor Author

MPagel commented Dec 11, 2024

If similar code is ported to a phenotype selection for future projects, I would personally find

        -- FLOOR shouldn't be needed in the CASE statement below, as integer divided by integer = truncated/floored integer. But inclusion of FLOOR makes intent more clear
        CASE FLOOR(datediff(year, d.birth_datetime, @arbitraryDate) / 5) 
		WHEN  0 THEN '0-4'
		WHEN  1 THEN '5-9'
		WHEN  2 THEN '10-14'
		WHEN  3 THEN '15-19'
		WHEN  4 THEN '20-24'
		WHEN  5 THEN '25-29'
		WHEN  6 THEN '30-34'
		WHEN  7 THEN '35-39'
		WHEN  8 THEN '40-44'
		WHEN  9 THEN '45-49'
		WHEN 10 THEN '50-54'
		WHEN 11 THEN '55-59'
		WHEN 12 THEN '60-64'
		WHEN 13 THEN '65-69'
		WHEN 14 THEN '70-74'
		WHEN 15 THEN '75-79'
		WHEN 16 THEN '80-84'
		WHEN 17 THEN '85-89'
		ELSE
			CASE SIGN(FLOOR(datediff(year, d.birth_datetime, @arbitraryDate)))
				WHEN 1 THEN '90+'
				ELSE NULL -- or 'not yet born'
			END
	END AS pt_age

cleaner, more legible and potentially less computationally intense as compared with

,CASE
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 0
AND 4
THEN '0-4'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 5
AND 9
THEN '5-9'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 10
AND 14
THEN '10-14'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 15
AND 19
THEN '15-19'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 20
AND 24
THEN '20-24'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 25
AND 29
THEN '25-29'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 30
AND 34
THEN '30-34'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 35
AND 39
THEN '35-39'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 40
AND 44
THEN '40-44'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 45
AND 49
THEN '45-49'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 50
AND 54
THEN '50-54'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 55
AND 59
THEN '55-59'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 60
AND 64
THEN '60-64'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 65
AND 69
THEN '65-69'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 70
AND 74
THEN '70-74'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 75
AND 79
THEN '75-79'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 80
AND 84
THEN '80-84'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) BETWEEN 85
AND 89
THEN '85-89'
WHEN datediff(year, d.birth_datetime, CURRENT_DATE) >= 90
THEN '90+'
END AS pt_age

it would be further "optimized" by eliminating the text labels entirely and just collapsing division result greater than 18 into 18

...but I acknowledge that the current syntax gets the job done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant