Statistik for MPH: 7

Relaterede dokumenter
Statistik for MPH: oktober Attributable risk, bestemmelse af stikprøvestørrelse (Silva: , )

Statistik for MPH: november Attributable risk, bestemmelse af stikprøvestørrelse (Silva: , )

Basic statistics for experimental medical researchers

Generalized Probit Model in Design of Dose Finding Experiments. Yuehui Wu Valerii V. Fedorov RSU, GlaxoSmithKline, US

Kursus 02323: Introducerende Statistik. Forelæsning 12: Forsøgsplanlægning. Peder Bacher

X M Y. What is mediation? Mediation analysis an introduction. Definition

Vina Nguyen HSSP July 13, 2008

Department of Public Health. Case-control design. Katrine Strandberg-Larsen Department of Public Health, Section of Social Medicine

Reexam questions in Statistics and Evidence-based medicine, august sem. Medis/Medicin, Modul 2.4.

applies equally to HRT and tibolone this should be made clear by replacing HRT with HRT or tibolone in the tibolone SmPC.

Confounding og stratificeret analyse

OBSERVERENDE UNDERSØGELSER. Kim Overvad Institut for Epidemiologi og Socialmedicin Aarhus Universitet Forår 2002

Besvarelser til Lineær Algebra Reeksamen Februar 2017

Morten Frydenberg 25. april 2006

Skriftlig Eksamen Kombinatorik, Sandsynlighed og Randomiserede Algoritmer (DM528)

Kommentarer til spørgsmålene til artikel 1: Ethnic differences in mortality from sudden death syndrome in New Zealand, Mitchell et al., BMJ 1993.

Statistik. Statistik. Hvad er Statistik? Hvad er Statistik? Hvad er Statistik? 1. Hvad er statistik? 2. Mennesker som måleinstrumenter

On the complexity of drawing trees nicely: corrigendum

Measuring the Impact of Bicycle Marketing Messages. Thomas Krag Mobility Advice Trafikdage i Aalborg,

DoodleBUGS (Hands-on)

Brystkræftscreening og overdiagnostik hvordan forstår vi stigningen i incidens?

Oversigt. 1 Motiverende eksempel - energiforbrug. 2 Hypotesetest (Repetition) 3 Two-sample t-test og p-værdi. 4 Konfidensinterval for forskellen

25. april Probability of Developing Coronary Heart Disease in 6 years. Women (Aged 35-70) 160 No Yes

Statistik ved Bachelor-uddannelsen i folkesundhedsvidenskab. Uafhængighedstestet

Measuring Evolution of Populations

University of Copenhagen Faculty of Science Written Exam April Algebra 3

GUIDE TIL BREVSKRIVNING

Løsning til eksaminen d. 29. maj 2009

Sport for the elderly

Agenda. The need to embrace our complex health care system and learning to do so. Christian von Plessen Contributors to healthcare services in Denmark

Cross-Sectorial Collaboration between the Primary Sector, the Secondary Sector and the Research Communities

Vores mange brugere på musskema.dk er rigtig gode til at komme med kvalificerede ønsker og behov.

Logistisk Regression - fortsat

A multimodel data assimilation framework for hydrology

9. Chi-i-anden test, case-control data, logistisk regression.

Løsning til eksaminen d. 14. december 2009

IBM WebSphere Operational Decision Management

Help / Hjælp

Linear Programming ١ C H A P T E R 2

Project Step 7. Behavioral modeling of a dual ported register set. 1/8/ L11 Project Step 5 Copyright Joanne DeGroat, ECE, OSU 1

Aktivering af Survey funktionalitet

Privat-, statslig- eller regional institution m.v. Andet Added Bekaempelsesudfoerende: string No Label: Bekæmpelsesudførende

Årsagsteori. Kim Overvad Afdeling for Epidemiologi Institut for Folkesundhed Aarhus Universitet April 2011

Sustainable use of pesticides on Danish golf courses

Introduktion til Statistik. Forelæsning 10: Inferens for andele. Peder Bacher

Introduktion til Statistik. Forelæsning 12: Inferens for andele. Peder Bacher

CS 4390/5387 SOFTWARE V&V LECTURE 5 BLACK-BOX TESTING - 2

Mantel-Haenszel analyser. Stratificerede epidemiologiske analyser

Trolling Master Bornholm 2012

Breaking Industrial Ciphers at a Whim MATE SOOS PRESENTATION AT HES 11

Logistisk regression

NOTIFICATION. - An expression of care

Particle-based T-Spline Level Set Evolution for 3D Object Reconstruction with Range and Volume Constraints

RoE timestamp and presentation time in past

PARALLELIZATION OF ATTILA SIMULATOR WITH OPENMP MIGUEL ÁNGEL MARTÍNEZ DEL AMOR MINIPROJECT OF TDT24 NTNU

Jens Olesen, MEd Fysioterapeut, Klinisk vejleder Specialist i rehabilitering

Løsning eksamen d. 15. december 2008

Kvant Eksamen December timer med hjælpemidler. 1 Hvad er en continuous variable? Giv 2 illustrationer.

Kursus 02402/02323 Introduktion til statistik. Forelæsning 13: Et overblik over kursets indhold. Klaus K. Andersen og Per Bruun Brockhoff

Kursus 02402/02323 Introducerende Statistik. Forelæsning 6: Sammenligning af to grupper

University of Copenhagen Faculty of Science Written Exam - 3. April Algebra 3

Constant Terminal Voltage. Industry Workshop 1 st November 2013

Molio specifications, development and challenges. ICIS DA 2019 Portland, Kim Streuli, Molio,

Analyseinstitut for Forskning

Ovl. Hans Mørch Jensen Prof. L. V. Kessing. Prof. Ø. Lidegaard Prof. P. K. Andersen PhD, MD, L. H. Pedersen Biostatistiker Randi Grøn

IBM Network Station Manager. esuite 1.5 / NSM Integration. IBM Network Computer Division. tdc - 02/08/99 lotusnsm.prz Page 1

Applications. Computational Linguistics: Jordan Boyd-Graber University of Maryland RL FOR MACHINE TRANSLATION. Slides adapted from Phillip Koehn

Financial Literacy among 5-7 years old children

Exam questions in Statistics and evidence-based medicine, spring sem. Medis/Medicin, Modul 2.4.

Trolling Master Bornholm 2014

Ikke-parametriske tests

The complete construction for copying a segment, AB, is shown above. Describe each stage of the process.

Skriftlig Eksamen Beregnelighed (DM517)

Bilag. Resume. Side 1 af 12

Exam questions in Statistics and evidence-based medicine, spring sem. Medis/Medicin, Modul 2.4.

Projekt DATA step view

Sign variation, the Grassmannian, and total positivity

Skriftlig Eksamen Diskret matematik med anvendelser (DM72)

How Long Is an Hour? Family Note HOME LINK 8 2

Internationalt uddannelsestilbud

DONG-område Resten af landet

Skriftlig eksamen Science statistik- ST501

Nyhedsmail, december 2013 (scroll down for English version)

User Manual for LTC IGNOU

Remote Sensing til estimering af nedbør og fordampning

Statistik ved Bachelor-uddannelsen i folkesundhedsvidenskab. Mål for sammenhæng mellem to variable

Implementing SNOMED CT in a Danish region. Making sharable and comparable nursing documentation

Sidde- og ventemøbler EFG Mingle 19/10/2017

MPH specialmodul i epidemiologi og biostatistik. SAS. Introduktion til SAS. Eksempel: Blodtryk og fedme

Motorway effects on local population and labor market

Research on development of international beef genetic evaluations for calving traits

An expression of care Notification. Engelsk

The X Factor. Målgruppe. Læringsmål. Introduktion til læreren klasse & ungdomsuddannelser Engelskundervisningen

Kapitalstruktur i Danmark. M. Borberg og J. Motzfeldt

1 enote 1: Simple plots og deskriptive statistik. 2 enote2: Diskrete fordelinger. 3 enote 2: Kontinuerte fordelinger

University of Copenhagen Faculty of Science Written Exam - 8. April Algebra 3

Erik Parner Sektion for Biostatistik. Biostatistisk metode et par eksempler

Engelsk. Niveau D. De Merkantile Erhvervsuddannelser September Casebaseret eksamen. og

Postoperative komplikationer

Angle Ini/al side Terminal side Vertex Standard posi/on Posi/ve angles Nega/ve angles. Quadrantal angle

Transkript:

Statistik for MPH: 7 3. november 2011 www.biostat.ku.dk/~pka/mph11 Attributable risk, bestemmelse af stikprøvestørrelse (Silva: 333-365, 381-383) Per Kragh Andersen 1

Fra den 6. uges statistikundervisning: skulle jeg gerne 1. forstå, at de parametre, som estimeres ved hjælp af logistisk regression kan fortolkes som odds ratio (henholdsvis: ln(odds ratio)), 2. forstå, at for en kategorisk forklarende variabel er disse OR r beregnet i forhold til en valgt reference kategori, 3. forstå, at for en kvantitativ forklarende variabel beskriver disse OR r, hvor meget odds stiger, når den forklarende variabel stiger 1 enhed, 4. forstå, at OR r fra modeller med flere forklarende variable er indbyrdes justerede, 5. forstå, at når der er flere forklarende variable i spil, er der mange mulige måder at vælge modellen på. 2

Fra den 6. uges statistikundervisning behøver jeg derimod ikke nødvendigvis: 1. at have forstået, hvordan det logistiske regressionsanalyseprogram opnår de estimerede OR r og deres SD/sikkerhedsinterval fra datasættet og modellen 2. at have forstået, hvad de præcise forudsætninger er for modellerne, og hvordan de kontrolleres 3. at have forstået, hvordan interaktion/effekt modifikation håndteres ved hjælp af logistisk regression 3

Attributable risks, AR, (excess fractions) Example: Lung cancer Exposure A (cigarette smoking) RR A = 10 Exposure B (uranium mining) RR B = 20 Which exposure has the greatest public health impact? Suppose that Q A = 40% of the population smokes Q B = 0.04% of the population mines uranium Attributable risks are measures which combine relative risk and exposure prevalence. Two types of AR (or excess fractions) (Silva, pp. 97-99, 356-62, 381-83): 1. AR among exposed (Silva: excess fraction %) 2. AR in the total population (Silva: population excess fraction %) 4

Exposed B A Population C Non-Exposed D 5

Notation T = A + B + C + D (total population size) Q = A+B T C P 0 = C+D A P e = A+B RR = P e P 0 P T = A+C T = Q P e + (1 Q) P 0 (because Q P e + (1 Q)P 0 = A+B A T A+B + C+D C T C+D ) = A+C T (proportion exposed= exposure prevalence ) (risk among non-exposed) (risk among exposed) (relative risk) (risk in total population) 6

AR among exposed For some of those in the exposed group, disease occurrence will not be due to exposure : P e = Risk due to exposure + P 0, (if P e P 0, i.e., if RR 1). AR (among exposed) = Proportion of risk among exposed which is due to exposure = P e P 0 P e = RR 1 RR. That is: the number of cases among exposed which is due to exposure is: Total number of cases among exposed AR (among exposed) = T Q P e RR 1 RR, where T Q P e = total number of cases among exposed. 7

AR in total population (PAR) (more important concept of the two) PAR = = Number of cases due to exp. Total number of cases TQP e( RR 1 RR ) TQP e +T(1 Q)P 0 (divide by TP 0 ) = Q(RR 1) Q RR+(1 Q) = Q(RR 1) 1+Q(RR 1) Alternative formula: PAR = P T P 0 P T. Estimation: estimate Q by q = a+b n, estimate RR (or use OR). Confidence limits: exist See Table 16.2 in Silva, p.361: PAR for combinations of Q and RR. 8

9

Current cigarette smoking and lung cancer mortality among US veterans. Smoking Events No-events Total Current cigarette 1116 700652 701768 All others 426 1015573 1015999 Total 1542 1716225 1717767 10

Solution: Prevalence of smoking q = AR = p e p 0 p e = 701768 1717767 1116 701768 426 1015999 1116 701768 = 0.41 = 0.736 RR = p e p 0 = 3.79 PAR = = 0.53 0.41 (3.79 1) 1 + 0.41 (3.79 1) 11

Sample size determination. When planning investigations: How many persons are needed? For what purpose? (1) To obtain a given precision of an estimate: Silva, Section 15.3. (2) To obtain a given power of a test (the most common situation): Silva, Section 15.2. (1) is rarely used in practice and will be skipped here. Instead, a slightly different approach to (2) (than in Silva s book) but leading to the same results. 12

Example: Testing; power. We study pregnant women with pre-eclampsia and wish to compare two treatments with respect to the risk of some pregnancy outcome, e.g. preterm birth. We want to be pretty certain to detect a treatment (exposure) effect of D (a risk difference) - what do we mean by pretty certain? We need the statistical concept of the power of a test. If we test using a given level of significance α (i.e. 5%) and if the true treatment difference is D then we want to have a large probability of rejecting the null hypothesis: D = 0. This probability, 1 β is the power, often set to at least 80%. Note: β is called the Type 2 error risk and α is called the Type 1 error risk. 13

Accept Reject H 0 correct Type 1 error α H 0 wrong Type 2 error β power 1 β In general: the larger power we want and the smaller α we use, the larger needs n to be. The smaller D, the larger needs n to be. To find n, a good guess of the risk in the control group (p 1 ) is needed. Letting p 2 = p 1 D, then n = p 1(1 p 1 ) + p 2 (1 p 2 ) D 2 f(α, β) is the number of women needed in each group. 14

Here, f(α, β) is given by: α β 0.01 0.05 0.10 0.05 17.8 13.0 10.8 0.10 14.9 10.5 8.6 0.15 13.0 9.0 7.2 0.20 11.7 7.9 6.2 0.25 10.6 6.9 5.4 Example: p 1 = 0.15, D = 0.07, α = 0.05, β = 0.20 Then, in each group we need: n = 0.15 0.85+0.08 0.92 0.07 2 7.9 = 324. Example: p 1 = 0.1, RR = 1.5, α = 0.05, β = 0.20 Then: p 2 = p 1 RR = 0.15, D = 0.05 and n = 0.1 0.9+0.15 0.85 0.05 2 7.9 = 687. 15

Finding the power based on the sample size Some times, the maximally obtainable sample size is given and we wish to assess how large the power is for some given value of the treatment difference D. The relationship is still given by: n = p 1(1 p 1 )+p 2 (1 p 2 ) D 2 f(α, β). E.g. n = 500 in each group and p 1 = 0.05, D = 0.05 (i.e., p 2 = 0.1) gives 500 = 0.05 0.95+0.1 0.9 0.05 2 f(α, β) or f(α, β) = 500/55 = 9.09 or β 0.15 if α = 0.05 (because the number in the α = 0.05 column in the table closest to 9.09 is 9.0 corresponding to β = 0.15). That is, the power is 0.85. 16

Unequal group sizes. If the two groups do not have the same size: first compute the total size N = 2n as if the two groups were equally large, then compute k = n 1 /n 2 = the ratio between the group sizes the total number needed is then N = N (1+k)2 4k. Example. If, in the first example above, group 1 is twice as big as group 2: N = 2 324 = 648 k = 2 N = N (1+k)2 4k = 648 9 8 = 729, i.e. n 1 = 486, n 2 = 243. 17

Fra den 7. uges statistikundervisning: skulle jeg gerne 1. forstå, hvordan man med størrelsen population attributable risk PAR = Q (RR 1) 1 + Q (RR 1) kan kombinere hyppigheden, Q, af en risikofaktor og dens effekt, RR, til et folkesundhedsvidenskabeligt relevant mål for, hvor stor en andel af et observeret antal sygdomstilfælde som kan tilskrives risikofaktoren 2. kunne vurdere, hvor stor en stikprøve der behøves for at kunne afsløre en given forskel mellem to hyppigheder med en given styrke 18