Statistik Statistik Per Bruun Brockhoff Professor i Statistik DTU Informatik 10 Marts 2009 http://www.imm.dtu.dk/~pbb 1. Hvad er statistik? Medicin Samfundsvidenskab Ingeniørområdet etc 2. Mennesker som måleinstrumenter Introduktion til (kursus)statistik Hvad er Statistik? Danmarks Statistik: http://www.dst.dk/ Danske skolekarakterer: http://www.karakter.dk/ NetStat: http://statmaster.sdu.dk/netstat/ Vildfarne fortolkninger: http://statmaster.sdu.dk/netstat/kap_5/afsnit5_5_2.htm Øresundsbroen har ikke fjernet ålene i Øresund. IMM nyheder: http://www.imm.dtu.dk/nyheder/nyheder_imm.aspx?guid=%7b4467b8e4-b670-4355-9eb8-815203c4f9d2%7d Øresundsbro frikendt for at genere fisk DR nyheder 24. mar. 2007 17.32: http://www.dr.dk/nyheder/udland/2007/03/24/172642.htm?wbc_purpose=update Forskning i Mobilstråling makværk! Ingeniøren 8. jan 2007: http://ing.dk/artikel/76295-eksperter-dansk-forskning-i-mobilstraaling-er-makvaerk Mobiltelefoni og cancer. Abstract for dansk artikel i JNCI Journal of the National Cancer Institute: http://jnci.oxfordjournals.org/cgi/content/abstract/98/23/1707 Flere misdannede børn i Danmark DR nyhederne, 14. sep. 2007 10.38 Indland: http://www.dr.dk/nyheder/indland/2007/09/14/102344.htm?wbc_purpose=update+-+81k+- Lotto fejl!! Lotto-fejl mindskede net-spilleres chance for den store gevinst: http://ing.dk/artikel/81974 Statistik: Et af de 11 mest vigtige bidrag til medicin i 1000 år!!! http://content.nejm.org/cgi/content/extract/342/1/42 De 11 vigtigste bidrag til den medicinske forskning det seneste årtusind Elucidation of Human Anatomy and Physiology Discovery of Cells and Their Substructures Elucidation of the Chemistry of Life Application of Statistics to Medicine Development of Anesthesia Discovery of the Relation of Microbes to Disease Elucidation of Inheritance and Genetics Knowledge of the Immune System Development of Body Imaging Discovery of Antimicrobial Agents Development of Molecular Pharmacotherapy Application of Statistics to Medicine 1747: James Lind behandlede 12 skørbugramte skibspassagerer med citronsaft i verdens første kliniske forsøg 1854: John Snow opdagede sammenhæng mellem kolera og forurenet vand i London. ( Pumpen i Broad Street ) 1
Bang & Olufsen Interfaces such as touch screens are becoming more and more common Switch investigation The relative importance of visual, auditory and haptic information for the user s experience of mechanical switches Ditte Hvas Mortensen, Department of Psychology, Aarhus University, Denmark and Bang & Olufsen a/s, Struer, Denmark Søren Bech, Bang & Olufsen a/s, Struer, Denmark Ranking of switches Bang & Olufsen Bang & Olufsen This is the title 2
Numbers Statistics Decisions What is sensometrics? Trained assessors A sensory measurement instrument Evaluates products under controlled and designed conditions NOT preference evaluations NOT consumer studies Sensometrics is the scientific area that applies mathematical and statistical methods to model data from sensory and consumer science. What is sensory? History: Since 1992! Conferences: Neurophysiology Anatomy Process Psychophysics Technology Quality control Chemistry Instrumental measurements Sensory Science Marketing Consumer Science Statistics Data analysis Chemometrics Psychology Cognitive science Sensometrics (100-200) 1992 Leiden, NL 1994 Edinburgh, Scotland 1996 Nantes, France 1998 Copenhagen, Denmark 2000 Columbia, MS, USA 2002 Dortmund, Germany 2004 Davis, CA, USA 2006 Ås, Norway 2008 Ontario, Canada Pangborn Sensory (500-700-900) 1992 Järvenpää, Finland 1995 Davis, USA 1998 Ålesund, Norge 2001 Dijon, Frankrig 2003 Boston, USA 2005 Harrogate, UK 2007 Minneapolis, USA The Sensometric Society www.sensometric.org 3
The role of sensory Industry: Product development to fit consumer needs Quality control in production Sensory-calibrated instrumental-based quality assessment methods Research: Advanced measurement instrument Understanding the instrument and it s outcomes Development of new sensory methodology A subject is presented with 3 samples, two of which are equal: Question: Which is different? Binomial data: answer is correct or incorrect Binomial models: X = Number of correct answers X follows a binomial distribution withsomeunknownp n=12 p=1/3 P( X n! k!( n k)! k n k k) p (1 p) Expected pattern when performing many triangle trials with n=12 and NO detectable difference. n=12 p=2/3 Discrimination test: Two situations in practice: 1. Analysis phase: 1. Observed x out of n correct responses: 2. Are the products similar or different? 2. Planning phase: 1. How many test should we carry out? Expected pattern when performing many triangle trials with n=12 and LARGE detectable difference. 4
Analysis situation 1: Miss Anna Sens: 28 triangle tests, 14 correct responses Hypothesis: The products are similar Alternative: The products are different H 0 : p 1/ 3 H1 : p 1/ 3 Hypothesis test: 1. Find P-value and compare with the nominal level (alpha) 2. How odd does the data look? (IF the hypothesis is OK) P-value: n=28 p=1/3 Choose Type I risk: 0.05 P value P(Observing14 or larger "at random") P( X 14), X ~ bin(28,1/ 3) 0.05 5% (Table or Computer) Decision : Products are different Statistical significance Type I risk Type I and II risks Probability of WRONGLY claiming a difference (=0.05) Probability of rejecting the null hypothesis even though it is true! Most commonly used levels: 0.05 0.01 0.001 Products are similar Products are different Claiming Difference ( Rejection ) TYPE 1 ERROR (alpha) Correct decision Claiming Similarity ( Acceptance ) Correct decision TYPE 2 ERROR (beta) Type II risk Power Probability of WRONGLY accepting similarity Probability of accepting the null hypothesis even though it is false! Used primarily in the planning phase! Difficulty: A deviation from p=1/3 can be small or large (a continuum of possibilities) Power 1 Probability of rejecting when we should reject! The more power the better A property of the test procedure NOT depending on the actual data! In practice: Trade-off between type I and type II risks! 5
Risk of not finding large difference: Risk of not finding medium difference: Beta risk=10% (Power=90%) Beta risk=37% (Power=63%) Risk of not finding small difference: Consequences of errors: Beta risk>50% (Power<50%) Example: New and better recipe for an existing product is saught! Type I error: A perfect recipe is rejected (Production department is unhappy!) Type II error: A wrong product is accepted and sent on the market (Marketing department is VERY unhappy) Confidence intervals Intervals are used instead of single numbers Good tool to summarize information Example: δ p 1.96 p(1 p) n 6
3-AFC test: Which is most sweet? The paradox is resolved! δ Statistik Statistics DTU Profile 1. Hvad er statistik? Medicin Samfundsvidenskab Ingeniørområdet etc 2. Mennesker som måleinstrumenter Introduktion til (kursus)statistik Examples: Characteristics: Climatology Working with the use and development Energy(Wind) of statistics in several areas Pharmaceutics KORT vej fra kursusviden til Finance forskning og/eller business Food science Environmental research Marine biology Microbiology Bachelor i Matematik og Teknologi: http://www.dtu.dk/uddannelse/civilingenioer/bacheloruddannelsen/matematik_og_teknologi.aspx 7