HUB19 - Intuistics

From HUB
Jump to: navigation, search

In conjunction with the 4th Heidelberg Forum for Young Life Scientists, HUB brings you an evening of dice, playing cards and werewolf hunting, showing when intuition can go wrong and how statistics can help.

HUB WordPress page reporting on the event

Sign up for the event Let us know you if you plan to come by filling in this doodle

When? Thursday 21st May, 2015, 20:00-21:30

Where? DKFZ Communications Center, Im Neuenheimer Feld 280, 69120 Heidelberg.

Who? A team of people working at several different life-science organisations within Heidelberg

To keep track of people you've met at HUB, join the HUB LinkedIn group

Get involved in organising this and other HUBs!

Announcing the HUB to HFYLS

Aidan gave a 5 min invitation/announcement about the meeting to the HFYLS before lunch - here are the notes on that in case they're useful for us some other time.

Programme

Starting at 20:00:

  • Welcome and introduction - there's a werewolf amongst us! (5mins)
  • Flash talk from Bernd Klaus (5mins)
  • Ice-breaker - the birthday problem (10-20mins)
  • Playing cards and probabilities (10mins)
  • Game show (20mins)
  • We tend to be overconfident (20mins)
  • Final words and farewell: unmasking the werewolf (5mins)
  • To cafe Botanik (ca. 21:30)

Report

Another successful evening! With around 60 people* and several helping hands on deck, people were intrigued from the beginning with their werewolf diagnoses and testing kits, with several groups chatting quite happily. It seemed almost a shame to interrupt this with a welcome and intro to the evening's programme. Bernd Klaus then gave a fun flash talk on 'cheating with charts', warming people up to the theme of the evening and getting people laughing at graphing (hats off to him for that). For the ice-breaker, several people guessed that there was a high chance of pairs of people with the same birthday. This was partly because there were twice as many people as we expected and partly because it's a well-known example... but of course this didn't matter - people liked getting off their feet to move around and talk to each other. Holger's magic trick was intriguing, even to him I think - I still don't know how he managed to guess people's cards every time (perhaps parapsychology could be a topic for a future HUB, as long as we cross Holger's palms with silver). Sam Caddick was a natural host for the game show. People seemed to intuitively know the right answer but for the wrong reason, but that was fine as whatever people said it stimulated discussion (or heated debate). The simulation with playing cards was enjoyed by all and (remarkably) gave the right answer, which was handy as none of us were that confident in explaining it analytically... The confidence quiz showed that people are indeed overconfident, with most groups giving correct ranges for between three and six of the answers. We then found the werewolf, with a brief chat and cartoon explanation of Bayesian statistics, disease prevalence and false-positive rates. It just remained to make some closing remarks, thank our guest presenters (Bernd and Sam) with wine, apologise for not buying wine for our in-house magician (Holger), and then retire to the foyer for wine and snacks kindly provided by HFLYS.

Thanks to everyone who made it happen. I'm looking forward to the next one.

(Explanatory slides used on the night.)

(* - about twice as many who put their names on the doodle.)

Task List

Before the evening:

What Person responsible
Pre-lunch advert to HFYLS on May 21 (inc. HUB video?) Aidan
List of registered people Matt
Create the poster Matt
Print and display poster
  • BioQuant: Matt
  • EMBL: Holger
  • DKFZ: Iman
  • ZMBH:
  • Mensa: Iman
  • HITS:Max
Advertise on linkedin Matt
Email advertising the event Matt
Send reminder email on the day of the event Matt

Things for organisers to bring:

What Person responsible
Participant list Matt
stickers for name badges Matt
explanatory slides Matt
werewolf paper Matt
dice Matt
playing cards Matt
HUB signs + tape
Drinks and snacks HFYLS


To do on the evening:

What Person responsible
Arrange tables/chairs
Hang up signs to the room
Setup and welcome attendees Holger
MC Matt
Flash talk Bernd
Ice-breaker Matt
Playing cards and probabilities Holger
Game show Sam Caddick
'We tend to be overconfident' activitiy Matt
Tidy and close room All
Take participants to Cafe Botanik

Notes

(These were originally collected on piratepad to avoid revealing the answer to some of the activities before the event.)

HUB on 'statistics and intuition', in association with HFYLS

At the HFYLS conference general meeting on February 17, we discussed / agreed on

  • May 21, 20:00 - 22:00
  • In DKFZ conference centre (are K1 + K2 the upstairs rooms? yes!)
  • HFYLS said they would provide the drinks
  • HFYLS said they'd estimate ca. 50 people from their side
  • We estimate between 10 and 40 people from HUB

=> How are we going to know the exact final number?send an email to registred people for HFYLS for advertisement ?

  • To keep in mind that the HUB is in place of what would have been a social event

Ice-breaker (ca. 20 mins)

How likely is it that two people in this room have the same birthday?

(An adaptation of the classic 'standing on a line' HUB ice breaker)

imagine a line from January 1 to December 31 across the room. get everyone to order themselves on the line (in silence?) by their birthday ask them to talk to their two neighbours are you in the right place in the line? Should you be before or after your neighbour if not, move (bubble sort!) repeat until everyone is in the right order with respect to their neighbours ask if there are any pairs with the same birthday ask everyone what the likelihood of that is by chance

If we have at least 23 people, there is a greater than 50% chance that at least two of them will share the same birthday: http://en.wikipedia.org/wiki/Birthday_problem. This is a good example of the common problem of multiple testing, and will hopefully get people chatting and thinking.

Monty Hall problem

Also an example I like: What is also known as the "goat problem" or the "Monty Hall problem" (http://en.wikipedia.org/wiki/Monty_Hall_problem) - based on some American TV show. Quoting from wikipedia:

"Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice? Vos Savant's response was that the contestant should switch to the other door (vos Savant 1990a). Under the standard assumptions, contestants who switch have a 2/3 chance of winning the car, while contestants who stick to their choice have only a 1/3 chance."


It's a really counterintuitive problem and I guess it would lead to lively discussions. Although for the icebreaker I think the birthday thing is nicer as it is less controversial and more of a chat topic. (again quoting from Wikipedia: " Even when given explanations, simulations, and formal mathematical proofs, many people still do not accept that switching is the best strategy (vos Savant 1991a)")


It could be fun to run this (or similar - eg. 'Deal or No Deal') as a real game show. If we had several rounds then it should become obvious that the odds are better if you switch.

We could first have a fake game show in front of everyone: have three boxes on a table at the front, one covering up a 'good' prize, eg. some chocolate or a HUB T-shirt (if we make one!), the other two covering up a cabbage ask for a volunteer from the audience to come to the front get them to pick a box but don't show them what's under it remove one of the others , one with a goat, and show them that it has a goat. Now say they can pick again. ask the audience whether they think he/she should stick or switch reveal what's under their final chosen box give them the good prize anyway

Now explain to them that if you switch you'll have a 2/3 chance of winning. Most people won't believe it, so we now run a simulation: get everyone to split up in to groups of 3-5 people give each group three cards from the same suit, one ace, one one and one two. get them to play the game in their groups ca. 10 times, take turns being the host and the contestant the contestant should always switch on their second choice record the number of times the contestant wins and loses at the end, they come up to the front and add their totals to the black-board we show them that indeed the contestant won 2/3rds of the time when they switched (hopefully!)

Discussion

ask people to split into 2 groups depending on their statistic skills: fair or good VS bad

  1. split people on table by mixing fair and good on each table
  2. ask people to discuss their statistic skills and try to get input from the "good" guys.
  3. also, people should discuss and write down how they overcome their statistic analysis problems (websites, good books, forums...)
  4. go table by table and ask people to summarize what they discussed about and give the recommendation (websites, good books to read, other suggestions)

5 min talk

Invite someone to talk about the importance of stats in science, eg: Thomas Lemberger (Chief Editor of MSB) - has shown an interest in HUB before Alexis Maizel (group leader at COS) - interested in stats (I gave an intro to stats course for MSc students at his request).

Games tables / casino

Lots of games benefit from a statistical analysis... we could several tables, each with a different game, poker tables, roulette wheels, etc... Would be nice if at least some of them had a relevance to biology. Any suggestions?

How to lie with Statistics

Examples where misuse of statistics can lead you astray - not sure how to turn this in to an activity though...

Sample selection

set up a biased and unbiased survey, get people to ask each other questions and then think whether the results are significant or not.

Graphs / data visualisation

scatter vs. bar vs. box plot.

'We tend to be overconfident' exercise

(Example from Motulsky, 'Intuitive Biostatistics')

Ask people in groups to come answer the following ten questions with ranges that they are 90% sure contain the correct answer:

  1. Martin Luther King Jr.’s age at death: 39
  2. Length of the Nile in kilometers: 6738
  3. Number of countries in OPEC: 13
  4. Number of books in the Old Testament: 39
  5. Diameter of the moon in kilometers: 3476
  6. Weight of an empty Boeing 747 in kilograms: 176901
  7. Year Mozart was born: 1756
  8. Gestation period of an Asian elephant in days: 645
  9. Distance from London to Tokyo in kilometers: 9638
  10. Depth of deepest known point in the ocean in kilometers: 11

Then give the correct answers and get each group to say how many they got right (write these numbers on the board)

Did each group give ranges that covered the right answer nine out of ten times? (When more than 1000 people tested, 99% of were overconfident, with most giving ranges that were too narrow, inc. only 30-60% of the right answers.)

More biological questions, suggested by Sam Caddick:

  1. Year of invention of PCR: 1983
  2. Year of death of Charles Darwin: 1882
  3. Date of publication of On the Origin of Species: Nov 24, 1859
  4. Date of publication of Watson & Crick paper: April 25, 1953
  5. Diameter of average blood cell
  6. Number of genes in fruit fly genome
  7. Age of Alexander Fleming when he discovered penicillin: 47
  8. Number of different antibiotics in current use worldwide
  9. Number of species that go extinct per year
  10. Number of active scientists worldwide
  11. Number of biology research papers published in 2014

We just need to find answers for them!

Other questions:

  1. Number of times the 1994 ClustalW paper had been cited as of Oct 2014: 40289
  2. Year of publication of the Sanger sequencing method: 1974
  3. Number of citations of the Sanger sequencing paper as of Oct 2014: 65335
  4. Number of citations for the most cited paper as of Oct 2014: 305148
  5. Number of named authors on the Leung et al March 2015 fruit fly paper: 1014

(Some numbers taken from list linked here: http://www.nature.com/news/the-top-100-papers-1.16224)

Find the werewolf

This could go on at the same time as everything else, with the werewolf revealed at the end. It's a nice example of the intuitive difficulty of combining probabilities / doing Bayesian statistics.

give out to everyone as they enter

  • a folded pieces of paper with 'werewolf' or 'not werewolf' written on it (only one werewolf!), and how they should respond to the blood test (below). tell them not to tell anyone else what they are
  • a die to run the test
  • a pen and a piece of paper to record the results of all the times they were tested

during the intro to the evening say something like:

  • one of you is a werewolf but it's not a full moon, so the rest of us don't know who. (don't tell anyone!)
  • our test for lycanthropy is quite accurate but not perfect
  • it correctly identifies 83% (5/6) of infected blood samples

but it also concludes that 17% (1/6) of noninfected blood samples have come from werewolves during the evening, test anyone you talk to by rolling a die

  • if you're not a werewolf but someone rolls a six, tell them you're a werewolf (False positive). Otherwise tell them you're not (True negative).
  • if you're the werewolf and someone rolls a one, tell them you're not a werewolf (False negative). Otherwise tell them you're a werewolf (True positive).

At the end end of the evening, ask everyone what they think a positive test results means. Then ask the werewolf to reveal themselves, and to tell everyone else how many times they tested positive and how many times they tested negative. Write this on the board. Then ask anyone else who ever tested positive to put up their hands. Point out that the meaning of a positive test depends partly on the prevalence of the disease, not just on the accuracy of the test.

Say that most people intuitively think that a positive test almost certainly means that the disease is present, but that the interpretation of the result greatly depends on the prevalance of the disease (as well as the accuracy of the test). This has obvious implications for testing for real diseases, eg. what if a false positive results causes you to have surgery, or to quit your job and travel the world, etc etc.

Perhaps then explain briefly the stats (simplified to the case of everyone being tested only once): if we have n people (use the real number from the evening), one of whom is a werewolf (but it's not a full moon, so we don't know who).

  • prevalance of disease = 1 / n. (So if we have eg 50 people, prevalence = 2%.)
  • test for lycanthropy is quite accurate but not perfect:
    • it correctly identifies 83% (5/6) of infected blood samples
    • but it also concludes that 17% (1/6) of noninfected blood samples have come from werewolves
    • the one person who is a werewolf will (likely) have a positive test
    • the remaining 49 are not werewolves but 1/6 of them will have a positive result, = 8 people
  • so there are 1 + 8 positive tests, with
    • 1 / 9 (11%) true positives
    • 8 / 9 (89%) false positives

(Prepare a slide in advance to which the real numbers can be added - even better if it will do the maths for us - safer than doing it on the night ;-) )

Planning Meetings

(These were originally collected on piratepad to avoid revealing the answer to some of the activities before the event.)

Wednesday 29/04/2015

Date chosen by this doodle: http://doodle.com/26mh438wz3iuirek

Location: Cafe Fresco

Present: Matt, Holger

Minutes:

  • The main idea is to
    • a) have fun, by
    • b) emphasing when stats can help you overcome your inbuilt bias / intuition.
    • c) focus on 'why do we need statistics?'
  • Birthday problem as ice breaker. Intro should ask 'what are the chances' , then run activity, then explain the stats
  • Can we have the session earlier, eg. during finger-food? (Matt to ask HFYLS) (Matt: I've asked HFYLS, but since dinner is scheduled for 18:35 it's unlikely.)
  • Don't have to connect everything to biology, can just be about statistical thinking / probabilities.
  • quiz show: German quiz show; der Zonk https://de.wikipedia.org/wiki/Geh_aufs_Ganze!
  • can HFYLS sponsor small prizes for games? (Matt to ask) [Matt: I think HUB will be able to cover this, but if necessary we can ask HFLYS when we know what we would like. i.e. after the next planning meeting.]
  • 'balls to statistics' - ask organisers for props: coloured balls, dice etc. that we can use to demo statistics / probabilities.
  • everything can be an 'ice-breaker'
  • general stats discussions might be a bit dry, better to tie these to particular games
  • any other quiz shows with a statistical basis? card games?
  • what's the best way to play roulette? etc. strategies to beat the bank? Is that possible? (No!)
  • how would we organise a casino as more than just tables with games? But it probably doesn't matter if that's all it is.
  • possible tables
    • roulette
    • poker
    • Holger to look at EMBL staff association games for anything appropriate. (unfortunately, no appropriate games such as roulette)
  • 'we tend to be overconfident' exercise in groups. adapt questions to biological questions. Ask them to plot their answers but don't say how. This could easily take 30 mins.
  • even if it's just an hour, that's OK
  • maybe the card tricks / magic society at embl have some relevant ideas? Holger to ask the organiser if they have a good trick with stats background. Perhaps they can they give a demo and/or teach people how to do the trick (either on the night or to a few of us in advance, who will then demo / teach it on the night). (contacted)
  • graphing and data visualisation - every scientist lies by selecting data. Does Toby Gibson have any examples of this in fraud or otherwise? (Holger to ask.) People could discuss what is fair and what is not.
  • two ways to show the same data, advantages and disadvantages of each
  • Holger to ask Bernd Klaus (centre for stats at EMBL) for examples about how to lie with stats. Maybe as intro 5 min talk? If that doesn't work out, Matt will ask Alexis Maizel (COS group leader). Matt / one of us could do such an intro too if necessary, but we think it would be better if it was an outside speaker. (contacted Bernd)
  • 'we tend to be overconfident' exercise in groups. adapt questions to biological questions. Ask them to plot their answers but don't say how. This could easily take 30 mins.
  • even if the whole thing is just an hour, that's OK (better than dragging it out to two hours just for the sake of it)

Thursday May 7

Date chosen by this doodle: http://doodle.com/upq83ztirzug56kx

Location: Cafe Botanik, Im Neuenheimer Feld: http://www.yelp.com/map/caf%C3%A9-botanik-heidelberg

Present: Matt, Holger, Iman (& Jazz the dog), Florian, Thomas Wolf

Agenda:

  • is the following programme OK? (more details above)
    • hand out materials for the 'find the werewolf' game as people arrive
    • welcome: intro to HUB, 'find a werewolf' and the rest of the evening's programme (5mins)
    • flash talk: why do we need statistics? / how to lie with statistics (5 mins)
    • birthday problem ice-breaker (10-20 mins)
    • monty-hall problem: intro as a game show, then a live simulation (20 mins)
    • another game / activity, perhaps with a biological angle (eg. the 'we tend to be overconfident' exercise but with biological questions) (20 mins)
    • final words
      • unmask the werewolf (maybe give them the cure, i.e. a prize?) and explain the stats
      • one line summary of stats vs. intuition
      • present a gift (wine?) to the guest speaker
      • advertise HUB website, mailing list and coming events
    • off to Cafe Botanik
  • if so, flesh out the details: for each point:
    • what will we do,
    • what we need to be able to do it
    • who will do it
  • are there any new ideas for activities?
  • decide on the schedule
  • should we fix a date for the barbecue so we can announce it at this HUB?
  • will it be possible to start the HUB earlier than 20:00?
  • Iman/Chiara have suggested that HUB give a brief presentation before lunch to encourage more people to attend in the evening, perhaps using our video. Who would like to do this?
  • Hub-organizers need to help setting up the room K1/K2
  • wiki page
  • poster
  • business cards to hand out on the night? Since we will potentially have a large audience of people who've never been to HUB before, it would be good to make it easy for them to know where to get more info and how to sign up for the mailing list (they may not bother to write it down if it's shown briefly on a slide).
  • name for the evening? maybe people will be put off by the word 'statistics'?... Suggestions:
    • statistics and intuition (current working title)
    • statistics vs. intuition
    • statistics: for when intuition goes wrong
    • HUBonomics
    • Data beating intuition
  • any other business

Minutes:

  • Iman: HFYLS will provide 24 bottles of white wine, 12 red, 12 apple juice, 10 bottles sparking water, 10 still, also snacks
  • Iman: we need to help HFYLS set up and clear up (we always do this anyway)
  • Monty Hall problem might be too well known, But probably OK, and should work OK with the demo and the simulation
    • demo: do the first choice, then the host takes one away, then ask the audience if he/she should switch
    • then do the simulation in groups (at least three rounds)
    • then sum the results of the simulation, then ask the audience again, then get the contestant to make their second choice
  • Flash talk from Bernd Klaus
  • Birthday problem ice-breaker, but in two dimensions (months and days)
  • Find a werewolf should work\
  • 'We tend to be overconfident' exercise - we should try to come up with ten good questions then ask the other HUB organisers for comments. We also need to emphasise on the night the ranges should be as narrow as possible, but still 90% confident
  • use various 'obscure' papers that have high number of citations (two or three of the questions)
  • Holger will a card trick; dealing with probabilities; call it 'an introduction to probability'; have it immediately after the ice-breaker because people were already standing
  • Intro should talk about why stats is important
  • Requirements:
    • Monty hall - mars bars, pictures of goats, playing cards (Matt)
    • Werewolf: werewolf cards and instructions, and lots of dice (Matt)
    • biological questions (everyone)
    • explanatory slides (one for each solution explanation) (Matt)
    • card trick: cards
    • Who: ask Sam to be Monty Hall? (Matt to ask)
  • Matt to MC mostly
  • Matt to set up a doodle for BBQ dates
  • Intro to HUB at lunchtime: one sentence intro then show the video (put this as a task on the wiki)
  • Matt to do the wiki page
  • Poster: waiting for answer re: can we use the logo. Matt to have a go at the poster in lieu of anyone else (put it on the task list).
  • Business cards: use vistaprint with adverts on the back - quite cheap (Holger to investigate)
  • Event title: intuistics (Holger)
  • AOB:
    • Matt to write the email advert and send it to others for comments, then send it out (after wiki page is up)
    • send the HUB email advert to the HFYLS participant list too.
    • Can we come up with an overfitting exercise? eg. in groups, come up with a classifier, then move to another group. cf. 'find something unique' ice-breaker