HUB19  Intuistics
In conjunction with the 4th Heidelberg Forum for Young Life Scientists, HUB brings you an evening of dice, playing cards and werewolf hunting, showing when intuition can go wrong and how statistics can help.
HUB WordPress page reporting on the event
Sign up for the event Let us know you if you plan to come by filling in this doodle
When? Thursday 21st May, 2015, 20:0021:30
Where? DKFZ Communications Center, Im Neuenheimer Feld 280, 69120 Heidelberg.
Who? A team of people working at several different lifescience organisations within Heidelberg
To keep track of people you've met at HUB, join the HUB LinkedIn group
Get involved in organising this and other HUBs!
Contents
 1 Announcing the HUB to HFYLS
 2 Programme
 3 Report
 4 Task List
 5 Notes
 5.1 HUB on 'statistics and intuition', in association with HFYLS
 5.2 Icebreaker (ca. 20 mins)
 5.3 Monty Hall problem
 5.4 Discussion
 5.5 5 min talk
 5.6 Games tables / casino
 5.7 How to lie with Statistics
 5.8 Sample selection
 5.9 Graphs / data visualisation
 5.10 'We tend to be overconfident' exercise
 5.11 Find the werewolf
 6 Planning Meetings
Announcing the HUB to HFYLS
Aidan gave a 5 min invitation/announcement about the meeting to the HFYLS before lunch  here are the notes on that in case they're useful for us some other time.
Programme
Starting at 20:00:
 Welcome and introduction  there's a werewolf amongst us! (5mins)
 Flash talk from Bernd Klaus (5mins)
 Icebreaker  the birthday problem (1020mins)
 Playing cards and probabilities (10mins)
 Game show (20mins)
 We tend to be overconfident (20mins)
 Final words and farewell: unmasking the werewolf (5mins)
 To cafe Botanik (ca. 21:30)
Report
Another successful evening! With around 60 people* and several helping hands on deck, people were intrigued from the beginning with their werewolf diagnoses and testing kits, with several groups chatting quite happily. It seemed almost a shame to interrupt this with a welcome and intro to the evening's programme. Bernd Klaus then gave a fun flash talk on 'cheating with charts', warming people up to the theme of the evening and getting people laughing at graphing (hats off to him for that). For the icebreaker, several people guessed that there was a high chance of pairs of people with the same birthday. This was partly because there were twice as many people as we expected and partly because it's a wellknown example... but of course this didn't matter  people liked getting off their feet to move around and talk to each other. Holger's magic trick was intriguing, even to him I think  I still don't know how he managed to guess people's cards every time (perhaps parapsychology could be a topic for a future HUB, as long as we cross Holger's palms with silver). Sam Caddick was a natural host for the game show. People seemed to intuitively know the right answer but for the wrong reason, but that was fine as whatever people said it stimulated discussion (or heated debate). The simulation with playing cards was enjoyed by all and (remarkably) gave the right answer, which was handy as none of us were that confident in explaining it analytically... The confidence quiz showed that people are indeed overconfident, with most groups giving correct ranges for between three and six of the answers. We then found the werewolf, with a brief chat and cartoon explanation of Bayesian statistics, disease prevalence and falsepositive rates. It just remained to make some closing remarks, thank our guest presenters (Bernd and Sam) with wine, apologise for not buying wine for our inhouse magician (Holger), and then retire to the foyer for wine and snacks kindly provided by HFLYS.
Thanks to everyone who made it happen. I'm looking forward to the next one.
(Explanatory slides used on the night.)
(*  about twice as many who put their names on the doodle.)
Task List
Before the evening:
What  Person responsible 

Prelunch advert to HFYLS on May 21 (inc. HUB video?)  Aidan 
List of registered people  Matt 
Create the poster  Matt 
Print and display poster 

Advertise on linkedin  Matt 
Email advertising the event  Matt 
Send reminder email on the day of the event  Matt 
Things for organisers to bring:
What  Person responsible 

Participant list  Matt 
stickers for name badges  Matt 
explanatory slides  Matt 
werewolf paper  Matt 
dice  Matt 
playing cards  Matt 
HUB signs + tape  
Drinks and snacks  HFYLS 
To do on the evening:
What  Person responsible 

Arrange tables/chairs  
Hang up signs to the room  
Setup and welcome attendees  Holger 
MC  Matt 
Flash talk  Bernd 
Icebreaker  Matt 
Playing cards and probabilities  Holger 
Game show  Sam Caddick 
'We tend to be overconfident' activitiy  Matt 
Tidy and close room  All 
Take participants to Cafe Botanik 
Notes
(These were originally collected on piratepad to avoid revealing the answer to some of the activities before the event.)
HUB on 'statistics and intuition', in association with HFYLS
At the HFYLS conference general meeting on February 17, we discussed / agreed on
 May 21, 20:00  22:00
 In DKFZ conference centre (are K1 + K2 the upstairs rooms? yes!)
 HFYLS said they would provide the drinks
 HFYLS said they'd estimate ca. 50 people from their side
 We estimate between 10 and 40 people from HUB
=> How are we going to know the exact final number?send an email to registred people for HFYLS for advertisement ?
 To keep in mind that the HUB is in place of what would have been a social event
Icebreaker (ca. 20 mins)
How likely is it that two people in this room have the same birthday?
(An adaptation of the classic 'standing on a line' HUB ice breaker)
imagine a line from January 1 to December 31 across the room. get everyone to order themselves on the line (in silence?) by their birthday ask them to talk to their two neighbours are you in the right place in the line? Should you be before or after your neighbour if not, move (bubble sort!) repeat until everyone is in the right order with respect to their neighbours ask if there are any pairs with the same birthday ask everyone what the likelihood of that is by chance
If we have at least 23 people, there is a greater than 50% chance that at least two of them will share the same birthday: http://en.wikipedia.org/wiki/Birthday_problem. This is a good example of the common problem of multiple testing, and will hopefully get people chatting and thinking.
Monty Hall problem
Also an example I like: What is also known as the "goat problem" or the "Monty Hall problem" (http://en.wikipedia.org/wiki/Monty_Hall_problem)  based on some American TV show. Quoting from wikipedia:
"Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice? Vos Savant's response was that the contestant should switch to the other door (vos Savant 1990a). Under the standard assumptions, contestants who switch have a 2/3 chance of winning the car, while contestants who stick to their choice have only a 1/3 chance."
It's a really counterintuitive problem and I guess it would lead to lively discussions. Although for the icebreaker I think the birthday thing is nicer as it is less controversial and more of a chat topic. (again quoting from Wikipedia: " Even when given explanations, simulations, and formal mathematical proofs, many people still do not accept that switching is the best strategy (vos Savant 1991a)")
It could be fun to run this (or similar  eg. 'Deal or No Deal') as a real game show. If we had several rounds then it should become obvious that the odds are better if you switch.
We could first have a fake game show in front of everyone: have three boxes on a table at the front, one covering up a 'good' prize, eg. some chocolate or a HUB Tshirt (if we make one!), the other two covering up a cabbage ask for a volunteer from the audience to come to the front get them to pick a box but don't show them what's under it remove one of the others , one with a goat, and show them that it has a goat. Now say they can pick again. ask the audience whether they think he/she should stick or switch reveal what's under their final chosen box give them the good prize anyway
Now explain to them that if you switch you'll have a 2/3 chance of winning. Most people won't believe it, so we now run a simulation: get everyone to split up in to groups of 35 people give each group three cards from the same suit, one ace, one one and one two. get them to play the game in their groups ca. 10 times, take turns being the host and the contestant the contestant should always switch on their second choice record the number of times the contestant wins and loses at the end, they come up to the front and add their totals to the blackboard we show them that indeed the contestant won 2/3rds of the time when they switched (hopefully!)
Discussion
ask people to split into 2 groups depending on their statistic skills: fair or good VS bad
 split people on table by mixing fair and good on each table
 ask people to discuss their statistic skills and try to get input from the "good" guys.
 also, people should discuss and write down how they overcome their statistic analysis problems (websites, good books, forums...)
 go table by table and ask people to summarize what they discussed about and give the recommendation (websites, good books to read, other suggestions)
5 min talk
Invite someone to talk about the importance of stats in science, eg: Thomas Lemberger (Chief Editor of MSB)  has shown an interest in HUB before Alexis Maizel (group leader at COS)  interested in stats (I gave an intro to stats course for MSc students at his request).
Games tables / casino
Lots of games benefit from a statistical analysis... we could several tables, each with a different game, poker tables, roulette wheels, etc... Would be nice if at least some of them had a relevance to biology. Any suggestions?
How to lie with Statistics
Examples where misuse of statistics can lead you astray  not sure how to turn this in to an activity though...
Sample selection
set up a biased and unbiased survey, get people to ask each other questions and then think whether the results are significant or not.
Graphs / data visualisation
scatter vs. bar vs. box plot.
'We tend to be overconfident' exercise
(Example from Motulsky, 'Intuitive Biostatistics')
Ask people in groups to come answer the following ten questions with ranges that they are 90% sure contain the correct answer:
 Martin Luther King Jr.’s age at death: 39
 Length of the Nile in kilometers: 6738
 Number of countries in OPEC: 13
 Number of books in the Old Testament: 39
 Diameter of the moon in kilometers: 3476
 Weight of an empty Boeing 747 in kilograms: 176901
 Year Mozart was born: 1756
 Gestation period of an Asian elephant in days: 645
 Distance from London to Tokyo in kilometers: 9638
 Depth of deepest known point in the ocean in kilometers: 11
Then give the correct answers and get each group to say how many they got right (write these numbers on the board)
Did each group give ranges that covered the right answer nine out of ten times? (When more than 1000 people tested, 99% of were overconfident, with most giving ranges that were too narrow, inc. only 3060% of the right answers.)
More biological questions, suggested by Sam Caddick:
 Year of invention of PCR: 1983
 Year of death of Charles Darwin: 1882
 Date of publication of On the Origin of Species: Nov 24, 1859
 Date of publication of Watson & Crick paper: April 25, 1953
 Diameter of average blood cell
 Number of genes in fruit fly genome
 Age of Alexander Fleming when he discovered penicillin: 47
 Number of different antibiotics in current use worldwide
 Number of species that go extinct per year
 Number of active scientists worldwide
 Number of biology research papers published in 2014
We just need to find answers for them!
Other questions:
 Number of times the 1994 ClustalW paper had been cited as of Oct 2014: 40289
 Year of publication of the Sanger sequencing method: 1974
 Number of citations of the Sanger sequencing paper as of Oct 2014: 65335
 Number of citations for the most cited paper as of Oct 2014: 305148
 Number of named authors on the Leung et al March 2015 fruit fly paper: 1014
(Some numbers taken from list linked here: http://www.nature.com/news/thetop100papers1.16224)
Find the werewolf
This could go on at the same time as everything else, with the werewolf revealed at the end. It's a nice example of the intuitive difficulty of combining probabilities / doing Bayesian statistics.
give out to everyone as they enter
 a folded pieces of paper with 'werewolf' or 'not werewolf' written on it (only one werewolf!), and how they should respond to the blood test (below). tell them not to tell anyone else what they are
 a die to run the test
 a pen and a piece of paper to record the results of all the times they were tested
during the intro to the evening say something like:
 one of you is a werewolf but it's not a full moon, so the rest of us don't know who. (don't tell anyone!)
 our test for lycanthropy is quite accurate but not perfect
 it correctly identifies 83% (5/6) of infected blood samples
but it also concludes that 17% (1/6) of noninfected blood samples have come from werewolves during the evening, test anyone you talk to by rolling a die
 if you're not a werewolf but someone rolls a six, tell them you're a werewolf (False positive). Otherwise tell them you're not (True negative).
 if you're the werewolf and someone rolls a one, tell them you're not a werewolf (False negative). Otherwise tell them you're a werewolf (True positive).
At the end end of the evening, ask everyone what they think a positive test results means. Then ask the werewolf to reveal themselves, and to tell everyone else how many times they tested positive and how many times they tested negative. Write this on the board. Then ask anyone else who ever tested positive to put up their hands. Point out that the meaning of a positive test depends partly on the prevalence of the disease, not just on the accuracy of the test.
Say that most people intuitively think that a positive test almost certainly means that the disease is present, but that the interpretation of the result greatly depends on the prevalance of the disease (as well as the accuracy of the test). This has obvious implications for testing for real diseases, eg. what if a false positive results causes you to have surgery, or to quit your job and travel the world, etc etc.
Perhaps then explain briefly the stats (simplified to the case of everyone being tested only once): if we have n people (use the real number from the evening), one of whom is a werewolf (but it's not a full moon, so we don't know who).
 prevalance of disease = 1 / n. (So if we have eg 50 people, prevalence = 2%.)
 test for lycanthropy is quite accurate but not perfect:
 it correctly identifies 83% (5/6) of infected blood samples
 but it also concludes that 17% (1/6) of noninfected blood samples have come from werewolves
 the one person who is a werewolf will (likely) have a positive test
 the remaining 49 are not werewolves but 1/6 of them will have a positive result, = 8 people
 so there are 1 + 8 positive tests, with
 1 / 9 (11%) true positives
 8 / 9 (89%) false positives
(Prepare a slide in advance to which the real numbers can be added  even better if it will do the maths for us  safer than doing it on the night ;) )
Planning Meetings
(These were originally collected on piratepad to avoid revealing the answer to some of the activities before the event.)
Wednesday 29/04/2015
Date chosen by this doodle: http://doodle.com/26mh438wz3iuirek
Location: Cafe Fresco
Present: Matt, Holger
Minutes:
 The main idea is to
 a) have fun, by
 b) emphasing when stats can help you overcome your inbuilt bias / intuition.
 c) focus on 'why do we need statistics?'
 Birthday problem as ice breaker. Intro should ask 'what are the chances' , then run activity, then explain the stats
 Can we have the session earlier, eg. during fingerfood? (Matt to ask HFYLS) (Matt: I've asked HFYLS, but since dinner is scheduled for 18:35 it's unlikely.)
 Don't have to connect everything to biology, can just be about statistical thinking / probabilities.
 quiz show: German quiz show; der Zonk https://de.wikipedia.org/wiki/Geh_aufs_Ganze!
 can HFYLS sponsor small prizes for games? (Matt to ask) [Matt: I think HUB will be able to cover this, but if necessary we can ask HFLYS when we know what we would like. i.e. after the next planning meeting.]
 'balls to statistics'  ask organisers for props: coloured balls, dice etc. that we can use to demo statistics / probabilities.
 everything can be an 'icebreaker'
 general stats discussions might be a bit dry, better to tie these to particular games
 any other quiz shows with a statistical basis? card games?
 what's the best way to play roulette? etc. strategies to beat the bank? Is that possible? (No!)
 how would we organise a casino as more than just tables with games? But it probably doesn't matter if that's all it is.
 possible tables
 roulette
 poker
 Holger to look at EMBL staff association games for anything appropriate. (unfortunately, no appropriate games such as roulette)
 'we tend to be overconfident' exercise in groups. adapt questions to biological questions. Ask them to plot their answers but don't say how. This could easily take 30 mins.
 even if it's just an hour, that's OK
 maybe the card tricks / magic society at embl have some relevant ideas? Holger to ask the organiser if they have a good trick with stats background. Perhaps they can they give a demo and/or teach people how to do the trick (either on the night or to a few of us in advance, who will then demo / teach it on the night). (contacted)
 graphing and data visualisation  every scientist lies by selecting data. Does Toby Gibson have any examples of this in fraud or otherwise? (Holger to ask.) People could discuss what is fair and what is not.
 two ways to show the same data, advantages and disadvantages of each
 Holger to ask Bernd Klaus (centre for stats at EMBL) for examples about how to lie with stats. Maybe as intro 5 min talk? If that doesn't work out, Matt will ask Alexis Maizel (COS group leader). Matt / one of us could do such an intro too if necessary, but we think it would be better if it was an outside speaker. (contacted Bernd)
 'we tend to be overconfident' exercise in groups. adapt questions to biological questions. Ask them to plot their answers but don't say how. This could easily take 30 mins.
 even if the whole thing is just an hour, that's OK (better than dragging it out to two hours just for the sake of it)
Thursday May 7
Date chosen by this doodle: http://doodle.com/upq83ztirzug56kx
Location: Cafe Botanik, Im Neuenheimer Feld: http://www.yelp.com/map/caf%C3%A9botanikheidelberg
Present: Matt, Holger, Iman (& Jazz the dog), Florian, Thomas Wolf
Agenda:
 is the following programme OK? (more details above)
 hand out materials for the 'find the werewolf' game as people arrive
 welcome: intro to HUB, 'find a werewolf' and the rest of the evening's programme (5mins)
 flash talk: why do we need statistics? / how to lie with statistics (5 mins)
 birthday problem icebreaker (1020 mins)
 montyhall problem: intro as a game show, then a live simulation (20 mins)
 another game / activity, perhaps with a biological angle (eg. the 'we tend to be overconfident' exercise but with biological questions) (20 mins)
 final words
 unmask the werewolf (maybe give them the cure, i.e. a prize?) and explain the stats
 one line summary of stats vs. intuition
 present a gift (wine?) to the guest speaker
 advertise HUB website, mailing list and coming events
 off to Cafe Botanik
 if so, flesh out the details: for each point:
 what will we do,
 what we need to be able to do it
 who will do it
 are there any new ideas for activities?
 decide on the schedule
 should we fix a date for the barbecue so we can announce it at this HUB?
 will it be possible to start the HUB earlier than 20:00?
 Iman/Chiara have suggested that HUB give a brief presentation before lunch to encourage more people to attend in the evening, perhaps using our video. Who would like to do this?
 Huborganizers need to help setting up the room K1/K2
 wiki page
 poster
 business cards to hand out on the night? Since we will potentially have a large audience of people who've never been to HUB before, it would be good to make it easy for them to know where to get more info and how to sign up for the mailing list (they may not bother to write it down if it's shown briefly on a slide).
 name for the evening? maybe people will be put off by the word 'statistics'?... Suggestions:
 statistics and intuition (current working title)
 statistics vs. intuition
 statistics: for when intuition goes wrong
 HUBonomics
 Data beating intuition
 any other business
Minutes:
 Iman: HFYLS will provide 24 bottles of white wine, 12 red, 12 apple juice, 10 bottles sparking water, 10 still, also snacks
 Iman: we need to help HFYLS set up and clear up (we always do this anyway)
 Monty Hall problem might be too well known, But probably OK, and should work OK with the demo and the simulation
 demo: do the first choice, then the host takes one away, then ask the audience if he/she should switch
 then do the simulation in groups (at least three rounds)
 then sum the results of the simulation, then ask the audience again, then get the contestant to make their second choice
 Flash talk from Bernd Klaus
 Birthday problem icebreaker, but in two dimensions (months and days)
 Find a werewolf should work\
 'We tend to be overconfident' exercise  we should try to come up with ten good questions then ask the other HUB organisers for comments. We also need to emphasise on the night the ranges should be as narrow as possible, but still 90% confident
 use various 'obscure' papers that have high number of citations (two or three of the questions)
 Holger will a card trick; dealing with probabilities; call it 'an introduction to probability'; have it immediately after the icebreaker because people were already standing
 Intro should talk about why stats is important
 Requirements:
 Monty hall  mars bars, pictures of goats, playing cards (Matt)
 Werewolf: werewolf cards and instructions, and lots of dice (Matt)
 biological questions (everyone)
 explanatory slides (one for each solution explanation) (Matt)
 card trick: cards
 Who: ask Sam to be Monty Hall? (Matt to ask)
 Matt to MC mostly
 Matt to set up a doodle for BBQ dates
 Intro to HUB at lunchtime: one sentence intro then show the video (put this as a task on the wiki)
 Matt to do the wiki page
 Poster: waiting for answer re: can we use the logo. Matt to have a go at the poster in lieu of anyone else (put it on the task list).
 Business cards: use vistaprint with adverts on the back  quite cheap (Holger to investigate)
 Event title: intuistics (Holger)
 AOB:
 Matt to write the email advert and send it to others for comments, then send it out (after wiki page is up)
 send the HUB email advert to the HFYLS participant list too.
 Can we come up with an overfitting exercise? eg. in groups, come up with a classifier, then move to another group. cf. 'find something unique' icebreaker