Deprecated: Creation of dynamic property Automattic\Jetpack\Connection\Manager::$error_handler is deprecated in /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php on line 93

Deprecated: Creation of dynamic property Automattic\Jetpack\Sync\Queue::$random_int is deprecated in /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-sync/src/class-queue.php on line 40

Deprecated: Creation of dynamic property Automattic\Jetpack\Sync\Queue::$random_int is deprecated in /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-sync/src/class-queue.php on line 40

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/garyco5/public_html/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-manager.php:93) in /home1/garyco5/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831
{"id":582,"date":"2020-11-02T01:54:56","date_gmt":"2020-11-02T01:54:56","guid":{"rendered":"https:\/\/garycornell.com\/?p=582"},"modified":"2020-11-02T01:54:56","modified_gmt":"2020-11-02T01:54:56","slug":"just-what-is-a-margin-of-error-anyway-sampling-1","status":"publish","type":"post","link":"https:\/\/garycornell.com\/2020\/11\/02\/just-what-is-a-margin-of-error-anyway-sampling-1\/","title":{"rendered":"Just what is a “margin of error” anyway-Sampling 1"},"content":{"rendered":"\n

You have been probably seeing a lot of polls lately. They all end by saying something like “we sampled 1,000 people and our margin of error is 3.8% or 4.5%” or some other weird percentage. I thought I would take some time to explain where this number comes from and what it means. I want to start by saying that the technical term isn\u2019t \u201cmargin of error\u201d but rather \u201cmargin of sampling error.\u201d\u00a0 And the keywords are \u201csampling error\u201d And, although it seems not directly connected to the pandemic, \u201csampling error\u201d is a fundamental concept in statistics that must come up in dealing with trying to find Covid 19 prevalence for example, so I thought I would take some time to explain it. This post won\u2019t get too much into the math but eventually math will rear its head when discussing sampling error so I will have some future posts that are a little more math centric.<\/p>\n\n\n\n

Anyway, statisticians like to talk about a \u201cpopulation\u201d – that\u2019s what you are trying to understand by taking a \u201csample.\u201d  We can\u2019t test everybody in the United States for the antibodies to the virus that causes Covid, so we test a \u201csample\u201d. From the results of that sample we try to estimate the \u201ctrue\u201d result – the actual number of people that have caught the disease. For example, suppose we find that 10% of our sample test positive and we \u201cjump\u201d to the conclusion that, heck, probably 10% of the whole population is positive. Are we really jumping?<\/p>\n\n\n\n

The answer depends on how the sample was taken!  But if it was taken \u201crandomly\u201d – and I will have to have a post on what just that means, it\u2019s actually a tricky concept, the answer is \u201cprobably no, we are really not jumping to conclusions\u201d and this is true <\/em>even if the sample seems so small compared to the actual population size. And yes, it seems magic that a sample size of a 1000 or so allows one to make reasoned judgements about populations in the 100\u2019s of millions i.e. that you can poll 1,000 people and make reasoned judgements on how the 210 million adult population of the United States feels or is.<\/p>\n\n\n\n

But it is true. A fundamental result – perhaps the fundamental result in statistics says that the results from relatively small-sized random samples come pretty close to the true result for the whole population under some pretty general and very reasonable assumptions. And this is true no matter how large the population you are sampling. <\/em>And yes I\u2019ll repeat it: it does seem like magic that it is the sample size rather than say the size of the sample relative to the size of the population is what matters.<\/p>\n\n\n\n

In fact, if you take a \u201crandom\u201d sample of about 1000 people from the adult population of the United States (about 210 million people), the odds of being off by more than 3%  in either direction is roughly speaking 1\/20. Go to about 2400 people and then 19\/20 times you are within about 1% of the correct answer. All this means is that if you had the time and money to increase the sample size to what still seems ridiculously small relative to the population size,  you can make the chance of you being wrong also ridiculously small. So I hope you can see why sampling can be so powerful in determining the hidden occurrence of Covid 19 infections for example and that polling, if done properly can work.<\/p>\n\n\n\n

OK as a mathematician I need to say this: mathematics isn\u2019t magic, it just seems that way sometimes. <\/em>And for what it is worth, if I had to pick a single result in all of mathematics, that any mathematician can understand relatively easily why it is true and yet still have trouble believing it, it is this result.<\/p>\n\n\n\n

But I need to reiterate that when looking at the results of any survey: (a)you need to be sure they did a random, unbiased sample, and (b)even if they did that, you need to keep in mind that almost all reported sampling results use a 1\/20 chance of being off by more than their \u201cmargin of error.\u201d Finally (c)it\u2019s worth keeping in mind that if they did if they did a random unbiased sample,<\/em> that there is only a very very small chance of them being off by twice their margin of error.<\/p>\n\n\n\n

(Technical note: These calculations were done if you are looking at result of a more or less equally split population. The numbers needed would change slightly if you were doing a sample where you had a more extreme split such as (75-25%). But, roughly speaking, the error is proportional to the square root of the sample – and the population doesn\u2019t figure into it!)<\/p>\n\n\n\n

So stay tuned for more posts that go deeper into the magic and mystery of how sampling works! <\/p>\n","protected":false},"excerpt":{"rendered":"

You have been probably seeing a lot of polls lately. They all end by saying something like “we sampled 1,000 people and our margin of error is 3.8% or 4.5%” or some other weird percentage. I thought I would take some time to explain where this number comes from and what it means. I want … Continue reading “Just what is a “margin of error” anyway-Sampling 1″<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7,6],"tags":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/posts\/582"}],"collection":[{"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/comments?post=582"}],"version-history":[{"count":1,"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/posts\/582\/revisions"}],"predecessor-version":[{"id":583,"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/posts\/582\/revisions\/583"}],"wp:attachment":[{"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/media?parent=582"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/categories?post=582"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/garycornell.com\/wp-json\/wp\/v2\/tags?post=582"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}