Blog posts

2024

Why Learn Code

9 minute read

Published:

If you are an undergraduate or graduate student in the social sciences, you will probably at some point be asked to learn some statistical software. And, you might be surprised to learn that Microsoft Excel is usually not understood to meet that criterion. Instead, you’ll be asked to learn Stata, R, Python, or maybe SPSS or SAS.

Counting for adults III: balls/bins, partitions, inclusion-exclusion, Stirling numbers

21 minute read

Published:

In previous posts about counting, I have not directly tied these questions to the “balls into bins” framing that is ubiquitous. In this post, I connect what I have discussed in past posts to this specific framing, which leads to a discussion of some important new (to this blog) ideas: the inclusion-exclusion principle, Stirling numbers of the second kind, and Bell numbers.

Some useful division tricks

17 minute read

Published:

Working out which multiple-digit numbers are divisible by the single-digit numbers is a very handy arrow to have in one’s quiver. In this post, I show how tricks for dividing multi-digit numbers by \(2, 3, 4, 5, 6, 7, 8, 9,\) and \(11\). I also use this occasion to introduce a bit of modular arithmetic.

The universality of NAND gates

4 minute read

Published:

Let’s show a useful theorem in computer science, that \(\text{NAND}\) gates are all one needs to be able to express any Boolean function, and thus extremely useful in computing.

A straightforward discussion of the Josephus problem

6 minute read

Published:

In this post, I offer a straightforward discussion of the Josephus problem, discussed in Graham, Knuth, and Patashnik’s Concrete Mathematics. I follow their discussion but present in a more straightforward manner, foregrounding a simpler and brilliant proof found in one of the infamous marginal comments.

Counting for adults II: the binomial and multinomial distribution

17 minute read

Published:

In this post, I introduce two very useful distributions in statistics, the binomial and multinomial distribution. What follows is not a full lecture on these topics; it merely introduces what is most important about these quantities for the ongoing series in survey statistics.

Counting for adults: introduction to combinatorics

51 minute read

Published:

In this post, I introduce the basic combinatorics that anyone who is serious about statistics beyond the most basic level should know. This part of math also, I think, happens to be exceptionally beautiful.

Complex surveys III: cluster random sampling

15 minute read

Published:

In this post, I briefly discuss the benefits and drawbacks of cluster random sampling. Much of the post is dedicated to some interesting transformations of the sampling variance of the cluster sample mean. (Updated 2024-05-10).

Complex surveys I: introduction to finite population statistics

26 minute read

Published:

In the following series of posts, I want to provide a short introduction to complex survey design. I have not found, anywhere on the internet, a short (\(\leq 30\) page) document that introduces the concepts of stratified and cluster random sampling from a finite population clearly and which derives point estimators for the mean and true and estimated sampling variance of the mean. In principle, all of this can be developed without need of much advanced math, and Kish (1965); Cochran (1977); Särndal, Swensson, and Wretman (SSW) (1992); and Lohr (1999) all provide accessible, good treatments. That said, each graduate-level textbook requires most of its 300 or 400 pages to get users to the point of being able to fully understand a complex survey. For many users, this is simply too much detail, and I have found that each of the textbooks above, while individually useful, typically omits certain important assumptions that one only finds fully spelled out in one of the others.

The handshake puzzle: introduction to counting

25 minute read

Published:

Here is my inaugural blogpost. This was written in a way so that a talented middle-school student could make sense of it, but this might be useful to people who haven’t needed to use this beautiful math in the long interval between high school and graduate school.