Random thoughts on primes

Recently, we saw in the headlines that a new world record has been set for the largest prime (not surprisingly, it’s a Mersenne prime, which means it’s of the form 2^n - 1 for some n); see here, or for more details, see here. Also, StewJo started a blog on Guesstimation and Benford’s law; Benford’s law says that in many sequences (such as the Fibonacci numbers or values of the Riemann zeta function near the critical line) there is a preponderance for having a small leading digit (1, 2 or 3) over a high leading digit (7, 8 or 9); in fact, the probability of a first digit of d base 10 is just log_{10}(1 + 1/d) — see here for more on Benford’s law. For our purposes, instead of looking at just the first digit we will want to look at the first k digits.

What’s the connection between these posts? One can ask about the distribution of leading digits of these large primes. In base 10 it will almost surely follow Benford’s law; however, in base 2 these numbers will definitely NOT follow Benford’s law (the first k digits will all be 1s for sufficiently large Mersenne primes). It takes a little bit of time to compute the first digit of these numbers, as the 46th Mersenne prime has 12,978,189 digits! But a little patience shows 28% have a first digit of 1, quite close to the 30% from Benford’s law, and the chi-square value, with 8 degrees of freedom, is 3.2, indicating a good fit. (For definiteness, here are the first digits of the first 46 known Mersenne primes: 3, 7, 3, 1, 8, 1, 5, 2, 2, 6, 1, 1, 6, 5, 1, 1, 4, 2, 1, 2, 4, 3, 2, 4, 4, 4, 8, 5, 5, 5, 7, 1, 1, 4, 8, 6, 1, 4, 9, 1, 2, 1, 3, 1, 2, 3 — hmm, only one 9!)

This leads to the following interesting question: given an explicit sequence of primes, what can you say about the distribution of leading digits? If you look at ALL the primes, the answer is that there is no limiting distribution! The reason is if you look at all primes up to 10^N or 2*10^N, you’ll see a very different answer; the percentage of leading digits of 1 oscillates between basically 5/9 (if we truncate at 10^N) and 1/9 (if we truncate at 2*10^N). One way around this is to use analytic density and not the natural density to study the primes (we’ll talk about both of these concepts in Math 406; in Dirichlet’s proof of the infinitude of primes it is the analytic density that naturally arises). (In fact, there is an aside in Serre’s A course on arithmetic where he states that Bombieri showed that, using analytic density, the first digit of primes is 1 with probability log_{10}(1 + 1/d), though no explicit mention of Benford’s law is made.)

But let’s say we don’t want to use analytic density, but we want to just look at the first N primes in some sequence and study their leading digits. What will we see? Will the answer depend on the base (a big yes if our sequence is the Mersenne primes and the base is two!)? It’s a very nice question to give explicit sequences of primes in some natural way. The following are some of my favorites, and are related to proofs of the infinitude of primes.

The following proof, due to Euclid, is almost surely well known to most of the readers here. To prove there are infinitely many primes, assume not. Thus there are finitely many primes, say p_1,…,p_n. Consider the product p_1*…*p_n+1; it is either prime, or divisible by a prime not on the list. Let a_{n+1} be the smallest prime dividing p_1 * … * p_n + 1/$ that is not already on our list. Thus the sequence a_i is {2,3,7,43,13, 53, 5, 6221671, 38709183810571, 139, 2801, 11, 17, 5471, …}; this is sequence A000945 in the On-line Encylopedia of Integer Sequences (a GREAT website, which should be a subject of a future post). A quick examination of the first 43 terms in this sequence show remarkable agreement with Benford’s law base 10 (again about 28% have leading digit of 1, and the chi-square value is about 2).

While there are lots of proofs of the infinitude of primes (see for example, Proofs from THE Book), most of them do not give infinite sequences of primes. Another proof that does is the following: for n = 0, 1, 2, … let F_n = 2^{2^n}+1 be the nth Fermat number; then one can show that these numbers are all relatively prime, and thus if we take the lowest prime dividing each, we get a new sequence of primes: (see sequence A093179 in the OEIS).

To end, here’s the gauntlet question: can you generate any other interesting sequences that prove there are infinitely many primes? For fame and fortune, show that your sequences are Benford for some base; this will be VERY hard, I think. I don’t believe this has been shown for any of the sequences above; in fact, I believe it’s open as to whether or not every prime is contained in Euclid’s sequence.