Rational fractal distributions

Up among the peaksMost of the time we encounter probability distributions over the reals, the positive reals, or integers. But one can use the rational numbers as a probability space too.

Recently I found the paper Vladimir Trifonov, Laura Pasqualucci, Riccardo Dalla-Favera & Raul Rabadan. Fractal-like Distributions over the Rational Numbers in High-throughput Biological and Clinical Data. Scientific Reports 1:191 DOI: 10.1038/srep00191. They discuss the distribution of ratios of the number of reads from the same spot of DNA that come from each chromosome in a pair: the number of reads is an integer, so the ratio is rational. They get a peaky, self-similar distribution empirically, and the paper explains why. 

If you take positive independent integers from some distribution f(n) and generate ratios q=a/(a+b), then those ratios will have a distribution that is a convolution over the rational numbers: g(q) = g(a/(a+b)) = \sum_{m=0}^\infty \sum_{n=0}^\infty f(m) g(n) \delta \left(\frac{a}{a+b} - \frac{m}{m+n} \right ) = \sum_{t=0}^\infty f(ta)f(tb)

One can of course do the same for non-independent and different distributions of the integers. Oh, and by the way: this whole thing has little to do with ratio distributions (alias slash distributions), which is what happens in the real case.

The authors found closed form solutions for integers distributed as a power-law with an exponential cut-off and for the uniform distribution; unfortunately the really interesting case, the Poisson distribution, doesn’t seem to have a neat closed form solution.

In the case of a uniform distributions on the set \{1,2,\ldots , L\} they get g(a/(a+b)) = (1/L^2) \lfloor L/\max(a,b) \rfloor .

The rational distribution g(a/(a+b))=1/max(a,b) of Trifonov et al.
The rational distribution g(a/(a+b))=1/max(a,b) of Trifonov et al.

They note that this is similar to Thomae’s function, a somewhat well-known (and multiply named) counterexample in real analysis. That function is defined as f(p/q)=1/q (where the fraction is in lowest terms). In fact, both graphs have the same fractal dimension of 1.5.

It is easy to generate other rational distributions this way. Using a power law as an input produces a sparser pattern, since the integers going into the ratio tend to be small numbers, putting more probability at simple ratios:

The rational distribution g(a/(a+b))=C(ab)^-2 of Trifonov et al.
The rational distribution g(a/(a+b))=C(ab)^-2 (rational convolution of two index -2 power-law distributed integers).

If we use exponential distributions the pattern is fairly similar, but we can of course change the exponent to get something that ranges over a lot of numbers, putting more probability at nonsimple ratios p/q where p+q \gg 1:

The rational distribution of two convolved Exp[0.1] distributions.
The rational distribution of two convolved Exp[0.1] distributions.
Not everything has to be neat and symmetric. Taking the ratio of two unequal Poisson distributions can produce a rather appealing pattern:

Rational distribution of ratio between a Poisson[10] and a Poisson[5] variable.
Rational distribution of ratio between a Poisson[10] and a Poisson[5] variable.
Of course, full generality would include ratios of non-positive numbers. Taking ratios of normal variates rounded to the nearest integer produces a fairly sparse distribution since high numerators or denominators are rare.

Rational distribution of normal variates rounded to nearest integer.
Rational distribution of a/(a+b) ratios of normal variates rounded to nearest integer.

But multiplying the variates by 10 produces a nice distribution.

Rational distribution of ratios of normal variates multiplied by 10 and rounded.
Rational distribution of a/(a+b) ratios of normal variates that have been multiplied by 10 and rounded.

This approaches the Chauchy distribution as the discretisation gets finer. But note the fun microstructure (very visible in the Poisson case above too), where each peak at a simple ratio is surrounded by a “moat” of low probability. This is reminiscent of the behaviour of roots of random polynomials with integer coefficients (see also John Baez page on the topic).

The rational numbers do tend to induce a fractal recursive structure on things, since most measures on them will tend to put more mass at simple ratios than at complex ratios, but when plotting the value of the ratio everything gets neatly folded together. The lower approximability of numbers near the simple ratios produce moats. Which also suggests a question to ponder further: what role does the über-unapproximable golden ratio have in distributions like these?

Mathematical anti-beauty

Browsing Mindfuck Math I came across a humorous Venn diagram originally from Spikedmath.com. It got me to look up the Borwein integral. Wow. A kind of mathematical anti-beauty.

“As we all know”, sinc(x)=sin(x)/x for x\neq 0 and defined to be 1 for x=0. It is not that uncommon as a function. Now look at the following series of integrals:

\int_0^{\infty} sinc(x) dx = \pi/2 ,

\int_0^{\infty} sinc(x) sinc(x/3) dx = \pi/2 ,

\int_0^\infty sinc(x) sinc(x/3) sinc(x/5) dx = \pi/2 .

The pattern continues:

\int_0^\infty sinc(x) sinc(x/3) sinc(x/5) \cdots sinc(x/13) dx = \pi/2 .

And then…

\int_0^\infty sinc(x) sinc(x/3) sinc(x/5) \cdots sinc(x/13) sinc(x/15) dx
=\frac{467807924713440738696537864469}{935615849440640907310521750000}\pi

It is around 0.499999999992646 \pi – nearly a half, but not quite.

What is going on here? Borwein & Borwein give proofs, but they are not entirely transparent. Basically the reason is that 1/3+1/5+…1/13 < 1, while 1/3+1/5+…1/13 + 1/15 > 1, but why this changes things is clear as mud. Thankfully Hanspeter Schmid has a very intuitive explanation that makes what is going on possible to visualize. At least if you like convolutions.

In any case, there is something simultaneously ugly and exciting when neat patterns in math just ends for no apparent reason.

Another good example is the story of the Doomsday conjecture. Gwern tells the story well, based on Klarreich: a certain kind of object is found in dimension 2, 6, 14, 30 and 62… aha! They are conjectured to occur in all  2^n-2  dimensions. A branch of math was built on this conjecture… and then the pattern failed in dimension 254. Oops. 

It is a bit like the opposite case of the number of regular convex polytopes in different dimensions: 1, infinity, 5, 6, 3, 3, 3, 3… Here the series start out crazy, and then becomes very regular.

Another dimensional “imperfection” is the behaviour of the volume of the n-sphere: V_n(r)=\frac{\pi^{n/2}r^n}{\Gamma(1+n/2)}

Volume of unit hyperspheres as a function of dimension

The volume of a unit sphere increases with dimension until n \approx 5.25 , and then decreases. Leaving the non-intuitiveness of why volumes would shrink aside, the real oddness is that the maximum is for a non-integer dimension. We might argue that the formula is needlessly general and only the integer values count, but many derivations naturally bring in the Gamma function and hence the possibility of non-integer values.

Another association is to this integral problem: given a set of integers x_i, is the integral \int_0^\pi \prod_i \sin(x_i \theta) d\theta = 0 ? As shown in Moore and Mertens, this is NP-complete. Here the strangeness is that integrals normally are pretty well behaved. It seems absurd that a particular not very scary trigonometric integral should require exponential work to analyze. But in fact, multivariate integrals are NP-hard to approximate, and calculating the volume of a n-dimensional polytope is actually #P-complete.

We tend to assume that mathematics is smoother and more regular than reality. Everything is regular and exceptionless because it is generated by universal rules… except when it isn’t. The rules often act as constraints, and when they do not mesh exactly odd things happen. Similarly we may assume that we know what problems are hard or not, but this is an intuition built in our own world rather than the world of mathematics. Finally, some mathematical truths maybe just are. As Gregory Chaitin has argued, some things in math are irreducible; there is no real reason (at least in the sense of a comprehensive explanation) for why they are true.

Mathematical anti-beauty can be very deep. Maybe it is like the insects, rot and other memento mori in classical still life paintings: a deviation from pleasantness and harmony that adds poignancy and a bit of drama. Or perhaps more accurately, it is wabi-sabi.