p-adic numbers
Ready to blow your mind?
*** Substack’s LaTeX compiler isn’t working properly with iPhones… use computer ***
In my explorations of the Collatz Conjecture, I recently encountered entirely new number systems that have blown my mind, and with this post, dear reader, I hope will blow your mind as well. I begin with some general comments about the origins and significance of p-adic numbers, followed by an avenue into understanding them for those readers with virtually no experience in mathematics beyond elementary arithmetic and just a wee bit of algebra.
Over the past century or so since their discovery, p-adic numbers and analyses have become a cornerstone of modern number theory. They emerged from innovative work by Kurt Hensel in 1897. His groundbreaking idea was to introduce methods of power series into the realm of number theory. Don’t worry about that for now, as all will become clearer below. Suffice to say that p-adic numbers have provided new understandings and insights into problems involving numbers.
Each prime number, p, opens up an entirely new space of numbers. There isn’t just one space of p-adic numbers, there are infinitely many. Not only that, just like the real numbers are uncountable, in that they can not be matched in a countable (one-to-one) correspondence with natural numbers like the rational numbers can, there are also uncountably many p-adic numbers in every p-adic space.
An important aspect of p-adic numbers is that they offer a natural and powerful language for talking about congruences between integers. They have also enabled the use of methods borrowed from calculus and analysis for studying such problems, and have begun to manifest themselves in other areas of mathematics, including physics. Whenever new number systems are discovered, the implications are manifestly profound. I’ve only just scratched the surface, but I’m already enthralled.
My main interest in p-adic numbers is that they are helping me make sense of the Inverse Collatz Map. In what ways, exactly, will have to await a future post. Consider this an intermediate post for understanding what will follow in that regard. In a more general sense, I’m fascinated by the fact that p-adic numbers, by defining completely different notions of distance and convergence, provide deep insights into sequences and series governed by arithmetic properties (divisibility by primes) rather than magnitude. I’ve been on a fascinating journey that I’m delighted to share.
Let’s begin with something very important about numbers: base representation.
The most important invention of the past two thousand years is base representation. It has become so ubiquitous in its use that many, if not most, are completely unaware of it. Consider, for instance, the number 4073. That is actually an abbreviation, short hand, so to speak, of its actual representation in base 10:
Much is laid bare here. Notice that 4073 is just an ordered sequence of symbols (coefficients) of the base 10 representation for that number. In base 10, ten symbols are required, 0, 1, 2, …, 9 (in base 2, only two symbols are required, 0 and 1). The beauty of base representation (unlike other number systems like Roman numerals), is that ten symbols are all we need using base 10 representation to represent a number of any magnitude we wish (subject to available resources for so doing, of course).
Another thing to notice about base 10 representation are the “powers” of 10 associated with each coefficient, where, for example, 10 to the 3rd power means 10 multiplied by itself twice: 10 x 10 x 10 = 1000; 10 to the 2nd power means 10 multiplied by itself once: 10 x 10 = 100; 10 to the first power is just 10; and then the tricky bit, 10 to the zeroth power is just 1 (any number to the zeroth power is defined as 1).
Multiply the coefficients by the powers of 10 and then add it all up, and voilá: 4073. I’ve often emphasized that next to learning how to add 1 + 1 to get 2, the second most important addition to teach children is 9 + 1 = 10. Can you see why? Without some understanding of base representation, such an addition seems arbitrary. It’s not.
It is what happens when we run out of symbols. We don’t have a symbol for 10. Thankfully, base 10 comes to our rescue, and herein lies its true “power.” With nine units, we can add one. Instead of having ten units, for which we have no symbol for, we group those ten units into a single unit of 10, which we represent by 10 to the first power. We no longer have any units, so the coefficient for 10 to the zeroth power is 0, but we have one unit of 10, so the coefficient for 10 to the first power is 1. Therefore, (1 x 10) + (0 x 1) = 10 + 0 = 10.
Bottom line, when we run out of symbols, we regroup into a higher power of ten, and that regrouping is designated by the humble “carry” in simple addition. Simple, in the sense of profoundly simple. The carry in addition (as with the borrow in subtraction), is like a quantum jump from smaller groups to larger groups (and vice versa).
With this basic understanding behind us, why stop now? Let’s keep rolling…
How to represent fractions using base 10 representation. Seems easy enough. All we require is a “decimal point.” But there are some important subtleties involved that are important to point out and remain conscious of going forward. The first thing to note is that base representation with powers of ten to the right of the decimal point turns out to be a way to represent all possible fractions (and more, as we shall see) using an important subset of fractions, which in base 10 is: {1/10, 1/100, 1/1000, …}. Thus, for a number like 4073.25, representing that subset of fractions in powers of 10 is:
We use the decimal point to indicate when we transition from non-negative powers of 10 to negative powers of 10. This is an important part of base representation, as it is the order of the coefficients that lets us determine what powers of the base they correspond to. In our example, .25 = 2/10 + 5/100 = 20/100 + 5/100 = 25/100 = 1/4.
You can see why, now, that the .25 component of 4073.25 is referred to as the “fractional component” (or “non-integral component”) of 4073.25. That is because .25 is, quite literally, a fraction (4073 is the “integral component” because it is an integer).
When it comes to fractional components, there are only three possibilities.
(1) it terminates at some negative power of 10.
(2) it begins to infinitely repeat at some point.
(3) it never stops or infinitely repeats but randomly continues on to infinity.
In the first case, the fractions are “friendly” in the sense that when written in its lowest terms, as I have done above with .25 = 1/4, their denominators are always only prime factors that are also prime factors of the base. In the case at hand, 2 is the only prime factor of the denominator 4, and 2 is also a prime factor of the base, 10.
In the second case, it gets a bit more messy. Take 1/3, for instance. How to represent that fraction in base 10? Well, 3/10ths is close to 1/3, and 3/10ths plus 3/100ths is even closer. The only way to represent 1/3 exactly in base ten, is repeating coefficients in the base 10 expansion to infinity, e.g., 0.3333…. We can indicate that 3 repeats to infinity by placing a bar over the first 3 after the decimal point.
By now, dear reader, if not earlier, you may be wondering just what the heck any of this has to do with p-adic numbers. Rest assured, we’re getting there. But first, let’s take stock of where we are. We’ve used base representation as an amazing way to represent all integers, and now, with the decimal point, we’ve expanded this representation to include all rational numbers. What about the reals?
That brings us to the third case, where the base expansion never stops or infinitely repeats at some point, but rather continues along randomly to negative infinity. The ancient Greeks discovered not all quantities could be represented as ratios of whole numbers (fractions)—most notably, the square root of 2. Turns out there are countless examples of such numbers. In fact, “uncountably” many. These numbers came to be known as irrational numbers, and there are so many of them, there are actually infinitely many for every rational number by Cantor’s diagonal argument.
What is important to note from all of this, is that the base (or “radix”) expansion of a real number like π = 3.14159… , for instance, extends to negative infinity…
For any real number, x:
Or in a much more condensed ∑ (“sigma” for “sum”) notation:
Where n is the highest power of 10 for the integer part of x (i.e., the place value of the leftmost digit), and we can use the ∑ summation notation to collapse all the addends from k = n to k = -∞. I think you will agree, ∑ notation is an aesthetically pleasing abbreviation (although quite intimidating if not introduced carefully).
Finally, note that we are not restricted to just using base 10. With base representation, we can use any base, b, with b ≥ 2 to represent the real numbers:
For instance, with the advent of electronic computers, numbers are represented using base 2 (binary), where the only two symbols are 0 and 1. Base 2 can be hard to read, so programmers condense them to hexidecimal (base 16, the 4th power of 2), where the 16 symbols used are {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F}. Hence, a long binary string like: 1011 0101 0111 1101 0001 1111 (base 2) reduces to B57D1F (base 16).
Prior to moving on to the p-adic numbers, it will be helpful to first consider how the real numbers “complete” the rational numbers. The reason for this is, as we shall see, the p-adic numbers also “complete” the rational numbers, but in very different, but also, in very complementary ways. In both cases, completing the rational numbers depends on what we mean by “distance” and then by “closeness.”
Let’s start with the more familiar notion of the absolute value of a number: |x|. Most of us have come to know the absolute value as a way to ensure that a number is positive:
Because the square of a negative number is positive. The absolute value basically tells us the distance of a number from 0. More generally:
The absolute value gives us a “metric” (or measure) of how close two numbers are. When y = 0, we just get the absolute value of x, because x - 0 = x; when x = y, |x - y| = 0, because the distance of a number from itself is always zero; and that |x - y| = |y - x|, because the distance between x and y is the same distance between y and x. Most importantly, the triangle inequality:
Note that the triangle inequality basically states that the hypotenuse of a triangle with vertices x, y, and z, is less than the sum of the two adjacent sides, and it is only equal in the “degenerate” case where x, y, and z, are co-linear (all on the same line).
This simple and very intuitive idea of absolute value, as a way to define distance between any two rational numbers, gives the set of rational numbers a structure that enables us to measure the distance between any two points. With this distance metric we can analyse sequences of rational numbers. Consider, for instance, the sequence of rational numbers that approximates the square root of 2:
Using the distance metric |x - y|, we see that consecutive terms in this sequence are getting closer and closer and the distance between them becomes vanishingly small. Sequences like these, where the terms get arbitrarily close to each other, are called Cauchy sequences.
Here’s the problem: Cauchy sequences look like they are converging to something. They are zeroing in (pun partially intended) on specific locations on the number line. But there are no rational numbers at those locations. The rational number line is, therefore, incomplete. It has “gaps” or “holes.”
This is where the completion process begins, by first recognizing there are gaps in the set of rational numbers that would be nice to be able to fill (who likes potholes?). The completion process ends by showing there exists a Cauchy sequence that is comprised of rational numbers for every gap in the rational numbers that converges to an irrational number. In this way, the set of real numbers are seen as a metric completion of the rational numbers with respect to the distance metric defined above.
If you’ve made it to this point, congratulations. You have both a notational and a conceptual foundation for understanding p-adic numbers. I must warn you, though, things are gonna start getting weird. Buckle up. We’re going down a deep rabbit hole.
The most important thing to grasp is that p-adic spaces use an entirely different metric for how “close” numbers are to each other. Instead of using a distance metric that enables the reals to complete the rationals, a divisibility “ultrametric” is used based on the divisibility of numbers with respect to a given prime, p.
Here’s how it works when we let p = 2. Numbers whose differences are divisible by increasingly large powers of 2 are increasingly close. So, for instance, because the difference between 1210 and 186 is 1024, which is 2 to the power of 10, they are considered very much “closer” to each other than, say, 33 and 31, where the difference between 33 and 31 is only 2 to the first power. Any given number, n, is defined as being infinitely close to itself, which satisfies our intuition that n - n = 0.
On the other hand, any two consecutive numbers, n + 1 and n, are as far apart, or “distant” from each other as can be because (n + 1) - n = 1 which isn’t divisible by 2 at all. In fact, because the difference between any odd and even number is always odd, and as such never has a factor of 2, they are also as far apart as far apart can be. All else falls between these extremes. How are these ideas to be formalized?
The first thing we need to define is the “p-adic valuation” of a number, n ≠ 0, with respect to a given prime p, which means the largest power of p that divides n. Note that in what follows, these definitions hold for all rational numbers a/b with b ≠ 0, but it can be more intuitive to think about them just in terms of integers or natural numbers.
If n = 0, then the p-adic valuation is defined to be infinity. With this p-adic valuation in hand, we can now define the p-adic counterparts to the absolute value over the real numbers, or the “p-adic absolute value”:
For example, we saw above that 1024 is 2 to the power of 10, hence its absolute value:
We can now define the p-adic divisibility ultrametric noted above as:
We can see now that applying this ultrametric to n = 1210 and m = 186 gives us 1/1024. The smaller the result, the closer together n and m are. Whenever the p-adic absolute value is not divisible by p, the ultrametric is always 1, and it is always 0 when n = m.
The reason why this metric is referred to as an ultrametric (and “non-Archimedean”) is because a stronger version of the triangle inequality holds true (from which the weaker version noted above can be seen to also hold true):
This means the “distance” from n to m is just the value of which of those two numbers is most divisible by p, and not the sum of the greatest powers of p dividing the two. This property has a fundamentally different geometric structure (triangles that are isosceles rather than scalene) compared to the standard (Archimedean) metric like the Euclidean distance we saw earlier over the real numbers.
Just as the Euclidean metric enabled holes in the rational numbers to be filled with real numbers using Cauchy sequences, so too does the ultrametric enable holes to be filled in the rational numbers as well for each of the infinite number of p-adic spaces (I’ll spare you the details on that). What I find particularly profound about all this is Ostrowski’s Theorem, which is a fundamental result in number theory that classifies all possible absolute values on the rational numbers. It states that every non-trivial absolute value on is equivalent to either the standard real absolute value or one of the p-adic absolute values for some prime p. Accordingly, with the absolute value for the real numbers and the absolute value for the p-adic numbers, we’ve covered all the possibilities for completing (filling the holes in) the rational numbers.
What I find particularly amazing is that when one multiplies all these different absolute values for a given number, for the reals and for all the p-adics, their product is 1. This is because the product of the absolute values for any p-adic number for all primes p, is just the reciprocal of the absolute value of that number in the reals.
Where the absolute value for the real numbers is indicated by the ∞ subscript. As an example, recall that for 1024 (which only has powers of 2 in it’s prime decomposition), the 2-adic absolute value was 1/1024. The Euclidean norm is just 1024. It is easy to see that multiplying the two together gives us 1. The same holds true for the prime decomposition for any rational number.
This product formula holds because the real and p-adic absolute values measure magnitude in complementary ways. The real absolute value captures the “size” in the usual sense, while each p-adic absolute value measures divisibility by a specific prime. The exponents in the factorization of any given rational number are balanced across these valuations, ensuring their product is always 1, reflecting a deep arithmetic harmony in the completions of the rational numbers.
Finally, here they are, for p-adic integers, where p is any prime number:
For a general p-adic number, including fractional components (m < 0):
This post is already longer than I had anticipated, so I’m going to leave my treatment of p-adic numbers at this for now. In my next post, we’ll take a deeper dive into p-adic numbers, particularly in regard to the Inverse Collatz Map.
In the meantime, to help whet yer whistle, here are some visualizations.



Nice introductions to p_adic numbers
https://youtu.be/3gyHKCDq1YA
https://youtu.be/vdjYiU6skgE
v_2(13981013, 3495253) = ? ;^)