Using Math to Make Underwater Plants

I was bored, so I decided to generate some “plants” in HTML5 Canvas, using Lindenmayer systems (I can go into greater detail later), with a dash of pseudo-randomness sprinkled in, so that each plant is unique, like a snowflake. I like to imagine that the plants are underwater, as part of a coral reef (yes, I know, coral is an animal, but I feel they belong there, visually) Here’s a few of them. Aren’t they pretty?

download (5).png

download (6).png

download (7).png

download (9).png

Understanding Machine Learning

Machine Learning allows computers to carry out complex tasks without being specifically programmed to do so. One way that this is possible is through the process known as Supervised Learning, in which the computer is given a set of training data – samples of input and output of the desired function – and from that data is able to generate a function which, given the input needed to perform the desired task, is able to return the correct output. There are two main kinds of tasks that are done with supervised learning: classification and regression. With classification, the computer takes some input data, usually as a vector, and returns one of a finite set of possible outcomes. For example, tell the computer that something is a reptile, that it has a dome-shaped shell, and that its average lifespan is over 100 years, and a properly “learned” machine could tell you that what you are describing is a tortoise. With regression, the computer takes a vector of input data and maps it to the predicted result within a continuous solution space. For example, tell the computer how large your house is in square feet, how many bedrooms/bathrooms it has, how old it is, etc. and it could return the predicted value of your house in dollars.  But how do these processes work? Regression and classification both commonly use the same methods at their core: using gradient descent to achieve linear or logistic regression. For the sake of simplicity, let’s stick to linear regression for now. Linear regression is the process of taking a set of data-points and finding the line that best fits the data points. How do we determine how well a line “fits” the data? Consider a line in 2-dimensional Euclidean Space, whose function is given by:

y = mx + b

Where x is an independent variable, y is a dependent variable, and m and b are constant coefficients. Now consider a sample data point, p0 , which is located at the point (x0, y0) in 2-dimensional space. We can calculate the Cost Function at this point to be | y0 – y |, where y = mx0 + b. So, given all of our data points, we can calculate sum of the cost for all of these points. Actually, it is more useful to calculate something called the mean squared error, which is 1/2 times the average squared cost for all of the points, that is, the average of the squares of the cost divided by 2. We can actually represent this as a function of our coefficients m and b. So, we want to find the m and b which gives us the smallest average square cost. How we find the optimal m and b brings us to the crux of the problem: gradient descent. This average squared cost function forms a paraboloid in 3-dimensional space. The minimum of the paraboloid represents our m and b values which minimize the average cost. So how do we find this minimum? Pick any two values to start with for m and b. Now think back to calculus. If we can perpetually find the direction of greatest descent at every point in time, ie, the direction with the lowest directional derivative, and continuously move in that direction, we will eventually reach the bottom of the paraboloid. And if you think back to calculus again, it is easy to find this direction, to find the gradient of a function at a point, just calculate the directional derivatives for each vector in a basis for your vector space and multiply those directional derivatives by their respective element in the basis. This gives us the steepest positive gradient. To get the steepest negative gradient, just go in the exact opposite direction. So that’s the direction we move in. And if we do this and move at small enough decreasing increments, we can approximate doing it continuously.

Kanter’s Zeta Function – Part 2: an exploration with Python

Continuing from Part 1.

(note to the reader: this was posted without being proofread. I will proofread/check my work later because I am too tired right now. I might have also forgotten to include a few points) 

Who want’s to do some math?

Let’s pick up right where we left off. I’m not going to go over anything from Part 1, because honestly, for the sake of digestibility, this should not have even been broken up into 2 parts, and I will probably just repost it as a single, combined post later anyway.

Ok. We have our gamma, psi, and zeta functions from part 1. We’re going to want to graph some of that data with python, but in order to do that, we ought to modify a lot of that code from Part 1 in order to make the whole process go faster/smoother. First, we’ll want to quickly generate a file with all of the gamma values which we can reference to quickly get the psi values (this way we can cut out many steps of recursion).

def gfilemaker(n):
    glist = [0]
    gglist = str(glist)
    gfile = open('gammafile.txt','w')
    gfile.write(gglist)
    def gammaloader(s,k):
        d = len(glist)
        if s < d:
            return glist[s][k-1]
        elif k==1:
            return 1
        elif k==s:
            return 1
        else:
            y = 0
            for x in range(1,min(s-k+1,k+1)):
                y+=gammaloader(s-k,x)
            return y
    for a in range(1,n+1):
        alist = []
        for k in range(1,a+1):
            alist.append(gammaloader(a,k))
        glist.append(alist)
        aglist = str(alist)
        gfile.write('\n')
        gfile.write(aglist)
    gfile.close()
    return glist

gammalist = gfilemaker(100)

notice that the function gammaloader is basically our gamma function from part one, but modified to cut down on the amount of recursion we have to go through.

Now let’s make a similar function to quickly generate a file containing all of our zeta values. I won’t bother making a file with the psi values, because the psi values will be so easy to get (since psi(n) is just the sum of the list in the n+1-th row of gammafile.txt).

def zetalister(n):
    gammafile = open('gammafile.txt','r')
    lines = gammafile.readlines()
    def gline(a):
        gline=lines[a].replace('[','').replace(']','').replace(',','')
        gline = gline.split()
        for x in range(len(gline)):
            gline[x]=int(gline[x])
        return gline
    def quickpsi(num):
        guesses=primegenerator(num**0.5)
        def quickprimer(trial):
            primedict = {}
            def factor(p, s):
                if not(p in primedict):
                    primedict[p]=1
                else:
                    primedict[p]+=1
            for p in guesses:
                while trial % p == 0:
                    factor(p, trial)
                    trial = int(trial/p)
                if trial == 1:
                    return primedict
            if trial != 1:
                primedict[trial] = 1
            return primedict
        psifactlist=[0]
        for z in range(1,num+1):
            b=1
            for a in quickprimer(z):
                c = sum(gline(quickprimer(z)[a])) # here is psi
                b*=c # here is our zeta function
            psifactlist.append(b)
        return psifactlist
    zetafile = open('zetafile.txt','w')
    zetalist = quickpsi(n)
    zetatext = str(zetalist)
    gammafile.close()
    zetafile.write(zetatext)
    zetafile.close()
    return zetalist

zetalist = zetalister(1000000)

notice that quickpsi calls the primegenerator function from part 1. notice that quickprimer is essentially our primefactorizer function from part 1.  notice that our psi function is basically just expressed as c = sum(gline(quickprimer(z)[a])), and that our zeta function shows up in the line just below our psi function, ie, b*=c.

Now we have a file with the first million zeta values. I plotted out the data in multiple ways using matplotlib. I will spare you the details, since it was probably fairly routine, and not unique to this particular study. (If you really want, dear reader, I’m willing to share the code, I just assume you’d rather figure it out on your own). Take a look at the results below:

zetalinear
zeta(x) for x up to one million (x-axis is linear scale)

Um. yeah. let’s look at that with a logarithmic x-axis.

zetalog
zeta(x) for x up to one million (x-axis is logarithmic scale)

okay, so zeta(x) is kind of weird to look at. We see a much thicker blue closer to the bottom, because that’s where the majority of data points are. I think the logarithmic scale on the x-axis makes it a bit easier to take in the data. Okay, now let’s take a look at the arithmetic means of the partial sums of zeta(x), which we call sigma(x), that is:

def sigma(x):
    b=0
    for a in range(1,x+1):
        b+=zeta(a)
    return b/float(x)

I didn’t actually write the code like this. I used the zetafile.txt which I generated. I just wanted it to be easier for you to read.

REMEMBER: Our main goal is to determine whether or not sigma(x) is Cesàro Summable. In other words, we want to know whether or not sigma(x) approaches some finite number y as x approaches infinity. Alright, let’s take a look at sigma(x) plotted out with linear x-axis and logarithmic x-axis.

sigmalinear
sigma(x) for x up to one million with linear x-axis
sigmalog
sigma(x) for x up to one million with logarithmic x-axis

Take a look at the graph with logarithmic x-axis. Notice how the data points seem to show up in clusters. Now count the number of data points in each of the first few clusters. 4, 8, 16…notice the pattern? Isn’t it obvious why we see clusters in increasing powers of 2? It’s because zeta(2n) is going to be substantially higher than zeta(x) for any x less than 2n. At least, this is the case for all x up to one million. This is because 2 is the smallest prime, so 2n will be the smallest number to “experience” psi(n). Consequentially, zeta(2n) pushes sigma higher than the other numbers around it do.

Here’s another observation: when we graph sigma with the logarithmic x-axis, the graph resembles something that looks “less than” a linear curve on a graph with a linear x-axis. This would lead us to believe that sigma is possibly “less than” logarithmic. Maybe it is. That doesn’t really help us much though, because even a logarithmic function is unbounded, so sigma still might not have a limit. The problem only becomes more pronounced when we chop the first 999 data points off the graph and zoom in accordingly:

sigmalog1000
sigma(x) for 1000 ≤ x ≤ 1000000 with logarithmic x-axis

Is it just me or is it starting to look almost linear toward the right end of that graph? Crap. No doubt this is a slow-moving function, but it looks like it’s getting harder to prove convergence (if it indeed converges). There’s another issue too. I plotted out the psi(x) for x up to 100. Take a look:

psilinear
psi(x) for x up to 100 with linear y-axis

Is that approximately exponential? Sure looks like it. Let’s take another look, this time with a logarithmic y-axis.

psilog
psi(x) for x up to 100 with logarithmic y-axis

Well this is weird. It kind of looks like it might be exponential, but when we look at it with a logarithmic y-axis, we would expect to see something a bit more linear. It’s close, but not quite. Seems maybe a bit “less than” that. Maybe if we had more data-points we would be able to see something more clearly. This is getting kind of difficult, but here’s the worst part. I don’t think we have enough data in our list of zeta values to see the full “long-term” behavior of the function. I’d be much more comfortable with a few more dozen orders of magnitude of data. I mean, I’d like to get to at least 280. But really, I’d like to get to x80 for some high enough prime x, because we need to know what happens when the prime factorization contains multiple primes, and zeta is the product of a bunch of psi values of sufficiently high exponents. But do we really? One really important step would be to figure out whether or not psi actually is approximately exponential, and if so, then we really want to know the base of the exponent. Because that could probably tell us about how zeta behaves in the long run, and therefore how sigma behaves as well. Here’s yet one more issue: are we going to have to relate this back to the prime number theorem? I don’t think so, because as we observed earlier (with the clusters of powers of 2), powers of small primes are going to be more important in driving sigma since they are easier to come by. And this difference in “importance” only becomes more pronounced as the exponents increase. In fact, if I wasn’t so damn sleepy right now, I might be up for some kind of make-shift proof of something using algebra. But I am very sleepy, so any kind of rigorous examination of this problem will have to wait until next time. I never expected to prove anything with all of this graphing, I just wanted to point myself in the right direction, and I think I have. For now let me leave you with this beautiful gif of our gamma values plotted in 3D, which i created using matplotlib, mplot3d, and ffmpeg (sorry for the bad rendering on the top part of the background of the graph. not sure what that’s all about).

animation.gif
gamma(k,n) for n up to 100. So pretty.

Here, I pause. If you wish to walk no farther with me, reader, I cannot blame you. It is no easy road.

Kanter’s Zeta Function – Part 1: An Introduction with Python

Hello Friends.

This is Part 1 of a 2-part post. I’m going to try to keep things fun and relatively brief. Not too much rigor (please acquaint yourself with the phrase “can be easily shown”), but with enough explanation for the layperson to hopefully figure out what I’m saying if he/she so chooses.

Today I will be introducing a function on the natural numbers (ie, positive integers). I call it my Zeta function. Obviously, if at any point I’m talking about something and I lose you, feel free to look up whatever concept/term you are unfamiliar with. Here we go. For any positive integer x, we will describe:

ζ(x) The number of Abelian Groups, unique up to isomorphism, with x elements.

Ok, so our goal is to come up with a definition of ζ(x). For any of you who don’t know, or cannot remember what an Abelian Group is, I’ll just copy and paste from wikipedia:

An abelian group is a set, A, together with an operation • that combines any two elements and b to form another element denoted a • b. The symbol • is a general placeholder for a concretely given operation. To qualify as an abelian group, the set and operation, (A, •), must satisfy five requirements known as the abelian group axioms:

Closure
For all ab in A, the result of the operation a • b is also in A.
Associativity
For all ab and c in A, the equation (a • b) • c = a • (b • c) holds.
Identity element
There exists an element e in A, such that for all elements a in A, the equation e • a = a • e = a holds.
Inverse element
For each a in A, there exists an element b in A such that a • b = b • a = e, where e is the identity element.
Commutativity
For all ab in Aa • b = b • a.

A group in which the group operation is not commutative is called a “non-abelian group” or “non-commutative group”.

So a Finite Abelian Group is just an Abelian Group with a finite number of elements. Got it? Great. Now, our goal is to define ζ(x) in a way that allows us to study the function, using python. Now, recall that according to the Fundamental Theorem of Finitely Generated Abelian Groups (also from wikipedia):

The primary decomposition formulation states that every finitely generated abelian group G is isomorphic to a direct sum of primary cyclic groups and infinite cyclic groups. A primary cyclic group is one whose order is a power of a prime. That is, every finitely generated abelian group is isomorphic to a group of the form:

\mathbb {Z} ^{n}\oplus \mathbb {Z} _{q_{1}}\oplus \cdots \oplus \mathbb {Z} _{q_{t}},

where the rank n ≥ 0, and the numbers q1, …, qt are powers of (not necessarily distinct) prime numbers. In particular, G is finite if and only if n = 0. The values of nq1, …, qt are (up to rearranging the indices) uniquely determined by G.

Right now would be a good time to remind you that for any positive integer n, the group ℤn is just modular arithmetic where n is the modulus. From this theorem, it can easily be shown that every non-trivial finite abelian group can be expressed in the form:

p1k1\oplus \!\, · · · \oplus \!\, ℤpnkn

where the pi are primes (not necessarily distinct), and the ki are positive integers.

Let’s consider abelian groups with pn elements, where p is any prime, and n is a positive integer. Well, clearly, every such group can be expressed in the form:

pk1\oplus \!\,· · · \oplus \!\,pkm

where k1 + . . . + km = n

to make these expressions unique, we need only to add the rule that:

k1 ≥ . . . ≥km

Ok. so we now have a way to uniquely express every finite abelian group with pn elements, and, by extension, a way to uniquely express every finite abelian group with x elements for any integer x. Clearly, all we would need in order to do this is the prime factorization of x. We are now going to define a function called psi.

ψ(n) The number of abelian groups, unique up to isomorphism, with pn elements for any prime p

Notice that ψ(n) = ζ(pn) for any prime p. Clearly also, if we get a prime factorization for any positive integer x, that is,

x = p1k1 · · · pmkm, where each pi is a distinct prime, and each ki is a positive integer,

then ζ(x) = ψ(k1) · · · ψ(km).

Ok, so it looks like the key to cracking zeta is to crack psi. In order to do that, we’re going to think about psi visually.

Let’s use columns of red dots as a way to visually represent any abelian group with pn for any prime p. Ready? Let’s use, as an example:
p3 \oplus \!\,p2 and represent it like this:

z3xz2

Get it? Given ℤpk1 \oplus \!\, . . .  \oplus \!\, ℤpkn , we can represent this group with a sequence of n columns of red dots, where, from left to right, the j-th column is kj dots tall. And remember, we can guarantee that these visual representations are unique up to isomorphism by saying that kj ≥ kj+1 for all j

We are going to introduce here yet another function. I call it gamma. Here’s how it works:

γk(n) = The number of abelian groups with pn elements for any prime p of which the largest cyclic subgroup is of order pk.

In other words γk(n) = the number of ways to combine n dots into columns as described above, where the tallest (leftmost) column contains k dots. If you have not figured it out by now, let me tell you right now that:

This can be a lot to think about in terms of equations, so let’s use the visualizations from earlier. Let’s look at visualizations of the abelian groups with p5 elements:

psi5

We can see from the above example that ψ(5) = 7 and we can see γk(5) for each k. Rather than give a long-winded explanation of how to calculate gamma, I am just going to introduce the basic python code to calculate gamma and psi, and then if you look at it long enough, I think you will be able to reverse engineer my thought process. Ready?

def gamma(k,n):
   if k==1:
      return 1
   elif k==n:
      return 1
   else:
      y = 0
      for x in range(1,min(n-k+1,k+1)):
         y+=gamma(x,n-k)
      return y

def psi(n):
   y=0
   for k in range(1,n+1):
      y+=gamma(k,n)
   return y

(Note how gamma is defined recursively here. This can really slow things down if we’re calculating psi for large enough numbers, or if we’re repeatedly calculating psi on large data sets, so don’t get carried away! Try to keep things under 100 for now. You really won’t need anything higher than that. In my next post, we will modify the code to make it run faster, but for now let’s just keep things simple readable).

Now we can calculate gamma and psi for any numbers we might need. Cool! We’re almost there! Recall that:

if we get a prime factorization for any positive integer x, that is,

x = p1k1 · · · pmkm, where each pi is a distinct prime, and each ki is a positive integer,

then ζ(x) = ψ(k1) · · · ψ(km).

Therefore, with psi defined, all we need now in order to calculate zeta for any x is the prime factorization for x. And in order to get the prime factorization for x, we’ll need to generate primes. Really, we could always just get a list of primes off the internet, but what’s the fun in that? Here’s some (sloppy) code I wrote to generate primes and get prime factorizations:

def primegenerator(ceil):
    t=6
    primes=[2,3,5]
    while t<=ceil:
        z=0
        a=0
        k=primes[a]
        y=(t**(0.5)+1)
        h=len(primes)-1
        while k < y:
            if t%primes[a] == 0:
                z+=1
                k=y+1
            else:
                a+=1
                k=primes[a]
        if z==0:
            primes.append(t)
        t+=1
    return primes
def primefactorizer(x):
    guesses = primegenerator(x**.5)
    primedict = {}
    def factor(p, n):
        if not(p in primedict):
            primedict[p]=1
        else:
            primedict[p]+=1
    for p in guesses:
        while num % p == 0:
            factor(p, x)
            x = int(x/p)
        if x == 1:
            return primedict
    if x != 1:
        primedict[x] = 1
    return primedict

(Notice that primefactorizer calls primegenerator. Notice also that primegenerator returns a list of primes, and that primefactorizer returns a dictionary where the keys are the pi and the values are the ki from the prime factorization)

And with that, we can define Kanter’s Zeta Function:

def zeta(x):
    b = 1
    for a in primefactorizer(x):
        b*=psi(primefactorizer(x)[a])
    return b

(I always thought it would be cool to name something after myself. Children are expensive and they only last, what, 80 years? Functions are free and they last forever.)

There you have it. We can now easily calculate the number of Abelian Groups with x elements for any positive integer x. Wow! That’s SoOoOoOo COOOOOL! And with that we conclude Part 1.

In Part 2, we will modify much of the code written here in order to make it run more efficiently, so that we can study the behavior of Kanter’s Zeta Function on a somewhat large data set. In particular, the question that drives Part 2 will be the following: If we let {ak} be an infinite sequence, where ak = zeta(k), and let:

be the k-th partial sum of the series

is this series Cesàro Summable?

Part 2 coming soon.