The probability of finding a string of n characters

I tried to google this today but I only found a lot of unnecessarily complicated stuff. I wanted something simple but mathematically exact so I had to figure out this myself.

 // p(n,m,b) - the probability of finding at least one occurence of a string of n characters (or digits/symbols)
 // (not starting and ending with the same character) in a larger random string of m characters
 // b - is the number of different types of characters
 double p(int n, int m, int b) { return f(n,m,b) / Math.Pow(b,m); }
 double f(int n, int m, int b) { return m<n ? 0 : b * f(n,m-1,b) + Math.Pow(b,m-n) - f(n,m-n,b); }

Compare this with the standard function for calculating an approximate probability…

double pa(int n, int m, int b) { return 1 - Math.Pow((Math.Pow(b,n)-1) / Math.Pow(b,n), m-n+1); }

The f-function above calculates the total number of possible ways that the the sought for string can appear in the larger random string. Although it’s pretty short it’s not particularly fast. I made a much faster algorithm based on this idea…


For example: Let’s say you want to calculate the probability of finding a particular 5 digit number in a random sequence of 18 digits. We first use the f-function to calculate the number of possible ways it can occur:


The the probability is given by:


Here’s my faster algorithm

double f(int n, int m, int b)
    int nn = n; int sg = 1; int p = 0;
    int q = 0; int a = 1; double sum = 0;
    do {
        q = m-nn;
        p = q+a;
        sum += sg * Combin(p,q) * Math.Pow(b,q);
        sg = -sg;
        nn += n;
    } while (m-nn >= 0);
    return sum;

”Combin(p,q)” is of course the combination-function – ”p over q”.



Fyll i dina uppgifter nedan eller klicka på en ikon för att logga in: Logo

Du kommenterar med ditt Logga ut / Ändra )


Du kommenterar med ditt Twitter-konto. Logga ut / Ändra )


Du kommenterar med ditt Facebook-konto. Logga ut / Ändra )


Du kommenterar med ditt Google+-konto. Logga ut / Ändra )

Ansluter till %s