4 visualizações

Enviado por Michelotto Toonomamoueinaikaneís Paulo

MATEMATUCA AVANÇADA
PARA QUEM NAO SABE TANTA MATEMÁTICA ASSIM

MATEMATUCA AVANÇADA
PARA QUEM NAO SABE TANTA MATEMÁTICA ASSIM

© All Rights Reserved

- Number Theory tutorial
- Rosen - Elementary Number Theory and Its Applications - 1st
- Diktat Olimpiade Matematika I
- Algebra
- Aks Codes
- The Inﬂuence of Homogeneous Algorithms on Machine Learning
- DataSufficiency Challenge
- Complex Space Factorization
- a^n+-1 Solutions - Yufei Zhao - Trinity Training 2011
- What is Mathematics
- Calculator
- Primality Testing
- ON A THEOREM OF WILSON
- IFMConf_9
- 01.Number System.pdf
- Mas4203 Class Notes1
- 0580_s13_qp_23.pdf
- Euler, Goldbach and Fermat Theorem
- beamer4.pdf
- Aljabar

Você está na página 1de 125

Designing Non-repeating Patterns with Prime Numbers

Low-Complexity Art

Random Psychedelic Art

Seam-carving for Content-Aware Image Scaling

The Cellular Automaton Method for Procedural Cave Generation

Bezier Curves and Picasso

Making Hybrid Images

Signal Processing

The Fast Fourier Transform Algorithm, and Denoising a Sound Clip

The Two-Dimensional Fourier Transform and Digital Watermarking

Making Hybrid Images

The Welch-Berlekamp Algorithm for Correcting Errors in Data

Introduction

K-Nearest-Neighbors and Handwritten Digit Classification

The Perceptron, and All the Things it Cant Perceive

Decision Trees and Political Party Classification

Neural Networks and the Backpropagation Algorithm

K-Means Clustering and Birth Rates

Linear Regression

Eigenfaces, for Facial Recognition (via Principal Component Analysis)

Bandit Learning: the UCB1 Algorithm

Bandit Learning: the EXP3 Algorithm

Bandits and Stocks

Weak Learning, Boosting, and the AdaBoost Algorithm

Fairness in machine learning (introduction, statistical parity)

The Johnson-Lindenstrauss Transform

Singular Value Decomposition (motivation, algorithms)

Support Vector Machines (inner products, primal problem, dual problem)

Trees and Tree Traversal

Breadth-First and Depth-First Search

The Erds-Rnyi Random Graph

The Giant Component and Explosive Percolation

Zero-One Laws for Random Graphs

Community Detection in Graphs, a Casual Tour

Googles PageRank Algorithm:

Introduction,

A First Attempt,

The Final Product,

Why It Doesnt Work Anymore

Combinatorial Optimization

When Greedy is Good Enough: Submodularity and the 1 1/e Approximation

When Greedy Algorithms are Perfect: the Matroid

Linear Programming and the Most Affordable Healthy Diet Part 1

Linear Programming and the Simplex Algorithm

The Many Faces of Set Cover

Stable Marriages and Designing Markets

Serial Dictatorships and Housing Allocation

Quantum Computing

A Motivation for Quantum Computing

The Quantum Bit

Multiple Qubits and the Quantum Circuit

Concrete Examples of Quantum Gates

Cryptography

RSA

Elliptic Curves Introduction

Elliptic Curves as Elementary Equations

Elliptic Curves as Algebraic Structures

Elliptic Curves as Python Objects (over the rational numbers)

Programming with Finite Fields

Connecting Elliptic Curves with Finite Fields

Elliptic Curve Diffie-Hellman

Sending and Authenticating Messages with Elliptic Curves (Shamir-Massey-Omura and

ElGamal)

The Mathematics of Secret Sharing

Zero Knowledge Proofs (primer, zero knowledge proofs for NP, definitions and theory)

Load Balancing and the Power of Hashing

Program Gallery entries

Natural Language

Metrics on Words

Word Segmentation, or Makingsenseofthis

Cryptanalysis with N-Grams

A Sample of Standard ML (and a Preview of Monoids)

Categories, Whats the Point?

Introducing Categories

Categories as Types

Properties of Morphisms

Universal Properties

Functoriality

The Universal Properties of Map, Fold, and Filter

Computational Topology

Computing Homology

Fixing Bugs in Computing Homology

The ech complex and the Vietoris-Rips complex

Games

The Wild World of Cellular Automata

Turing Machines and Conways Dreams

Conways Game of Life in Conways Game of Life

Optimally Stacking the Deck: Kicsi Poker

Optimally Stacking the Deck: Texas Hold Em

Want to make a great puzzle game? Get inspired by theoretical computer science.

Miscellaneous

The Reasonable Effectiveness of the Multiplicative Weights Update Algorithm

Hunting Serial Killers

Learning to Love Complex Numbers

Holidays and Homicide

Numerical Integration

Busy Beavers, and the Quest for Big Numbers

Row Reduction over a Field

Complete Sequences and Magic Tricks

Well Orderings and Search

Education

Teaching Mathematics Graph Theory

Learning Programming Finger-Painting and Killing Zombies

How to Take a Calculus Test

Why Theres no Hitchhikers Guide to Mathematics

Deconstructing the Common Core Mathematical Standard

Guest Posts

Math and Baking: Torus-Knotted Baklava

With High Probability: Whats up with Graph Laplacians?

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

5 Comments

1. Pingback: Cryptanalysis with N-Grams | Math Programming

2. Pingback: Matemtica e lgebra Computacional - Mrcio Francisco Dutra e Campos

3. Pingback: The Universal Properties of Map, Fold, and Filter | Math Programming

4. Pingback: Introducing Categories | Math Programming

5. Pingback: Introducing Elliptic Curves | Math Programming

Prime Design

Posted on June 13, 2011 by j2kun

The goal of this post is to use prime numbers to make interesting and asymmetric graphics,

and to do so in the context of the web design language CSS.

Number Patterns

For the longest time numbers have fascinated mathematicians and laymen alike. Patterns in

numbers are decidedly simple to recognize, and the proofs of these patterns range from

trivially elegant to Fields Medal worthy. Heres an example of a simple one that computer

science geeks will love:

If youre a mathematician, you might be tempted to use induction, but if youre a computer

scientist, you might think of using neat representations for powers of 2

Proof: Consider the base 2 representation of , which is a 1 in the th place and zeros

everywhere else. Then we may write the summation as

And clearly adding one to this sum gives the next largest power of 2.

This proof extends quite naturally to all powers, giving the following identity. Try to

prove it yourself using base number representations!

The only other straightforward proof of this fact would require induction on , and as a

reader points out in the comments (and I repeat in the proof gallery), its not so bad. But it

was refreshing to discover this little piece of art on my own (and it dispelled my boredom

during a number theory class). Number theory is full of such treasures.

Primes

Though there are many exciting ways in which number patterns overlap, there seems to be

one grand, overarching undiscovered territory that drives research and popular cultures

fascination with numbers: the primes.

attempts to characterize the prime numbers admit results implying intractability. Here are a

few:

For any natural number , there exist two primes with no primes between

them and . (there are arbitrarily large gaps between primes)

It is conjectured that for any natural number , there exist two primes larger

than with . (no matter how far out you go, there are still primes that

are as close together as they can possibly be)

Certainly then, these mysterious primes must be impossible to precisely characterize with

some sort of formula. Indeed, it is simple to prove that there exists no polynomial formula

with rational coefficients that always yields primes*, so the problem of generating primes

via some formula is very hard. Even then, much interest has gone into finding polynomials

which generate many primes (the first significant such example was , due to

Euler, which yields primes for ), and this was one of the driving forces behind

algebraic number theory, the study of special number rings called integral domains.

*Aside: considering the amazing way that the closed formula for the Fibonacci numbers

uses irrational numbers to arrive at integers, I cannot immediately conclude whether the

same holds for polynomials with arbitrary coefficients, or elementary/smooth functions in

general. This question could be closely related to the Riemann hypothesis, and Id expect a

proof either way to be difficult. If any readers are more knowledgeable about this, please

feel free to drop a comment.

However, the work of many great mathematicians over thousands of years is certainly not in

vain. Despite their seeming randomness, the pattern in primes lies in their distribution, not

in their values.

Theorem: Let be the number of primes less than or equal to (called the prime

counting function). Then

Intuitively, this means that is about for large , or more specifically that if

one picks a random number near , the chance of it being prime is about . Much

of the work on prime numbers (including equivalent statements to the Riemann hypothesis)

deals with these prime counting functions and their growth rates. But stepping back, this is a

fascinatingly counterintuitive result: we can say with confidence how many primes there are

in any given range, but determining what they are is exponentially harder!

And whats more, many interesting features of the prime numbers have been just stumbled

upon by accident. Unsurprisingly, these results are among the most confounding. Take, for

instance, the following construction. Draw a square spiral starting with 1 in the center, and

going counter-clockwise as below:

If you circle all the prime numbers youll notice many of them spectacularly lie on common

diagonals! If you continue this process for a long time, youll see that the primes continue to

lie on diagonals, producing a puzzling pattern of dashed cross-hatches. This Ulam

Spiral was named after its discoverer, Stanislaw Ulam, and the reasons for its appearance

are still unknown (though conjectured).

All of this wonderful mathematics aside, our interest in the primes is in its apparent lack of

patterns.

Primes in Design

One very simple but useful property of primes is in least common denominators. The

product of two numbers is well known to equal the product of their least common multiple

and greatest common divisor. In symbols:

We are particularly interested in the case when and are prime, because then their

greatest (and only) common divisor is 1, making this equation

The least common multiple manifests itself concretely in patterns. Using the numbers six

and eight, draw two rows of 0s and 1s with a 1 every sixth character in the first row and

every 8th character in the second. Youll quickly notice that the ones line up every twenty-

fourth character, the lcm of six and eight:

000001000001000001000001000001000001000001000001

000000010000000100000001000000010000000100000001

Using two numbers which are coprime (their greatest common divisor is 1, but they are

not necessarily prime; say, 9 and 16), then the 1s in their two rows would line up

every characters. Now for pretty numbers like six and eight, there still appears to be a

mirror symmetry in the distribution of 1s and 0s above. However, if the two

numbers are prime, this symmetry is much harder to see. Try 5 and 7:

0000100001000010000100001000010000100001000010000100001000010000100001

0000001000000100000010000001000000100000010000001000000100000010000001

There is much less obvious symmetry, and with larger primes it becomes even harder to tell

that the choice of match up isnt random.

provided we use large enough primes. However, patterns in strings of 1s and 0s are not

quite visually appealing enough, so we will resort to overlaying multiple backgrounds in

CSS. Consider the following three images, which have widths 23, 41, and 61 pixels,

respectively.

23

41

61

Each has a prime width, semi-transparent color, and a portion of the image is deleted to

achieve stripes when the image is x-repeated. Applying our reasoning from the 1s and 0s,

this pattern will only repeat once every pixels!

As designers, this gives us a naturally non-repeating pattern of stripes, and we can control

the frequency of repetition in our choice of numbers.

html {

background-image: url(23.png), url(41.png), url(61.png);

}

Im using Google Chrome, so this is all the CSS thats needed. With other browsers you

may need a few additional lines like height: 100% or margin: 0, but Im not going to

worry too much about that because any browser which supports multiple background

images should get the rest right. Heres the result of applying the CSS to a blank HTML

webpage:

as a warning to the reader, using these three particular colors may result in an eyesore more

devastating than an 80s preteen bedroom, but it illustrates the point of the primes, that on

my mere 1440900 display, the pattern never repeats itself. So brace yourself, and click the

thumbnail to see the full image.

Now, to try something at least a little more visually appealing, we do the same process with

circles of various sizes on square canvases with prime length sides ranging from 157157

pixels to 419419. Further, I included a little bash script to generate a css file with

randomized background image coordinates. Here is the CSS file I settled on:

html {

background-image: url(443.png), url(419.png), url(359.png),

url(347.png), url(157.png), url(193.png), url(257.png),

url(283.png);

background-position: 29224 10426, 25224 24938, 8631 32461,

22271 15929, 13201 7320, 30772 13876, 11482 15854,

31716, 21968;

}

#! /bin/bash

echo -n " background-image: url(443.png), url(419.png), "

echo -n "url(359.png), url(347.png), url(157.png), url(193.png), "

echo -n "url(257.png), url(283.png);"

echo -n " background-position: "

for i in {1..7}

do

echo -n "$RANDOM $RANDOM, "

done

echo "}"

Prime Circles

And here is the result. Again, this is not a serious attempt at a work of art. But while you

might not call it visually beautiful, nobody can deny that its simplicity and its elegant

mathematical undercurrent carry their own aesthetic beauty. This method, sometimes

called the cicada principle, has recently attracted a following, and the Design Festival blog

has a gallery of images, and a few that stood out. These submissions are the true works of

art, though upon closer inspection many of them seem to use such large image sizes that

there is only one tile on a small display, which means the interesting part (the patterns) cant

be seen without a ridiculously large screen or contrived html page widths.

So there you have it. Prime numbers contribute to interesting, unique designs that in their

simplest form require very little planning. Designs become organic; they grow from just a

few prime seedlings to a lush, screen-filling ecosystem. Of course, for those graphical

savants out there, the possibilities are endless. But for the rest of us, we can use these

principles to quickly build large-scale, visually appealing designs, leaving math-phobic

designers in the dust.

It would make me extremely happy if any readers who play around and come up with a cool

design submit them. Just send a link to a place where your design is posted, and if I get

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

tagged bash, css, graphics, html, mathematics, patterns, primes, programming. Bookmark

the permalink.

Post navigation

Googles PageRank Introduction

Well Orderings and Search

1. Pingback: Kolmogorov Complexity A Primer | Math Programming

2. Nik Coughlin

I know this post is quite old now, but I just came across it if you havent seen it already,

there is an article (and gallery) regarding this principle that you might like to look at:

http://designfestival.com/the-cicada-principle-and-why-it-matters-to-web-designers/

http://designfestival.com/cicada/

Like

o j2kun

Yup! That was the inspiration for my post, because unfortunately back when I

started this blog I didnt have any original ideas. Interestingly, I find applications

for this concept all over the place, not just in design. For instance, I went to a salsa

club recently, and I noticed that the stage lights moved in patterns with very tiny

periods, the whole pattern repeating twice a minute at least. Why not make it more

random, and give each lights erratic path a prime number period? Then the lights

wouldnt repeat the same pattern for the entire night at least!

Like

3. Jeremy

Only a single variable of induction is required. In fact if you want to use a little abstract

algebra, then things get even easier (and more general).

And this will obviously expand out exactly to , exactly the sum you

want. And this will hold for an arbitrary ring.

Like

o j2kun

I believe it could work (and your TeX is a bit wonky), but it certainly is messy. I

revisited this problem in the proof gallery to give the shortest possible proof. And I

also used polynomials in an alternate proof. Indeed, it holds for

arbitrary commutative rings with unity, but I was relaxed enough to leave it at

polynomials over a ring.

Like

Jeremy

How do I edit the post? Im not that experienced with Tex, and I made

several mistakes.

But indeed the first proof does not require induction over two variables, and

the second holds in any ring where the sum is defined (commutativity not

required, and if there are no zero divisors, then the division will give a

unique answer rather than an equivalency class).

Like

j2kun

I fixed the latex issues, and worked it out myself, and I agree that you can

just prove it by induction on one variable, and that the proof is not so bad. I

But I think you do have to worry about commutativity. Take for instance the

same idea and extend it to two variables: xy = yx as polynomials, but

plugging in random values for x and y doesnt make this still hold.

Like

Jeremy

Haha, thanks for fixing that. Theres still a duplicate line, and some minor

mistakes, if you care. I definitely agree that your proof is cool, I suppose I

just commented because I like induction. Certainly your proof offers more

*insight*.

The reason I dont think you have to worry about commutativity is that the

only values you deal with are 1 and powers of x, which automatically

commute. Once you start adding multi-variable polynomials, then yeah, I

agree you have to be careful about how you manipulate them.

Like

5. Pingback: Matemticas y programacin | CyberHades

6. Pingback: Kolmogorov Complexity A Primer Another Word For It

7. HarryPotter5777

For polynomials with arbitrary coefficients, I think the proof isnt actually too bad the

only polynomials which produce integer outputs for integer inputs are those with rational

coefficients. This can be seen by taking the series $f(0), f(1), f(2), $ and taking successive

differences between terms. After $n$ differences for an $n^\textrm{th}$ degree polynomial,

the sequence will go to 0; then for each first term $d_i$ in the differences compute the sum

$\sum_{i=0}^{n}{d_i} {x \choose i}$. For instance, with $f(x)=x^2$, the series is

$0,1,4,9,16,$, the first differences are $1,3,5,7,$, the second differences are

$2,2,2,2,2,$, and all successive differences are $0$. If we look at $0{x\choose 0}+1{x

\choose 1}+2{x \choose 2}$, we find that its equal to $x^2$. Since this provides a rational

polynomial expression for any polynomial-generated series of integers, we can conclude

that all polynomials mapping $\mathbb{N}$ to $\mathbb{N}$ have rational coefficients.

As for smooth functions in general, I think that such functions do exist it shouldnt be too

hard to prove that by adding together an infinite series of smooth functions which decrease

rapidly enough away from a single spike, one can create a function that over time matches

more and more of the primes and converges at every real value.

Low Complexity Art

Posted on July 6, 2011 by j2kun

Whether in painting, fiction, film, landscape architecture, or paper folding, art is often said

to be the art of omission. Simplicity breeds elegance, and engages the reader at a deep,

aesthetic level.

He called it his best work, and rightfully so. To say so much with this simple picture is a

monumental feat that authors have been trying to recreate since Hemingways day.

Unsurprisingly, some mathematicians (for whom the art of proof had better not omit

anything!) want to apply their principles to describe elegance.

This study of artistic elegance will be from a computational perspective, and it will be based

loosely on the paper of the same name. While we include the main content of the paper in a

condensed form, we will deviate in two important ways: we alter an axiom with

justification, and we provide a working implementation for the readers use. We do not

require extensive working knowledge of theoretical computation, but the informed reader

should be aware that everything here is theoretically performed on a Turing machine, but

the details are unimportant.

our own lack of knowledge of the subject, we will overlook the underlying details and take

them for granted. [At some point in the future, we will provide a primer on Kolmogorov

complexity. We just ordered a wonderful book on it, and cant wait to dig into it!]

Here we recognize that all digital images are strings of bits, and so when we speak of the

complexity of a string, in addition to meaning strings in general, we specifically mean the

complexity of an image.

Definition: The Kolmogorov complexity of a string is the length of the shortest program

which generates it.

In order to specify length appropriately, we must fix some universal description language,

so that all programs have the same frame of reference. Any Turing-complete programming

language will do, so let us choose Python for the following examples. More specifically,

there exists a universal Turing machine , for which any program on any machine may be

translated (compiled) into an equivalent program for by a program of fixed size. Hence,

the measure of Kolmogorov complexity, when a fixed machine is specified (in this case

Python), is objective over the class of all outputs.

Here is a simple example illustrating Kolmogorov complexity: consider the string of one

hundred zeros. This string is obviously not very complex, in the sense that one could

write a very short program to generate it. In Python:

One can imagine that a compiler which optimizes for brevity would output rather short

assembly code as well, with a single print instruction and a conditional branch, and some

constants. On the other hand, we want to call a string like

00111010010000101101001110101000111101

complex, because it follows no apparent pattern. Indeed, in Python the shortest program to

output this string is just to print the string itself:

print "00111010010000101101001110101000111101"

And so we see that this random string of ones and zeros has a higher Kolmogorov

complexity than the string of all zeros. In other words, the boring string of all zeros is

simple, while the other is complicated.

(the number itself) for any input. In other words, the problem of determining exact

Kolmogorov complexity is undecidable (by reduction from the halting problem; see the

Turing machines primer). So we will not try in vain to actually get a number for the

Kolmogorov complexity of arbitrary programs, although it is easy to count the lengths of

these provably short examples, and instead we speak of complexity in terms of bounds and

relativity.

To apply this to art, we want to ask, for a given picture, what is the length of the shortest

program that outputs it? This will tell us whether a picture is simple or complex.

Unfortunately for us, most pictures are neither generated by programs, nor do they have

obvious programmatic representations. More feasibly, we can ask, can we come up with

pictures which have low Kolmogorov complexity and are also beautiful? This is truly a

tough task.

To do so, we must first invent an encoding for pictures, and write a program to interpret the

encoding. Thats the easy part. Then, the true test, we must paint a beautiful picture.

We dont pretend to be capable of such artistry. However, there are some who have created

an encoding based on circles and drawn very nice pictures with it. Here we will present

those pictures as motivation, and then develop a very similar encoding method, providing

the code and examples for the reader to play with.

Jrgen Schmidhuber, a long time proponent of low-complexity art, spent a very long time

(on the order of thousands of sketches) creating drawings using his circle encoding method,

and here are some of his results:

Marvelous. Our creations will be much uglier. But we admit, one must start somewhere,

and it might as well be where we feel most comfortable: mathematics and programming.

There are many possible encodings for drawings. We will choose one which is fairly easy to

implement, and based on intersecting circles. The strokes in a drawing are arcs of these

circles. We call the circles used to generate drawings legal circles, while the arcs are legal

arcs. Here is an axiomatic specification of how to generate legal circles:

1. Arbitrarily define the a circle with radius 1 as legal. All other circles are

generated with respect to this circle. Define a second legal circle whose center is

on , and also has radius 1.

2. Wherever two legal circles of equal radius intersect, a third circle of equal radius is

centered at the point of intersection.

3. Every legal circle of radius has at its center another legal circle of radius .

A legal arc is then simply any arc of a legal circle, and a legal drawing is any list of legal

arcs, where each arc has a width corresponding to some fixed set of values. Now we

generate all circles which intersect the interior of the base circle , and sort them first by

radius, then by coordinate, then coordinate. Now given a specified order on the circles,

we may number them from 1 to , and specify a particular circle by its index in the list. In

this way, we have defined a coordinate space of arcs, with points of the form (center,

thickness, arc-start, arc-end), where the arc-start and art-end coordinates are measured in

radians.

We describe the programmatic construction of these circles later. For now, here is the

generated picture of all circles which intersect the unit circle up to radius :

And another animation displaying the list circles sorted by index in increasing order. For an

animated GIF, this file has a large size (5MB), and so we link to it separately.

As we construct smaller and smaller circles, the interior of the base circle is covered up by a

larger proportion of legally usable area. By using obscenely small circles, we may

theoretically construct any drawing. On the other hand, what we care about is how much

information is needed to do so.

Because of our nice well ordering on circles, those circles with very small radii will have

huge indices! Indeed, there are about four circles of radius for each circle of

radius in any fixed area. Then, we can measure the complexity of a drawing by how

many characters its list of legal arcs requires. Clearly, a rendition of Starry Night would

have a large number of high-indexed circles, and hence have high Kolmogorov complexity.

(On second thought, I wonder how hard it would be to get a rough sketch of a Starry-Night-

esque picture in this circle encodingit might not be all that complex).

Note that Schmidhuber defines things slightly differently. In particular, he requires that the

endpoints of a legal arc must be the intersection points of two other legal arcs, making the

arc-start and arc-end coordinates integers instead of radian measures. We respectfully

disagree with this axiom, and we explain why here:

Which of the two arcs is more complex?

Of the two arcs in the picture to the left, which would you say is more complex, the larger

or the smaller? We observe that two arcs of the same circle, regardless of how long or short

they are, should not be significantly different in complexity.

Schmidhuber, on the other hand, implicitly claims that arcs which begin or terminate at

non-standard locations (locations which only correspond to the intersections of sufficiently

small circles) should be deemed more complex. But this can be a difference as small

as , and it drastically alters the complexity. We consider this specification unrealistic,

at least to the extent to which human beings consider complexity in art. So we stick to

radians.

Indeed, our model does alter the complexity for some radian measures, simply because

finely specifying fractions requires more bits than integral values. But the change in

complexity is hardly as drastic.

In addition, Schmidhuber allows for region shading between legal arcs. Since we did not

find an easy way to implement this in Mathematica, we skipped it as extraneous.

We implemented this circle encoding in Mathematica. The reader is encouraged

to download and experiment with the full notebook, available from this blogs Github page.

We will explain the important bits here.

First, we have a function to compute all the circles whose centers lie on a given circle:

borderCircleCenters[{x_, y_}, r_] :=

Table[{x + r Cos[i 2 Pi/6], y + r Sin[i 2 Pi/6]}, {i, 0, 5}];

We arbitrarily picked the first legal circle to be the unit circle, defined with center (0,0),

while the second has center (1,0). This made generating all legal circles a relatively simple

search task. In addition, we recognize that any arbitrary second chosen circle is simply a

rotation of this chosen configuration, so one may rotate their final drawing to accommodate

for a different initialization step.

Second, we have the brute-force search of all circles. We loop through all circles in a list,

generating the six border circles appropriately, and then filtering out the ones we need,

repeating until we have all the circles which intersect the interior of the unit circle. Note our

inefficiencies: we search out as far as radius 2 to find small circles which do not necessarily

intersect the unit circle, and we calculate the border circles of each circle many times. On

the other hand, finding all circles as small as radius takes about a minute on an Intel

Atom processor, which is not so slow to need excessive tuning for a prototypes sake.

ord, rt},

ord[{a_, b_}, {c_, d_}] := If[a < c, True, b < d];

centers = {{0, 0}};

rt = Power[r, 1/2];

While[Norm[centers[[-1]]] <= Min[2, 1 + rt],

borderCenters = Map[borderCircleCenters[#, r] &, centers];

centers = centers \[Union] Flatten[borderCenters, 1]];

];

Finally, we have a function to extract from the resulting list of all centers the center and

radius of a given index, and a function to convert a coordinate to its graphical

representation:

index of the circle *)

indexToCenterRadius[layeredCenters_, index_] :=

Module[{row, length, counter},

row = 1;

length = Length[layeredCenters[[row]]];

counter = index;

counter -= length;

row++;

length = Length[layeredCenters[[row]]];

];

];

Module[{center, radius},

{center, radius} = indexToCenterRadius[allCenters, index];

Graphics[{Thickness[thickness],

Circle[center, radius, {arcStart, arcEnd}]},

ImagePadding -> 5, PlotRange -> {{-1, 1}, {-1, 1}},

ImageSize -> {400, 400}]

];

And a front-end style function, which takes a list of coordinates and draws the resulting

picture:

Any omitted details (at least one global variable name) are clarified in the notebook.

Now, with our paintbrush in hand, we unveil our very first low-complexity piece of art.

Behold! Surprised Mr. Moustache Witnessing a Collapsing Souffl:

{299, 0.002, 0, 2 Pi}, {783, 0.002, 0, 2 Pi},

{2140, 0.001, 0, 2 Pi}, {3592, 0.001, 0, 2 Pi},

{22, 0.004, 8 Pi/6, 10 Pi/6}, {29, 0.004, 4 Pi/3, 5 Pi/3},

{21, 0.004, Pi/3, 2 Pi/3}, {28, 0.004, Pi/3, 2 Pi/3}}

Okay, so its lame, and took all of ten minutes to create (guess-and-check on the indices is

quick, thanks to Mathematicas interpreter). But it has low Kolmogorov complexity! And

thats got to count for something, right?

Even if you disagree with our obviously inspired artistic genius, the

Mathematica framework for creating such drawings is free and available for anyone to play

with. So please, should you have any artistic talent at all (and access to Mathematica), we

would love to see your low-complexity art! If we somehow come across three days of being

locked in a room with access to nothing but a computer and a picture of Starry Night, we

might attempt to recreate a sketch of it for this blog. But until then, we will explore other

avenues.

Happy sketching!

Addendum: Note that the outstanding problem here is how to algorithmically take a given

picture (or specification of what one wants to draw), and translate it into this system of

coordinates. As of now, no such algorithm is known, and hence we call the process of

making a drawing art. We may attempt to find such a method in the future, but it is likely

hard, and if we produced an algorithm even a quarter as good as we might hope, we would

likely publish a paper first, and blog about it second.

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

This entry was posted in Algorithms, Design, Geometry, Logic and tagged art, computational

complexity, kolmogorov complexity, low-complexity

art, mathematica, mathematics, patterns, programming, turing machines. Bookmark the permalink.

Post navigation

False Proof There are Finitely Many Primes

False Proof 31.5 = 32.5

1. paxinum

Like

o j2kun

Technically this circle construction is a fractal (if we drew larger circles and smaller

circles ad infinitum), but we are selecting pieces of the fractal with the goal of

constructing a specific picture. The difference here is that we have well-defined

curves to choose from, whereas in something like Mandelbrots set, its a gradient.

Of course, there are other fractals which draw specific pictures, like the Barnsley

fern, but these sorts of constructions have a disadvantage for our purposes because

each algorithm is specific to the object being created.

With this construction we have a distinct analytical advantage. Any drawing can be

drawn, so our framework is universal. And when we restrict our attention to this

particular style of art, any drawing can be compared to any other drawing in terms

of complexity. We could theoretically construct both the Mandelbrot set and the

Barnsley fern using the coordinate system, but our real problem is to find those

drawings which have very low complexity in this framework and are still beautiful.

Like

2. erniejunior

This is a nice attempt to compare the Kolmogorov complexity of images, but what happens

if you try to compare the complexities of the rather simple geometric figures of a suare and

a circle?

Your system will tell you that the square has an infinite complexity (which is not true) and

the circle is rather simple. Your results are biased by the way you encode your information.

If you encoded your pictures with parts of straight lines instead of parts of circles the

comparision of a square and a circle would give you the opposite results (square rather

simple but the circle infinitely complex).

Now saying that you could just combine the circles-system and the lines-system would not

get you anywhere: now circles and squares are both simple (as they are supposed to be) but

for example a Mandelbrot fractal (with low Kolmogorov complexity) would still be graded

infinitely complex.

If you want any usable results you need to use math or any equivalent strong language to

encode your picture information. And then again the encoding of a picture is not unique any

more and you need to make sure that any image you draw/construct is encoded in the

simplest way possible which is equally hard as computing the Kolmogorov complexity.

This is not mean as a rant on your article. I love it and your whole blog. It makes me think

and teaches me! Thanks a lot for the effort you put into your articles.

Maximilian

Like

o j2kun

You make some very good points, and the square counterexample clearly came

from a mathematical mind! And I think the only rebuttal is this: the circle

framework is not designed to be useful. In fact, it is not hard to see that determining

the correct Kolmogorov complexity of any image is undecidable, since any string

can be interpreted as the pixel information of an image.

So it is not fruitful to search for such a system, because no system exists. Heres a

relatively simple example, however, of a more expressive system that encapsulates

both circles and squares: Bezier curves. However, this system is just more complex,

and it sidesteps the point of the article.

That is, this is a question about aesthetics: are designs with provably low

Kolmogorov complexity more beautiful than those with higher Kolmogorov

complexity (with respect to a universal encoding system)?

planning to look at it in some more depth in the coming months. In particular it has

shown up in a number of machine learning applications.

Like

erniejunior

I agree that some of my arguments stated the obvious. I just like to write

down my though process and make everything as easy to understand as

possible.

A system based on beziers would certeinly be worth to look into. Especially

since they are not that hard to implement.

I am looking forward to your future articles about Kolmogorov complexity

because I feel that it is very important even though I do not (yet) see how it

could be at least approximated and used in any way.

Like

3. MSM

Im a bit late to the paty, but if you want some /really/ good low-complexity art, check old,

good demoscene stuff.

Posted on January 1, 2012 by j2kun

Next semester I am a lab TA for an introductory programming course, and its taught in

Python. My Python experience has a number of gaps in it, so well have the opportunity for

a few more Python primers, and small exercises to go along with it. This time, well be

investigating the basics of objects and classes, and have some fun with image construction

using the Python Imaging Library. Disappointingly, the folks who maintain the PIL are

slow to update it for any relatively recent version of Python (its been a few years since 3.x,

honestly!), so this post requires one use Python 2.x (were using 2.7). As usual, the full

source code for this post is available on this blogs Github page, and we encourage the

reader to follow along and create his own randomized pieces of art! Finally, we include a

gallery of generated pictures at the end of this post. Enjoy!

An image is a two-dimensional grid of pixels, and each pixel is a tiny dot of color displayed

on the screen. In a computer, one represents each pixel as a triple of numbers ,

where represents the red content, the green content, and the blue content. Each of

these is a nonnegative integer between 0 and 255. Note that this gives us a total

of distinct colors, which is nearly 17 million. Some estimates of how much

color the eye can see range as high as 10 million (depending on the definition of color) but

usually stick around 2.4 million, so its generally agreed that we dont need more.

The general idea behind our random psychedelic art is that we will generate three

randomized functions each with domain and codomain , and at

each pixel we will determine the color at that pixel by the

triple . This will require some translation between pixel

coordinates, but well get to that soon enough. As an example, if our colors are defined by

the functions , then the resulting image is:

We use the extra factor of because without it the oscillation is just too slow, and the

resulting picture is decidedly boring. Of course, the goal is to randomly generate such

functions, so we should pick a few functions on and nest them appropriately. The

first which come to mind are and simple multiplication. With these,

we can create such convoluted functions like

We could randomly generate these functions two ways, but both require randomness, so

lets familiarize ourselves with the capabilities of Pythons random library.

Random Numbers

Pseudorandom number generators are a fascinating topic in number theory, and one of these

days we plan to cover it on this blog. Until then, we will simply note the basics. First,

contemporary computers can not generate random numbers. Everything on a computer

is deterministic, meaning that if one completely determines a situation in a computer, the

following action will always be the same. With the complexity of modern operating systems

(and the aggravating nuances of individual systems), some might facetiously disagree.

For an entire computer the determined situation can be as drastic as choosing every single

bit in memory and the hard drive. In a pseudorandom number generator the determined

situation is a single number called a seed. This initializes the random number generator,

which then proceeds to compute a sequence of bits via some complicated arithmetic. The

point is that one may choose the seed, and choosing the same seed twice will result in the

same sequence of randomly generated numbers. The default seed (which is what one uses

when one is not testing for correctness) is usually some sort of time-stamp which is

guaranteed to never repeat. Flaws in random number generator design (hubris, off-by-one

errors, and even using time-stamps!) has allowed humans to take advantage of people who

try to rely on random number generators. The interested reader will find a detailed

account of how a group of software engineers wrote a program to cheat at online poker,

simply by reverse-engineering the random number generator used to shuffle the deck.

In any event, Python makes generating random numbers quite easy:

import random

random.seed()

print(random.random())

print(random.choice(["clubs", "hearts", "diamonds", "spades"]))

We import the random library, we seed it with the default seed, we print out a random

number in , and then we randomly pick one element from a list. For a full list of the

functions in Pythons random library, see the documentation. As it turns out, we will only

need the choice() function.

One neat way to represent a mathematical function is viaa function! In other words, just

like Racket and Mathematica and a whole host of other languages, Python functions are

first-class objects, meaning they can be passed around like variables. (Indeed, they are

objects in another sense, but we will get to that later). Further, Python has support for

anonymous functions, or lambda expressions, which work as follows:

5

lambdas:

import math

def makeExpr():

if random.random() < 0.5:

return lambda x: math.sin(math.pi * makeExpr()(x))

else:

return lambda x: x

Note that we need to import the math library, which has support for all of the necessary

mathematical functions and constants. One could easily extend this to support two

variables, cosines, etc., but there is one flaw with the approach: once weve constructed the

function, we have no idea what it is. Heres what happens:

>>> x = lambda y: y + 1

>>> str(x)

'<function <lambda> at 0xb782b144>'

Theres no way for Python to know the textual contents of a lambda expression at

runtime! In order to remedy this, we turn to classes.

The inquisitive reader may have noticed by now that lots of things in Python have

associated things, which roughly correspond to what you can type after suffixing an

expression with a dot. Lists have methods like [1,2,3,4].append(5), dictionaries have

associated lists of keys and values, and even numbers have some secretive methods:

>>> 45.7.is_integer()

False

between primitive types and objects, and numbers usually fall into the former category.

However, in Python everything is an object. This means the dot operator may be used after

any type, and as we see above this includes literals.

A class, then, is just a more transparent way of creating an object with certain associated

pieces of data (the fancy word is encapsulation). For instance, if I wanted to have a type

that represents a dog, I might write the following Python program:

class Dog:

age = 0

name = ""

def bark(self):

print("Ruff ruff! (I'm %s)" % self.name)

Then to use the new Dog class, I could create it and set its attributes appropriately:

fido = Dog()

fido.age = 4

fido.name = "Fido"

fido.weight = 100

fido.bark()

The details of the class construction requires a bit of explanation. First, we note that the

indented block of code is arbitrary, and one need not initialize the member variables.

Indeed, they simply pop into existence once they are referenced, as in the creation of the

weight attribute. To make it more clear, Python provides a special function called

__init__() (with two underscores on each side of init; heaven knows why they decided

it should be so ugly), which is called upon the creation of a new object, in this case the

expression Dog(). For instance, one could by default name their dogs Fido as follows:

class Dog:

def __init__(self):

self.name = "Fido"

d = Dog()

d.name # contains "Fido"

This brings up another point: all methods of a class that wish to access the attributes of the

class require an additional argument. The first argument passed to any method is always the

object which represents the owning instance of the object. In Java, this is usually hidden

from view, but available by the keyword this. In Python, one must explicitly represent it,

and it is standard to name the variable self.

If we wanted to give the user a choice when instantiating their dog, we could include an

extra argument for the name like this:

class Dog:

def __init__(self, name = 'Fido'):

self.name = name

d = Dog()

d.name # contains "Fido"

e = Dog("Manfred")

e.name # contains "Manfred"

Here we made it so the name argument is not required, and if it is excluded we default to

Fido.

function on by the following class:

class X:

def eval(self, x, y):

return x

expr = X()

expr.eval(3,4) # returns 3

Thats simple enough. But we still have the problem of not being able to print anything

sensibly. Trying gives the following output:

>>> str(X)

'__main__.X'

In other words, all it does is print the name of the class, which is not enough if we want to

have complicated nested expressions. It turns out that the str function is quite special.

When one calls str() of something, Python first checks to see if the object being called has

a method called __str__(), and if so, calls that. The awkward __main__.X is a default

behavior. So if we soup up our class by adding a definition for __str__(), we can define

the behavior of string conversion. For the X class this is simple enough:

class X:

def eval(self, x, y):

return x

def __str__(self):

return "x"

For nested functions we could recursively convert the argument, as in the following

definition for a SinPi class:

class SinPi:

def __str__(self):

return "sin(pi*" + str(self.arg) + ")"

return math.sin(math.pi * self.arg.eval(x,y))

Of course, this requires we set the arg attribute before calling these functions, and since

we will only use these classes for random generation, we could include that sort of logic in

the __init__() function.

picks to terminate or continue nesting things:

def buildExpr(prob = 0.99):

if random.random() < prob:

return random.choice([SinPi, CosPi, Times])(prob)

else:

return random.choice([X, Y])()

Here we have classes for cosine, sine, and multiplication, and the two variables. The reason

for the interesting syntax (picking the class name from a list and then instantiating it, noting

that these classes are objects even before instantiation and may be passed around as well!),

is so that we can do the following trick, and avoid unnecessary recursion:

class SinPi:

def __init__(self, prob):

self.arg = buildExpr(prob * prob)

...

In words, each time we nest further, we exponentially decrease the probability that we will

continue nesting in the future, and all the nesting logic is contained in the initialization of

the object. Were building an expression tree, and then when we evaluate an expression we

have to walk down the tree and recursively evaluate the branches appropriately.

Implementing the remaining classes is a quick exercise, and we remind the reader that the

entire source code is available from this blogs Github page. Printing out such expressions

results in some nice long trees, but also some short ones:

>>> str(buildExpr())

'cos(pi*y)*sin(pi*y)'

>>> str(buildExpr())

'cos(pi*cos(pi*y*y*x)*cos(pi*sin(pi*x))*cos(pi*sin(pi*sin(pi*x)))*sin(pi*x))'

>>> str(buildExpr())

'cos(pi*cos(pi*y))*sin(pi*sin(pi*x*x))*cos(pi*y*cos(pi*sin(pi*sin(pi*x))))*si

n(pi*cos(pi*sin(pi*x*x*cos(pi*y)))*cos(pi*y))'

>>> str(buildExpr())

'cos(pi*cos(pi*sin(pi*cos(pi*y)))*cos(pi*cos(pi*x)*y)*sin(pi*sin(pi*x)))'

>>> str(buildExpr())

'sin(pi*cos(pi*sin(pi*cos(pi*cos(pi*y)*x))*sin(pi*y)))'

>>> str(buildExpr())

'cos(pi*sin(pi*cos(pi*x)))*y*cos(pi*cos(pi*y)*y)*cos(pi*x)*sin(pi*sin(pi*y*y*

x)*y*cos(pi*x))*sin(pi*sin(pi*x*y))'

This should work well for our goals. The rest is constructing the images.

The Python imaging library is part of the standard Python installation, and so we can access

the part we need by adding the following line to our header:

Now we can construct a new canvas, and start setting some pixels.

canvas = Image.new("L00,300))

canvas.putpixel((150,150), 255)

canvas.save("test.pngNG")

This gives us a nice black square with a single white pixel in the center. The L argument

to Image.new() says were working in grayscale, so that each pixel is a single 0-255 integer

representing intensity. We can do this for three images, and merge them into a single color

image using the following:

above, but with the appropriate intensities. The rest of the details in the Python code are left

for the reader to explore, but we dare say it is just bookkeeping and converting between

image coordinate representations. At the end of this post, we provide a gallery of the

randomly generated images, and a text file containing the corresponding expression trees is

packaged with the source code on this blogs Github page.

There is decidedly little mathematics in this project, but there are some things we can

discuss. First, we note that there are many many many functions on the interval that

we could include in our random trees. A few examples are: the average of two numbers in

that range, the absolute value, certain exponentials, and reciprocals of interesting sequences

of numbers. We leave it as an exercise to the reader to add new functions to our existing

code, and to further describe which functions achieve coherent effects.

Indeed, the designs are all rather psychedelic, and the layers of color are completely

unrelated. It would be an interesting venture to write a program which, given an image of

something (pretend its a simple image containing some shapes), constructs expression trees

that are consistent with the curves and lines in the image. This follows suit with our goal of

constructing low-complexity pictures from a while back, and indeed, these pictures have

rather low Kolmogorov complexity. This method is another framework in which to describe

their complexity, in that smaller expression trees correspond to simpler pictures. We leave

this for future work. Until then, enjoy these pictures!

Gallery

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

This entry was posted in Analysis, Design, Primers, Programming Languages and

tagged art, kolmogorov complexity, primer, python, random number generators. Bookmark

the permalink.

Post navigation

Row Reduction Over A Field

Numerical Integration

1. jakesprinter

Brilliant colors

Like

2. Axio

Funny that this idea shows up on planet scheme. It had already three years ago (cant find

the original link, though), and I had given it a try as

well: http://fp.bakarika.net/index.cgi?show=5 (with ugly code).

P!

Like

3. sudonhim

expovariate, gammavariate, gauss, getrandbits, getstate,

jumpahead, lognormvariate, normalvariate, paretovariate,

randint, random, randrange, sample, seed, setstate,

shuffle, triangular, uniform, vonmisesvariate,

weibullvariate, SystemRandom

Awesome images btw!

http://metahub-remote.no-ip.info

Like

o j2kun

I cant access your server. Did you set up a thing for users to create python images

online?

Like

sudonhim

Python image server is up again, however we are porting it (have a 95%

working version!) to client-side JavaScript. Although I do prefer Python, its

just too much load on the server

If you want to go ahead anyway, any images you submit to the Python

server will still appear with source in the next version, people just wont be

able to reuse the code.

Like

4. EmoryM

it http://www.cs.ucf.edu/complex/papers/stanley_gpem07.pdf

Like

5. anorthhare

Thanks for posting this, Ive just begun to scratch the surface of what Python can do. Its a

wonderful language. Im working on a program to turn images into sound. I enjoyed your

examples, they are very useful.

Like

6. Bidobido

Cant believe Ive been having so much fun with this piece of code since yesterday !!

Just added a Plus class, coded in the same fashion as the Times class, that adds its

arguments and trims the result to be in [-1,1]. Results are breathtaking (much more

so than averaging) ! Thank you for these genuinely pythonic ideas about expression

nesting by the way.

Scaling

Posted on March 4, 2013 by j2kun

Every programmer or graphic designer with some web development experience can attest to

the fact that finding good images that have an exactly specified size is a pain. Since the

dimensions of the sought picture are usually inflexible, an uncomfortable compromise can

come in the form of cropping a large image down to size or scaling the image to have

appropriate dimensions.

Both of these solutions are undesirable. In the example below, the caterpillar looks distorted

in the scaled versions (top right and bottom left), and in the cropped version (bottom right)

its more difficult to tell that the caterpillar is on a leaf; we have lost the surrounding

context.

In this post well look at a nice heuristic method for rescaling images called seam-

carving, which pays attention to the contents of the image as it resacles. In particular, it

only removes or adds pixels to the image that the viewer is least-likely to notice. In all but

the most extreme cases it will avoid the ugly artifacts introduced by cropping and scaling,

and with a bit of additional scaffolding it becomes a very useful addition to a graphic

designers repertoire. At first we will focus on scaling an image down, and then we will see

that the same technique can be used to enlarge an image.

Before we begin, we should motivate the reader with some examples of its use.

Its clear that the caterpillar is far less distorted in all versions, and even in the harshly

rescaled version, parts of the green background are preserved. Although the leaf is warped a

little, it is still present, and its not obvious that the image was manipulated.

Now that the readers appetite has been whet, lets jump into the mathematics of it. This

method was pioneered by Avidan and Shamir, and the impatient reader can jump straight to

their paper (which contains many more examples). In this post we hope to fill in the

background and show a working implementation.

Images as Functions

One common way to view an image is as an approximation to a function of two real

variables. Suppose we have an -pixel image ( rows and columns of pixels). For

simplicity (during the next few paragraphs), we will also assume that the pixel values of an

image are grayscale intensity values between 0 and 255. Then we can imagine the pixel

values as known integer values of a function . That is, if we take two

integers and then we know the value ; its just the intensity

value at the corresponding pixel. For values outside these ranges, we can impose arbitrary

values for (we dont care whats happening outside the image).

Moreover, it makes sense to assume that is a well-behaved function in between the pixels

(i.e. it is differentiable). And so we can make reasonable guessed as to the true derivative

of by looking at the differences between adjacent pixels. There are many ways to get a

good approximation of the derivative of an image function, but we should pause a moment

to realize why this is important to nail down for the purpose of resizing images.

A good rule of thumb with images is that regions of an image which are most important to

the viewer are those which contain drastic changes in intensity or color. For instance,

consider this portrait of Albert Einstein.

Which parts of this image first catch the eye? The unkempt hair, the wrinkled eyes, the

bushy mustache? Certainly not the misty background, or the subtle shadows on his chin.

Indeed, one could even claim that an image having a large derivative at a certain pixel

corresponds to high information content there (of course this is not true of all images, but

perhaps its reasonable to claim this for photographs). And if we want to scale an image

down in size, we are interested in eliminating those regions which have the smallest

information content. Of course we cannot avoid losing some information: the image after

resizing is smaller than the original, and a reasonable algorithm should not add any new

information. But we can minimize the damage by intelligently picking which parts to

remove; our naive assumption is that a small derivative at a pixel implies a small amount of

information.

Of course we cant just remove regions of an image to change its proportions. We have to

remove the same number of pixels in each row or column to reduce the corresponding

dimension (width or height, resp.). Before we get to that, though, lets write a program to

compute the gradient. For this program and the rest of the post we will use the Processing

programming language, and our demonstrations will use the Javascript cross-

compiler processing.js. The nice thing about Processing is that if you know Java then you

know processing. All the basic language features are the same, and its just got an extra few

native types and libraries to make graphics rendering and image displaying easier. As

usual, all of the code used in this blog post is available on this blogs Github page.

Lets compute the gradient of this picture, and call the picture :

A very nice picture whose gradient we can compute. It was taken by the artist Ria

Czichotzki.

Since this is a color image, we will call it a function , in the sense that the

input is a plane coordinate , and the output is a triple of color

intensity values. We will approximate the images partial

derivative at by inspecting values of in a neighborhood of the

point:

the direction, and the partial in the direction. Note that

the values are vectors, so the norm signs here are really computing

the distance between the two values of .

There are two ways to see why this makes sense as an approximation. The first is analytic:

by definition, the partial derivative is a limit:

It turns out that this limit is equivalent to

And the closer gets to zero the better the approximation of the limit is. Since the closest

we can make is (we dont know any other values of with nonzero ), we plug in

the corresponding values for neighboring pixels. The partial is similar.

The slope of the blue secant line is not a bad approximation to the derivative at x, provided

the resolution is fine enough.

The salient fact here is that a nicely-behaved curve at will have a derivative close to the

secant line between the points and . Indeed, this idea

inspires the original definition of the derivative. The slope of the secant line is

just . As we saw in our post on numerical integration, we can do

much better than a linear guess (specifically, we can use do any order of polynomial

interpolation we wish), but for the purposes of displaying the concept of seam-carving, a

linear guess will suffice.

And so with this intuitive understanding of how to approximate the gradient, the algorithm

to actually do it is a straightforward loop. Here we compute the horizontal gradient (that is,

the derivative ).

color left, right;

int center;

PImage newImage = createImage(img.width, img.height, RGB);

for (int y = 0; y < img.height; y++) {

center = x + y*img.width;

y*img.width];

right = x == img.width-1 ? img.pixels[center] : img.pixels[(x+1) +

y*img.width];

}

}

return newImage;

}

The details are a bit nit-picky, but the idea is simple. If were inspecting a non-edge pixel,

then we can use the formula directly and compute the values of the neighboring left and

right pixels. Otherwise, the left pixel or the right pixel will be outside the bounds of the

image, and so we replace it with the pixel were inspecting. Mathematically, wed be

computing the difference and .

Additionally, since well later only be interested in the relativesizes of the gradient, we can

ignore the factor of 1/2 in the formula we derived.

The parts of this code that are specific to Processing also deserve some attention.

Specifically, we use the built-in types PImage and color, for representing images and colors,

respectively. The createImage function creates an empty image of the specified size. And

peculiarly, the pixels of a PImage are stored as a one-dimensional array. So as were

iterating through the rows and columns, we must compute the correct location of the sought

pixel in the pixel array (this is why we have a variable called center). Finally, as in Java,

the ternary if notation is used to keep the syntax short, and those two lines simply check for

the boundary conditions we stated above.

The last unexplained bit of the above code is the colorDistance function. As our image

function has triples of numbers as values, we need to compute the distance between

two values via the standard distance formula. We have encapsulated this in a separate

function. Note that because (in this section of the blog) we are displaying the results in an

image, we have to convert to an integer at the end.

float r = red(c1) - red(c2);

float g = green(c1) - green(c2);

float b = blue(c1) - blue(c2);

return (int)sqrt(r*r + g*g + b*b);

}

The reader who

is interested in comparing the two more closely may visit this interactive page. Note that we

only compute the horizontal gradient, so certain locations in the image have a large

derivative but are still dark in this image. For instance, the top of the door in the background

and the wooden bars supporting the bottom of the chair are dark despite the vertical color

variations.

The vertical gradient computation is entirely analogous, and is left as an exercise to the

reader.

Since we want to inspect both vertical and horizontal gradients, we will call the total

gradient matrix the matrix whose entries are the sums of the magnitudes of the

horizontal and vertical gradients at :

The function is often called an energy function for . We will mention now

that there are other energy functions one can consider, and use this energy function for the

remainder of this post.

Back to the problem of resizing, we want a way to remove only those regions of an image

that have low total gradient across all of the pixels in the region removed. But of course

when resizing an image we must maintain the rectangular shape, and so we have to add or

remove the same number of pixels in each column or row.

For the purpose of scaling an image down in width (and the other cases are similar), we

have a few options. We could find the pixel in each row with minimal total gradient and

remove it. More conservatively, we could remove those columns with minimal gradient (as

a sum of the total gradient of each pixel in the column). More brashly, we could just remove

pixels of lowest gradient willy-nilly from the image, and slide the rows left.

If none of these ideas sound like they would work, its because they dont. We encourage

the unpersuaded reader to try out each possibility on a variety of images to see just how

poorly they perform. But of these options, removing an entire column happens to distort the

image less than the others. Indeed, the idea of a seam in an image is just a slight

generalization of a column. Intuitively, a seam is a trail of pixels traversing the image

from the bottom to the top, and at each step the pixel trail can veer to the right or left by at

most one pixel.

zero. A vertical seam in is a list of coordinates with the following

properties:

is at the top of the image.

is strictly increasing.

for all .

These conditions simply formalize what we mean by a seam. The first and second impose

that the seam traverses from top to bottom. The third requires the seam to always go up,

so that there is only one pixel in each row. The last requires the seam to be connected in

the sense that it doesnt veer too far at any given step.

Here are some examples of some vertical seams. One can easily define horizontal seams by

swapping the placement of in the above list of conditions.

So the goal is now to remove the seams of lowest total gradient. Here the total gradient of a

seam is just the sum of the energy values of the pixels in the seam.

Unfortunately there are many more seams to choose from than columns (or even individual

pixels). It might seem difficult at first to find the seam with the minimal total gradient.

Luckily, if were only interested in minima, we can use dynamic programming to compute

the minimal seam ending at any given pixel in linear time.

We point the reader unfamiliar with dynamic programming to our Python primer on this

topic. In this case, the sub-problem were working with is the minimal total gradient value

of all seams from the bottom of the image to a fixed pixel. Lets call this value . If

we know for all pixels below, say, row , then we can compute the for

the entire row by taking pixel , and adding its gradient value to the

minimum of the values of possible predecessors in a

seam, (respecting the appropriate boundary conditions).

Once weve computed for the entire matrix, we can look at the minimal value at the

top of the image , and work backwards down the image to compute which

seam gave us this minimum.

Lets make this concrete and compute the function as a two-dimensional array called

seamFitness.

void computeVerticalSeams() {

seamFitness = new float[img.width][img.height];

for (int i = 0; i < img.width; i++) {

seamFitness[i][0] = gradientMagnitude[i][0];

}

for (int x = 0; x < img.width; x++) {

seamFitness[x][y] = gradientMagnitude[x][y];

if (x == 0) {

seamFitness[x][y] += min(seamFitness[x][y-1], seamFitness[x+1][y-

1]);

} else if (x == img.width-1) {

seamFitness[x][y] += min(seamFitness[x][y-1], seamFitness[x-1][y-

1]);

} else {

seamFitness[x][y] += min(seamFitness[x-1][y-1], seamFitness[x][y-

1], seamFitness[x+1][y-1]);

}

}

}

}

We have two global variables at work here (global is bad, I know, but its Processing; its

made for prototyping). The seamFitness array, and the gradientMagnitude array. We

assume at the start of this function that the gradientMagnitude array is filled with sensible

values.

Here we first initialize the zeroth row of the seamFitness array to have the same values as

the gradient of the image. This is simply because a seam of length 1 has only one gradient

value. Note here the coordinates are a bit backwards: the first coordinate represents the

choice of a column, and the second represents the choice of a row. We can think of the

coordinate axes of our image function having the origin in the bottom-left, the same as we

might do mathematically.

Then we iterate over the rows in the matrix, and in each column we compute the fitness

To actually remove a seam, we need to create a new image of the right size, and shift the

pixels to the right (or left) of the image into place. The details are technically important, but

tedious to describe fully. So we leave the inspection of the code as an exercise to the reader.

We provide the Processing code on this blogs Github page, and show an example of its use

below. Note each the image resizes every time the user clicks within the image.

Its interesting (and indeed the goal) to see how at first nothing is warped, and then the lines

on the walls curve around the womans foot, and then finally the womans body is distorted

before she gets smushed into a tiny box by the oppressive mouse.

program online in the same way we did for the gradient computation example. Processing is

quite nice in that any Processing program (which doesnt use any fancy Java libraries) can

be cross-compiled to Javascript via the processing.js library. This is what we did for the

gradient example. But in doing so for the (admittedly inefficient and memory-leaky) seam-

carving program, it appeared to run an order of magnitude slower in the browser than

locally. This was this authors first time using Processing, so the reason for the drastic jump

in runtime is unclear. If any readers are familiar with processing.js, a clarification would be

very welcome in the comments.

In addition to removing seams to scale an image down, one can just as easily insertseams to

make an image larger. To insert a seam, just double each pixel in the seam and push the rest

of the pixels on the row to the right. The process is not hard, but it requires avoiding one

pitfall: if we just add a single seam at a time, then the seam with minimum total energy will

never change! So well just add the same seam over and over again. Instead, if we want to

add seams, one should compute the minimum seams and insert them all. If the desired

resize is too large, then the programmer should pick an appropriate batch size and add

seams in batches.

Another nice technique that comes from the seam-carving algorithm is to intelligently

protect or destroy specific regions in the image. To do this requires a minor modification of

the gradient computation, but the rest of the algorithm is identical. To protect a region,

provide some way of user input specifying which pixels in the image are important, and

give those pixels an artificially large gradient value (e.g., the maximum value of an integer).

If the down-scaling is not too extreme, the seam computations will be guaranteed not to use

any of those pixels, and inserted seams will never repeat those pixels. To remove a region,

we just give the desired pixels an arbitrarily low gradient value. Then these pixels will be

guaranteed to occur in the minimal seams, and will be removed from the picture.

The technique of seam-carving is a very nice tool, and as we just saw it can be extended to a

variety of other techniques. In fact, seam-carving and its applications to object removal and

image resizing are implemented in all of the recent versions of Photoshop. The techniques

are used to adapt applications to environments with limited screen space, such as a mobile

phone or tablet. Seam carving can even be adapted for use in videos. This involves an

extension of the dynamic program to work across multiple frames, formally finding a

minimal graph cut between two frames so that each piece of the cut is a seam in the

corresponding frame. Of course there is a lot more detail to it (and the paper linked above

uses this detail to improve the basic image-resizing algorithm), but thats the rough idea.

Weve done precious little on this blog with images, but wed like to get more into graphics

programming. Theres a wealth of linear algebra, computational geometry, and artificial

intelligence hiding behind most of the computer games we like to play, and it would be fun

to dive deeper into these topics. Of course, with every new post this author suggests ten new

directions for this blog to go. Its a curse and a blessing.

Until next time!

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

tagged calculus, graphics, javascript, mathematics, photoshop, processing, programming, seam

carving. Bookmark the permalink.

Post navigation

Methods of Proof Contradiction

Methods of Proof Induction

1. k

Note: GIMP too has had this with the resynthesizer (liquid rescale) plugin for some time.

Like

2. jakeanq

I would love to see some of these concepts integrated into some form of game as

mentioned, games use so much math that to see the process of game design from more of a

mathematical angle instead of the all-programming employed in other tutorials and blogs

would be very interesting.

Like

o j2kun

Games do use a lot of math, and its usually in the form of vector calculus to

emulate physics. The most sophisticated math usually goes into the graphics engine

itself: shading, lighting, texturing, etc all require a ton of linear algebra, and things

like particle flow (for water wind, flowing cloth) require differential equations

which are discretized and attacked with linear algebra.

I would be really interested to get into these sorts of things, but to be honest Ive

never done any sort of graphics programming outside of basic 2d games. So Ill just

Like

3. Rafael Carrascosa

I have one question and one request:

-Why dynamic programming and not dijkstras graph search? or more advanced stuff like

A* and other informed searchs?

-When inserting seams you suggest picking up the k minimal, afaik this is a non-trivial task,

could you do a post sometime on the k-shortest-path algorithms?

Thank you!

Like

o j2kun

Finding the k shortest *seams* in the seam carving example is not hard: just find

the k smallest entries of the bottom row of the seam fitness array, and compute the

seams starting at those positions. Since we dont want to remove two seams with the

same starting position, we dont care if the two shortest paths overlap at their base.

The problem is when those seams overlap even when they have different starting

points (and the overlap cant be avoided). Thats why I remove the seams one by

one in the above code, and a priori it seems this problem would occur in any search

algorithm. Im honestly not sure how this is overcome in practice, but at least for

adding seams it could simply be ignored.

programming, and the complexity is only off by a (probably small) constant factor. I

dont think A* search would be as useful since we have multiple starting points

(though there are probably variants of A* that account for this, I think dynamic

programming is a simpler solution).

Like

4. Andy Bennett

Have you tried swapping the order of your for loops? That will make it more cache-friendly

and may speed it up by a few orders of magnitude! At the moment youre indexing your

array in such that it gives you bad locality as youre sequentially plucking elements out

from each column (in which youre storing the x values) rather than running along the

rows of the array (where youre storing y).

Ive been having trouble with the K-smallest seams as well. I can see how you can get the

smallest, but I cant see how you avoid the complexity exploding again on the way back

down the image.

Consider this matrix:

4421444444

4212444444

4129144444

4333914444

4144492111

The two smallest seams land in column 7. How do you get the algorithm to select the right

hand path? Using min on the way down will always cause this path to be missed.

Moreover, one of the replies in the comments mentions that the seams are removed one-by-

one. Surely you have to recomupte the weights and paths each time you remove a seam

otherwise you are prone to minute distortions? Do you shift the pixels hard-right, hard-left

or do you perform some other kind of row-averaging when you remove a pixel from an

arbitrary position in a row?

Thanks for an interesting article. I wasnt previously aware of this particular method!

PS: your comments box doesnt resize in a friendly way when one pastes a bunch of lines

in.

Like

o j2kun

Its been a long time since I thought about locality Ill give that a try

I think the matrix you gave is the matrix for gradient values (because the matrix for

seam fitness values is monotonically increasing). Before we try to compute seams,

we transform the gradient matrix into the seam fitness matrix, and from there

finding the minimal paths is much easier. In any event, we certainly dont use min

on the gradient matrix: that would give non-optimal paths.

Yes, I do recompute everything after each seam removal. I shift pixels hard-right,

and actually construct a new image of the right size to store the shifted image.

Like

5. Vince P.

Hey, where are your permalinks? Dufuses like me like to add your articles to Pocket or the

like and currently I have to go to your comments page just to get a URL I can add.

Just a thought.

Like

o j2kun

Theyre at the bottom of each post. And I believe the permalinks arent any

different from the regular URLs

Like

Vince P.

Well, if you put them at the top of the post, then dufuses like me will never

ask that question again. Just a thought.

Like

6. Severyn Kozak

Fantastic post. Sums up seam carving really well from start to finish, provides working code

examples, *and* contains a formal mathematical angle.

The Cellular Automaton Method for Cave

Generation

Posted on July 29, 2012 by j2kun

Dear reader, this post has an interactive simulation! We encourage you to play with it as

you read the article below.

In our series of posts on cellular automata, we explored Conways classic Game of Life and

discovered some interesting patterns therein. And then in our primers on computing theory,

we built up a theoretical foundation for similar kinds of machines, including a discussion

of Turing machines and the various computational complexity classes surrounding them.

But cellular automata served us pretty exclusively as a toy. It was a basic model of

computation, which we were interested in only for its theoretical universality. One wouldnt

expect too many immediately practical (and efficient) applications of something which

needs a ridiculous scale to perform basic logic. In fact, its amazing that there are as many

as there are.

In this post well look at one particular application of cellular automata to procedural level

generation in games.

An example of a non-randomly generated cave level from Bethesdas The Elder Scrolls

series.

Level design in video games is a time-consuming and difficult task. Its extremely difficult

for humans to hand-craft areas that both look natural and are simultaneously fun to play in.

This is particularly true of the multitude of contemporary role-playing games modeled

after Dungeons and Dragons, in which players move through a series of areas defeating

enemies, collecting items, and developing their character. With a high demand for such

games and so many levels in each game, it would save an unfathomable amount of money

to have computersgenerate the levels on the fly. Perhaps more importantly, a game with

randomly generated levels inherently has a much higher replay value.

The idea of randomized content generation (often called procedural generation) is not

particularly new. It has been around at least since the 1980s. Back then, computers simply

didnt have enough space to store large, complex levels in memory. To circumvent this

problem, video game designers simply generated the world as the player moved through it.

This opened up an infinitude of possible worlds for the user to play in, and the seminal

example of this is a game called Rogue, which has since inspired series such

as Diablo, Dwarf Fortress, and many many others. The techniques used to design these

levels have since been refined and expanded into a toolbox of techniques which have

become ubiquitous in computer graphics and game development.

Well explore more of these techniques in the future, but for now well see how a cellular

automaton can be used to procedurally generate two-dimensional cave-like maps.

While the interested reader can read more about cellular automata on this blog, we will give

a quick refresher here.

For our purposes here, a 2-dimensional cellular automaton is a grid of cells , where each

cell is in one of a fixed number of states, and has a pre-determined and fixed set of

neighbors. Then is updated by applying a fixed rule to each cell simultaneously, and the

process is repeated until something interesting happens or boredom strikes the observer.

The most common kind of cellular automaton, called a Life-like automaton, has only two

states, dead and alive (for us, 0 and 1), and the rule applied to each cell is given as

conditions to be born or survive based on the number of adjacent live cells. This is often

denoted Bx/Sy where x and y are lists of single digit numbers. Furthermore, the choice of

neighborhood is the eight nearest cells (i.e., including the diagonally-adjacent ones). For

instance, B3/S23 is the cellular automaton rule where a cell is born if it has three living

neighbors, and it survives if it has either two or three living neighbors, and dies otherwise.

Technically, these are called Life-like automata, because they are modest generalizations

of Conways original Game of Life. We give an example of a B3/S23 cellular automaton

initialized by a finite grid of randomly populated cells below. Note that each of the black

(live) cells in the resulting stationary objects satisfy the S23 part of the rule, but none of the

neighboring white (dead) cells satisfy the B3 condition.

A cellular automaton should really be defined for an arbitrary graph (or more generally, an

arbitrary state space). There is really nothing special about a grid other than that its easy to

visualize. Indeed, some cellular automata are designed for hexagonal grids, others are

embedded on a torus, and still others are one- or three-dimensional. Of course, nothing

stops automata from existing in arbitrary dimension, or from operating with arbitrary (albeit

deterministic) rules, but to avoid pedantry we wont delve into a general definition here. It

would take us into a discussion of discrete dynamical systems (of which there are many,

often with interesting pictures).

Now the particular cellular automaton we will use for cave generation is simply

B678/S345678, applied to a random initial grid with a fixed live border. We interpret the

live cells as walls, and the dead cells as open space. This rule should intuitively work: walls

will stay walls even if more cells are born nearby, but isolated or near-isolated cells will

often be removed. In other words, this cellular automaton should smooth out a grid

arrangement to some extent. Here is an example animation quickly sketched up in

Mathematica to witness the automaton in action:

An example cave generated via the automaton rule B678/S345678. The black cells are

alive, and the white cells are dead.

As usual, the code to generate this animation (which is only a slight alteration to the code

used in our post on cellular automata) is available on this blogs Github page.

This map is already pretty great! It has a number of large open caverns, and they are

connected by relatively small passageways. With a bit of imagination, it looks absolutely

cavelike!

We should immediately note that there is no guarantee that the resulting regions of

whitespace will be connected. We got lucky with this animation, in that there are only two

disconnected components, and one is quite small. But in fact one can be left with multiple

large caves which have no connecting paths.

Furthermore, we should note the automatons rapid convergence to a stable state. Unlike

Conways Game of Life, in practice this automaton almost always converges within 15

steps, and this author has yet to see any oscillatory patterns. Indeed, they are unlikely to

exist because the survival rate is so high, and our initial grid has an even proportion of live

and dead cells. There is no overpopulation that causes cells to die off, so once a cell is born

it will always survive. The only cells that do not survive are those that begin isolated. In a

sense, B678/S345678 is designed to prune the sparse areas of the grid, and fill in the dense

areas by patching up holes.

We should also note that the initial proportion of cells which are alive has a strong effect on

the density of the resulting picture. For the animation we displayed above, we initially

chose that 45% of the cells would be live. If we increase that a mere 5%, we get a picture

like the following.

A cave generated with the initial proportion of live cells equal to 0.5

As expected, there are many more disconnected caverns. Some game designers prefer a

denser grid combined with heuristic methods to connect the caverns. Since our goal is just

to explore the mathematical ideas, we will leave this as a parameter in our final program.

One important thing to note is that B678/S345678 doesnt scale well to fine grid sizes. For

instance, if we increase the grid size to 200200, we get something resembling an

awkward camouflage pattern.

A 200200 grid cave generation. Click the image to enlarge it.

What we really want is a way to achieve the major features of the low-resolution image on a

larger grid. Since cellular automata are inherently local manipulations, we should not

expect any modification of B678/S345678 to do this for us. Instead, we will use

B678/345678 to create a low-resolution image, increase its resolution manually, and smooth

it out with you guessed it another cellular automaton! Well design this automaton

specifically for the purpose of smoothing out corners.

To increase the resolution, we may simply divide the cells into four pieces. The picture

doesnt change, but the total number of cells increases fourfold. There are a few ways to do

this programmatically, but the way we chose simply uses the smallest resolution possible,

and simulates higher resolution by doing block computations. The interested programmer

can view our Javascript program available on this blogs Github page to see this directly (or

view the page source of this posts interactive simulator).

improve on in the above examples. In particular, once we increase the resolution, we will

have a lot of undesirable convex and concave corners. Since a corner is simply a block

satisfying certain local properties, we can single those out to be removed by an automaton.

Its easy to see that convex corners have exactly 3 live neighbors, so we should not allow

those cells to survive. Similarly, the white cell just outside a concave corner has 5 live

neighbors, so we should allow that cell to be born. On the other hand, we still want the

major properties of our old B678/S345678 to still apply, so we can simply add 5 to the B

part and remove 3 from the S part. Lastly, for empirical reasons, we also decide to kill off

cells with 4 live neighbors.

We present this application as an interactive javascript program. Some basic instructions:

B678/S345678 to the currently displayed grid. It iterates the automaton 20 times in

an animation.

The Apply B5678/S5678 button applies the smoothing automaton, but it does so

only once, allowing the user to control the degree of smoothing at the specific

resolution level.

The Increase Resolution button splits each cell into four, and may be applied until

the cell size is down to a single pixel.

The Reset button resets the entire application, creating a new random grid.

We used this program to generate a few interesting looking pictures by varying the order in

which we pressed the various buttons (it sounds silly, but its an exploration!). First, a nice

cave:

An example of a higher resolution cave created with our program. In order to achieve

similar results, First apply B678/S345678, and then alternate increasing the resolution and

applying B5678/S5678 1-3 times.

We note that this is not perfect. There are some obvious and awkward geometric artifacts

lingering in this map, mostly in the form of awkwardly straight diagonal lines and

awkwardly flawless circles. Perhaps one might imagine the circles are the bases of

stalactites or stalagmites. But on the whole, in terms of keeping the major features of the

original automaton present while smoothing out corners, this author thinks B5678/S5678

has done a phenomenal job. Further to the cellular automatons defense, when the local

properties are applied uniformly across the entire grid, such regularities are bound to occur.

Thats just another statement of the non-chaotic nature of B5678/S5678 (in stark contrast to

Conways Game of Life).

There are various modifications one could perform (or choose not to, depending on the type

of game) to make the result more accessible for the player. For instance, one could remove

all regions which fit inside a sufficiently small circle, or add connections between the

disconnected components at some level of resolution. This would require some sort of

connected-component labeling, which is a nontrivial task; current research goes into

optimizing connected-component algorithms for large-scale grids. We plan to cover such

topics on this blog in the future.

Another example of a cool picture we created with this application might be considered a

more retro style of cave.

Apply S678/B345678 once, and increase the resolution as much as possible before applying

B5678/S5678 as many times as desired.

We encourage the reader to play around with the program to see what other sorts of

creations one can make. As of the time of this writing, changing the initial proportion of

live cells (50%) or changing the automaton rules cannot be done in the browser; it requires

one to modify the source code. We may implement the ability to control these in the

browser given popular demand, but (of course) it would be a wonderful exercise for the

intermediate Javascript programmer.

Its clear that this same method can be extended to a three-dimensional model for

generating caverns in a game like Minecraft. While we havent personally experimented

with three-dimensional cellular automata here on this blog, its far from a new idea. Once

we reach graphics programming on this blog (think: distant future) we plan to revisit the

topic and see what we can do.

Until then!

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

automata, javascript, mathematica, procedural generation, programming, video games. Bookmark

the permalink.

Post navigation

Dynamic Time Warping for Sequence Comparison

Machine Learning Introduction

1. paxinum

Like

2. mortoray

Whats awkward about circles? They exist in nature, especially in caves (things dripping to

create round stalagmites and eventually form columns). Im actually upset that in video

games the open chambers are always empty, enver having columns or protrusions.

Like

o j2kun

Perhaps I should clarify. Its not that I dont like circles, or that they dont belong in

caves. The problem is their regularity in this particular model. The picture I gave

above was lucky in that it did not have too many circles, but if you experiment with

the interactive simulation youll notice that they show up frequently, and in every

single run. These perfect circles are simply a persistent side effect when wed rather

they be a randomly occurring feature. I suppose its largely a matter of perspective,

but at least it gives some insight into the nature of the automaton.

Like

mortoray

Like

j2kun

You could Now were getting into the realm of more general discrete

dynamical systems (which I know absolutely nothing about). I dont quite

have the intuition to design such a system.

Like

That is a great idea, thinking I might look into generating stalagmites or tites

using the basic code I used to generate waterfalls in this

cave. http://www.avanderw.co.za/making-a-cave-like-structure-with-worms/

Basic idea being create half-waterfalls and reverse-waterfalls and call then

stalagmites / stalactites will put aside some time to see what results.

Like

3. Kris

As an aside, the Cahn-Hillard equation describes phase separation from a mixure, and

results in a type of tortuous distribution of caves like your CA arrives at. Although I dont

think there is a guarantee that there will be connected path from here to there.

http://en.wikipedia.org/wiki/Cahn%E2%80%93Hilliard_equation

http://www.ctcms.nist.gov/fipy/examples/cahnHilliard/generated/examples.cahnHilliard.me

sh2D.html

http://www.ctcms.nist.gov/fipy/examples/cahnHilliard/generated/examples.cahnHilliard.sph

ere.html

Like

o j2kun

Like

4. codiecollinge

Great post. Did you ever think about this being used for procedural textures as well as map

generation? I feel that B678/S345678 when straight scaled up would be a nice function to

start off with for a procedural texture, the smoothing would also come in handy.

Once again, thanks for the post, Im reading it in the early hours and easily understood it!

Although I think I may read your other posts on cellular automata, seems very interesting.

Like

o j2kun

Ive been meaning to derive and implement Perlin noise for a while on this blog,

and use it to do cool textures. Alas, work and research must come first (and Im a

bit of a newbie to graphics, despite my extensive experience in both linear algebra

and C++). So textures is definitely on my list, and Ill keep your comment in mind.

Like

5. Sascha

Like

6. xot

Nice to see an article about this. I posted a very brief suggestion in CA topic in my forums

that Grard Vichniacs Vote CA could be used for cave generation. Its very similar to

these, but has a couple of interesting features. Its quite dynamic looking as it runs, and the

longer it runs, the more homogeneous it gets. Rather than scaling and smoothing, you just

let it keep running until the features are the size you desire. It does mean a good deal more

computation, but it also results in structures with fewer lattice artifacts.

Like

o xot

Whoops, I messed up my notation. What I was shooting for was the Vote variant

called Anneal or Vote 4/5.

B4678/S35678

Like

7. YetAnotherPortfolio (@yetanotherportf)

Great article!

I have made a simple editor in javascript to play with cellular automaton the way you

describe it here: http://www.yetanotherportfolio.fr/tmp/cellular/index.html

I first made it for me to better understand the automaton things, so its maybe obscure the

way it works. let me now what do you think.

Like

o j2kun

Like

YetAnotherPortfolio (@yetanotherportf)

Applying a b/s345678 on time at the end removed all alone cells. its pretty

useful to clean the map.

Im also trying different set of rules based of you article and what I found in

the comments to see how I can control the overall shapes of the blobs.

Im adding a blobs recognition system, to be able to add doors (or bridges)

betweens each blobs, but its not quite finished.

Like

8. gekkostate

Amazing post! I would like to try something like this in Java. I was hoping you could point

in the correct direction? What are the first steps, I should take in learning this?

Like

o j2kun

If you already know some Java, check out the JFrame GUI library. That will get you

started on drawing things. The javascript code I used on the demo page might help

you out with the logic.

Like

9. rusyninventions

I know that this post was written quite some time ago now, but I want to say how much I

love it. Several months ago, it really pointed me in the right direction for an idea I was

tampering with. I started blogging about the experience which is chiefly based on this

article. At present, there is only the first part of the series, but I have already extracted the

demonstration in this first article to be used in a 3D environment with sprawling, randomly

generated terrain to be featured in the followup articles.

http://blog.rusyninventions.com/2014/01/cellular-automated-caves-for-game-design/

Like

o j2kun

Very impressive! Ive been getting contacted a lot about doing 3D versions of this

idea. I know for a fact its possible, I just lack enough 3D-graphics knowledge and

time to do it. Im interested to see your game as it progresses.

Like

rusyninventions

mention in this post. I used this same idea but in a 3D world so that it

creates walls. I have not yet blogged about it, but here is the demo video I

recorded last night with its current

state:http://www.youtube.com/watch?v=c5ZWNxQQxr8

Like

j2kun

Still very cool. I think I may have to do my 3D experiments using Unity like

Like

10. Dave S.

j2kun Many years ago I played around with Al Hensels lIfe program (V1.06) and like

you, came up with some interesting rules for cave/maze generation. Once I have deciphered

my scribblings Ill put something up on my web site.

smoothing) https://www.mediafire.com/convkey/6e68/psnouz8r58ngk066g.jpg

Like

11. Dave S.

rusyninventions I read your name wrong sorry about that. Anyway I have

looked at my notes and experimented a little. Will update my website hopefully

sometime next week.

Posted on May 11, 2013 by j2kun

Pablo Picasso in front of The Kitchen, photo by Herbert List.

Some of my favorite of Pablo Picassos works are his line drawings. He did a number of

them about animals: an owl, a camel, a butterfly, etc. This piece called Dog is on my

wall:

(Jump to interactive demo where we recreate Dog using the math in this post)

These paintings are extremely simple but somehow strike the viewer as deeply profound.

They give the impression of being quite simple to design and draw. A single stroke of the

hand and a scribbled signature, but what a masterpiece! It simultaneously feels like a hasty

afterthought and a carefully tuned overture to a symphony of elegance. In fact, we know

that Picassos process was deep. For example, in 1945-1946, Picasso made a series of

eleven drawings (lithographs, actually) showing the progression of his rendition of a bull.

The first few are more or less lifelike, but as the series progresses we see the bull boiled

down to its essence, the final painting requiring a mere ten lines. Along the way we see

drawings of a bull that resemble some of Picassos other works (number 9 reminding me of

the sculpture at Daley Center Plaza in Chicago). Read more about the series of lithographs

here.

Picassos, The Bull. Photo taken by Jeremy Kun at the Art Institute of Chicago in 2013.

Click to enlarge.

Now I dont pretend to be a qualified artist (I couldnt draw a bull to save my life), but I can

recognize the mathematical aspects of his paintings, and I can write a damn fine program.

There is one obvious way to consider Picasso-style line drawings as a mathematical object,

and it is essentially the Bezier curve. Lets study the theory behind Bezier curves, and then

write a program to draw them. The mathematics involved requires no background

knowledge beyond basic algebra with polynomials, and well do our best to keep the

discussion low-tech. Then well explore a very simple algorithm for drawing Bezier curves,

implement it in Javascript, and recreate one of Picassos line drawings as a sequence of

Bezier curves.

When asked to conjure a curve most people (perhaps plagued by their elementary

mathematics education) will either convulse in fear or draw part of the graph of a

polynomial. While these are fine and dandy curves, they only represent a small fraction of

the world of curves. We are particularly interested in curves which are not part of the graphs

of any functions.

For instance, a French curve is a physical template used in (manual) sketching to aid the

hand in drawing smooth curves. Tracing the edges of any part of these curves will usually

give you something that is not the graph of a function. Its obvious that we need to

generalize our idea of what a curve is a bit. The problem is that many fields of mathematics

define a curve to mean different things. The curves well be looking at, called Bezier

curves, are a special case of single-parameter polynomial plane curves. This sounds like a

mouthful, but what it means is that the entire curve can be evaluated with two polynomials:

one for the values and one for the values. Both polynomials share the same variable,

which well call , and is evaluated at real numbers.

An example should make this clear. Lets pick two simple polynomials in ,

say and . If we want to find points on this curve, we can just choose

values of and plug them into both equations. For instance, plugging in gives the

point on our curve. Plotting all such values gives a curve that is definitely not the

graph of a function:

But its clear that we can write any single-variable function in this parametric form:

just choose and . So these are really more general objects than regular

old functions (although well only be working with polynomials in this post).

polynomials in the same variable . Sometimes, if we want to express the whole

gadget in one piece, we can take the coefficients of common powers of and write them

as vectors in the and parts. Using the example above, we can

rewrite it as

Here the coefficients are points (which are the same as vectors) in the plane, and we

represent the function in boldface to emphasize that the output is a point. The linear-

algebraist might recognize that pairs of polynomials form a vector space, and further

combine them as . But for us, thinking of points as coefficients

of a single polynomial is actually better.

We will also restrict our attention to single-parameter polynomial plane curves for which

the variable is allowed to range from zero to one. This might seem like an awkward

restriction, but in fact every finite single-parameter polynomial plane curve can be written

this way (we wont bother too much with the details of how this is done). For the purpose of

brevity, we will henceforth call a single-parameter polynomial plane curve where ranges

from zero to one simply a curve.

Now there are some very nice things we can do with curves. For instance, given any two

points in the plane we can describe the straight line between

them as a curve: . Indeed, at the value is exactly ,

at its exactly , and the equation is a linear polynomial in . Moreover (without

getting too much into the calculus details), the line travels at unit speed from to .

In other words, we can think of as describing the motion of a particle from to over

time, and at time the particle is a quarter of the way there, at time its halfway, etc.

(An example of a straight line which doesnt have unit speed is, e.g. .)

More generally, lets add a third point . We can describe a path which goes from to ,

and is guided by in the middle. This idea of a guiding point is a bit abstract, but

computationally no more difficult. Instead of travelling from one point to another at

constant speed, we want to travel from one line to another at constant speed. That is, call the

two curves describing lines from and , respectively. Then the

curve guided by can be written as a curve

We can interpret this again in terms of a particle moving. At the beginning of our curve the

value of is small, and so were sticking quite close to the line As time goes on the

point moves along the line between the points and , which are themselves

moving. This traces out a curve which looks like this

This screenshot was taken from a wonderful demo by data visualization consultant Jason

Davies. It expresses the mathematical idea quite superbly, and one can drag the three points

around to see how it changes the resulting curve. One should play with it for at least five

minutes.

The entire idea of a Bezier curve is a generalization of this principle: given a

list of points in the plane, we want to describe a curve which travels from the

first point to the last, and is guided in between by the remaining points. A Bezier curve is

a realization of such a curve (a single-parameter polynomial plane curve) which is the

inductive continuation of what we described above: we travel at unit speed from a Bezier

curve defined by the first points in the list to the curve defined by the

last points. The base case is the straight-line segment (or the single point, if you

wish). Formally,

the degree Bezier curve recursively as

While the concept of travelling at unit speed between two lower-order Bezier curves is the

real heart of the matter (and allows us true computational insight), one can multiply all of

this out (using the formula for binomial coefficients) and get an explicit formula. It is:

And for example, a cubic Bezier curve with control points would have

equation

Higher dimensional Bezier curves can be quite complicated to picture geometrically. For

instance, the following is a fifth-degree Bezier curve (with six control points).

A degree five Bezier curve, credit Wikipedia.

The additional line segments drawn show the recursive nature of the curve. The simplest are

the green points, which travel from control point to control point. Then the blue points

travel on the line segments between green points, the pink travel along the line segments

between blue, the orange between pink, and finally the red point travels along the line

segment between the orange points.

Without the recursive structure of the problem (just seeing the curve) it would be a wonder

how one could actually compute with these things. But as well see, the algorithm for

drawing a Bezier curve is very natural.

Lets derive and implement the algorithm for painting a Bezier curve to a screen using only

the ability to draw straight lines. For simplicity, well restrict our attention to degree-three

(cubic) Bezier curves. Indeed, every Bezier curve can be written as a combination of cubic

curves via the recursive definition, and in practice cubic curves balance computational

efficiency and expressiveness. All of the code we present in this post will be in Javascript,

and is available on this blogs Github page.

So then a cubic Bezier curve is represented in a program by a list of four points. For

example,

Most graphics libraries (including the HTML5 canvas standard) provide a drawing

primitive that can output Bezier curves given a list of four points. But suppose we arent

given such a function. Suppose that we only have the ability to draw straight lines. How

would one go about drawing an approximation to a Bezier curve? If such an algorithm

exists (it does, and were about to see it) then we could make the approximation so fine that

it is visually indistinguishable from a true Bezier curve.

The key property of Bezier curves that allows us to come up with such an algorithm is the

following:

Any cubic Bezier curve can be split into two, end to end,

which together trace out the same curve as .

Let see exactly how this is done. Let be a cubic Bezier curve with control

points , and lets say we want to split it exactly in half. We notice that the

formula for the curve when we plug in , which is

Moreover, our recursive definition gave us a way to evaluate the point in terms of smaller-

degree curves. But when these are evaluated at 1/2 their formulae are similarly easy to write

down. The picture looks like this:

The green points are the degree one curves, the pink points are the degree two curves, and

the blue point is the cubic curve. We notice that, since each of the curves are evaluated

at , each of these points can be described as the midpoints of points we already

know. So , etc.

In fact, the splitting of the two curves we want is precisely given by these points. That is,

the left half of the curve is given by the curve with control

points , while the right half has control

points .

How can we be completely sure these are the same Bezier curves? Well, theyre just

polynomials. We can compare them for equality by doing a bunch of messy algebra. But

note, since only travels halfway along , to check they are the same is to

equate with , since as ranges from zero to one, ranges from zero to one

half. Likewise, we can compare with .

The algebra is very messy, but doable. As a test of this blogs newest tools, heres a screen

cast of me doing the algebra involved in proving the two curves are identical.

Now that thats settled, we have a nice algorithm for splitting a cubic Bezier (or any Bezier)

into two pieces. In Javascript,

function subdivide(curve) {

var firstMidpoints = midpoints(curve);

var secondMidpoints = midpoints(firstMidpoints);

var thirdMidpoints = midpoints(secondMidpoints);

thirdMidpoints[0]],

[thirdMidpoints[0], secondMidpoints[1], firstMidpoints[2],

curve[3]]];

}

Here curve is a list of four points, as described at the beginning of this section, and the

output is a list of two curves with the correct control points. The midpoints function used

is quite simple, and we include it here for compelteness:

function midpoints(pointList) {

var midpoint = function(p, q) {

return [(p[0] + q[0]) / 2.0, (p[1] + q[1]) / 2.0];

};

for (var i = 0; i < midpointList.length; i++) {

midpointList[i] = midpoint(pointList[i], pointList[i+1]);

}

return midpointList;

}

It just accepts as input a list of points and computes their sequential midpoints. So a list

of points is turned into a list of points. As we saw, we need to call this

function times to compute the segmentation of a degree Bezier curve.

As explained earlier, we can keep subdividing our curve over and over until each of the tiny

pieces are basically lines. That is, our function to draw a Bezier curve from the beginning

will be as follows:

if (isFlat(curve)) {

drawSegments(curve, context);

} else {

var pieces = subdivide(curve);

drawCurve(pieces[0], context);

drawCurve(pieces[1], context);

}

}

In words, as long as the curve isnt flat, we want to subdivide and draw each piece

recursively. If it is flat, then we can simply draw the three line segments of the curve and be

reasonably sure that it will be a good approximation. The context variable sitting there

represents the canvas to be painted to; it must be passed through to the drawSegments

function, which simply paints a straight line to the canvas.

Of course this raises the obvious question: how can we tell if a Bezier curve is flat? There

are many ways to do so. One could compute the angles of deviation (from a straight line) at

each interior control point and add them up. Or one could compute the volume of the

enclosed quadrilateral. However, computing angles and volumes is usually not very nice:

angles take a long time to compute and volumes have stability issues, and the algorithms

which are stable are not very simple. We want a measurement which requires only basic

arithmetic and perhaps a few logical conditions to check.

It turns out there is such a measurement. Its originally attributed to Roger Willcocks, but

its quite simple to derive by hand.

Essentially, we want to measure the flatness of a cubic Bezier curve by computing the

distance of the actual curve at time from where the curve would be at time if the curve

were a straight line.

the straight-line Bezier cubic as the colossal sum

Theres nothing magical going on here. Were simply giving the Bezier curve with control

points . One should think about this as points which are a 0,

1/3, 2/3, and 1 fraction of the way from to on a straight line.

curves at the same time . The flatness value of is the maximum of over all values of .

If this flatness value is below a certain tolerance level, then we call the curve flat.

With a bit of algebra we can simplify this expression. First, the value of for which the

distance is maximized is the same as when its square is maximized, so we can omit the

square root computation at the end and take that into account when choosing a flatness

tolerance.

Now lets actually write out the difference as a single polynomial. First, we can cancel the

3s in and write the polynomial as

, , we get

Since the maximum of a product is at most the product of the maxima, we can boundthe

above quantity by the product of the two maxes. The reason we want to do this is because

we can easily compute the two maxes separately. It wouldnt be hard to compute the

maximum without splitting things up, but this way ends up with fewer computational steps

for our final algorithm, and the visual result is equally good.

of for turns out to be . And the norm of a vector is just the sum

of squares of its components. If and , then the norm above is

exactly

And notice: for any real numbers the quantity is exactly the straight

line from to we know so well. The maximum over all between zero and one is

obviously the maximum of the endpoints . So the max of our distance function is

bounded by

And so our condition for being flat is that this bound is smaller than some allowable

tolerance. We may safely factor the 1/16 into this tolerance bound, and so this is enough to

write a function.

function isFlat(curve) {

var tol = 10; // anything below 50 is roughly good-looking

var ay = 3.0*curve[1][1] - 2.0*curve[0][1] - curve[3][1]; ay *= ay;

var bx = 3.0*curve[2][0] - curve[0][0] - 2.0*curve[3][0]; bx *= bx;

var by = 3.0*curve[2][1] - curve[0][1] - 2.0*curve[3][1]; by *= by;

}

And there we have it. We write a simple HTML page to access a canvas element and a few

extra helper functions to draw the line segments when the curve is flat enough, and

present the final result in this interactive demonstration (you can perturb the control points).

The picture you see on that page (given below) is my rendition of Picassos Dog drawing

While we didnt invent the drawing itself (and hence shouldnt attach our signature to it),

we did come up with the representation as a sequence of Bezier curves. It only seems fitting

to present that as the work of art. Here weve distilled the representation down to a single

file: the first line is the dimension of the canvas, and each subsequent line represents a cubic

Bezier curve. Comments are included for readability.

Dog Jeremy Kun, 2013. Click to enlarge.

Because standardizing things seems important, we define a new filetype .bezier, which

has the format given above:

int int

(int) curve

(int) curve

...

Where the first two ints specify the size of the canvas, the first (optional) int on each line

specifies the width of the stroke, and a curve has the form

If an int is omitted at the beginning of a line, this specifies a width of three pixels.

In a general .bezier file we allow a curve to have arbitrarily many control points, though the

code we gave above does not draw them that generally. As an exercise, write a program

which accepts as input a .bezier file and produces as output an image of the drawing. This

will require an extension of the algorithm above for drawing arbitrary Bezier curves, which

loops its computation of the midpoints and keeps track of which end up in the resulting

subdivision. Alternatively, one could write a program which accepts as input a .bezier file

with only cubic Bezier curves, and produces as output an SVG file of the drawing (SVG

only supports cubic Bezier curves). So a .bezier file is a simplification (fewer features) and

an extension (Bezier curves of arbitrary degree) of an SVG file.

We didnt go as deep into the theory of Bezier curves as we could have. If the reader is

itching for more (and a more calculus-based approach), see this lengthy primer. It contains

practically everything one could want to know about Bezier curves, with nice interactive

demos written in Processing.

Low-Complexity Art

There are some philosophical implications of what weve done today with Picassos Dog.

Previously on this blog weve investigated the idea of low-complexity art, and its quite

relevant here. The thesis is that beautiful art has a small description length, and more

formally the complexity of some object (represented by text) is the length of the shortest

program that outputs that object given no inputs. More on that in our primer on

Kolmogorov complexity. The fact that we can describe Picassos line drawings with a small

number of Bezier curves (and a relatively short program to output the bezier curves) is

supposed to be a deep statement about the beauty of the art itself. Obviously this is very

subjective, but not without its proponents.

There has been a bit of recent interest in computers generating art. For instance, this recent

programming competition (in Dutch) gave the task of generating art similar to the work

of Piet Mondrian. The idea is that the more elegant the algorithm, the higher it would be

scored. The winner used MD5 hashes to generate Mondrian pieces, and there were many

many other impressive examples (the link above has a gallery of submissions).

In our earlier post on low-complexity art, we explored the possibility of representing all

images within a coordinate system involving circles with shaded interiors. But its obvious

that such a coordinate system wouldnt be able to represent Dog with very low

complexity. It seems that Bezier curves are a much more natural system of coordinates.

Some of the advantages include that length of lines and slight perturbations dont affect the

resulting complexity. A cubic Bezier curve can be described by any set of four points, and

more intricate (higher complexity) descriptions of curves require a larger number of

points. Bezier curves can be scaled up arbitrarily, and this doesnt significantly change the

complexity of the curve (although scaling many orders of magnitude will introduce a

logarithmic factor complexity increase, this is quite small). Curves with larger stroke are

slightly more complex than those with smaller stroke, and representing many small sharp

bends require more curves than long, smooth arcs.

On the downside, its not so easy to represent a circle as a Bezier curve. In fact, it is

impossible to do so exactly. Despite the simplicity of this object (its even defined as a

single polynomial, albeit in two variables), the best one can do is approximate it. The same

goes for ellipses. There are actually ways to overcome this (the concept of rational Bezier

curves which are quotients of polynomials), but they add to the inherent complexity of the

drawing algorithm and the approximations using regular Bezier curves are good enough.

And so we define the complexity of a drawing to be the number of bits in its .bezier file

representation. Comments are ignored in this calculation.

The real prize, and what well explore next time, is to find a way to generate art

automatically. That is to do one of two things:

1. Given some sort of seed, write a program that produces a pseudo-random line

drawing.

2. Given an image, produce a .bezier image which accurately depicts the image as a

line drawing.

We will attempt to explore these possibilities in the follow-up to this post. Depending on

how things go, this may involve some local search algorithms, genetic algorithms, or other

methods.

Until then!

Addendum: want to buy a framed print of the source code for Dog? Head over to our

page on Society6.

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

This entry was posted in Algorithms, Design and tagged art, bezier curves, de

Casteljau, graphics, javascript, low-complexity art, math, picasso, programming, svg. Bookmark

the permalink.

Post navigation

Facebook Page, Google+ Community, and Whispers of Guest Posts

Dog Print Available for Sale

An ever so slightly simpler yet still robust test for flatness was discussed here and on

comp.graphics.algorithms:http://antigrain.com/research/adaptive_bezier/index.html#toc001

3

Also of interest, Pyramid Algorithms by Ron Goldman covers Wangs formula (chap.

5.6.3) for determining in advance how many levels of subdivision you need to achieve a

specified degree of flatness. Wangs formula is also discussed in: DEC Paris Research

Laboratory report #1, May 1989. Clearly this approach will be more conservative than

testing each segment.

Like

o j2kun

Thats a lot of great info! From what I understand the metric I presented here is

whats used in the Postscript language. Not to say whether that makes it good or

not, but at least its stood the test of time (and engineers).

Like

Pixel I/O (@pixelio)

[ http://kowon.dongseo.ac.kr/~lbg/cagd/history1.pdf ].

Like

2. cpress

Like

3. mbaz

What tools did you use to make the video? Its great!

Like

o j2kun

Sketchbook for drawing with a Wacom Bamboo tablet, and Screenflow to capture it

Like

mbaz

Thanks for your answer, and congratulations on getting enough out of the

blog to afford such awesome tools. Screenflow especially looks great; too

bad I dont use Macs

Like

4. stephanwehner

I tried to find out in which year Picasso made the dog drawing. Do you know (since you

have a copy)? A few years after 1957? See http://en.wikipedia.org/wiki/Lump_(dog)

So my guess is that Bezier curves came after Picassos lines, namely in the sixties,

see http://en.wikipedia.org/wiki/B%C3%A9zier_curve

pages.blogspot.ca/2011/04/picasso-dog-in-one-line.html

Cheers,

Stephan

Like

o j2kun

1942. http://sapergalleries.com/PicassoLeChienDetail.html

The drawing is quite stable. The nature of Bezier curves makes them stable to small

perturbations of the control points.

In fact, the original inventor of Bezier curves was Paul de Casteljau, and he

published (or made public) his work on Bezier curves in 1959. So its quite amazing

how close together these two ideas are in history.

Like

stephanwehner

Thanks, I saw that page, but couldnt make out that it also related to the

simple line drawing. So you think it is not a drawing of Lump, the 1957

dog.

Theres another sense of closeness, Picasso living in France during those

years, as did, I take it, Bzier and de Casteljau.

Cheers,

Like

5. Frere Loup

It seems you left out the 3 coefficients in the equation of the cubic Bezier curve?

Like

o j2kun

Like

6. Peter Gorgson

Picasso couldnt have used Bezier curves because they hadnt been invented in the 14th

Century when Picasso was painting.

Like

o j2kun

Like

Fantastic article! Its wonderful to see different disciplines being explored at the same time.

Like

8. Tomas

It took me more time that what i wanted, but now anyone can do this in DrRacket:

#lang s-exp (planet tomcoiro/doodle-draw:1:0/lang)

500 500

(180 280 183 268 186 256 189 244)

(191 244 290 244 300 230 339 245)

(340 246 350 290 360 300 355 210)

(353 210 370 207 380 196 375 193)

(375 193 310 220 190 220 164 205)

(164 205 135 194 135 265 153 275)

(153 275 168 275 170 180 150 190)

(149 190 122 214 142 204 85 240)

(86 240 100 247 125 233 140 238)

(show Picassos Dog)

Like

o j2kun

Like

9. Ahmed Hossam

This is amazing! Now I know, what to do, in order to split a Bezier Curve into two

pieces! Could you please recommend me some clear explanations on B-Splines

too?! Thanks!!!!

Posted on September 29, 2014 by j2kun

The Mona Lisa

Leonardo da Vincis Mona Lisa is one of the most famous paintings of all time. And there

has always been a discussion around her enigmatic smile. He used a trademark Renaissance

technique called sfumato, which involves many thin layers of glaze mixed with subtle

pigments. The striking result is that when you look directly at Mona Lisas smile, it seems

to disappear. But when you look at the background your peripherals see a smiling face.

One could spend decades studying the works of these masters from various perspectives,

but if we want to hone in on the disappearing nature of that smile, mathematics can provide

valuable insights. Indeed, though he may not have known the relationship between his work

and da Vincis, hundreds of years later Salvador Dali did the artists equivalent of

mathematically isolating the problem with his painting, Gala Contemplating the

Mediterranean Sea.

Gala Contemplating the Mediterranean Sea (Salvador Dali, 1976)

Here you see a woman in the foreground, but step back quite far from the picture and there

is a (more or less) clear image of Abraham Lincoln. Here the question of gaze is the blaring

focus of the work. Now of course Dali and da Vinci werent scribbling down equations and

computing integrals; their artistic expression was much less well-defined. But we the

artistically challenged have tools of our own: mathematics, science, and programming.

In 2006 Aude Oliva, Antonio Torralba, and Philippe. G. Schyns used those tools to merge

the distance of Dali and the faded smiles of da Vinci into one cohesive idea. In their 2006

paper they presented the notion of a hybrid image, presented below.

The Mona Lisas of Science

If you look closely, youll see three women, each of which looks the teensiest bit strange,

like they might be trying to suppress a smile, but none of them are smiling. Blur your eyes

or step back a few meters, and they clearly look happy. The effect is quite dramatic. At the

risk of being overly dramatic, these three women are literally modern day versions of Mona

Lisa, the Mona Lisas of Science, if you will.

Another, perhaps more famous version of their technique, since it was more widely

publicized, is their Marilyn Einstein, which up close is Albert Einstein and from far away

is Marilyn Monroe.

Marilyn Einstein

This one gets to the heart of the question of what the eye sees at close range versus long

range. And it turns out that you can address this question (and create brilliant works of art

like the ones above) with some basic Fourier analysis.

The basic idea of Fourier analysis is the idea that smooth functions are hard to understand,

and realization of how great it would be if we could decompose them into simpler pieces.

Decomposing complex things into simpler parts is one of the main tools in all of

mathematics, and Fourier analysis is one of the clearest examples of its application.

In particular, the things we care about are functions with specific properties I wont

detail here like smoothness and finiteness. And the building blocks are the complex

exponential functions

where can be any integer. If you have done some linear algebra (and ignore this if you

havent), then I can summarize the idea succinctly by saying the complex exponentials form

an orthonormal basis for the vector space of square-integrable functions.

Back in colloquial language, what the Fourier theorem says is that any function of the kind

we care about can be broken down into (perhaps infinitely many) pieces of this form

called Fourier coefficients (Im abusing the word coefficient here). The way its breaking

down is also pleasingly simple: its a linear combination. Informally that means youre just

adding up all the complex exponentials with specific weights for each one. Mathematically,

the conversion from the function to its Fourier coefficients is called the Fourier

transform, and the set of all Fourier coefficients together is called the Fourier spectrum. So

if you want to learn about your function , or more importantly modify it in some way, you

can inspect and modify its spectrum instead. The reason this is useful is that Fourier

coefficients have very natural interpretations in sound and images, as well see for the latter.

We wrote and the complex exponential as a function of one real variable, but you can

do the same thing for two variables (or a hundred!). And, if youre willing to do some

abusing and ignore the complexness of complex numbers, then you can visualize complex

exponentials in two variables as images of stripes whose orientation and thickness

correspond to two parameters (i.e., the in the offset equation becomes two coefficients).

The video below shows how such complex exponentials can be used to build up an image

of striking detail. The left frame shows which complex exponential is currently being

added, and the right frame shows the layers all put together. I think the result is quite

beautiful.

This just goes to show how powerful da Vincis idea of fine layering is: its as powerful as

possible because it can create any image!

Now for digital images like the one above, everything is finite. So rather than have an

infinitely precise function and a corresponding infinite set of Fourier coefficients, you get a

finite list of sampled values (pixels) and a corresponding grid of Fourier

coefficients. But the important and beautiful theorem is, and I want to emphasize how

groundbreakingly important this is:

If you give me an image (or any function!) I can compute the decomposition

very efficiently.

And the same theorem lets you go the other way: if you give me the decomposition, I can

compute the original functions samples quite easily. The algorithm to do this is called the

Fast Fourier transform, and if any piece of mathematics or computer science has a

legitimate claim to changing the world, its the Fast Fourier transform. Its hard to pinpoint

specific applications, because the transform is so ubiquitous across science and engineering,

but we definitely would not have cell phones, satellites, internet, or electronics anywhere

near as small as we do without the Fourier transform and the ability to compute it quickly.

Constructing hybrid images is one particularly nice example of manipulating the Fourier

spectrum of two images, and then combining them back into a single image. Thats what

well do now.

As a side note, by the nature of brevity, the discussion above is a big disservice to the

mathematics involved. I summarized and abused in ways that mathematicians would object

to. If you want to see a much better treatment of the material, this blog has a long series of

posts developing Fourier transforms and their discrete analogues from scratch.

See our four primers, which lead into the main content posts where we implement the Fast

Fourier transform in Python and use it to apply digital watermarks to an image. Note that in

those posts, as in this one, all of the materials and code used are posted on this blogs

Github page.

For images, interpreting ranges of Fourier coefficients is easy to do. You can imagine the

coefficients lying on a grid in the plane like so:

Each dot in this grid corresponds to how intense the Fourier coefficient is. That is, its the

magnitude of the (complex) coefficient of the corresponding complex exponential. Now the

points that are closer to the origin correspond informally to the broad, smooth changes in

the image. These are called low frequency coefficients. And points that are further away

correspond to sharp changes and edges, and are likewise called high frequency

components. So the if you wanted to hybridize two images, youd pick ones with

complementary intensities in these regions. Thats why Einstein (with all his wiry hair and

wrinkles) and Monroe (with smooth features) are such good candidates. Thats also why,

when we layered the Fourier components one by one in the video from earlier, we see the

fuzzy shapes emerge before the fine details.

Moreover, we can extract the high frequency Fourier components by simply removing the

low frequency ones. Its a bit more complicated than that, since you want the transition

from something to nothing to be smooth in sone sense. A proper discussion of this

would go into sampling and the Nyquist frequency, but thats beyond the scope of this post.

Rather, well just define a family of filtering functions without motivation and

observe that they work well.

Definition: The Gaussian filter function with variance and center is the function

In particular, at zero the function is 1 and it gradually drops to zero as you get farther away.

The parameter controls the rate at which it vanishes, and in the picture above the center is

set to .

Now what well do is take our image, compute its spectrum, and multiply coordinatewise

with a certain Gaussian function. If were trying to get rid of high-frequency components

(called a low-pass filter because it lets the low frequencies through), we can just multiply

the Fourier coefficients directly by the filter values , and if were doing a high-pass

filter we multiply by .

Before we get to the code, heres an example of a low-pass filter. First, take this image of

Marilyn Monroe

And reverse the Fourier transform to get an image

In fact, this is a common operation in programs like photoshop for blurring an image (its

called a Gaussian blur for obvious reasons). Heres the python code to do this. You

can download it along with all of the other resources used in making this post on this blogs

Github page.

import numpy

from numpy.fft import fft2, ifft2, fftshift, ifftshift

from scipy import misc

from scipy import ndimage

import math

centerI = int(numRows/2) + 1 if numRows % 2 == 1 else int(numRows/2)

centerJ = int(numCols/2) + 1 if numCols % 2 == 1 else int(numCols/2)

def gaussian(i,j):

coefficient = math.exp(-1.0 * ((i - centerI)**2 + (j - centerJ)**2) /

(2 * sigma**2))

return 1 - coefficient if highPass else coefficient

range(numRows)])

shiftedDFT = fftshift(fft2(imageMatrix))

filteredDFT = shiftedDFT * filterMatrix

return ifft2(ifftshift(filteredDFT))

n,m = imageMatrix.shape

return filterDFT(imageMatrix, makeGaussianFilter(n, m, sigma,

highPass=False))

n,m = imageMatrix.shape

return filterDFT(imageMatrix, makeGaussianFilter(n, m, sigma,

highPass=True))

if __name__ == "__main__":

marilyn = ndimage.imread("marilyn.png", flatten=True)

lowPassedMarilyn = lowPass(marilyn, 20)

misc.imsave("low-passed-marilyn.png", numpy.real(lowPassedMarilyn))

The first function samples the values from a Gaussian function with the specified

parameters, discretizing the function and storing the values in a matrix. Then

the filterDFT function applies the filter by doing coordinatewise multiplication (note these

are all numpy arrays). We can do the same thing with a high-pass filter, producing the edgy

image below

And if we compute the average of these two images, we basically get back to the original.

So the only difference between this and a hybrid image is that you take the low-passed part

of one image and the high-passed part of another. Then the art is in balancing the

parameters so as to make the averaged image look right. Indeed, with the following picture

of Einstein and the above shot of Monroe, we can get a pretty good recreation of the Oliva-

Torralba-Schyns piece. I think with more tinkering it could be even better (I did barely any

centering/aligning/resizing to the original images).

highPassed = highPass(highFreqImg, sigmaHigh)

lowPassed = lowPass(lowFreqImg, sigmaLow)

Interestingly enough, doing it in reverse doesnt give quite as pleasing results, but it still

technically works. So theres something particularly important that the high-passed image

does have a lot of high-frequency components, and vice versa for the low pass.

You can see some of the other hybrid images Oliva et al constructed over at their web

gallery.

Next Steps

How can we take this idea further? There are a few avenues I can think of. The most

obvious one would be to see how this extends to video. Could one come up with

generic parameters so that when two videos are hybridized (frame by frame, using this

technique) it is only easy to see one at close distance? Or else, could we apply a three-

dimensional transform to a video and modify that in some principled way? I think one

would not likely find anything astounding, but who knows?

Second would be to look at the many other transforms we have at our disposal. How

does manipulating the spectra of these transforms affect the original image, and can you

make images that are hybridized in senses other than this one?

And finally, can we bring this idea down in dimension to work with one-dimensional

signals? In particular, can we hybridize music? It could usher in a new generation of

mashup songs that sound different depending on whether you wear earmuffs

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

This entry was posted in Design, Linear Algebra and tagged albert einstein, art, design, fourier

analysis, hybrid images, image manipulation, marilyn monroe, mathematics, mona

lisa, programming, python, salvador dali, signal processing. Bookmark the permalink.

Post navigation

Occams Razor and PAC-learning

On the Computational Complexity of MapReduce

1. Jonathan

In sound, this is an awful lot like what a vocoder does, when its used in music. The low-

frequence envelope is the performers voice, the high-frequency signal comes from the

instrument.

Like

o Flo Vouin

@Jonathan: In a vocoder, the spectrum of one signal is used as a filter to alter the

other signal, so its slightly different Mixing the low frequencies of one song

with the high frequencies of another sounds more like what a DJ does when

transitioning between two songs.

A slightly more complex image processing technique, but which is still a lot of fun

is Poisson editing: http://www.cs.jhu.edu/~misha/Fall07/Papers/Perez03.pdf

Like

2. Helder

Have you published the code used to reconstruct the image in the video?

If so, where?

Thank you.

Like

o j2kun

post: https://jeremykun.com/2013/12/30/the-two-dimensional-fourier-transform-

and-digital-watermarking/

Like

3. yboris

Like

4. Umair Jameel

Just finished my UWP windows 10 hybrid image illusion app. You combine two

images and from the combined image, you see first image when seen from some

distance and see second image at a closer look. Have a look at it.

https://www.microsoft.com/en-us/store/apps/imagine-pic/9nblggh4x04c

Markov Chain Monte Carlo Without all the

Bullshit

Posted on April 6, 2015 by j2kun

I have a little secret: I dont like the terminology, notation, and style of writing in statistics.

I find it unnecessarily complicated. This shows up when trying to read about Markov Chain

Monte Carlo methods. Take, for example, the abstract to the Markov Chain Monte Carlo

article in the Encyclopedia of Biostatistics.

Markov chain Monte Carlo (MCMC) is a technique for estimating by simulation the

expectation of a statistic in a complex model. Successive random selections form a Markov

chain, the stationary distribution of which is the target distribution. It is particularly useful

for the evaluation of posterior distributions in complex Bayesian models. In the Metropolis

Hastings algorithm, items are selected from an arbitrary proposal distribution and are

retained or not according to an acceptance rule. The Gibbs sampler is a special case in

which the proposal distributions are conditional distributions of single components of a

vector parameter. Various special cases and applications are considered.

I can only vaguely understand what the author is saying here (and really only because I

know ahead of time what MCMC is). There are certainly references to more advanced

things than what Im going to cover in this post. But it seems very difficult to find an

explanation of Markov Chain Monte Carlo without superfluous jargon. The bullshit here

is the implicit claim of an author that such jargon is needed. Maybe it is to explain advanced

applications (like attempts to do inference in Bayesian networks), but it is certainly not

needed to define or analyze the basic ideas.

So to counter, heres my own explanation of Markov Chain Monte Carlo, inspired by the

treatment of John Hopcroft and Ravi Kannan.

Markov Chain Monte Carlo is a technique to solve the problem of sampling from a

complicated distribution. Let me explain by the following imaginary scenario. Say I have a

magic box which can estimate probabilities of baby names very well. I can give it a

string like Malcolm and it will tell me the exact probability that you will choose

this name for your next child. So theres a distribution over all names, its very specific

to your preferences, and for the sake of argument say this distribution is fixed and you dont

get to tamper with it.

Now comes the problem: I want to efficiently draw a name from this distribution . This is

the problem that Markov Chain Monte Carlo aims to solve. Why is it a problem? Because I

have no idea what process you use to pick a name, so I cant simulate that process myself.

Heres another method you could try: generate a name uniformly at random, ask the

machine for , and then flip a biased coin with probability and use if the coin lands

heads. The problem with this is that there are exponentially many names! The variable here

is the number of bits needed to write down a name . So either the

probabilities will be exponentially small and Ill be flipping for a very long time to get a

single name, or else there will only be a few names with nonzero probability and it will take

me exponentially many draws to find them. Inefficiency is the death of me.

So this is a serious problem! Lets restate it formally just to be clear.

Definition (The sampling problem): Let be a distribution over a finite set . You are

given black-box access to the probability distribution function which outputs the

probability of drawing according to . Design an efficient randomized

algorithm which outputs an element of so that the probability of outputting is

approximately . More generally, output a sample of elements from drawn according

to .

Assume that has access to only fair random coins, though this allows one to efficiently

simulate flipping a biased coin of any desired probability.

Notice that with such an algorithm wed be able to do things like estimate the expected

value of some random variable . We could take a large sample via the

solution to the sampling problem, and then compute the average value of on that sample.

This is what a Monte Carlo method does when sampling is easy. In fact, the Markov

Chain solution to the sampling problem will allow us to do the sampling and the estimation

of in one fell swoop if you want.

But the core problem is really a sampling problem, and Markov Chain Monte Carlo

would be more accurately called the Markov Chain Sampling Method. So lets see why a

Markov Chain could possibly help us.

Markov Chain is essentially a fancy term for a random walk on a graph.

You give me a directed graph , and for each edge you give me

a number . In order to make a random walk make sense, the need to satisfy

the following constraint:

For any vertex , the set all values on outgoing edges must sum to 1, i.e.

form a probability distribution.

If this is satisfied then we can take a random walk on according to the probabilities as

follows: start at some vertex . Then pick an outgoing edge at random according to the

probabilities on the outgoing edges, and follow it to . Repeat if possible.

I say if possible because an arbitrary graph will not necessarily have any outgoing edges

from a given vertex. Well need to impose some additional conditions on the graph in order

to apply random walks to Markov Chain Monte Carlo, but in any case the idea of randomly

walking is well-defined, and we call the whole object a Markov chain.

Here is an example where the vertices in the graph correspond to emotional states.

An example Markov chain; image source http://www.mathcs.emory.edu/~cheung/

In statistics land, they take the state interpretation of a random walk very seriously. They

call the edge probabilities state-to-state transitions.

The main theorem we need to do anything useful with Markov chains is the stationary

distribution theorem (sometimes called the Fundamental Theorem of Markov Chains, and

for good reason). What it says intuitively is that for a very long random walk, the

probability that you end at some vertex is independent of where you started! All of these

probabilities taken together is called the stationary distribution of the random walk, and it is

uniquely determined by the Markov chain.

However, for the reasons we stated above (if possible), the stationary distribution theorem

is not true of every Markov chain. The main property we need is that the

graph is strongly connected. Recall that a directed graph is called connected if, when you

ignore direction, there is a path from every vertex to every other vertex. It is called strongly

connected if you still get paths everywhere when considering direction. If we additionally

require the stupid edge-case-catcher that no edge can have zero probability, then strong

connectivity (of one component of a graph) is equivalent to the following property:

For every vertex , an infinite random walk started at will return to with

probability 1.

In fact it will return infinitely often. This property is called the persistence of the state by

statisticians. I dislike this term because it appears to describe a property of a vertex, when to

me it describes a property of the connected component containing that vertex. In any case,

since in Markov Chain Monte Carlo well be picking the graph to walk on (spoiler!) we will

ensure the graph is strongly connected by design.

Finally, in order to describe the stationary distribution in a more familiar manner (using

linear algebra), we will write the transition probabilities as a matrix where

entry if there is an edge and zero otherwise. Here the rows and

columns correspond to vertices of , and each column forms the probability distribution

of going from state to some other state in one step of the random walk. Note is the

transpose of the weighted adjacency matrix of the directed weighted graph where the

weights are the transition probabilities (the reason I do it this way is because matrix-vector

multiplication will have the matrix on the left instead of the right; see below).

This matrix allows me to describe things nicely using the language of linear algebra. In

particular if you give me a basis vector interpreted as the random walk currently at

vertex , then gives a vector whose -th coordinate is the probability that the random

walk would be at vertex after one more step in the random walk. Likewise, if you give me

a probability distribution over the vertices, then gives a probability vector interpreted

as follows:

If a random walk is in state with probability , then the -th entry of is the probability

that after one more step in the random walk you get to vertex .

that , in other words is an eigenvector of with eigenvalue 1.

A quick side note for avid readers of this blog: this analysis of a random walk is exactly

what we did back in the early days of this blog when we studied the PageRank algorithm for

ranking webpages. There we called the matrix a web matrix, did random walks on it,

and found a special eigenvalue whose eigenvector was a stationary distribution that we

used to rank web pages (this used something called the Perron-Frobenius theorem, which

says a random-walk matrix has that special eigenvector). There we described an algorithm

to actually find that eigenvector by iteratively multiplying . The following theorem is

essentially a variant of this algorithm but works under weaker conditions; for the web

matrix we added additional fake edges that give the needed stronger conditions.

probabilities forming a Markov chain. For a probability vector ,

define for all , and let be the long-term average .

Then:

2. For all , the limit .

Indeed, we can expand this quantity as

Now its clear that this does not depend on . For uniqueness we will cop out and appeal to

the Perron-Frobenius theorem that says any matrix of this form has a unique such

(normalized) eigenvector.

One additional remark is that, in addition to computing the stationary distribution by

actually computing this average or using an eigensolver, one can analytically solve for it as

the inverse of a particular matrix. Define , where is the identity

matrix. Let be with a row of ones appended to the bottom and

the topmost row removed. Then one can show (quite opaquely) that the last column

of is . We leave this as an exercise to the reader, because Im pretty sure nobody uses

this method in practice.

One final remark is about why we need to take an average over all our in the

theorem above. There is an extra technical condition one can add to strong connectivity,

called aperiodicity, which allows one to beef up the theorem so that itself converges to

the stationary distribution. Rigorously, aperiodicity is the property that, regardless of where

you start your random walk, after some sufficiently large number of steps the random

walk has a positive probability of being at every vertex at every subsequent step. As an

example of a graph where aperiodicity fails: an undirected cycle on an even number

of vertices. In that case there will only be a positive probability of being at certain vertices

every other step, and averaging those two long term sequences gives the actual stationary

distribution.

One way to guarantee that your Markov chain is aperiodic is to ensure there is a positive

probability of staying at any vertex. I.e., that your graph has a self-loop. This is what well

do in the next section.

Recall that the problem were trying to solve is to draw from a distribution over a finite

set with probability function . The MCMC method is to construct a Markov

chain whose stationary distribution is exactly , even when you just have black-box access

to evaluating . That is, you (implicitly) pick a graph and (implicitly) choose transition

probabilities for the edges to make the stationary distribution . Then you take a long

enough random walk on and output the corresponding to whatever state you land on.

The easy part is coming up with a graph that has the right stationary distribution (in fact,

most graphs will work). The hard part is to come up with a graph where you can prove

that the convergence of a random walk to the stationary distribution is fast in comparison to

the size of . Such a proof is beyond the scope of this post, but the right choice of a

graph is not hard to understand.

The one well pick for this post is called the Metropolis-Hastings algorithm. The input is

your black-box access to , and the output is a set of rules that implicitly define a

random walk on a graph whose vertex set is .

It works as follows: you pick some way to put on a lattice, so that each state corresponds

to some vector in . Then you add (two-way directed) edges to all neighboring

lattice points. For it would look like this:

You have to be careful here to ensure the vertices you choose for are not disconnected,

but in many applications is naturally already a lattice.

Now we have to describe the transition probabilities. Let be the maximum degree of a

vertex in this lattice ( ). Suppose were at vertex and we want to know where to go

next. We do the following:

2. If you picked neighbor and then deterministically go to .

3. Otherwise, , and you go to with probability .

It is easy to check that this is indeed a probability distribution for each vertex . So we just

have to show that is the stationary distribution for this random walk.

Heres a fact to do that: if a probability distribution with entries for each has

the property that for all , the is the stationary distribution.

To prove it, fix and take the sum of both sides of that equation over all . The result is

exactly the equation , which is the same as . Since the

stationary distribution is the unique vector satisfying this equation, has to be it.

Doing this with out chosen is easy, since and are both equal

to by applying a tiny bit of algebra to the definition. So were done! One

can just randomly walk according to these probabilities and get a sample.

Last words

The last thing I want to say about MCMC is to show that you can estimate the expected

value of a function simultaneously while random-walking through your Metropolis-

Hastings graph (or any graph whose stationary distribution is ). By definition the

expected value of is .

Now what we can do is compute the average value of just among those states weve

visited during our random walk. With a little bit of extra work you can show that this

quantity will converge to the true expected value of at about the same time that the

random walk converges to the stationary distribution. (Here the about means were off

by a constant factor depending on ). In order to prove this you need some extra tools Im

too lazy to write about in this post, but the point is that it works.

The reason I did not start by describing MCMC in terms of estimating the expected value of

a function is because the core problem is a sampling problem. Moreover, there are many

applications of MCMC that need nothing more than a sample. For example, MCMC can be

used to estimate the volume of an arbitrary (maybe high dimensional) convex set. See these

lecture notes of Alistair Sinclair for more.

If demand is popular enough, I could implement the Metropolis-Hastings algorithm in code

(it wouldnt be industry-strength, but perhaps illuminating? Im not so sure).

Share this:

Click to share on Google+ (Opens in new window)

Click to share on Reddit (Opens in new window)

Click to share on Twitter (Opens in new window)

More

This entry was posted in Algorithms, Graph Theory, Linear Algebra, Probability Theory and

tagged markov chain, mathematics, MCMC, monte carlo, random walk. Bookmark the permalink.

Post navigation

The Codes of Solomon, Reed, and Muller

The Many Faces of Set Cover

1. Ben Buckley

Its not immediately obvious to me how this helps with our baby name blackbox. I assume

Im missing something important.

My understanding is that, in the graph, each state would correspond to some name, where n

= 26 (letters in the alphabet) and d = 7 (just to keep things simple) so that MALCOLM is

one of the states. Wont the states neighbours be crazy strings like JALCOLM and

MALCZLM for which the blackbox should return zero, and p(j)/p(i) is always zero? So,

if I do a walk on the graph, how am I supposed to leave the state MALCOLM?

Liked by 1 person

o j2kun

dependant issue, and in particular for names there is nothing stopping someone from

making up a name like Jalcom. So the issue is finding a way to map names to grid

vertices in a sensible way. I dont know of a simple way to do that off the top of my

head without a given enumeration of all legal names.

Like

o paulie

Thank you and God bless you for the inspiration we named our sweet baby boy

Jalcolm

Like

2. ZL

code yes please

Like

o Hugle (@wulong3)

python..http://python4mpia.github.io/fitting_data/Metropolis-Hastings.html

Liked by 1 person

3. gt

Strictly speaking your theorem also requires the state space to be finite: a simple M/M/1

exploding queue will serve as a counter example. Having said that, your original

motivation was MCMC on a finite state set X, so perhaps this is implicit.

Liked by 1 person

4. Amnon Harel

Thats quite a straw man, in the introduction. Not only full of jargon but after two badly

chosen and inexact sentences it moves on to specific usages and implementations that do

not belong in an introduction, without spelling out, e.g. what is a Monte Carlo?. This web

page gives a very nice introduction to Markov Chain sampling. But the title is Markov

Chain Monte Carlo, and all the basic concepts of the Monte Carlo method are missing. To

be sure, they are readily available elsewhere:

http://en.wikipedia.org/wiki/Monte_Carlo_method

Still, I would get rid of expectation values. Integration is more accurate, basic, general,

and communicative to everyone who went through a calculus course and encountered the

fact that sometimes integrals are hard.

Liked by 1 person

o j2kun

Im not sure what to say. Any search for Markov chain sampling method gives

you results for MCMC, or a scientific paper about dirichlet distributions. And the

core of any Monte Carlo method is sampling, regardless of whether you use the

sample to estimate an integral.

Like

5. Richard

One small comment, though: In your definition of the sampling problem you use f both as a

probability density function and as a random variable, and that was a little confusing. It

would be very helpful (at least for me) if you used different symbols here (assuming they

are meant to be different?).

Like

6. Josh

Nice post! And Ive found many of your other primers very helpful as well.

Quick question: you wrote that a markov chain is essentially a random walk on a graph. In

many important situations, however, we define markov chains on continuous state spaces,

and Im not sure I see how that fits into the framework you described. Can markov chains

on continuous state spaces be interpreted as random walks on (implicit) graphs?

Also, a perhaps clearer introduction to MCMC than the one you cited is in chapter 29 of

David MacKays book: http://www.inference.phy.cam.ac.uk/itprnn/book.html

Liked by 1 person

o j2kun

You can define graphs on continuous state spaces just fine, and just as you would

for a usual Markov chain you can define the transitions implicitly and talk about

densities as integrals, etc.

Like

o j2kun

And yes that text does appear to have a great treatment of the subject.

Like

7. Ian Mallett

One minor clarification: you go to j with probability p_j / p_i. |-> you go to j with

probability p(j) / p(i).

Like

o j2kun

Fixed, thanks!

Like

8. Tyson Williams

Nice post. I completely agree that most explanations of MCMC are too jargon dense. Your

treatment here is great.

Given that your opening example was picking baby names, I was anxiously looking forward

to how you were going to define that two name are adjacent in the state graph. I became

Like

o j2kun

For what its worth, Im pretty sure the set of baby names is sparse in the set of all

strings, but yeah its a cop out.

Like

9. Andreas Eckleder

I think from your description it is not immediately clear that your black box does not have a

finite vocabulary of names but the name is really an arbitrary string. I think it would make

understanding this great article a lot easier if you explicitly mentioned that out at the

beginning.

Like

o j2kun

It is finite.

Like

10. Nick

A bit of a thought about the statement that a graph being strongly connected is equivalent to

For every vertex v \in V(G), an infinite random walk started at v will return to v with

probability 1.

I can see how the former implies the latter, but without also requiring at least connectedness

already, the later does not seem to imply the former. Consider a graph with more than one

vertex, where each vertex has exactly one edge which connects to itself with probability 1.

It definitely satisfies the latter property, but is also not strongly connected, or even

connected at all.

The property for every pair of vertices u,v \in V(G), and infinite random walk started at u

will pass through v with probability 1. would imply strong connectedness, and I think,

though I havent worked out the proof, that strong connectedness and all edge probabilities

positive would imply it.

Am I going horribly wrong here? Is equivalent only being used as a one directional

implication rather than as an iff?

Like

o j2kun

Youre right, I was being unclear. The confusion is because what the statistics

community calls persistent (which is the definition you quoted) really means the

connected component containing a vertex is strongly connected. Its sort of silly

because for any Markov chain you assume youre working with a connected graph

(in which case persistent means the whole graph is strongly connected), because to

analyze a graph which is a union of connected components you just analyze the

connected components one by one. I have updated the text to reflect this.

Like

11. Lorand

It is not clear to me how to choose the transition probabilities from one state (name) to

another state.

Also, is it important which neighbors a state has, or can i just randomly assign names to

verices in the lattice (lets say i would have 100 names and assign them randomly to a

lattice of 1010)?

Like

12. isomorphismes

Like

o isomorphismes

Like

13. Evan

In the last paragraph of the Constructing a Graph to Walk On section, I think theres a

small error in this sentence:

Doing this with out chosen p(i) is easy, since p(i)p_{i,j} and p(i)p_{j,i} are both equal to

\frac1r \min(p(i), p(j)) by applying a tiny bit of algebra to the definition.

I think p(i)p_{i,j} and p(i)p_{j,i} is meant to be p(i)p_{i,j} and p(j)p_{j,i} (note the i

replaced by j as the argument to the second p()).

Like

14. asmageddon

> mathematical notation everywhere

No thanks.

Like

15. Samchappelle

Like

16. Tann

January 12, 2016 at 3:37 am Reply

Glad I didnt get the version *with* all the bullshit. This is hard enough

Liked by 1 person

17. Navaneethan

I had a question about MCMC that Im finding hard to answer. If the idea is to sample from

a complicated distribution, how do you know that youve produced a representative sample?

Is there a property of MCMC that ensures that the sample is representative? If not, isnt that

a huge weakness of this framework?

Like

o j2kun

Yes, in fact that is the entire point, and and I gave a mathematical proof that it

works in the post.

Like

18. mariusagm

Like

19. thweealc

Like

Thank you, I needed help to understand what a MCMC is, after listening to Ed Vuls talk

about cognitive biases and trying to model how reasoning works and seeing what biases it

explains. I thought it was a very interesting talk,

here https://www.youtube.com/watch?v=eSq_80TfUO0 Plan to read more of your blog

posts. Thanks very much.

Like

Doing this with out chosen p(i) is easy, since p(i)p_{i,j} and p(i)p_{j,i} are both equal to

\frac1r \min(p(i), p(j))

Dont you mean p(i)p_{i,j} and p(j)p_{j,i}? Or am I misunderstanding something?

Like

22. Richard

It is more general than this right? The black box is some constant k times p(x)?

So you only need to know the proportions of the probabilities via the black box

Like

23. compostbox

However, there is one thing not obvious to me. Why the lattice thing is necessary at all in

Metropolis-Hastings? Imagine a fully linked graph where you can move to any state from

any state. It seems to me everything will still work. Put it another way, its just that the r

will now equal to the number of size of X minus 1, and notice in your proof the exact value

of r does not matter.

Like

o j2kun

This is correct, however, often the size of X is exponentially large, and so a

complete graph will not be tractable.

For example, suppose your state space is the integer grid . You may want

your algorithm to run in time polynomial in , but there are states. This is why I

brought up the example of names, since it is also an exponentially large state space.

Like

24. allenhw

I first wanted to thank you and great job for getting ideas across clearly. It was super

helpful!

I have one question; whats the advantage of MCMC over simpler algorithms using RNG?

For example, you can simulate a dice roll by dividing [0.1] into 6 subspaces, generating a

random number x in [0,1], and outputting result based which subspace x falls into. This can

be done as long as we have p(x) for all X.

Like

o j2kun

MCMC is used when the number of possible outputs is exponentially large. To see

this, imagine your proposed die had 2^50 sides, each of which had a slightly

different probability of occurring (and there is no discernible pattern or formula to

tell you p(x) for a given x, you just have to ask the black box p(x) to give you a

value). How does your algorithm break down in this scenario? How long does it

take to simulate a single die roll on average?

Like

Thanks for writing this! Do you mind explaining why x_t and x_0 are unit vectors in your

proof of the eigenvector theorem? I was under the impression that each x vector is a

probability vector, which means it would sum to 1 but not necessarily have length 1. Any

help/clarification is appreciated thanks!

Like

26. Matt

Thanks for writing this. It was very helpful to understand the basic idea. Unfortunately, I

still

have some troubles to understand the last step (Constructing a graph to walk on). Could

you maybe explain on a simple example how you build up the correct graph?

Let us assume

X = {A,B,C}

with

p(A) = 2/10, p(B) = 5/10, p(C) = 3/10.

This means our stationary distribution should be

p = [2/10, 5/10, 3/10] (correct me if I am wrong).

What would in this case be n and d you used to build the lattice?

How would the lattice look like?

Like

o j2kun

In your example, the number of nodes in the graph is only 3, which is so small that

any connected graph should work, if Im not mistaken.

In general, there is no correct answer. You want a graph which is sparse, but also

has high connectivity properties, meaning you need to remove many edges in order

to disconnect the graph. Graphs that contain only one or two shortest paths between

two nodes would have bad mixing. Grid graphs tend to do well, but I dont know

how to pick the parameters in a principled way. There is also a theory of expander

graphs that is closely related, and you may want to look up that literature.

Like

Matt

Maybe I am little bit more confused than I thought. Have I

understand the procedure correct?

p = [2/10, 5/10, 3/10]

This is the same like saying: We need to find a Matrix A with

Eigenvalue 1 and Eigenvector p.

If we have this we could walk on the Graph represented by matrix A.

If we do this a long time the vertex we will end could be used as

generator for random variables probability p.

Is this correct?

- Number Theory tutorialEnviado porNiranjan Prasad
- Rosen - Elementary Number Theory and Its Applications - 1stEnviado porHunter Patton
- Diktat Olimpiade Matematika IEnviado porDidik Krisdiyanto
- AlgebraEnviado poritz1223
- Aks CodesEnviado porMargit Orsós
- The Inﬂuence of Homogeneous Algorithms on Machine LearningEnviado porIareme
- DataSufficiency ChallengeEnviado porgbd97
- Complex Space FactorizationEnviado porOVVOFinancialSystems
- a^n+-1 Solutions - Yufei Zhao - Trinity Training 2011Enviado porTaha
- What is MathematicsEnviado porlam795246
- CalculatorEnviado porMelvin Esguerra
- Primality TestingEnviado porapi-19891013
- ON A THEOREM OF WILSONEnviado porIlieCraciun
- IFMConf_9Enviado porthinx
- 01.Number System.pdfEnviado porSelva Kumar
- Mas4203 Class Notes1Enviado porSarita Dudi
- 0580_s13_qp_23.pdfEnviado porHaider Ali
- Euler, Goldbach and Fermat TheoremEnviado porpajoroc
- beamer4.pdfEnviado porAfina Azizah
- AljabarEnviado porAhmad Lutfi
- Dp After Cp3Enviado porVikash Kumar
- 0005185Enviado porHafijur Rahman
- 2016 TIMC Keystage II Individual FinalEnviado pordcalgenio
- 2015AEnviado porSudarshan Muralidhar
- Calcu Tech SampleEnviado porGilbert Fabella Casuncad
- 658-1993-1-PBEnviado porMukhlis
- ques2Enviado pororaclefusions
- Chapter 3.docxEnviado porNina Ad
- SetsEnviado porRosalinda Tacastacas
- number typesEnviado porGautam Murthy

- Mathem BeckettEnviado porMike Pugsley
- Heller II v District of Columbia, 670 F3d 1244 (DC Cir 10-7036, 2011), Kavanaugh Dissent at 46Enviado pornolu chan
- Autism Screening QuestionnaireEnviado porMichelotto Toonomamoueinaikaneís Paulo
- Algernon Charles Swinburne and MorrisEnviado porMichelotto Toonomamoueinaikaneís Paulo
- PMP handbook.pdfEnviado porSANJOY MAJI
- Heidegger Work of ArtEnviado porMichelotto Toonomamoueinaikaneís Paulo
- Martin Heidegger. Work of ArtEnviado porLorenzo Marinucci
- Beckett, Samuel ''MaloneDies'' Fr en Sp XxEnviado porEduardo Peinado Gaona

- Excel Knime 072018Enviado porMiguel Angel Vidal Arango
- Blockchain Accelerates Insurance Transformation FsEnviado porthakornikita001.nt10
- MSP Design WorkshopEnviado porBouhafs Abdelkader
- Quick Introduction to CANalyzerEnviado porIonut Bogdan Pop
- ATTRIB CommandEnviado porswadeshsxc
- Splunk 6.5.2 CapacityEnviado porim
- MPX_B_EEnviado porSanthosh Reddy
- Os Coa NewtonEnviado porandribudhi
- 19. Control Builder User Guide for Ovation 3.5_OW350_80.pdfEnviado porEvbaru
- SL70_ENUS_TVB_03Enviado porenrikexk1
- 2010-Springer-A Data-Centric Approach to Insider Attack Detection in Database SystemsEnviado porKrissh Raj
- LabManual for Cryptography and Network SecurityEnviado porRajeshkannan Vasinathan
- openmp_2013_acs2Enviado porAdriana Selena Tito Ilasaca
- Functions in C++.pdfEnviado porAnonymous EpmRbw
- X-lite Wrapper AutoITEnviado porricx31
- 6-Weeks Training BrochureEnviado porrhooda_1
- Moving Object Tracking in Video Using MATLABEnviado porSumeet Saurav
- Software Engineering Issues for Mobile Application DevelopmentEnviado porTooru Okada
- FistaEnviado porLelouch V. Britania
- Cpu Sched MlfqEnviado porAl Azhar
- grammaire(1)Enviado porAmel Haddad Amamou
- cszg623 (11)Enviado porShivam Shukla
- 05-Unit5Enviado porvelmurugan
- Intro Embedded SystemsEnviado porjameschall1
- Multiload II Communications Manual_fv_4!3!31_00Enviado porcalvo365
- Digital McLogic DesignEnviado porneerajdhanraj
- BIOS Beep Codes2Enviado porcongoluc182005
- 08 - ZigBee StackEnviado porPri478
- Report Design in Visual FoxPro 3Enviado porCarlos N. Porras
- Zenoss_Core_AdministrationEnviado porMaharaja Subramanian