**Pure NKS lecture series**

Fritz Obermeyer discusses mining the space of programming languages

Given a simple programming language family like lambda calculus, Fritz asks: what are the simplest few program behaviors? These behaviors form a basis that seems big but really isn't all that big. In an more abstract view, we can consider the behavior space of programs, and define a bigger map, and Fritz argues that this is the worthwhile space to map.

Since programs are elements of an algebra, like constants, binary operations, and equations and/or inequalities, a first starting place for this map is to try computational algebra methods. Fritz gave an example of making a map with the Todd-Coxeter algorithm. In theory, equivalence is undecidable, but in practice, given enough computation, equivalence is largely decidable (Fritz cites 96% in one particular case).

Then in the space of programs, Fritz can calculate the shape of an algebra of programs using Kolmogorov complexity. In order to visualize the space, Fritz posed the space as an eigenvector problem and is able to visually illustrate the difference in different programming styles in a program space. Through proper definition, the space of programs is a Reimannian manifold.

Potential applications of thes techniques include program simplification, using a database of simple rewritings, or software analysis, refactoring based on spatial proximity. It is also possible to program by searching.

Fritz's analysis shows that the average over all languages is simpler than any one.

## Comments