Saturday, April 29, 2006

Thoughts on programming languages

This article is reconstructed from some notes I made to myself on GMail (yeah, sadly, I write myself emails because I forget things). This is the sort of things that starts happening to you when you spend six weeks not sleeping. You start to wonder if you will ever get to sleep again. Then, it dawns on you, if you could only get more done, then you would get to sleep more than 45 minutes every 48 hours. What happens next looks like the rants of a madman.

madman rant starts here:

Languages manipulate meaning.

Languages are a leaky abstraction for meaning. [note: see Joel Spolsky's article on leaky abstractions.]

Therefore, programming languages themselves for[ce] the user to think in terms of the language [note: Is this a form of Sapir-Whorf affect?] because they are by definition a leaky abstraction that has to be worked around.

What is the easiest way to manipulate units of meaning?

How can we increase the density with which we maniplate meanings?

Mostly, we assign a meaning to a symbol, and then use that symbol to stand for that meaning. [circular login when you are tired, anyone?] Examples:

14 + 35 = 49

the "+" means "add them together", and the "=" means "the result is". Most of the human race has agreed on this symbology [okay, "symbolism" but you should all see Boondock Saints].

this could be rewritten a million different ways

+ 14 35

or

(+ 14 35)

or

(add 14 35)

or

(apply + '(14 35))

Because of the engrained nature of the add operation in our minds we have no trouble coupling that symbol to that specific meaning. It is when we get into something like english that we have trouble:

Like: "Jane is blue."

Does that mean Jane is an alien, or does it mean she is depressed? This ambiguity doesn't work well for computers. They want specific instructions. Therefore we need a language that minimizes the ambiguity (eliminating it would make it math, not language)

How can we minimize ambiguity? What tools can we give the developer to minimize ambiguity?

we need a structured way to write down algorithms and compositions of algorithms. In English and programming languages we use context to indicate some of the information and to reduce the possibility of ambiguity [You like how punctuation and capitalization go out the window when you are tired enough?]

Normally, we could build a hierarchy of concepts which gets us from very simple starting concepts to very complex systems like TCP/IP.

These layers of abstraction further obscure the actual manipulation of atomic units of meaning, creating further leaky abstractions.

All of programming is based on this concept. Create an operator out of the operators that you have already been given and work from there, all the way up the chain to whatever abstraction you need to stop at in order to get your job done.

If we cannot get rid of ambiguity, and we realize that all programming languages are based on composition of basic operators, then it seems that a language would be more and more useful as it became easier and easier to compose the operators together, along with their data. Also, ease of manipulation of these compositions is a factor (think macros).

the more meta information we have about an operator, the less the operator has to actually show in code. Back to the "+", everybody knows what this means, so general meta knowledge about "+" is very, very rich in individuals. We either have to educate individuals to the level that they understand "+" at, or we need a formal language for describing the meta data of the operator.

How can we minimize code? By identifying patterns in meaning, and using a single operator to express that pattern.

End of madman rant


Alright, I have slept some since I wrote this. I have let the ideas roll around in my head. I am rereading it as I write this article, over and over. I admit these assertions are based on accepting a string of postulate based (as in, I can't prove any of the above statements are undeniably true) arguments. However, I do think it brings us to an important point. The point is hidden in "identifying patterns in meaning, and using a single operator to express that pattern."

Most modern programming languages (I cannot speak for all, but the ones I am familiar with) concentrate on the expression of algorithms. There is even an algorithms class in most undergraduate computer science programs. It seems algorithms are central to the creation and discussion of programming. Algorithms are made up of units of meaning. We just assume units of meaning because our human brains are used to assigning meaning to a symbol (written) or a sound (verbal).

To my knowledge nobody has given any thought to finding patterns in units of meaning. If, as my sleepless mind postulated, we can find regular patterns in the use of units of meaning, then we could assign a symbol to those more generalized patterns, allowing us to write more succinct programs using the new symbolic language. Think of it as writing a function that takes another function as an argument along with the arguments to be operated on. We find the general patterns of execution for types of meanings in human language, assign them a symbol, and then use those patterns with specific meanings to generate actual code. This would significantly increase the density of the units of meaning, because you are implicitly placing a single meaning into the context of a more generalized pattern. It requires less symbols to express more meaning. More succinct expression is more expressive power. Paul Graham agrees.

Is this possible? I have no idea. Am I going to try? Absolutely.

[Note: If I am completely daft and somebody has already gone down this road and come back again, please email me with a reference to their work. I doubt that I am the first person to consider programming languages from this angle, but I am ignorant of other's attempts.]

0 Comments:

Post a Comment

<< Home