Wednesday, April 9, 2014

Six programming paradigms that will change how you think about coding

Every now and then, I stumble across a programming language that does something so different that it changes how I think about coding. In this post, I want to share some of my favorite finds.

This is not your grandma's "functional programming will change the world!" blog post: this list is much more esoteric. I'd wager most readers haven't heard of the majority of the languages and paradigms below, so I hope you have as much fun learning about these new concepts as I did.

Note: I have only minimal experience with most of the languages below: I find the ideas behind them fascinating, but claim no expertise in them, so please point out any corrections and errors. Also, if you've found any new paradigms and ideas not covered here, please share them!

Update: this post hit the front page of r/programming and HN. Thank you for the great feedback! I've added some corrections below.

Concurrent by default

Example languages: ANI, Plaid

Let's kick things off with a real mind bender: there are programming languages out there that are concurrent by default. That is, every line of code is executed in parallel!

For example, imagine you wrote three lines of code, A, B, and C:

In most programming languages, A would execute first, then B, and then C. In a language like ANI, A, B, and C would all execute at the same time!

Control flow or ordering between lines of code in ANI is merely a side effect of explicit dependencies between lines of code. For example, if B had a reference to a variable defined in A, then A and C would execute at the same time, and B would execute only after A finished.

Let's look at an example in ANI. As described in the tutorial, ANI programs consists of "pipes" and "latches" that are used to manipulate streams and data flows. The unusual syntax is tough to parse, and the language seems dead, but the concepts are pretty interesting.

Here's a "Hello World" example in ANI:

In ANI terminology, we are sending the "Hello, World!" object (a string) to the std.out stream. What happens if we send another string to std.out?

Both of these lines of code execute in parallel, so they could end up in any order in the console. Now, look what happens when we introduce a variable on one line and reference it later:

The first line declares a "latch" (latches are a bit like variables) called s that contains a string; the second line sends the text "Hello, World!" to s; the third line "unlatches" s and sends the contents to std.out. Here, you can see ANI's implicit program sequencing: since each line depends on the previous one, this code will execute in the order it is written.

The Plaid language also claims to support concurrency by default, but uses a permissions model, as described in this paper, to setup control flow. Plaid also explores other interesting concepts, such as Typestate-Oriented Programming, where state changes become a first class citizen of the language: you define objects not as classes, but as a series of states and transitions that can be checked by the compiler. This seems like an interesting take on exposing time as a first class language construct as discussed in Rich Hickey's Are we there yet talk.

Multicore is on the rise and concurrency is still harder than it should be in most languages. ANI and Plaid offer a fresh a fresh take on this problem that could lead to amazing performance gains; the question is whether "parallel by default" makes concurrency easier or harder to manage.

Update: the description above captures the basic essence of ANI and Plaid, but I used the terms "concurrent" and "parallel" interchangeably, even though they have different meanings. See Concurrency Is Not Parallelism for more info.

Dependent types


Example languages: Idris, Agda, Coq

You're probably used to type systems in languages like C and Java, where the compiler can check that a variable is an integer, list, or string. But what if your compiler could check that a variable is "a positive integer", "a list of length 2", or "a string that is a palindrome"?

This is the idea behind languages that support dependent types: you can specify types that can check the value of your variables at compile time. The shapeless library for Scala adds partial, experimental support (read: probably not ready for primetime) for dependent types to Scala and offers an easy way to see some examples.

Here is how you can declare a Vector that contains the values 1, 2, 3 with the shapeless library:

This creates a variable l1 who's type signature specifies not only that it's a Vector that contains Ints, but also that it is a Vector of length 3. The compiler can use this information to catch errors. Let's use the vAdd method in Vector to perform a pairwise addition between two Vectors:

The example above works fine because the type system knows both Vectors have length 3. However, if we tried to vAdd two Vectors of different lengths, we'd get an error at compile time instead of having to wait until run time!

Shapeless is an amazing library, but from what I've seen, it's still a bit rough, only supports a subset of dependent typing, and leads to fairly verbose code and type signatures. Idris, on the other hand, makes types a first class member of the programming language, so the dependent type system seems much more powerful and clean. For a comparison, check out the Scala vs Idris: Dependent Types, Now and in the Future talk:


Formal verification methods have been around for a long type, but were often too cumbersome to be usable for general purpose programming. Dependent types in languages like Idris, and perhaps even Scala in the future, may offer lighter-weight and more practical alternatives that still dramatically increase the power of the type system in catching errors. Of course, no dependent type system can catch all errors due to to ineherent limitations from the halting problem, but if done well, dependent types may be the next big leap for static type systems.

Concatenative languages

cat
Example languages: Forthcat, joy

Ever wonder what it would be like to program without variables and function application? No? Me neither. But apparently some folks did, and they came up with concatenative programming. The idea is that everything in the language is a function that pushes data onto a stack or pops data off the stack; programs are built up almost exclusively through functional composition (concatenation is composition).

This sounds pretty abstract, so let's look at a simple example in cat:

Here, we push two numbers onto the stack and then call the + function, which pops both numbers off the stack and pushes the result of adding them back onto the stack: the output of the code is 5. Here's a slightly more interesting example:

Let's walk through this line by line:
  1. First, we declare a function foo. Note that functions in cat specify no input parameters: all parameters are implicitly read from the stack. 
  2. foo calls the < function, which pops the first item on the stack, compares it to 10, and pushes either True or False back onto the stack. 
  3. Next, we push the values 0 and 42 onto the stack: we wrap them in brackets to ensure they get pushed onto the stack unevaluated. This is because they will be used as the "then" and "else" branches (respectively) for the call to the if function on the next line. 
  4. The if function pops 3 items off the stack: the boolean condition, the "then" branch, and the "else" branch. Depending on the value of the boolean condition, it'll push the result of either the "then" or "else" branch back onto the stack. 
  5. Finally, we push 20 onto the stack and call the foo function.
  6. When all is said and done, we'll end up with the number 42.
 For a much more detailed introduction, check out The Joy of Concatenative Languages.

This style of programming has some interesting properties: programs can be split and concatenated in countless ways to create new programs; remarkably minimal syntax (even more minimal than LISP) that leads to very concise programs; strong meta programming support. I found concatenative programming to be an eye opening thought experiment, but I'm not sold on its practicality. It seems like you have to remember or imagine the current state of the stack instead of being able to read it from the variable names in the code, which can make it hard to reason about the code.

Declarative programming

GNU Prolog
Example languages: Prolog, SQL

Declarative programming has been around for many years, but most programmers are still unaware of it as a concept. Here's the gist: in most mainstream languages, you describe how to solve a particular problem; in declarative languages, you merely describe the result you want, and the language itself figures out how to get there.

For example, if you're writing a sorting algorithm from scratch in C, you might write the instructions for merge sort, which describes, step by step, how to recursively split the data set in half and merge it back together in sorted order: here's an example. If you were sorting numbers in a declarative language like Prolog, you'd instead describe the output you want: "I want the same list of values, but each item at index i should be less than or equal to the item at index i + 1". Compare the previous C solution to this Prolog code:

If you've used SQL, you've done a form of declarative programming and may not have realized it: when you issue a query like select X from Y where Z, you are describing the data set you'd like to get back; it's the database engine that actually figures out how to execute the query. You can use the explain command in most databases to see the execution plan and figure out what happened under the hood.

The beauty of declarative languages is that they allow you to work at a much higher level of abstraction: your job is just to describe the specification for the output you want. For example, the code for a simple sudoku solver in prolog just lists out what each row, column, and diagonal of a solved sudoku puzzle should look like:

Here is how you would run the sudoku solver above:

The downside, unfortunately, is that declarative programming languages can easily hit performance bottlenecks. The naive sorting algorithm above is likely O(n!); the sudoku solver above does a brute force search; and most developers have had to provide database hints and extra indices to avoid expensive and inefficient plans when executing SQL queries.

Symbolic programming


Example languages: Aurora

The Aurora language is an example of symbolic programming: the "code" you write in these languages can include not only plain text, but also images, math equations, graphs, charts, and more. This allows you to manipulate and describe a large variety of data in the format native to that data, instead of describing it all in text. Aurora is also completely interactive, showing you the results from each line of code instantly, like a REPL on steroids.


The Aurora language was created by Chris Granger, who also built the Light Table IDE. Chris outlines the motivation for Aurora in his post Toward a better programming: some of the goals are to make programming more observable, direct, and reduce incidental complexity. For more info, be sure to see Bret Victor's incredible talks: Inventing on Principle, Media for Thinking the Unthinkable, and Learnable Programming.

Update: "symbolic programming" is probably not the right term to use for Aurora. See the Symbolic programming wiki for more info.

Knowledge-based programming


Examples: Wolfram Language

Much like the Aurora language mentioned above, The Wolfram Language is also based on symbolic programming. However, the symbolic layer is merely a way to provide a consistent interface to the core of the Wolfram Language, which is knowledge-based programming: built into the language is a vast array of libraries, algorithms, and data. This makes it easy to do everything from graphing your Facebook connections, to manipulating images, to looking up the weather, processing natural language queries, plotting directions on a map, solving mathematical equations, and much more.


I suspect the Wolfram Languages has the largest "standard library" and data set of any language in existence. I'm also excited by the idea that Internet connectivity is an inherent part of writing the code: it's almost like an IDE where the auto-complete function does a google search. It'll be very interesting to see if the symbolic programming model is as flexible as Wolfram claims and can truly take advantage of all of this data.

Update: although Wolfram claims the Wolfram Language supports "symbolic programming" and "knowledge programming", these terms have slightly different definitions. See the Knowledge level and Symbolic Programming wikis for more info. 


27 comments:

Nathan Wailes said...

Great write-up!

doug said...

loved this. the python library sympy i suppose, is a limited example under the symbolic rubric

Asim said...

Good essay. Another language that is parallel by default like ANI in some ways is Pig, which runs on top of Hadoop.

anta40 said...

I had some experiences with declarative programming in the university: logic programming (Prolog), and functional programming (Haskell). Even though I don't write applications in those language nowadays, indeed they pretty much influence me how I like to think to solve problems :D

Unknown said...

Another concatenative (or stack based) language is PostScript which might be powering a printer close to you.

Samuel Bosch said...

And yet another concatenative language is Factor

Praveen C said...

Perl's Moose supports Dependent Types.

Uncompetative said...

No mention of Actors, you might want to add Erlang which supports fault tolerant distributed concurrency and modules that can be "hot-swapped" at runtime, it was used to write the server for Black Ops - slides:

http://www.erlang-factory.com/upload/presentations/395/ErlangandFirst-PersonShooters.pdf

Chaosape said...

I would highly recommend looking at lambda prolog. Lambda prolog permits simply typed lambda terms. This permits a natural encoding of binding notions. Additionally, it permits a subset of high-order unification(i.e., logic variables can represent predicates). A powerful aspect of lambda prolog is that it based on hereditary Harrop formulas instead of Horn clauses. HH formulas, permit implication and universal quantifiers in goals. Intuitively, this allows programs to be extended and new, fresh, variables to be introduced, respectively. More information can be found here: http://www.lix.polytechnique.fr/~dale/lProlog/

John Seals said...

Great article!

Paul W said...

As Samual Bosch said, Factor is a concatenative language, whose stack effect checker alleviates the problem you mention, of having to keep the state of the stack in your head. Stack effects look a little bit like function signatures, and the factor compiler will verify that the various 'words' in your program have the effect on the stack that their signature says they do.

Warbo said...

I found the description of the concatenative code a bit all over the place. Two specific issues: you say "concatenation is composition", but don't show any concatenation. Maybe a better phrase would be "juxtaposition is concatenation" or even "whitespace is concatenation". Also, you keep saying that we "call" functions, which is misleading (it makes the control flow sound very complicated: "why do functions get called but numbers get pushed?"). In fact concatenative languages are very uniform: everything is a function from stacks to stacks.

A more uniform description might be something like this:

1. First we declare a variable "foo". This will be a function, which we will define as the composition of existing functions. This may be familiar from functional programming, eg. the "." function in Haskell, but in Cat we can compose by just using whitespace.
2. The first component of "foo" will be "10", which is a function that takes a stack and returns a new stack with the number 10 pushed on top. We compose this with the "<" function, which pops two numbers off a stack and pushes the boolean result (True or False) of comparing them.
3. We compose this with two 'anonymous functions', denoted by the brackets "[...]". These functions push their bodies, unevaluated, on to a stack. Those bodies just consist of the "0" and "42" functions, which (when evaluated) will push the numbers 0 and 42 on to a stack respectively.
4. We compose this with the "if" function, which pops two functions and a boolean off the stack and applies one of those functions (depending on the boolean) to the resulting stack. The functions act as the "then" and "else" branches of the "if".
5. We've now finished defining "foo", so we start defining our 'main' (top-level) function. This consists of the "20" function, which will push the number 20 on to a stack, composed with the "foo" function we just defined.
6. When we run this program, the top-level function is applied to an empty stack. The result will be a stack containing the number 42.

Warbo said...

Another paradigm that's interesting is term-rewriting systems. Examples are Pure and Maude.

In these systems we define a bunch of rewrite rules (a -> b) and/or equations (c = d, equivalent to c -> d and d -> c). The system executes a program by trying to match the input against any of the rewrite rules. If a match is found, the rewrite is applied and the system starts looking for another match. For example, here's the declaration of a couple of Maude operators (ops):

--- Declare some variables to use in our patterns
vars A B : Nat .

--- Clears B of the least-significant-bits in A
op _above_ : Nat Nat -> Nat .
eq (A) above (B) = (A >> B) << B .

--- Clears all but the B least-significant-bits of A
op _below_ : Nat Nat -> Nat .
eq A below 0 = 0 .
eq A below B = A rem (1 << B) .

Robby said...

You might enjoy Icon, noted for its string scanning and goal-directed evaluation: http://www.cs.arizona.edu/icon/

bp said...

You might enjoy looking at LabVIEW. It is dataflow-based, concurrent by default, almost fully concatenative , graphical - and most importantly, it is huge. Used in production. Not some obscure thing, but really big in science.

And if you master these concepts (which most LabVIEW programmers don't - you can usually see when they are trying to write Java or C), your way of programming will be forever changed.

Mike Hunt said...

Wolfram reminds of REBOL.

Zack Morris said...

Fantastic article, thank you. I had never made the connection between Prolog and O(n!) solving time. But I stumbled onto the issue when I was writing a program that when given an isometric view of blocks on a table (represented as a graph of the vertices), would solve how imaginary blocks were stacked to result in the same image. The naive method is to try every permutation (which is n!) but I ended up speeding it up using a hill climbing algorithm that tried swapping two branches of the tree and keeping the answer with least error. Worst case was still n! I believe, but in practice it found the solution in under a second. My dream would be a language like Prolog that can evolve solutions under the hood with a time less than n! perhaps by referencing previous solutions from the web. Also, n! sounds bad until we consider multiprocessing and quantum computing which could try many possibilities simultaneously.

Ryan Mitchell said...

On concurrent by default; Verilog or VHDL are analogous examples ... coming from
a software background it's a very different paradigm -- hardware is hardware and every circuit is always "executing".

Vasudev Ram said...

Icon is another interesting language.

http://en.wikipedia.org/wiki/Icon_(programming_language)

https://www.cs.arizona.edu/icon/docs/ipd266.htm

David Warman said...

Interesting you think FORTH is a curiosity. It is actually pretty pervasive, been in space many times (still is), PostScript (mentioned above) is not only same family, it is actually derived from an early state of FORTH.

It has many interesting properties you missed, in particular that it is a language in which you define your domain-specific language to solve your problem. Every aspect of the base language is extensible, including all math and syntax. And it is very small and very tight.

I use it as a embedded debug scripting tool in pretty much everything I wrote the past 15 years or so. Real Time embedded systems have few bugs that can be found with breakpoint debuggers, one has to be able to observe them _behaving_ for many of the issues that do not lead to crashes. With FORTH I can interactively explore the workings with a minimally intrusive and small footprint code prober.

akuhn said...

Amazing article, really loved reading this! There's even a 7th paradigm for your list…

Liveness

Programs that are always running, even while you are writing them. If you do web development you get a hint of it, but working with one of these systems is just mind blowing. Take the quick edit-and-reload cycle of web development and turn it into a keystroke-and-automagic-update. Boom!

One example is CSS editing in Chrome's dev tools. But there are other systems, systems even that don't even have a textual representation of the source code because their only way of being is as a running programming that has been started in the 70ies and whose memory dump has been forked by thousands of programers and users since.

Example languages

Squeak — http://www.squeak.org
Pharo — http://www.pharo-project.org
Self — http://selflanguage.org
Newspeak — http://www.newspeaklanguage.org
SuperColliders — http://supercollider.sourceforge.net
Extempore — http://extempore.moso.com.au
ixilang — https://github.com/ixi-thor/ixilang
Conception — https://github.com/shurcooL/Conception
LightTable — http://www.lighttable.com/

Today, most of these languages are niche languages but before the rise of open source, many industry-strength systems were written in Smalltalk. Alas, the commercial vendors of Smalltalk missed the open-source bandwagon (with licenses ranging in the $35,000 per developer) and Java capitalized on their mistake as the open-source exploded with the internet. It seems however that recently there is a growing interest in bringing this paradigm back. Sparked by Bret Victor's talks but also by the amazing work on the Live Coding community who uses these systems for musical performances and algoraves. Yes, that's a thing! See http://algorave.com and http://toplap.org

Christophe de Dinechin said...

Another language that may interest you is XL and its real-time 3D derivative Tao.

They are based on concept programming, which focuses on the transformation of ideas into code.

Tao is also a reactive language, i.e. if you write a time or mouse dependency in the program, the corresponding section re-evaluates automatically over time or mouse events. There are a few examples in the tutorials section of the Taodyne web site.

MCAndre said...

Don't forget Erlang, a modern descendant of Prolog.

http://www.erlang.org/

Javin Paul said...

Indeed great paradigm, mix of both functional and object oriented. Loved this post

Thanks
Javin

Affity Solutions said...
This comment has been removed by a blog administrator.
Lalit Sharma said...
This comment has been removed by a blog administrator.
Bangalore RWD said...
This comment has been removed by a blog administrator.