Sunday, August 28, 2011

Reaction to "Clojure: Towards The Essence Of Programming" from a Scalaperspective

How Scala addresses those problems


Just watched Howard Lewis Ship's Clojure: Towards The Essence Of Programming, which is an excellent introduction to Clojure and how it addresses the problems in Java. As a Scala fan, I thought it would be a good idea to see how one could use Scala to do the same.

Ceremony vs Essence


Howard uses the example below to show how much ceremony (code with little real info) is required, hiding the essence (the real intention) when using Java to sort a list of Stock objects (a portfolio) either using different properties such as open values or last trade values:
 public static void sortByOpen(List<Stock> portfolio) {
    //ceremony: a lot of code but very little real info
    Comparator<Stock> c = new Comparator<Stock>() {
       public int compare(Stock o1, Stock o2) {
          return o1.getOpen() - o2.getOpen(); //essence: the beef is here
       }
    };
    Collections.sort(portfolio, c);
}

The Clojure version is:
//a Stock object is just a map with each field as a key. ":open" is the key to get the open value.
(sort-by :open portfolio)

In Scala, it can be as simple as the Clojure version:
//full version: using an anonymous function instead of a comparator object
portfolio.sortBy((s: Stock) => s.open)

//simpified version: sortBy is expecting the argument to be a function
// (Stock) => SOME-RESULT-TYPE, so we don't need to declare the 
//type for "s".
portfolio.sortBy((s) => s.open)

//final version: as "s" only appears once in the function body, just use
//an underscore and get rid of the parameter declaration.
portfolio.sortBy(_.open)

The Clojure version is like the final Scala version in the first attempt because it doesn't need typing information in the code. In Scala, the compiler infer it from the code.

In addition, in Clojure the object is just a map so people can refer to a property directly with its name without referring to the object. In Scala, we must use something to denote the object in the form of obj.property. However, the underscore syntax eliminates the need to declare that object, so the code is still very succinct.
The benefits of the Scala approach are:

  • Compile time checking: if the property name mismatches the Stock class, in the Scala the compiler will tell us immediately.

  • Runtime performance: it is a lot faster to access the property by offset than by looking up a key.

The cost we pay is mainly the learning curve: we must learn how Scala infers the typing information so that we know what to type and what to NOT type and learn the meaning of the underscore in an anonymous function.

Form is structure


I may not be fully understanding this example. Howard uses the Fahrenheit conversion, f =9*c/5+32, to show that in Clojure the form (the look on the surface) is the same as the structure (the actual relationship):
//the look is the same as the actual syntax tree
(+
  (/
    (* 9 c)
    5)
  32)

I definitely think the Java or Scala version is much simpler and more concise:
9*c/5+32

This is, after all, what we call DSL. Arithmetics has its rules of precedence and associativity, allowing us to write in a very succinct form to express more complex structure.

Read Eval Print Loop


Scala has it too. I like it very much too.

Functional programming


Howard that moves onto functional programming with examples using filter, reduce, map, for comprehension, stream. Obviously, Scala can do all these in a similar fashion. For example, to get the last trade total value of the portfolio, we can map each stock to its last trade value and then use reduce to sum them all up. In Clojure:
//#() means anonymous function. % refers to the one and only parameter (the stock).
(reduce + (map #(* (% :last-trade) (% :shares)) portfolio))

In Scala:
//+ in Scala is not a function, but a method. So we can't use it directly; 
//instead, use an anonymous function with underscores again.
portfolio.map((s)=>s.lastTrade*s.shares).reduceLeft(_+_)

There is not much difference. However, I think the significance is much deeper. The question is not whether functional programming is supported or not, but how much it is intended to be used in place of procedural programming. Scala allows both styles of programming, which may or may not be a good thing, depending on how you view it. If you're committed to the functional programming style for concurrency, testability, predictability, then using a language that is less capable of procedural programming is actually a good thing. If you're more comfortable with procedural programming and just using functional programming in a smaller scale, then a hybrid language would be better.

Extending the language with DSL


Then Howard moves onto extending Clojure with DSL. One example he uses is building a DOM tree with a DSL.
(defview root-index
  [env]
  :html [
    :head [
      :title [ "Cascade Blog" ]
    ]
    :body [
      :h1 [ "Cascade Blog" ]
      :ul { :class "recent-postings" } [
        (template-for [posting (recent-postings env)]
          :li [
             (render-link env show-posting (posting :id) (posting :title))
           ])
      ]
    ]
  ])

As an exercise I implemented a similar DOM DSL in Scala:
//the DSL is implemented in the Html object
import Html._

//using the DSL to define a DOM tree
p @@ ("style" -> "color: red") containing (
        "This is an ",
        b containing ("interesting"),
        "DOM tree"
        )

They aren't that different except that in Clojure each list item is separate from its neighbors by being enclosed in its own () but in Scala we must use a comma to separate the list items. So, choose your own poison: either tolerate a million ()'s or the ugly commas in the DSL.
Another difference is that in Clojure text can be freely mixed in the DOM tree as there is no typing restriction, while in Scala text is magically converted into a TextNode object. Again, the latter provides structure, compile-time protection and runtime performance, but we must learn the implicit conversion mechanism to understand what is going on.
Just to be complete, the DOM DSL is implemented in Scala as below:
//a DOM node
class Node

//a text node
case class TextNode(val text: String) extends Node

//an element node
case class Element(val name: String,
                   val attrs: Map[String, String],
                   val children: Seq[Node]) extends Node {
  def this(name: String) = this (name, Map(), List())

  //we can use symbols as method names, but @ is a reserved keyword in Scala,
  //so I used @@ as the name of the method that adds attributes to the element.
  def @@(attrPairs: (String, String)*) =
    //just add each pair to the map
    Element(name, attrPairs.foldLeft[Map[String, String]](Map())(_ + _), children)

  //add the child nodes (elements or text nodes) to this element
  def containing(children: Node*) = Element(name, attrs, children)
}

object Html {
  //convert strings into text nodes automatically
  implicit def string2TextNode(s: String) = TextNode(s)
  def p = new Element("p")
  def b = new Element("b")
}

Howard also shows the need for lazy evaluation so that some code is only executed on demand. For example, the DSL code to generate each <li> is repeatedly called in a loop. In Clojure lazy evaluation is implemented with macros which generate well-form code instead of arbitrary text. This seems to be more powerful (able to do more) than the call-by-name mechanism Scala, although I don't know if/when this extra power is really needed.


Sunday, August 21, 2011

Revealing the Scala magician’s code: method vs function

How's a method different from a function in Scala?


A method can appear in an expression as an internal value (to be called with arguments) but it can't be the final value, while a function can:

//a simple method
scala> def m(x: Int) = 2*x
m: (x: Int)Int

//a simple function
scala> val f = (x: Int) => 2*x
f: (Int) => Int = <function1>

//a method can't be the final value
scala> m
<console>:6: error: missing arguments for method m in object $iw;
follow this method with `_' if you want to treat it as a partially applied function
m
^

//a function can be the final value
scala> f
res11: (Int) => Int = <function1>

Parameter list is optional for methods but mandatory for functions


A method can have no parameter list or have one (empty or not), but a function must have one (empty or not):

//a method can have no parameter list
scala> def m1 = 100
m1: Int

//a method can have an empty parameter list
scala> def m2() = 100
m2: ()Int

//a function must have a parameter list
scala> val f1 = => 100
<console>:1: error: illegal start of simple expression
val f1 = => 100
^
//a function's parameter list could be empty
scala> val f2 = () => 100
f2: () => Int = <function0>

Why a method can have no parameter list? See below.

Method name means invocation while function name means the function itself


Because methods can't be the final value of an expression, so if you write a method name and if it doesn't take any argument (no argument list or an empty argument list), the expression is meant to call that method to get the final value. Because functions can be the final value, if you just write the function name, no invocation will occur and you will get the function as the final value. To force the invocation, you must write ():

//it doesn't have a parameter list
scala> m1
res25: Int = 100

//it has an empty parameter list
scala> m2
res26: Int = 100

//get the function itself as the value. No invocation.
scala> f2
res27: () => Int = <function0>

//invoke the function
scala> f2()
res28: Int = 100

Why we can provide a method when a function is expected?


Many Scala methods such as map() and filter() take functions arguments, but why can we provide methods to them like:

scala> val myList = List(3, 56, 1, 4, 72)
myList: List[Int] = List(3, 56, 1, 4, 72)

//the argument is a function
scala> myList.map((x)=>2*x)
res29: List[Int] = List(6, 112, 2, 8, 144)

//try to pass a method as the argument instead
scala> def m3(x: Int) = 3*x
m3: (x: Int)Int

//still works
scala> myList.map(m3)
res30: List[Int] = List(9, 168, 3, 12, 216)

This is because when a function is expected but a method is provided, it will be automatically converted into a function. This is called the ETA expansion. This makes it a lot easier to use the methods we created. You can verify this behavior with the tests below:

//expecting a function
scala> val f3: (Int)=>Int = m3
f3: (Int) => Int = <function1>

//not expecting a function, so the method won't be converted.
scala> val v3 = m3
<console>:5: error: missing arguments for method m3 in object $iw;
follow this method with `_' if you want to treat it as a partially applied function
val v3 = m3
^

With this automatic conversion, we can write concise code like:

//10.< is interpreted as obj.method so is still a method. Then it is converted to a function.
scala> myList.filter(10.<)
res31: List[Int] = List(56, 72)

Because in Scala operators are interpreted as methods:

  • prefix: op obj is interpreted as obj.op.

  • infix: obj1 op obj2 is interpreted as obj1.op(obj2).

  • postfix: obj op is interpreted as obj.op.


You could write 10< instead of 10.<:

scala> myList.filter(10<)
res33: List[Int] = List(56, 72)

How to force a method to become a function?


When a function is not expected, you can still explicitly convert a method into a function (ETA expansion) by writing an underscore after the method name:

scala> def m4(x: Int) = 4*x
m4: (x: Int)Int

//explicitly convert the method into a function
scala> val f4 = m4 _
f4: (Int) => Int = <function1>

scala> f4(2)
res34: Int = 8

A call by name parameter is just a method


A call by name parameter is just a method without a parameter list. That's why you can invoke it by writing its name without using ():

//use "x" twice, meaning that the method is invoked twice.
scala> def m1(x: => Int) = List(x, x)
m1: (x: => Int)List[Int]

scala> import util.Random
import util.Random

scala> val r = new Random()
r: scala.util.Random = scala.util.Random@ad662c

//as the method is invoked twice, the two values are different.
scala> m1(r.nextInt)
res37: List[Int] = List(1317293255, 1268355315)

If you "cache" the method in the body, you'll cache the value:

//cache the method into y
scala> def m1(x: => Int) = { val y=x; List(y, y) }
m1: (x: => Int)List[Int]

//get the same values
scala> m1(r.nextInt)
res38: List[Int] = List(-527844076, -527844076)

Is it possible to maintain the dynamic nature of x in the body? You could cache it as a function by explicitly converting it:

//explicit conversion, but then you must invoke the function with ().
scala> def m1(x: => Int) = { val y=x _; List(y(), y()) }
m1: (x: => Int)List[Int]

scala> m1(r.nextInt)
res39: List[Int] = List(1413818885, 958861293)

Saturday, August 6, 2011

equals() and Scala

Problem with equals()


The problem with equals() is that it is difficult to get it right. For example, for a simple class Foo, you may write the equals() method as:

class Foo(val i: Int) {
override def equals(that: Any) = {
that match {
case f: Foo => f.i == i
case _ => false
}
}
}

The problem is that what happens if the other object ("that") belongs to a subclass of Foo, which may have its own fields? As the equals() method is only comparing the "i" field, it will ignore the other fields and return true prematurely. Below is such a subclass Bar:

class Bar(i: Int, val j: Int) extends Foo(i) {
override def equals(that: Any) = {
that match {
case b: Bar => super.equals(b) && b.j == j
case _ => false
}
}
}

With the erroneous equals() method in Foo, we could get incorrect results:

scala> val f1 = new Foo(2)
f1: Foo = Foo@14c0275

scala> val b1 = new Bar(2, 3)
b1: Bar = Bar@171bc3f

scala> f1.equals(b1)
res0: Boolean = true

scala> b1.equals(f1)
res1: Boolean = false

A solution to the problem


The problem is that the equals() method in Foo is treating the Bar object exactly as a base Foo object, but the equality contract in Bar has changed from that in Foo. Of course, not every subclass of Foo will use a different equality contract; some do and some don't (by default, we should assume that they don't). Therefore, the equals() method in Foo should make sure that the "that" object uses the same equality contract as "this":

object FooEqualityContract {
}

class Foo(val i: Int) {
//by default all Foo objects and subclass objects use this equality contract
val equalityContract: Any = FooEqualityContract

override def equals(that: Any) = {
that match {
//make sure the two objects are using the same equality contract
case f: Foo => f.equalityContract == this.equalityContract && f.i == i
case _ => false
}
}
}

Now, as Bar is using its own equality contract, it should say so:

class Bar(i: Int, val j: Int) extends Foo(i) {
//tell others that we're using our own equality contract
override val equalityContract: Any = BarEqualityContract

override def equals(that: Any) = {
that match {
case b: Bar => super.equals(b) && b.j == j
case _ => false
}
}
}

Now, the equals() method in Foo will rightly determine that a Bar object is using a different equality contract and thus will never be equal to a bare Foo object:

scala> val f1 = new Foo(2)
f1: Foo = Foo@34b350

scala> val b1 = new Bar(2, 3)
b1: Bar = Bar@7c28c

scala> f1.equals(b1)
res2: Boolean = false

scala> b1.equals(f1)
res3: Boolean = false

scala> val b2 = new Bar(2, 3)
b2: Bar = Bar@5dd915

scala> b1.equals(b2)
res6: Boolean = true

Of course, it should also work for subclasses that use the same equality contract:

scala> val f2 = new Foo(2) { }
f2: Foo = $anon$1@a594e1

scala> f1.equals(f2)
res9: Boolean = true

scala> f2.equals(f1)
res10: Boolean = true