Kent Tong's personal thoughts on information technology

Sunday, August 21, 2011

Revealing the Scala magician’s code: method vs function

How's a method different from a function in Scala?

A method can appear in an expression as an internal value (to be called with arguments) but it can't be the final value, while a function can:


//a simple method
scala> def m(x: Int) = 2*x
m: (x: Int)Int

//a simple function
scala> val f = (x: Int) => 2*x
f: (Int) => Int = <function1>

//a method can't be the final value
scala> m
<console>:6: error: missing arguments for method m in object $iw;
follow this method with `_' if you want to treat it as a partially applied function
       m
       ^

//a function can be the final value
scala> f
res11: (Int) => Int = <function1>

Parameter list is optional for methods but mandatory for functions

A method can have no parameter list or have one (empty or not), but a function must have one (empty or not):


//a method can have no parameter list
scala> def m1 = 100           
m1: Int

//a method can have an empty parameter list
scala> def m2() = 100
m2: ()Int

//a function must have a parameter list
scala> val f1 = => 100   
<console>:1: error: illegal start of simple expression
       val f1 = => 100
                ^
//a function's parameter list could be empty
scala> val f2 = () => 100
f2: () => Int = <function0>

Why a method can have no parameter list? See below.

Method name means invocation while function name means the function itself

Because methods can't be the final value of an expression, so if you write a method name and if it doesn't take any argument (no argument list or an empty argument list), the expression is meant to call that method to get the final value. Because functions can be the final value, if you just write the function name, no invocation will occur and you will get the function as the final value. To force the invocation, you must write ():


//it doesn't have a parameter list
scala> m1
res25: Int = 100

//it has an empty parameter list
scala> m2
res26: Int = 100

//get the function itself as the value. No invocation.
scala> f2
res27: () => Int = <function0>

//invoke the function
scala> f2()
res28: Int = 100

Why we can provide a method when a function is expected?

Many Scala methods such as map() and filter() take functions arguments, but why can we provide methods to them like:


scala> val myList = List(3, 56, 1, 4, 72)
myList: List[Int] = List(3, 56, 1, 4, 72)

//the argument is a function
scala> myList.map((x)=>2*x)
res29: List[Int] = List(6, 112, 2, 8, 144)

//try to pass a method as the argument instead
scala> def m3(x: Int) = 3*x
m3: (x: Int)Int

//still works
scala> myList.map(m3)      
res30: List[Int] = List(9, 168, 3, 12, 216)

This is because when a function is expected but a method is provided, it will be automatically converted into a function. This is called the ETA expansion. This makes it a lot easier to use the methods we created. You can verify this behavior with the tests below:


//expecting a function
scala> val f3: (Int)=>Int = m3   
f3: (Int) => Int = <function1>

//not expecting a function, so the method won't be converted.
scala> val v3 = m3
<console>:5: error: missing arguments for method m3 in object $iw;
follow this method with `_' if you want to treat it as a partially applied function
       val v3 = m3
                ^

With this automatic conversion, we can write concise code like:


//10.< is interpreted as obj.method so is still a method. Then it is converted to a function.
scala> myList.filter(10.<)
res31: List[Int] = List(56, 72)

Because in Scala operators are interpreted as methods:

prefix: op obj is interpreted as obj.op.

infix: obj1 op obj2 is interpreted as obj1.op(obj2).

postfix: obj op is interpreted as obj.op.

You could write 10< instead of 10.<:


scala> myList.filter(10<) 
res33: List[Int] = List(56, 72)

How to force a method to become a function?

When a function is not expected, you can still explicitly convert a method into a function (ETA expansion) by writing an underscore after the method name:


scala> def m4(x: Int) = 4*x      
m4: (x: Int)Int

//explicitly convert the method into a function
scala> val f4 = m4 _
f4: (Int) => Int = <function1>

scala> f4(2)
res34: Int = 8

A call by name parameter is just a method

A call by name parameter is just a method without a parameter list. That's why you can invoke it by writing its name without using ():


//use "x" twice, meaning that the method is invoked twice.
scala> def m1(x: => Int) = List(x, x)
m1: (x: => Int)List[Int]

scala> import util.Random
import util.Random

scala> val r = new Random()
r: scala.util.Random = scala.util.Random@ad662c

//as the method is invoked twice, the two values are different.
scala> m1(r.nextInt)           
res37: List[Int] = List(1317293255, 1268355315)

If you "cache" the method in the body, you'll cache the value:


//cache the method into y
scala> def m1(x: => Int) = { val y=x; List(y, y) }
m1: (x: => Int)List[Int]

//get the same values
scala> m1(r.nextInt)                              
res38: List[Int] = List(-527844076, -527844076)

Is it possible to maintain the dynamic nature of x in the body? You could cache it as a function by explicitly converting it:


//explicit conversion, but then you must invoke the function with ().
scala> def m1(x: => Int) = { val y=x _; List(y(), y()) }
m1: (x: => Int)List[Int]

scala> m1(r.nextInt)                                    
res39: List[Int] = List(1413818885, 958861293)

Saturday, August 6, 2011

equals() and Scala

Problem with equals()

The problem with equals() is that it is difficult to get it right. For example, for a simple class Foo, you may write the equals() method as:


class Foo(val i: Int) {
  override def equals(that: Any) = {
    that match {
      case f: Foo => f.i == i
      case _ => false
    }
  }
}

The problem is that what happens if the other object ("that") belongs to a subclass of Foo, which may have its own fields? As the equals() method is only comparing the "i" field, it will ignore the other fields and return true prematurely. Below is such a subclass Bar:


class Bar(i: Int, val j: Int) extends Foo(i) {
  override def equals(that: Any) = {
    that match {
      case b: Bar => super.equals(b) && b.j == j
      case _ => false
    }
  }
}

With the erroneous equals() method in Foo, we could get incorrect results:


scala> val f1 = new Foo(2)
f1: Foo = Foo@14c0275

scala> val b1 = new Bar(2, 3)
b1: Bar = Bar@171bc3f

scala> f1.equals(b1)
res0: Boolean = true

scala> b1.equals(f1)
res1: Boolean = false

A solution to the problem

The problem is that the equals() method in Foo is treating the Bar object exactly as a base Foo object, but the equality contract in Bar has changed from that in Foo. Of course, not every subclass of Foo will use a different equality contract; some do and some don't (by default, we should assume that they don't). Therefore, the equals() method in Foo should make sure that the "that" object uses the same equality contract as "this":


object FooEqualityContract {
}

class Foo(val i: Int) {
  //by default all Foo objects and subclass objects use this equality contract
  val equalityContract: Any = FooEqualityContract

  override def equals(that: Any) = {
    that match {
      //make sure the two objects are using the same equality contract
      case f: Foo => f.equalityContract == this.equalityContract && f.i == i
      case _ => false
    }
  }
}

Now, as Bar is using its own equality contract, it should say so:


class Bar(i: Int, val j: Int) extends Foo(i) {
  //tell others that we're using our own equality contract
  override val equalityContract: Any = BarEqualityContract

  override def equals(that: Any) = {
    that match {
      case b: Bar => super.equals(b) && b.j == j
      case _ => false
    }
  }
}

Now, the equals() method in Foo will rightly determine that a Bar object is using a different equality contract and thus will never be equal to a bare Foo object:


scala> val f1 = new Foo(2)
f1: Foo = Foo@34b350

scala> val b1 = new Bar(2, 3)
b1: Bar = Bar@7c28c

scala> f1.equals(b1)
res2: Boolean = false

scala> b1.equals(f1)
res3: Boolean = false

scala> val b2 = new Bar(2, 3)
b2: Bar = Bar@5dd915

scala> b1.equals(b2)
res6: Boolean = true

Of course, it should also work for subclasses that use the same equality contract:


scala> val f2 = new Foo(2) { }
f2: Foo = $anon$1@a594e1

scala> f1.equals(f2)
res9: Boolean = true

scala> f2.equals(f1)
res10: Boolean = true

Friday, July 15, 2011

A simple but highly useful feature request for DNS

Most people believe that by having two Windows domain controllers can provide transparent fail over, i.e., if one DC fails, the clients will automatically use the other. However, this is not true. The client will simply use the first DC returned by the DNS. Similarly, if you use DNS to load-balance between multiple web servers, when one of them fails, some clients will still be directed to it.
To fix the problem, there is a very simple solution: enhance the DNS server to perform a health check against the resulting host of the resource record. For example, the administrator could specify the TCP port to connect to as in the imaginary syntax below:

  www        A      1.1.1.1    80     ; return this record only if we can connect to its TCP port 80
  www        A      1.1.1.2    80
  www        A      1.1.1.3    80

Of course, the health check could be more general, then you could use a script:

  www        A      1.1.1.1    web-check.sh  ; return this record only if the script returns true

where the IP would be passed to that script as an argument for checking.
It works for domain controllers too:

  _ldap._tcp.dc._msdcs.foo.com.   SRV  1.1.1.1  dc-check.sh
  _ldap._tcp.dc._msdcs.foo.com.   SRV  1.1.1.2  dc-check.sh

Finally, one might ask why implement this checking in the DNS server instead of the clients? The idea is that problems should be detected as early as possible to avoid bad effects downstream. In concrete terms, if a server is down but the DNS server (broker) still refers the clients to it, many clients will need to perform this health check themselves. But if the DNS server performs this health check, the checking is only done once, saving a lot of trouble downstream.

Sunday, May 1, 2011

Revealing the Scala magician's code: expression

Scala is truly magical. However, sometimes it is not easy to understand how it performs the magic. Below are some common questions and their answers.

Why some of expressions below work (can be evaluated and printed) but the other don't?


Math.min  //doesn't work
Math.min _  //works
val f: (Int, Int)=>Int = Math.min  //works

The first expression doesn't work because Math.min is a method, but a method is not a value in Scala. The second expression works because the underscore asks Scala to convert the method to a function, which is indeed a value in Scala. The third expression also works because when Scala is expecting a function value from the expression but finds a method, it will convert it to a function automatically.

Why some of expressions below work but the other don't?


List(2, 3, 5) foreach println  //works
List(2, 3, 5) foreach println(_) //doesn't work
List(2, 3, 5) foreach (println(_)) //works

The first one works because foreach is expecting a function, while println (of the Predef object which has been imported automatically) is a method, not a function, but as mentioned above Scala will convert it into a function automatically because a function is expected. So it works.
The second one is trying to create an anonymous function:


List(2, 3, 5) foreach ((x)=>println(x))

However, in Scala only a "top level" expression can be a function. In this case, println(_) is only a "term" in an expression (foreach is the operator) but not a top level expression, so the compiler won't try to make it an anonymous function. Instead, it will search further to find the first enclosing top level expression (in this case, the whole expression you entered) and turn it into an anonymous function:


(x)=>List(2, 3, 5) foreach println(x)

But then the type of x can't be inferred, so it is an error. Also, println(x) returns a value of (), the only value of the class Unit, which is not what foreach wants anyway (a function taking an Int).
With this knowledge, you can see why the third expression works:


List(2, 3, 5) foreach (println(_)) //works

This is because with the parentheses, a top level expression is expected, so Scala will make println(_) an anonymous function:


List(2, 3, 5) foreach ((x)=>println(x)) //works

Why some of expressions below work but the other don't?


List(2, 3, 5) foreach (println("hi"+_))  //doesn't works
List(2, 3, 5) foreach (println "hi"+_)  //doesn't works
List(2, 3, 5) foreach (Predef println "hi"+_)  //works

The first expression doesn't work because the first top level expression found is "hi"+_ due to the parentheses. So, Scala will treat it as:


List(2, 3, 5) foreach (println((x)=>"hi"+x))  //doesn't works

So you're printing a function to the console and returning a unit value () to foreach. In order to fix the problem, you may try to get rid of the parentheses so "hi"+_ is no longer a top level expression:


List(2, 3, 5) foreach (println "hi"+_)

The problem is that Scala will now try to parse:


println "hi"+_

as:


expr1 op1 expr2 op2 ...

println is assumed to be an expression instead of a prefix operator because only !, ~, +, - can be prefix operators. So, println will be treated as an expression while the String "hi" will be treated as an operator, which is obviously incorrect.
To fix this problem, you can provide a real object as expr1, which is the Predef object:


List(2, 3, 5) foreach (Predef println "hi"+_)

Note that there are two operators: println and +. Because all symbolic operators have higher precedence than identifier operators, + will be applied first.
So, the first enclosing top level expression is turned into an anonymous function:


List(2, 3, 5) foreach ((x)=>Predef println "hi"+x)

Because in Scala "e1 op e2" is treated as e1.op(e2), the code is treated as:


List(2, 3, 5) foreach ((x)=>Predef.println("hi".+(x)))

Friday, April 22, 2011

Great way to learn the Scala API: Scala interpreter

If you have already learned the language construct of Scala, the next step is to learn its API. If you're a Java programmer, usually you'll try to do that by writing small Scala programs in an IDE. However, there is a much better way: using the Scala interactive interpreter. This way you can inspect the effect of each line of code as soon as you press Enter. This is quite non-obvious for Java programmers because there is no such thing in the Java tool set.
For example, you'd like to learn about the Seq trait in Scala. So you open the API page of it and find a long list of methods. Let's say you're interested in learning the behaviors of the following methods:


/** Appends all elements of this sequence to a string builder. The written text consists of the string representations (w.r.t. the method toString) of all elements of this sequence without any separator string. */
def addString(b: StringBuilder): StringBuilder

To do that, you issue the "scala" command from a shell/command prompt and then explore the method like:


kent@dragon:~$ scala
Welcome to Scala version 2.8.0.r20327-b20091231020112 (Java HotSpot(TM) Server VM, Java 1.6.0_20).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val a = List("a", "b", "c")
a: List[Int] = List(a, b, c)

scala> val b = new StringBuilder
b: StringBuilder = 

scala> b.append("hi")
res0: StringBuilder = hi

scala> a.addString(b)
res1: StringBuilder = hiabc

From the experiment, it is clear that the addString() method will append all the elements of the Seq to the string builder.
Let's consider another. You're interested in:


/** Multiplies up the elements of this collection. num is an implicit parameter defining a set of numeric operations which includes the * operator to be used in forming the product.
*/
def product[B >: A](num: Numeric[B]): B

So, try it in the Scala interpreter:


scala> val a = List(3, 5, 2)
a: List[Int] = List(3, 5, 2)

scala> a.product
res2: Int = 30

So it seems to work fine. Possible to override the * operator? From the API doc it is clear that the default being used is the IntIsIntegral object:


package scala.math

class Numeric {
  object IntIsIntegral extends IntIsIntegral with IntOrdering {
     ...
  }
}

So, the first try is:


scala> import scala.math.Numeric._              
import scala.math.Numeric._

scala> val x = new IntIsIntegral {              
     | override def times(x: Int, y: Int) = x+y;
     | }
<console>:9: error: object creation impossible, since method compare in trait Ordering of type (x: Int,y: Int)Int is not defined

Oops, forgot to specify the IntOrdering trait which defines the compare() method. So, do it now:


scala> import scala.math.Ordering._
import scala.math.Ordering._

scala> val x = new IntIsIntegral with IntOrdering {
     | override def times(x: Int, y: Int) = x+y;
     | }
x: java.lang.Object with math.Numeric.IntIsIntegral with math.Ordering.IntOrdering = $anon$1@1e5d007

We have successfully created a new object to redefine the * operator as just addition. Now, pass it to the product() method:


scala> a
res5: List[Int] = List(3, 5, 2)

scala> a.product(x)
res6: Int = 11

It should be 3+5+2=10 but why the result is 11? Recall that it is doing multiplication, so it is using 1 as the seed for calculation (1+3+5+2). To change the seed from 1 to, say, 0, we can override the one() method:


scala> val x = new IntIsIntegral with IntOrdering {
     | override def times(x: Int, y: Int) = x+y;
     | override def one = 0;
     | }
x: java.lang.Object with math.Numeric.IntIsIntegral with math.Ordering.IntOrdering = $anon$1@11a9f20

scala> a.product(x)
res7: Int = 10

Obviously now it works.

Tuesday, April 5, 2011

Five signs that your talents are not being appreciated

Here are the five signs:

Late arrival to the office. Your boss is so blind to see your great contributions to the company, therefore you are so de-motivated that you come in to the office late everyday.

Micro-management. Your boss is always trying to micro-manage you by telling you the "right" ways to do things such as agile methodologies, but actually you know those aren't just good as your way because you've been in the trenches long before your boss.

Rejecting your good suggestions. You have been pushing a complete conversion to a cutting edge technology to double the productivity of the team, but your boss just won't listen.

Incompetent peers. Your peers are so incompetent and have been hampering the product launch. Even though you have been telling them their problems, they just don't get it.

Being blamed. Your peers are so jealous of your great abilities that they try to isolate you and blame you for everything that has gone wrong.

These signs seemingly indicate that people aren't appreciating your talents, but the fact is you may be just living in your own little world and will probably be fired in a few months! The truth that you may not be understanding right now is:

Late arrival to the office (Lack of commitment). Even if you aren't happy with your job or the way you're treated, you should still demonstrate your commitment to your duties. Arriving late is an obvious way to say that you have no commitment.

Micro-management (Not accepting advise for improvement). Seeing the difficulties you face, your boss is trying to help you by giving you good advice. But you are so attached to your ego and stubborn to see any values in any new approaches.

Rejecting your good suggestions (Not understanding the priorities of the company). Your suggested technology may seem great to yourself, but actually it may be just too premature or is not among the top priorities of the company. Everyone is trying to tell you that but you just won't listen.

Incompetent peers (Destroying harmony). You may be good or not, but it is never a good idea to pick the mistakes of others. You will appear like an asshole to all your peers. Instead, you should help others improve.

Being blamed (Not seeing your own mistakes). Be brave and admit it, those mistakes were yours. If you don't see your own mistakes, you'll never grow.

Fortunately, it is not too late to realize your own mistakes. Stop denying and you'll have a much brighter future.
ps, if you’d like to learn more about IT management and governance, check out IT Governance by Examples.

Sunday, March 27, 2011

A story for development outsourcing clients

Once upon a time there was a client who had contracted a carpenter to make a chair for him. In order to protect his own interests, he spent a lot of time specifying the dimensions of the chair, the material and etc. Then he negotiated with the carpenter to settle on the cost and duration. In addition, he set up a monetary reward for early delivery and a daily penalty for each day late.
Then there was another client who spent a lot of time looking for and selecting a carpenter with good reputation for excellent customer satisfaction. He only told the carpenter that he needed a chair that is comfortable to sit on and then drafted a sketch of the chair. Then he negotiated with the carpenter to settle on the cost and duration. There was no reward for early delivery and no penalty for late delivery. They agreed to discuss to adjust the cost and duration if there turned out to be a need.
Then what was the result? The first client got the chair earlier than the deadline and the carpenter got the reward. However, when the client sat on the chair, it was not that comfortable. Because the carpenter rushed to finish it, the craftsmanship was poor and as a result, the chair was broken in a year. Then, it was found that low-graded wood was used in the inner, hidden part of the chair.
In contrast, the other client found that the chair was not that comfortable during the making. So he discussed with the carpenter to make the necessary change. The cost was increased by a little and the duration was longer by a little, but the carpenter had enough time for him to make a quality chair that he is proud of, which finally lasted for 10 years.

The moral of the story

The moral of the story is that in any project there are four factors to be considered: cost, time, scope and quality. If any of those changes, at least one of the other must change accordingly. The problem is, most people will try to fix/specify all those in planning, but in really only cost and time can be easily fixed, while it is very difficult to fix/specify the scope and quality (they are fuzzy and dynamic). So, from the view of the contractor, to meet the fixed cost and time, he has every incentive to minimize the scope (only make what is explicitly told and nothing else) and reduce the quality (uncomfortable to sit, poor craftsmanship, low-graded wood).
The reward and the penalty make the matter worse. They double the importance of meeting the time factor, so they double the incentive for the contractor to minimize the scope and reduce the quality.
To solve this problem, many people try to make the scope and quality even better specified. However, this effort is futile as scope and quality are related to the needs of human which is fuzzy and dynamic. In the example of the chair, what is the quality? Perhaps that it should be comfortable to sit on, that it should last long, etc. But these aren't objective (what is comfortable?) nor can be measured immediately (how long it will last?), so they can't be used to bind the behavior of the contractor.
Therefore, there must be a second level of defense to deal with this fuzziness and dynamics. In addition to specifying the scope and quality at our best effort, we must also identify the contractor who really cares for quality. The idea is that he will work for quality on a higher-level (what we really care about) instead of focusing on meeting lower-level but objective metrics which do not necessarily fully reflect quality. That is the best insurance that we can get. Of course, to allow him to work for the fuzzy and dynamic scope and quality, we must be prepared to renegotiate with him to adjust the cost and duration as necessary.
ps, if you'd like to learn more about IT management and governance, check out IT Governance by Examples.