0%

Ideas of Week5

Type Inference

SML has a static type system, i.e., it determines all the types for its bindings before ever executing the program. But unlike most statically typed programming languages, like C, Java, SML is implicitly typed. We don’t write down the type of our bindings ever since Week3. Inside the SML is an algorithm called type inference that figures out the type of the bindings for us. It may sound magic for the first time, but in this week, Dan explained this mechanics for us.

The central idea of this algorithm is do assumptions and deductions as we go. It exams the bindings from top to bottom and every time a new thing comes up it first assumes that it is a new type and then infers its type by collecting facts associated with it. Here is an example:

1
2
3
4
fun f x =
let val (y, z) = x in
(abs y) + z
end

The whole inference process goes like this:

  1. Looking at the first line, this is a function binding. So f has type T1->T2 for being a function. And x has type T1 for being the only parameter of f.
  2. Looking at the second line, x must have type T3*T4 for the pattern matching to type check. Also, y has type T3 and z has type T4.
  3. Examing the third line. We know from the standard library abs has type int->int. So T3=int. z must also have type int because only int can be added to int. So T4=int.
  4. Looking at the last line, the whole let expression ends with an addition. So the type of the let expression is int. Thus the type of the returned value of f is int.
  5. Combining all the facts together and concluding that f has type int*int->int.

Now type inference may seem more easy and simple. But to have type inference in a programming language requires delicate design. SML is well designed to have a good type inference algorithm. Some of the programming languages does not even have a feasible type inference algorithm. A feasible type inference algorithm must do the following three things:

  • Always terminate
  • Accept all the programs that type check and give the right type
  • Reject all the programs that does not type check

The question of how difficult the type inference algorithm is does not have a clear relation with how permissive the type system of the language is. For example, the type system that accepts everything and the type system that rejects everything are both easy to type check and type infer. Actually, to make the type inference in SML more difficult, we only need small modification to the type system of SML.

  1. If we admit subtyping(e.g., if every triple can also be a pair), the type inference will be much more difficult because we cannot reach the conclusion x has type T1*T2 with the pattern matching (y, z) = x. x may have two, three or more fields.
  2. If we disallow polymophic types, the type inference will also be more difficult because the type inference algorithm has to choose one concrete type which depends on the use of the binding.
    阅读全文 »

Ideas of Week4

Unnecessary Function Wrapping

Unnecessary function wrapping is a low-level mistake. In some languages like C/C++, Java, this is easy to find. But in SML, this becomes a little harder due to the existence of anonymous functions:

1
fn x => tl x

The code above is a typical unnecessary function wrapping. It is hard to find out because SML makes the calling process less observable. And unnecessary function wrapping is easy to confuse with function reusing( though I make up this term, its meaning is straightforward ).

1
2
3
int abs( int a ) {
diff( a, 0 );
}

The code above is function reusing instead of unnecessary function wrapping. One way to distinguish between them, I believe, is to check the parameter list. Function reusing typically uses a different parameter list( like ( a, 0 ) and ( int a ) in the above code ).

Functional Programming

The term functional programming may refer to distinct concepts according to the context. But two most common and important are:

  1. immutable data
  2. treat functions like values

And there are some more terms that relate to the second point–first-class functions, higher-order functions and function closures. These terms are often used unprecisely or interchangably, but in this course they have the following meaning respectively:

  • First-class functions: Those functions which can be passed into or returned from other functions, computed, stored, etc. like values.
  • Higher-order functions: From the name we can tell that it is sort of opposite to first-class functions. This term refers to those functions that take or return other functions
  • Function closures: This term refers to those functions that use variables defined outside them. Another two closely-related terms are Lexical Scope and Dynamic Scope.

Lexical Scope vs. Dynamic Scope

Before we tell the difference between these two scopes, we need to clarify some points about functions. A function actually contains two parts–the code and the environment. The code part is easy to understand. But where does the environment part come from?

Consider the code part of functions. They are nothing more than just a bunch of symbols. And among the symbols there are arguments, local variables( defined by let in end in SML ) and variables defined outside the function body. The arguments, which are passed to the functions when calling them, have clear meaning–just look them up when calling them. The local variables, defined inside the function body, also have clear meaning because when we define it in the function body we must be clear enough to pass the compiler. So the code will reveal the meaning of any local variables.

But here comes the problem–the last one, variables defined outside the function body. Looking at the code tells us nothing about this kind of variables. No definition in the body. No trace of them when calling the function. For example:

1
2
3
4
5
6
7
8
fun f( a, b, c ) =
let
q = a + b
in
c > q andalso a > d andalso b > e
end

val x = f( 1, 2, 3 )

Now the meaning of a, b, c are clear–they are arguments, by looking them up in the call-site we know a = 1, b = 2, c = 3. The meaning of q is also clear–it is a local variable, by looking it up in the code we know q = a + b. Combined with the infomation of arguments we know q = 3. But what are d and e? The code makes no contribution here. And that’s where the environment part takes control. The environment contains all the definitions of variables defined outside the functions.

Now here comes the combat of lexical scope and dynamic scope. Lexical scope insists that the environment of a function should be the environment where the function get defined while the dynamic scope argues the environment of a function should be that where the function get called. For example:

1
2
3
4
5
6
7
8
9
10
fun f g =
let
val x = 3
in
g 2
end

val x = 2
fun h y = x + y
val z = f h

In the code above, if we take the lexical scope, functionh always increment its argument by 2. And z=4. If we take the dynamic scope, function h increments its argument by 3 when called inside f because in f, x has a new definition 3. And z=5.

After decades of trials and errors, lexical scope stands out and dynamic scope are now considered inferior style. The reason, I believe, is that lexical scope locks the variables in the function definition and protects them from the changing current environment. So in lexical scope, once a function is defined, the function of it is settled down. In the subsequent calling we know its effect by just seeing its name. No need to combine the current environment to get the effect clear.

Lexical scope is widely used and many programming idioms are bound with it. Dan presents us some in Week4.

阅读全文 »

Ideas of Week3

Conceptual ways to build compound type

In week2, I have seen many ways to build compound types in SML—lists, options and tuples. And in other language I have learned, like C, there are structs, unions, etc. But in Week3, Dan extract the core ideas behind building compound types and categorize them into three different groups:

  • “One of”, the compound type descripts data that contains value which is one of the basic types that consist of this compound type. Typically, we use “or” in the description, like an int option contains an int or nothing.

  • “Each of”, the compound type descripts data which contains value of each of the basic types that consist of the compound type. Typically, we use “and” in the description, like, a (int * bool) tuple contains an int value and a bool value.

  • “Self-Reference”, the compound type refers to itself in the definition to define a recursive data, like an int list is empty list or an int and another int list(Note that list actually contains all the concepts mentions above—one-of, each-of and self-reference.)

Naturally, we can have any of them and nest them as deep as we want.

阅读全文 »

History Between PL and Me

PL will stand for the open course Programming Language produced by Dan Grossman on Coursera platform in my posts.

Actually, the first time I met this course was the winter vacation of 2017. I heard about this course on Zhihu and got interested because the people there described how amazing and magic this course can be. But the reason I did not formally take the course until now is that I have tons of bad habits, delaying anything that involves thinking included.

The first time I (informally) took this course was the summer vacation of 2018. I had an impression that this course is a little bit hard-core due to the amount and level of the material. But this impression cannot overshadow the value of the course. So now I am back to take the course formally.

Structrue of My Notes on PL

The course, as Dan stated, aims at teaching core ideas and concepts behind all the programming languages with several specific programming languages being the vehicle. So contents of every week can be divided into two parts—the contents specific to one selected language and the general idea behind all of them. So will my notes. But typically, I will leave out the language-features part and only present the ideas, concepts or philosophy part.

阅读全文 »

Finally, I have established this site as my personal blog!

In the past one year, I have tried on many sites and by many means. Originally, I viewed this(building a personal blog) as both a way to share knowledge and a way to learn more about frond-end and back-end development. Consequently, I wanted to write all the code(JS, CSS, HTML) myself. But now I see that it is impossible because I have to spend most of my time finishing my hws and exams in NJU. So this is just a place for me to share knowledge with others.

There will be at least one post every week. The posts may be in English or Chinese, and some may even include both of them, depending on the material I have and the mood I am in.

Hope that you will all be happy and healthy!