Kitten

Introduction

This is a quick introduction to Kitten. It glosses over some nuances of using the language in order to provide a fast, high-level overview. For a more complete treatment of how to use Kitten, see the tutorial.

Comments

Comments are C-style. Multi-line comments are allowed to nest.

// Single-line comment.

/*
  Multi-line
  comment.
*/

/*
  Nested

  /* multi-line */

  comment.
*/

Literals

Integers consist of decimal digits, or a base prefix and digits in the corresponding base. They have type int.

6       // decimal
0b1010  // binary
0o755   // octal
0x10    // hexadecimal

Floating-point numbers consist of decimal digits, a decimal point ., and a fractional part. They have type float.

6.28

String literals use double-quotes, and are syntactic sugar for vectors of characters. They have type [char]. Common string escapes are allowed using a backslash \.

""
"Meow!\n"

Containers

Vectors are surrounded by square brackets [] with comma-separated elements of the same type. They have type [a] where a is the type of the elements. A trailing comma is allowed.

[]

[1, 2, 3]  // [int]

// [[char]]
[
  "four",
  "five",
  "six",
]

Tuples are similar to vectors, using parentheses () instead. Tuples may contain elements of different types. A one-element tuple is just the value itself.

(1, "two", 3.0)  // (int & [char] & float)
(21, 31)         // (int & int)
("silly",)       // [char]

Expressions and Stack-based Evaluation

Kitten is an expression language; there are no statements, only expressions that compute a result. Expressions may use infix operators with different precedences, as in many other languages. Kitten operators are not built into the language—they are ordinary functions with symbolic names.

2 * 2 + 3 * 3  // (2 * 2) + (3 * 3)

The biggest syntactic difference between Kitten and C-like languages is that function calls are postfix: the function is named after the arguments. Instead of f(x, y), we write x y f. We can think of this like a generalization of the method-call syntax x.f() in many object-oriented languages.

"Meow!\n" print

Operators can also be called in postfix form by wrapping them in parentheses.

2 2 (*) 3 3 (*) (+)

Here we start to see Kitten’s stack-based evaluation model. When computing 2 2 (*), we push 2 to the data stack, push another 2, then call * to multiply them, leaving the result on the stack. So one way to read postfix code is as a sequence of imperative commands operating on the stack.

2    // push 2   => 2
2    // push 2   => 2 2
(*)  // multiply => 4
3    // push 3   => 4 3
3    // push 3   => 4 3 3
(*)  // multiply => 4 9
(+)  // add      => 13

However, this is the low-level way to think about Kitten programs. At a high level, postfix syntax can be thought of as representing a data flow graph.

2233
2233
**
49
+
13

Conditionals and Blocks

Conditional execution uses the if (…) {…} else {…} function, which is mixfix for the sake of similarity with C-like languages. Blocks can be surrounded by curly braces {} like in C, or prefixed with a colon : and delimited by indentation like in Python. The : form desugars into the {} form.

if (foodDish isEmpty) {
  ":(" say
  begForFood
} else {
  purr
}

if (foodDish isEmpty):
  ":(" say
  begForFood
else:
  purr

Built-in Data Types

In addition to int, float, char, vectors, and tuples, Kitten has two built-in algebraic data types: option ? and choice |.

A value of type a? (“optional a”) is either empty with a value of none or full with a value of x some, where x has type a.

none    // a?
5 some  // int?

An instance of a | b (“either a or b”) is either x left or y right, where x has type a and y has type b.

// int | char
if (this = that):
  1 left
else:
  '!' right

The option (…) {…} else {…} function matches on options, and the choice (…) {…} else {…} function matches on choices.

option (xs head):
  sayInt
else:
  "empty vector" say

choice (x):
  sayInt
else:
  sayBool

Function Definitions

New functions are defined with the define keyword. They have a name, a type signature, and a body. Type signatures are written in parentheses; function types use an ASCII rightwards arrow -> or its Unicode equivalent (U+2192). If no signature is specified, the type can be inferred.

define square (int -> int):
  dup (*)

A function is called when we mention its name.

4 square  // 16

To push a function to the stack instead, wrap it in a block.

[1, 2, 3] {square} map

The backslash syntax \f is a slightly more succinct equivalent, with the advantage that an operator can be referred to as simply \+ instead of {(+)}. Mnemonically, this is “escaping” the function to prevent it from being evaluated.

[1, 2, 3] \square map           // [1, 4, 9]
[1, 2, 3] [4, 5, 6] \* zipWith  // [4, 10, 18]

Locals

Local variables can be introduced with the -> name; syntax (also → name;). This takes a value from the stack and stores it in a local variable for the remainder of scope. Mentioning a local has the effect of pushing its value to the stack.

define square:
  -> x;
  x * x

We can also introduce multiple variables at once; they are taken from the stack in the order they were pushed.

1 2 3 -> x y z;
x sayInt  // 1
y sayInt  // 2
z sayInt  // 3

Local variables are always immutable. Pushing a local to the stack produces a copy of its value; as an optimization, non-primitive values are copied lazily.

Whereas Kitten decouples the concepts of local variable bindings (-> …;) and anonymous functions ({…}), in practice we often want to couple these things together. For example, in the option (…) {…} else {…} function, we often want to name the value we extracted, if one was present. We can use a local variable binding inside a block:

option (xs head):
  -> x;
  x sayInt
else:
  "nothing" say

Or we can use the combined form, -> … { … }, which is compatible with the usual layout rule:

option (xs head) -> x {
  x sayInt
} else {
  "nothing" say
}

option (xs head) -> x:
  x sayInt
else:
  "nothing" say

-> x y z { x + y + z } is exactly equivalent to { -> x y z; x + y + z }. Note that this also works on definitions.

define square (int -> int)
  -> x
{
  x * x
}

Multiple Inputs and Outputs

Functions can take multiple inputs. In the type signature, their types are separated by spaces. At present there is no overloading: the * operator is for int multiplication, and *. for float.

define multiply3 (float float float -> float):
  -> x y z;
  x *. y *. z

Notice what happens when we write this function in postfix:

define multiply3:
  -> x y z;
  x y z (*.) (*.)

We’re pulling x, y, and z off the stack and pushing them right back! This can of course be optimized away, but it’s redundant. This is a good place to apply Kitten’s common practice of chaining functions without using local variables. We can just pass arguments and return values implicitly on the stack.

define multiply3:
  (*.) (*.)

Functions may also have multiple return values. Wrapping in a tuple is not required, and it would simply interfere with the implicit dataflow style.

define incrementBoth (int int -> int int):
  -> x y;
  (x + 1) (y + 1)

The low-precedence semicolon ; operator may be used as an expression separator, to avoid many side-by-side parentheses (…) (…) as above.

define incrementBoth (int int -> int int):
  -> x y;
  x + 1; y + 1

Polymorphism

Type signatures often include type variables, which enable polymorphism. For example, here is a simple recursive function to reverse a vector:

define rev<a> ([a] -> [a]):
  -> xs;
  option (xs last) -> x:
    xs init rev
    x prepend
  else:
    []

The signature reads:

<a>  // For any type 'a'…
(
  [a]  // given a vector of values of that type…
  ->   // this function produces…
  [a]  // another vector of values of that type.
)

Functions may be generic in any number of types. For example, the type of the map function is <a, b> ([a] (a -> b) -> [b]). It takes a vector of a values, and a function from a to b, and returns a vector of b values. a and b may of course refer to the same type.

[3, 6, 9, 12] \showInt map  // ["3", "6", "9", "12"]
[3, 6, 9, 12] {(/ 3)} map   // [1, 2, 3, 4]

Notice the syntax (/ 3), which is shorthand for -> x; x / 3. This is called an operator section. Sections are infix operators applied to only one of their arguments, like (/ 3) (divide by three) or (1 +) (add one).

User-defined Data Types

You can also define your own data types with a data definition. There are two “simple” kinds of data types you may be familiar with from C. Enumerations:

data Bool:
  case True
  case False
  case FileNotFound

And structures:

data Point:
  case Point (float, float)

You can construct an instance of your data type by calling its constructor with all the fields as parameters.

1.5 2.5 Point -> p;

In order to use the values of these fields, you can deconstruct an instance of your data type with a match expression:

match (p):
  case Point -> x y:
    x sayFloat
    y sayFloat

A data type may include multiple constructors with fields; this is like a tagged union in C:

data IntOrString:
  case Nothing
  case Int (int)
  case String ([char])

When matching on a value of such a type, you simply include multiple case branches.

match (intOrString):
  case Nothing:
    "nothing" say
  case Int -> i:
    i sayInt
  case String:  // Names are never required!
    say

If you do not match all possible cases, pattern matching may fail at runtime. To change the default behavior of aborting the task, you can specify a default branch.

match (intOrString):
  case Int: sayInt
  default:
    "I only care about integers." say