Introduction
This is a quick introduction to Kitten. It glosses over some nuances of the language in order to provide a fast, high-level overview so you can quickly get started with using Kitten. I’m working on a more complete treatment of how to use Kitten in the form of a tutorial.
Comments
Comments are C-style. Multi-line comments are allowed to nest.
// Single-line comment.
/*
Multi-line
comment.
*/
/*
Nested
/* multi-line */
comment.
*/
Literals
Integers consist of decimal digits, or a base prefix and digits in the
corresponding base. They have type
Int32
by default; this can be
overridden with a suffix.
6 // Decimal
0b1010 // Binary
0o755 // Octal
0x10 // Hexadecimal
5i64 // Decimal Int64
0xFFu8 // Hexadecimal UInt8
Floating-point numbers consist of decimal digits, a decimal point
(.
), a fractional part, and an optional exponent. They have type
Float64
by default, and this can
also be overridden with a suffix.
6.28 // Regular notation
.5 // Leading digit omitted
1.0e+9 // Scientific notation
2.0f32 // Float32
Text literals use double-quotes
(""
), and are syntactic sugar
for lists of characters. They have type
List<Char>
.
Common character escapes are allowed using a backslash \
. Character
literals use single quotes (''
).
// []
""
// ['M', 'e', 'o', 'w', '!', '\n']
"Meow!\n"
Ordinary text literals can’t contain newlines; a multi-line text literal is called a paragraph and consists of text wrapped in three double-quote marks. Paragraphs may be indented, and the indentation is stripped from each line.
"""
Here’s a paragraph.
It contains multiple lines.
(Haiku optional.)
"""
"""
1 2 3 4 5
6 7 8 9 10
11 12
"""
// Equivalent to:
"Here’s a paragraph.\nIt contains multiple lines.\n(Haiku optional.)"
"1 2 3 4 5\n6 7 8 9 10\n\n11 12"
Lists
Lists are surrounded by square brackets ([]
) with comma-separated
elements of the same type. They have type
List<T>
where T
is the type of the
elements. A trailing comma is allowed after the last element.
// <T> List<T>
[]
// List<Int32>
[1, 2, 3]
// List<List<Char>>
[
"four",
"five",
"six",
]
Note that the type of []
is
<T> List<T>
:
an empty list may be used where a list of any type is expected.
Expressions and Stack-based Evaluation
Kitten is an expression language; there are no statements, only expressions that compute a result. Expressions may use infix operators with different precedences, as in many other languages. Kitten operators are not built into the language—they are ordinary functions with symbolic names.
// (2 * 2) + (3 * 3)
2 * 2 + 3 * 3
The biggest syntactic difference between Kitten and C-like languages is that
function calls are postfix: the function name is written after
the arguments. Instead of f(x, y)
, we write x y f
. You
can think of this like a generalization of the method-call syntax
x.f()
in many object-oriented languages, and it easily permits
chaining multiple functions together in a “pipeline”.
"Meow!\n" print
// Prints "3-" followed by a newline.
-3 abs neg show reverse say
You can call an operator in postfix form by wrapping it in parentheses. You can
also partially apply one operand:
(/ 3)
(“divide by three”) is
shorthand for
-> x; x / 3
; likewise,
(1 -)
(“one minus”) is short for
-> x; 1 - x
.
2 2 (*) 3 3 (*) (+)
// (x * 2)
x (* 2)
// (2 - x)
x (2 -)
Here we start to see Kitten’s stack-based evaluation model. When computing
2 2 (*)
,
we push 2
to the
data stack, push another
2
, then call *
to
multiply them, leaving the result on the stack. So one way to read postfix code
is as a sequence of imperative commands operating on the stack.
2 // push 2 => 2
2 // push 2 => 2 2
(*) // multiply => 4
3 // push 3 => 4 3
3 // push 3 => 4 3 3
(*) // multiply => 4 9
(+) // add => 13
However, this is the low-level way to think about Kitten programs. At a high level, postfix syntax can be thought of as representing a data flow graph.
2 | 2 | 3 | 3 |
↓ 2 | ↓ 2 | ↓ 3 | ↓ 3 |
* | * | ||
↓ 4 | ↓ 9 | ||
+ | |||
↓ 13 |
Locals
Local variables can be introduced with the -> name;
syntax
(or Unicode → name;
). This takes a value from the stack and
stores it in a local variable for the remainder of scope. Mentioning a local
has the effect of pushing its value to the stack.
// 25
5 -> x; x * x
We can also introduce multiple variables at once; they are taken from the stack in the order they were pushed.
1 2 3 -> x, y, z;
x say // 1
y say // 2
z say // 3
Local variables are always immutable. Pushing a local to the stack once simply
moves its value (as if by memcpy
); pushing the variable
multiple times copies its value, as long as the type is copyable.
Conditionals and Blocks
Conditional execution uses the
if
…elif
…else
expression. Blocks can be surrounded by curly braces {}
like in C,
or prefixed with a colon :
and delimited by indentation like in
Python. The :
form is called “layout” and desugars into the
{}
form.
if (food_dish is_empty) {
":<" say
beg_for_food
} elif (mouse_nearby) {
eat_mouse
} else {
purr
}
if (food_dish is_empty):
":<" say
beg_for_food
elif (mouse_nearby):
eat_mouse
else:
purr
Layout syntax is generally preferred for multi-line blocks.
You can omit the condition part of an if
expression to take it from
the stack. For example, here’s how the standard not
word is defined.
// Start with a local variable…
-> x; if (x) { false } else { true }
// …take it from the stack…
-> x; x if { false } else { true }
// …and remove the redundancy.
if { false } else { true }
An if
without an
else
is equivalent to an
if
with an empty
else
block. In other words, the
else
branch is the identity
function.
if (have_food) { purr }
if (have_food) { purr } else {}
Algebraic Data Types
Kitten has algebraic data types (ADTs) defined with the
type
keyword. A type may have any number of constructors,
introduced with the case
keyword, and each constructor may have
any number of fields, with the field types given in parentheses.
// Enumeration
// Multiple constructors with no fields
type Bool:
case false
case true
// Enumeration
// Explicitly indicating no fields
type Bool:
case false ()
case true ()
// Structure
// One constructor with multiple fields
type Pair<A, B>:
case pair (A, B)
// A pair of type Pair<Float64, Char>
2.0 'x' pair
// Tagged Union
// Multiple constructors
// with different numbers of fields
type Optional<T>:
case none
case some (T)
// An empty Optional
// of type <T> Optional<T>
none
// A full Optional
// of type Optional<Int32>
5 some
A match
expression takes an
instance of an ADT, matches on its tag, and unpacks its fields (if any) onto
the stack so they can be manipulated. You can think of
match
as the inverse
of a constructor: whereas pair
has the type
A, B -> Pair<A, B>
, the expression
-> p; match (p) case pair {}
has the type Pair<A, B> -> A, B
—the inputs and outputs
are swapped.
match (xs head)
case some:
-> x;
"Non-empty list; first value: " print
x say
case none:
"Empty list." say
The value being pattern-matched (x
in match (x)
) is
called the scrutinee. Similarly to
if
, you can omit this
from a match
expression, in
which case the value is taken from the stack. That is,
match (x) case /* ... */
is equivalent to x match case /* ... */
.
This code constructs a pair, deconstructs it, swaps its elements, and makes a new swapped pair.
// (1, 2)
1 2 pair
// (2, 1)
match case pair { swap pair }
A match
expression may include
an else
branch, which
is executed if none of the cases matched. If a
match
expression
does not include a case
for
every constructor of a data type, and has no
else
branch, it’s called
non-exhaustive. In that case, the compiler produces an implicit
else
branch, which calls
fail
to abort the program with an error message. This also causes
the match
expression to require
the +Fail
permission, which will
be explained later on.
Function Definitions
You can define new functions with the
define
keyword. Every definition
has a name, a type signature, and a body. Type signatures are written in
parentheses, with the input and output types to the left and right,
respectively, of an ASCII rightwards arrow ->
or its Unicode
equivalent →
(U+2192).
This definition squares a number by copying it and multiplying it with itself.
define square (Int32 -> Int32):
dup (*)
A user-defined function, just like any other function, is called when we mention its name.
// 16
4 square
To push a function to the stack instead of calling it, wrap it in a block. This
denotes an anonymous function, called a quotation. Quotations are often
used with higher-order functions which take other functions as
arguments, such as map
, filter
,
fold_left
, and zip_with
.
// [1, 4, 9]
[1, 2, 3] { square } map
// [10, 20, 30]
[1, 2, 3] { (* 10) } map
The backslash syntax \f
is a slightly more succinct equivalent,
with the advantage that an operator can be referred to as simply
\+
instead of { (+) }
. Mnemonically, this is
“escaping” the function to prevent it from being evaluated. This also works with
partially applied operators.
// [1, 4, 9]
[1, 2, 3] \square map
// [2, 3, 4]
[1, 2, 3] \(+ 1) map
// [4, 10, 18]
[1, 2, 3] [4, 5, 6] \* zip_with
Lambdas
Whereas Kitten decouples the concepts of local variable bindings
(->
…;
…) and anonymous functions
({
…}
) by default, in practice we often want to couple these
things together. For example, in a match
expression, we often want
to name the values we extracted, if present. We could use a local
variable binding inside a block:
match (xs head)
case some:
-> x;
x say
else:
"nothing" say
But we can also use the combined “lambda” form,
->
…{
…}
,
which is compatible with the usual layout syntax for blocks:
match (xs head)
case some -> x {
x say
} else {
"nothing" say
}
match (xs head)
case some -> x:
x say
else:
"nothing" say
-> x y z { x + y + z }
is exactly equivalent to
{ -> x y z; x + y + z }
. Note that this also works on
definitions.
define square
(Int32 -> Int32)
-> x:
x * x
Multiple Inputs and Outputs
As in most languages, functions can take multiple inputs. In the type signature, their types are separated by commas.
define multiply3
(Float64, Float64, Float64 -> Float64):
-> x, y, z;
x * y * z
Notice what happens when we write this function in postfix:
define multiply3
(Float64, Float64, Float64 -> Float64):
-> x, y, z;
x y z (*) (*)
We’re pulling x
, y
, and z
off the stack
and pushing them right back! Of course, the compiler can trivially optimize this
away, but it’s redundant. This is a good place to apply Kitten’s common practice
of chaining functions without using local variables. We can just pass
arguments and return values implicitly on the stack.
define multiply3
(Float64, Float64, Float64 -> Float64):
(*) (*)
Unlike most languages, Kitten also allows functions to have multiple results, which is very useful for writing in a dataflow style.
define increment_both
(Int32, Int32 -> Int32, Int32):
-> x, y;
(x + 1) (y + 1)
define increment_both
(Int32, Int32 -> Int32, Int32):
\(+ 1) to_both
to_both
is a handy combinator: it takes one function and two
values, and applies the function to both of the values.
Polymorphism
Type signatures often include type variables, which enable polymorphism. For example, here is a simple recursive function to reverse a list:
define reverse<T>
(List<T> -> List<T>):
-> xs;
match (xs head_tail)
case some:
unpair -> h, t;
t reverse h append
case none:
[]
The signature reads:
// For any type 'T'…
<T>
(
// …given a list of values of that type…
List<T>
// …this function produces…
->
// …another list of values of that type.
List<T>
)
Functions may be generic in any number of types. For example, the type of the
map
function can be written as
<A, B> (List<A>, (A -> B) -> List<B>)
.
It takes a list of A
values, and a function from A
to
B
, and returns a list of B
values. A
and
B
may refer to the same type or different types.
// Same type (Int32 -> Int32)
// [1, 2, 3, 4]
[3, 6, 9, 12] { (/ 3) } map
// Different types (Int32 -> List<Char>)
// ["3", "6", "9", "12"]
[3, 6, 9, 12] \show map
Permissions
We’ve seen functions such as say
that have side
effects. Kitten’s type system includes a concept of permissions,
which track the side effects that a function is allowed to perform.
say
, print
, newline
, and other such
output functions require the +IO
permission because they perform
I/O, and this is reflected in their types.
// Takes no inputs,
// produces no outputs,
// and has permission to do I/O.
define newline (-> +IO):
"\n" print
Some of the standard permissions in Kitten are:
+IO
- Permission to do I/O, or more generally, break referential transparency. A function requiring
+IO
is allowed to produce side effects and return different values each time it’s called on the same inputs, e.g.,"Enter a number: " ask "Enter a number: " ask
returns whatever the user entered each time, despite being called with the same argument. +Fail
- Permission to abort the program with an assertion failure. This is required by the
fail
andabort
words. It also appears in the implicitelse
branch ofmatch
expressions that don’t include acase
for all constructors and don’t have an explicitelse
. +Unsafe
- Permission to break memory safety. This can be used for low-level programming in Kitten, such as accessing the memory representations of Kitten values or working with raw pointers.