A dependently-typed programming language with static memory management

栏目: IT技术 · 发布时间: 4年前

内容简介:Neut is a dependently-typed programming language based onTheoretically, this is made possible by translating the source language into a dependent variant ofPractically, this means that you can write your program in the ordinary lambda-calculus without any

Neut

Neut is a dependently-typed programming language based on the Calculus of Constructions (CoC) . The interesting point is that this language determines how to allocate/deallocate memory at compile-time, without extra annotations to the type system. In other words, Neut in its source language is the ordinary lambda-calculus after all (imagine something like Haskell, OCaml, Idris, Coq, Agda, or Lean), and at the same time it handles memory without using, for example:

  • explicit malloc/free,
  • garbage collection,
  • region-based memory management.

Theoretically, this is made possible by translating the source language into a dependent variant of Call-By-Push-Value . The crucial point of this translation is to see that knowing the type of a term in the target calculus is knowing how to eta-expand the term, which in turn means knowing how to copy/discard the term. In the translation, a type is not erased, but translated into a computationally-meaningful term that copies/discards the terms of the type .

Practically, this means that you can write your program in the ordinary lambda-calculus without any extra restrictions or annotations, and at the same time are allowed to control how resources are used in the program. If you want an ordinary-lambda-calculus based programming language with a human-predictable semantics (including its memory management), this might be for you.

Here I briefly summarize the basic properties of Neut before diving into the details:

  • A compiled language
  • The output is LLVM IR / assembly / executable binary (the last two are via clang)
  • The type system is CoC + fix + int + float + enum + array + struct - universe hierarchy
  • The type inference algorithm is based on the one of Lean
  • The evaluation strategy is call-by-value
  • Every tail call is optimized into a loop
  • Memory allocation/deallocation is statically determined at compile time

Table of Contents

    • Proof-Theoretic Memory Management
    • Manual Optimization via Borrowing
    • Automatic Optimization via malloc/free Cancellation
  • Highly-Accelerated Tutorial
  • Syntax and Semantics of Statements
  • Syntax and Semantics of Terms
    • LLVM IR for the First Example
    • LLVM IR for the Second Example

Examples

Proof-Theoretic Memory Management

Basics

Let’s see how Neut manages resources. The following code prints the string “a” for the 3 times:

; download the core library
(ensure core/1.0.0.0
  "https://github.com/u2zv1wx/neut-core/raw/master/release/0.1.0.0.tar.gz")

(include "core/0.1.0.0/core.neut")

(with identity.bind
  (let str "a")
  (let _ (string.print str))
  (let _ (string.print str))
  (string.print str))

The (with identity.bind (...)) is the same as the do-notation in Haskell or other languages, specialized to the identity monad.

By running $ neut build --no-alloc-cancellation --emit llvm filename.neut , we obtain schematically the following LLVM IR (the --no-alloc-cancellation is to disable certain optimization that we will see later):

declare void @free(i8*)
declare i8* @write(i8*, i8*, i8*)
declare i8* @malloc(i64)
define i64 @main() {
  ; Repeat the following for the 3 times:
  ;   <memory allocation for the string>
  ;   <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ;   <write the string into stdout>
  ;   <memory deallocation for the string>
}

The non-schematic, actual output can also be found in the. I chose not to write the IR here since the output is a bit long (59 lines including comments).

The resulting LLVM IR creates a string in memory, print it, and free it for the 3 times. The point here is that the resulting code creates 2 copies of the original string. This is because the variable str is used for the 3 times; The content of a variable is copied to create n instances when the variable is used for the n times (n > 1). If the variable isn’t used at all (n < 1), the content of the variable is discarded (deallocated). If the variable is used exactly once, or linearly (n = 1), the content of the variable is used without any discarding/copying operation.

By translating the source calculus in the way sketched above, every variable is ensured to be used linearly, except for the ones in the functions that realize those copying/discarding operations. Thus, by checking that those exponentializers use resources properly, we can ensure that the target calculus of this translation handles memory properly too. Then we check that those exponentializers are indeed sane - This is the basic storyline.

The way how a variable is copied/discarded is determined by the type of the variable. For example, if the type of a variable is the array type as in the example above, the copying operation is something that you would do in C to copy an array (memory allocation followed by value insertion). If the type is an immediate type like an integer type, the “copying” operation is the one that uses the original value for the cloned value. Every type is translated into a term that copies/discards the terms of the type, including the type of the types.

Types as Exponentializers

Let us go a little deeper. Consider the following schematic user input:

let str : string := "a";
{- A CODE THAT USES `str` for the 3 times -}

Intuitively, what the compiler does is to translate the term above into the following term:

let str : string := "a";
let (discard-string, copy-string) := TRANSLATE(string);
let (str1, tmp) := copy-string(str);
let (str2, str3) := copy-string(tmp);
{- THE CODE THAT USES `str1`, `str2`, AND `str3` LINEARLY -}

Note that the second line extracts discard-string and copy-string from the tuple TRANSLATE(string) . Every type X is translated in this manner; the pair of discard-X and copy-X . These exponentializers - something that allows us to create n copies of x from a single x - are used in its continuation so that every variable of this type ( str in this example) is used linearly.

Incidentally, in the actual implementation, the result of TRANSLATE(string) is not a pair, but a function as in the pseudo-code below:

cartesian-string i e :=
  if i == 0
  then discard-string(e)
  else copy-string(e)

which is used in the following manner:

let str : string := "a";
let cartesian-string := TRANSLATE(string);
let (str1, tmp) := cartesian-string(1, tmp);
let (str2, str3) := cartesian-string(1, tmp);
{- THE CODE THAT USES `str1`, `str2`, AND `str3` LINEARLY -}

This alternative translation frees us from having to create a tuple every time when we translate a type. Thus, in the actual implementation, every type is translated into a closed function, which is then lowered to a pointer (1 word).

It would be worth noting here that these functions like cartesian-string are β-reduced (inlined) aggressively; These are ordinary functions that can be defined in the target language, after all.

Notes on Closures

You may be wondering now: “How can we copy/discard a closure? In ordinary closure conversion, a lambda-abstraction is translated into a pair consists of (1) all the free variables in the abstraction, and (2) a pointer to an appropriately-arranged closed function. How can that tuple be copied/discarded just by using type information like i64 -> bool , which is seemingly useless here? How should we translate the type i64 -> bool ?”

That is a valid question. The key to give the answer to this question is generalizing the concept of “all the free variables”. Consider the following term:

λ (a : type). λ (x : a). λ (y : i64). (x, y)

In ordinary closure conversion, the free variables of λ (y : i64). (x, y) is calculated to be [x] without making a fuss. Here, however, we generalize the concept so that we “trace” all the free variables included in the type of every free variable. In this case, for example, note that the type of x is a , which is again a free variable if it occurs in λ (y : i64). (...) , and thus this a is also considered to be a free variable. Since the type of a is type , which has no free variables, our tracing stops here, resulting a chain of the free variables [a : type, x : a] .

Note that every result of this procedure is necessarily “closed”. That is, if the list [x1 : A1, ..., xn : An] is a result of this tracing process, then the set of free variables in A{i} is a subset of {x1, ..., x{i-1}} . In this way, we extract a closed chain from a lambda-abstraction.

Let us continue the example of λ (y : i64), (x, y) . Using the closed chain we have just calculated, the compiler translates this lambda-abstraction conceptually as follows:

(∑ [a : type, x : a], (a, x), LABEL_OF_A_CLOSED_FUNCTION)

That is, a lambda-abstraction is translated into a tuple consists of (0) the type of its closed chain , (1) its closed chain, and (2) a pointer to an appropriately-arranged closed function. Now, remember that every type is translated into a term that copies/discards the terms of the type. Also remember that every type is translated into a function pointer, which can be copied/discarded without any malloc/free operations. Thus, we can generate a function that copies/discards a closure conceptually as follows:

cartesian-closure i closure = do
  -- extract the element of the closure, freeing the outer 3-word tuple
  let (typeOfChain, chain, label) := closure;
  if i == 0
  then do
    -- discard the chain using the type information
    let _ := typeOfChain(0, chain);
    -- note that both typeOfChain and label are immediate
    return ()
  else do
    -- copy the chain using the type information
    let (chainA, chainB) := typeOfChain(1, chain);
    -- construct the 2 closures, and the wrapper tuple (thus do malloc for the 3 times)
    return ((typeOfChain, chainA, label), (typeOfChain, chainB, label))

Thus, we can translate every Π-type into the function cartesian-closure . Every Closure is copied/discarded in the same way, regardless of its actual details. So, information like i64 or bool in i64 -> bool is indeed useless here; It simply isn’t necessary since every closure knows how to copy/discard itself.

The remaining piece is how a type of a closed chain like ∑ [a : type, x : a] is translated. This is where eta-expansion plays its role. Suppose we have a term e of type ∑ [a : type, x : a] . Since we already know the type of e , we can eta-expand this term as follows:

let (a, x) := e in
(a, x)

Now, at this point, note that if we can copy both a : type and x : a , we can then copy e as follows:

let (a, x) := e in
let (a1, a2) := {COPY_a} a in
let (x1, x2) := {COPY_x} x in
((a1, x1), (a2, x2))

Thus, thanks to eta-expansion, the problem of copying/discarding the terms of type ∑ [a : type, x : a] is reduced into the one of copying/discarding the terms of type a : type and x : a .

The actual copying function is constructed inductively as follows. The starting point is the following term:

let (a, x) := e in
((a, x), (a, x))

Firstly we copy x - using its type a - so that x is used linearly:

let (a, x) := e in
let (x1, x2) := a(1, x) in
((a, x1), (a, x2))

This creates a term that uses x linearly. Then we copy the term a - using its type type - so that a is used linearly:

let (a, x) := e in
let (a1, tmp) := type(1, a) in
let (a2, a3) := type(1, tmp) in
let (x1, x2) := a1(1, x) in
((a2, x1), (a3, x2))

where the type is a term defined as follows:

type i a =
  if i == 0
  then ()     -- non-linear (affine) use of `a`
  else (a, a) -- non-linear (relevant) use of `a`

which is resource-safe since a type is translated into a function pointer. This creates a term that uses both a and x linearly.

Note that, by its construction, the first element of a closed chain doesn’t contain any free variables, and therefore can be copied without using any free variables.

In conclusion, the copying part of ∑ [a : type, x : a] is defined by the following term:

copy-sigma sig :=
  let (a, x) := sig in
  let (a1, tmp) := type(1, a) in
  let (a2, a3) := type(1, tmp) in
  let (x1, x2) := a1(1, x) in
  ((a2, x1), (a3, x2))

By using this term, we can now copy the tuple (a, x) in the closure (∑ [a : type, x : a], (a, x), LABEL) . The discarding function is defined similarly. That is, we change the starting point to

let (a, x) := e in
()

and do the same inductive procedure. Now we just have to construct the following term:

cartesian-sigma i sig :=
  if i == 0
  then discard-sigma sig
  else copy-sigma sig

and translate ∑ [a : type, x : a] into the cartesian-sigma above. In this way we can copy/discard a closure.

Notes on Polymorphic Functions

Or you may be wondering: “What if a function is polymorphic? If the size of an argument is not fixed, how can that function copy the term?”

That is again a valid question, and here comes dependent-type. Firstly, remember that a polymorphic function in dependent-type theory is nothing but an ordinary function with an argument of type tau , where tau is the type of types. For example, the following is a polymorphic function that creates a pair of any type:

; to-tuple : Π (a : tau, x : a). a * a
(define to-tuple ((a tau) (x a))
  (tuple x x))

This function to-tuple is, for example, used as follows:

(to-tuple i64 1)          ; ~> (tuple 1 1)
(to-tuple bool bool.true) ; ~> (tuple bool.true bool.true)
(to-tuple string "a")     ; ~> (tuple "a" "a")

Note that the type i64 is used in exactly the same way as 1 ; A type is nothing but an ordinary term of type tau . And these very terms i64 , bool , and string in the example are translated into ordinary closed functions that copies/discards the terms of the types. The to-tuple function can therefore copy the resource x of type a conceptually as follows:

to-tuple :: Π (a : tau, x : a). a * a
to-tuple a x :=
  let (x1, x2) := a(1, x) in
  (x1, x2)

Thus the answer to the question is: Polymorphic functions can copy/discard its polymorphic argument since the type, which is guaranteed to be passed as an argument, contains information on how to copy/discard the terms of the type.

Summary So Far

  • A variable is copied/discarded so that the variable is used linearly
  • A type is lowered into a function pointer that copies/discards the terms of the type
  • Closures can be copied/discarded since they know how to copy/discard itself
  • Polymorphic function can copy/discard its polymorphic arguments thanks to the information provided by its type argument

This is the basic behavior of Neut’s proof-theoretic memory management. As you might already be aware, this naive copying/discarding can result in an inefficient object code. We often use a variable more than once, as in the example of str :

(with identity.bind
  (let str "a")
  (let _ (string.print str))
  (let _ (string.print str))
  (string.print str))

We can’t say the resulting LLVM IR of this code is efficient enough; We can’t ignore those redundant copy operations.

Fortunately, there is a workaround for this performance problem.

Manual Optimization via Borrowing

The point of the workaround is straightforward: If those copying/discarding operations result from using variables non-linearly, we simply have to use variables linearly. Let’s go back to the first example code:

(ensure core/1.0.0.0
  "https://github.com/u2zv1wx/neut-core/raw/master/release/0.1.0.0.tar.gz")

(include "core/0.1.0.0/core.neut")

(with identity.bind
  (let str "a")
  (let _ (string.print str))
  (let _ (string.print str))
  (string.print str))

We would like to use the variable str linearly. To this end, we can request string.print to include the argument str in its return value. So, the type of string.print shouldn’t be something like string -> top - where the top is the unit type - but should be string -> string * top , where the A * B means the product type of A and B . More specifically, the implementation of string.print should be something like (in pseudo-code):

string.print :: string -> string * top
string.print str = do
  {print the string `str`}
  return (str, unit)

With that definition of string.print , we can use the variable str linearly (again, in pseudo-code):

let str1 = "a";
let (str2, _) := string.print str1;
let (str3, _) := string.print str2;
let (str4, _) := string.print str3;
unit

Note that the variables str1 , str2 , and str3 are used exactly once, and str4 for the 0 time. Therefore, the copying operation doesn’t occur in the code above. Also, since the str4 is defined but not used, the str4 is discarded immediately after its definition.

Now we have seen that those redundant copying/discarding operations can be avoided by writing the code in the manner above. There still remains a problem: code cluttering. It would be much nicer to have more sophisticated notation of that code pattern. Towards that end, firstly note that we can use the same name for the variables str1 , str2 , str3 , and str4 thanks to variable shadowing:

let str = "a";
let (str, _) := string.print str;
let (str, _) := string.print str;
let (str, _) := string.print str;
unit

Now, we just have to introduce a notation that translates:

let foo := string.print &str;

into:

let (str, foo) := string.print str;

With this notation, our running example is rewritten as follows:

let str = "a";
let _ := string.print &str;
let _ := string.print &str;
let _ := string.print &str;
unit

And this is the notation that is implemented in Neut. Indeed, the following is a valid code of Neut:

(ensure core/1.0.0.0
  "https://github.com/u2zv1wx/neut-core/raw/master/release/0.1.0.0.tar.gz")

(include "core/0.1.0.0/core.neut")

(with identity.bind
  (let str "a")
  (let _ (string.print &str))
  (let _ (string.print &str))
  (let _ (string.print &str))
  top.unit) ; ~> top.unit

Or,

(ensure core/1.0.0.0
  "https://github.com/u2zv1wx/neut-core/raw/master/release/0.1.0.0.tar.gz")

(include "core/0.1.0.0/core.neut")

(with identity.bind
  (let str "a")
  (let _ (string.print &str))
  (let _ (string.print &str))
  (string.print str)) ; ~> (unit, "a")

This notation is “borrowing” in Neut. Note that borrowing in Neut is nothing but a syntactic translation. Borrowing has nothing to do with, for example, the type system, or the operational semantics, of Neut. Indeed, this syntactic translation is processed at the stage of parsing in the compiler.

Let’s see how the resulting LLVM IR changes. Is it faster now? We can compile the code above by running $ neut build --no-alloc-cancellation --emit llvm filename.neut . The output is schematically as follows:

declare void @free(i8*)
declare i8* @write(i8*, i8*, i8*)
declare i8* @malloc(i64)
define i64 @main() {
  ; <memory allocation for the string>
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ; <write the string into stdout for the three times>
  ; <memory deallocation for the string>
  ; <return 0>
}

Again, the non-schematic output can be found in the. The output in this time is actually short enough to include it here (36 lines including comments), though I chose not to. The point here is that the string “a” is reused without copying, as expected.

The resulting assembly code, which can be obtained by --emit asm , is reasonably small too (works on macOS; when you compile the same code on Linux, the write operations are lowered into the corresponding syscalls):

	.section	__TEXT,__text,regular,pure_instructions
	.macosx_version_min 12, 15
	.globl	_main                   ## -- Begin function main
	.p2align	4, 0x90
_main:                                  ## @main
	.cfi_startproc
## %bb.0:
	pushq	%rbx
	.cfi_def_cfa_offset 16
	.cfi_offset %rbx, -15
	movl	$1, %edi                 # 1-byte memory allocation (you can ignore the lines above this line)
	callq	_malloc
	movq	%rax, %rbx
	movb	$97, (%rax)              # write 'a' (= 97) to the allocated memory
	movl	$1, %edi                 # set the arguments for `_write`
	movl	$1, %edx
	movq	%rax, %rsi
	callq	_write                   # ... and call `_write` (i.e. print 'a')
	movl	$1, %edi                 # (repeat)
	movl	$1, %edx
	movq	%rbx, %rsi
	callq	_write
	movl	$1, %edi                 # (repeat)
	movl	$1, %edx
	movq	%rbx, %rsi
	callq	_write
	movq	%rbx, %rdi               # free the allocated memory
	callq	_free
	xorl	%eax, %eax
	popq	%rbx
	retq
	.cfi_endproc
                                        ## -- End function
.subsections_via_symbols

In short: the resulting code is faster in that it is free from the redundant copying operations we saw in the first example.

This is how Neut controls resources efficiently, without modifying the type system of the source language.

Automatic Optimization via malloc/free Cancellation

Neut’s static memory management enables not only the “manual” optimization we have just seen, but also another “automatic” optimization. Remember the first example:

(with identity.bind
  (let str "a")
  (let _ (string.print str))
  (let _ (string.print str))
  (let _ (string.print str))
  (i64 0))

and the output IR of this example code:

declare void @free(i8*)
declare i8* @write(i8*, i8*, i8*)
declare i8* @malloc(i64)
define i64 @main() {
  ; Repeat the following for the 3 times:
  ;   <memory allocation for the string>
  ;   <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ;   <write the string into stdout>
  ;   <memory deallocation for the string>
}

The code is already judged to be inefficient in that it allocates/deallocates memory unnecessarily. More specifically, it is inefficient in that it deallocates the memory that can actually be reused.

Now you might think: If the sizes of allocations/deallocations are known at compile-time, isn’t it possible to compare the sizes of them at compile-time and emit a code that reuses the allocated memory?

It is indeed possible. When the option --no-alloc-cancellation is not passed, the compiler translates code pieces something like this:

define TYPE @FUNCTION_NAME(...) {
  (...)
  ; <memory allocation for the string>
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ; <write the string into stdout>
  ; <memory deallocation for the string>    -- (*1)
  ; <memory allocation for the string>      -- (*2)
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ; <write the string into stdout>
  ; <memory deallocation for the string>
  (...)
}

into something like this:

define TYPE @FUNCTION_NAME(...) {
  (...)
  ; <memory allocation for the string>
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ; <write the string into stdout>
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ; <write the string into stdout>
  ; <memory deallocation for the string>
  (...)
}

In other words, the compiler can cancel the memory deallocation at (*1) and the allocation at (*2) , reusing the allocated memory in its continuation. This is automatic malloc/free cancellation. By this fallback optimization, the compiler can emit somewhat more performant code even if a user wrote code in an inefficient way.

Note that the “create the string” parts are not optimized away from the resulting LLVM IR, in contrast to the one of borrowing:

define i64 @main() {
  ; <memory allocation for the string>
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  ; <write the string into stdout>
  ; <write the string into stdout>
  ; <write the string into stdout>
  ; <memory deallocation for the string>
  ; <return 0>
}

Although the compiler can cancel memory allocations/deallocations, it cannot cancel their accompanying initialization processes (at least for now). If you do need performance, you need to write code in the linear/borrowing style.

Summary

  • Neut statically determines malloc/free at compile-time via type information
  • The content of a variable is
    • discarded if and only if the variable isn’t used at all
    • untouched if and only if the variable is used exactly once (i.e. used linearly)
    • copied if and only if the variable is used more than once
  • Linearity tends to result in an efficient code
  • Non-linearity tends to result in an inefficient code
  • Borrowing can be used as a convenient syntactic tool when accomplishing linearity
  • Redundant malloc/free can be reduced by automatic malloc/free cancellation
  • Borrowing-based, or “manually” optimized code is faster than cancellation-based, or “automatically” optimized code

Installation

The currently supported platforms are: Linux (x64), macOS (x64).

Make sure that you have already installed zlib (>= 1.2.11), libssl (>= 1.1.1), stack (>= 2.3.0) and clang (>= 10.0.0). On Debian, for example, the first three can be installed as follows:

$ sudo apt install zlib1g-dev libssl-dev haskell-stack
$ stack upgrade --binary-only # not required if your stack is already up-to-date

clang can be installed in the way described here .

Also make sure that you have ~/.local/bin in your $PATH .

Then, clone the repository and build it:

$ git clone https://github.com/u2zv1wx/neut
$ cd neut
$ git checkout 0.1.0.0
$ stack test    # this builds the project and tests its behavior
$ stack install # this installs the executable `neut` into `~/.local/bin`

To uninstall, you just have to remove the binary ~/.local/bin/neut and the directory ~/.local/share/neut .

As for editor support, you can currently try neut-mode and flycheck-neut if you’re using Emacs. The former package is for syntax highlighting, and the latter one for linting.

Highly-Accelerated Tutorial

You can find a detailed description of the syntax, the logic, and the semantics of Neut in the succeeding sections. Those should suffice to read/write a program of Neut. Having said that though, some might prefer learning from actual source code after taking a brief look at this and that basic stuff. So here I introduce you some necessities that would be required to understand - or guess the meaning of - a program of Neut.

Let’s start. Notes on programs. (0) A program of Neut is a list of statements, processed one by one. (1)andare the ones that use codes written in other files. I recommend you to read the linked notes; both of them are not so long. (2)andare the ones that handle namespace. Again I recommend you to read them.

Next. Notes on terms. (0) Note that what follows presupposes the notation.neut in the core library ; Some of these won’t work without including the file. (1) tau is the type of types. (2) Π is the universal quantification. Note that Π (x : A). B is the same as the arrow type A -> B if x ∉ freevar(B) . Also note that a lambda-abstraction in Neut is n-ary; (λ ((x A) (y B)) e) is not the same as (λ ((x A)) (λ ((y B)) e)) . (3) If you want to create a tuple, you can use (sigma-introduction e1 ... en) or (tuple e1 ... en) . You can destruct a tuple by (sigma-elimination (x1 ... xn) e cont) . (4) You can write (question e) or ?x when you want the compiler to show the type of e or x . (5) You can write * to have the compiler infer the specified part. For example, assuming (define id ((a tau) (x a)) x) , you can write (id * bool.true) instead of (id bool bool.true) . (6) You will notice that (witness t e) is used here and there. This is a notation defined by (notation (witness t e) ((λ ((x t)) x) e)) ; A notation that tells the compiler that the term of e is t .

Next. Notes on primitives: (0) The following primitive types are available: i1 , i2 , i3 , …, i64 . These are the same as the corresponding integer types in LLVM. (1) You can also use f16 , f32 , and f64 . These are LLVM’s half , float , and double , respectively. (2) You will soon come to want primitive instructions - like add , mul , or xor - that can operate on terms of these types. You can find information on them. (3) In the section you can also find notes on unsafe casting, array accessing, and syscall. I think these should also count as necessities. (4) When you use an effectful primitive like os.write in a type, the behavior of the resulting executable is undefined.

Next. Notes on the compiler subcommands. (0) You can build a program with $ neut build path/to/file.neut . (1) You can create an tar.gz archive of a project via $ neut archive path/to/dir . Then you can upload the archive to somewhere, allowing others to ensure and include it.

Now I think you are basically ready to, for example, start reading the files in the test directory , or the files in the core library , referring the sections below as necessary. After that you should know how to write lambdas, recursive functions, inductive types, tuples. You need a state? You can use the state monad. Multiple effects? The free monad. You have the full power of lambda calculus.

That pretty much should do it. I hope you enjoy this language.

Syntax and Semantics of Statements

A program of Neut is a list of statements. Each statement is one of the following statements described in this section.

Please note that the descriptions in the followings are not that formal yet.

In the following, I use the symbols LEAF and TREE defined as follows:

LEAF ::= {a sequence of character that doesn't contain '(', ')', ' ', '\n', or '"'}
TREE ::= LEAF | (TREE ... TREE)

I also define the symbol STRING to be a double-quoted string.

Note that the examples codes below usually assumes that the core library is already included.

constant

constant declares an external constant.

Example

(constant unsafe.cast
  (Π ((A tau)
      (B tau)
      (x A))
     B))

(let foo (unsafe.cast i64 string 1)) ; foo : string

Syntax

(constant LEAF TREE)

Semantics

(constant x t) modifies the state of the compiler so that the x of type t is available in the succeeding code. Declared constants can then be available for use in the same way as ordinary.

All the constants must have distinct names.

This statement is currently intended only for internal use (e.g. implementation of a syscall). FFI might also be supported in future release via this statement.

ensure

ensure fetches the content of the specified URL for later use.

Example

(ensure core/1.0.0.0
  "https://github.com/u2zv1wx/neut-core/raw/master/release/0.1.0.0.tar.gz")

(include "core/0.1.0.0/core.neut")

Syntax

(ensure LEAF STRING)

Semantics

(ensure path URL) fetches the content of the specified URL, extracts the content into ~/.local/share/neut/NEUT_VERSION/path , assuming that the format of the archive is tar.gz . The path must be a valid path string. Every path separator in path is treated “literally”. For example, if the path is some-library/0.2.0.0 , the content of the archive is extracted into ~/.local/share/neut/NEUT_VERSION/some-library/0.2.0.0 .

If the target directory of archive extraction already exists, ensure does nothing.

This statement is intended to be used in harmony with.

Archives specified in ensure is expected to be the ones created via.

enum

enum declares a new enum-type and its values.

Example

(enum choice left right) ; defines choice : tau, choice.left : choice, and choice.right : choice

(let x choice.left)

(i64.print
  (enum-elimination x
    (choice.left
      (i64 1))
     choice.right
      (i64 2))) ; ~> 1

(i64.print (unsafe.cast choice i64 choice.left)) ; ~> 0

(i64.print (unsafe.cast choice i64 choice.right)) ; ~> 1

(enum foo
  (a 100)
  b
  (c 20)
  d
  e
  (f 103))

(i64.print (unsafe.cast foo i64 foo.a)) ; ~> 100
(i64.print (unsafe.cast foo i64 foo.b)) ; ~> 101
(i64.print (unsafe.cast foo i64 foo.c)) ; ~> 20
(i64.print (unsafe.cast foo i64 foo.d)) ; ~> 21
(i64.print (unsafe.cast foo i64 foo.e)) ; ~> 22
(i64.print (unsafe.cast foo i64 foo.f)) ; ~> 103

Syntax

(enum LEAF LEAF_INT ... LEAF_INT)

LEAF_INT := LEAF | (LEAF INT)

Semantics

(enum x a1 ... an) updates the state of the compiler so that the specified enum-type x : tau and the enum-values x.a1, ..., x.an : x can be used in its continuation.

Every enum-value has its internal i64 value (discriminant). Those discriminant values can be extracted by using the constant unsafe.cast , though usually not recommended.

Discriminant value starts from 0 by default, and increments one by one. The “current” value of this process can be modified by writing, e.g. (enum foo a (b 100) c d) . In this example, the discriminant value of c is set to be 101 .

All the discriminant values of an enum-type must be distinct.

erase

erase makes given variable invisible in its continuation.

Example

(define helper (x)
  (add-i64 x 1))

(define f (x)
  (mul-i64 (helper x) 2))

(f 10)        ; ~> 22

(helper 10)   ; ~> 11

(erase helper)

(f 10)        ; ~> 22

; (helper 10) ; ~> undefined variable: helper

Syntax

(erase LEAF)

Semantics

erase makes given variable invisible in its continuation.

include

include “pastes” the content of the specified file.

Example

(include "core/0.1.0.0/free.neut")

(include "./relative/path/from/the/dir/path/of/this/file.neut")

Syntax

(include STRING)

Semantics

{CODE_1}

(include STRING)

{CODE_2}

~>

{CODE_1}

{THE_CONTENT_OF_THE_FILE_SPECIFIED_BY_THE_STRING}

{CODE_2}

With the following notes:

(1) If the first character of the path is dot (“.”), the path is interpreted as a relative one. That is, the path is calculated using the current file’s directory as the base path. Otherwise, the base path is set to be the library path (i.e. ~/.local/share/neut/NEUT_VERSION/library ).

(2) If the file is already included, include does nothing.

(3) When including a file,the prefix environmentmust be empty.

(4) Cyclic inclusion is invalid.

inductive

inductive defines an inductive type, its introduction rules (constructors), and its pattern match function.

Example

(inductive list ((a tau))
  (nil ()
    (list a))
  (cons ((_ a) (_ (list a)))
    (list a)))

(define length
  ((a tau)
   (xs (list a)))
  (list.case i64 a xs
    (λ ()
      0)
    (λ (_ ys)
      (add-i64 1 (length a ys)))))

(let xs (list.cons * 10 (list.cons * 20 (list.nil i64))))

(i64.print (length xs)) ; ~> 2

; mutually inductive types
(inductive
  (even ((_ (nat)))
    (zero-is-even
      ()
      (even (nat.zero)))
    (succ-of-odd-is-even
      ((n (nat))
       (_ (odd n)))
      (even (nat.succ n))))
  (odd ((_ (nat)))
    (succ-of-even-is-odd
      ((n (nat))
       (_ (even n)))
      (odd (nat.succ n)))))

Syntax

(inductive LEAF ((LEAF TREE) ... (LEAF TREE))
  (LEAF ((LEAF TREE) ... (LEAF TREE))
    TREE)
  ...
  (LEAF ((LEAF TREE) ... (LEAF TREE))
    TREE))

; n-mutual inductive type
(inductive
  (LEAF ((LEAF TREE) ... (LEAF TREE))
    (LEAF ((LEAF TREE) ... (LEAF TREE))
      TREE)
    ...
    (LEAF ((LEAF TREE) ... (LEAF TREE))
      TREE))
  ...
  (LEAF ((LEAF TREE) ... (LEAF TREE))
    (LEAF ((LEAF TREE) ... (LEAF TREE))
      TREE)
    ...
    (LEAF ((LEAF TREE) ... (LEAF TREE))
      TREE)))

Semantics

When parsed, the inductive statement is translated into the let statements that defines (1) the inductive type, (2) the introduction rules (or the constructors of the inductive type), and (3) the pattern match function. For example, consider the following inductive statement:

(inductive list ((a tau))
  (nil ()
    (list a))
  (cons ((_ a) (_ (list a)))
    (list a)))

Given this statement, the compiler generates the let statements that define the followings:

  1. The inductive type list : Pi (a : tau). tau
  2. The introduction rules (constructors) of the type:
    list.nil : Pi (a : tau). list a
    list.cons : Pi (a : tau, _ : a, _ : list a). list a
    
  3. The pattern matching function of the type:
    • list.case : Π (z : tau, a : tau, _ : list a, on-nil : Π (). z, on-cons : Π (_ : a, _ : list a). z). z

The structure of a pattern matching function is: Π (RESULT_TYPE : tau, {ARGUMENTS_OF_THE_INDUCTIVE_TYPE}, {THE_TERM_BEING_MATCHED}, {LIST_OF_CLAUSES}). RESULT_TYPE .

The codomain of each constructor must be of the form (a e1 ... en) , where the a is the inductive type being defined. For example,

(inductive foo ((x bool))
  (bar () i64))

is an invalid inductive statement, since the codomain of bar is not of the form (foo _) , but i64 .

introspect

introspect introspects the state of the compiler and selects statements by those information.

Example

(introspect OS
  (linux
    (include library "constant/linux.neut"))
  (darwin
    (include library "constant/darwin.neut")))

Syntax

(introspect LEAF (LEAF TREE ... TREE) ... (LEAF TREE ... TREE))

Semantics

(introspect VAR
  (VAR-1 stmt-1-1 ... stmt-1-n{1})
  ...
  (VAR-m stmt-m-1 ... stmt-m-n{m}))

~>

(stmt-i-1)
...
(stmt-i-n)

[where VAR == VAR-i]

If the corresponding value is not found in the clause list, this statement does nothing.

The var in (introspect var (...)) must be a valid compile-time variable. The valid compile-time variables and its possible values are currently as in the table below:

compile-time variable possible values
OS linux, darwin
architecture x64

notation

The notation statement registers a notation (macro).

Example

(notation λ Π-introduction)

((λ (x) (add-i64 x 1)) 10) ; ~> 11

(notation switch enum-elimination)

(notation (if b e1 e2)
  (switch b
    (bool.true e1)
    (bool.false e2)))

Syntax

(notation TREE TREE)

Semantics

(notation FROM TO) modifies the state of the compiler so that the mapping FROM ~> TO is recognized as a notation (macro). This affects the result of macro expansion in the succeeding code.

Notes on Macro Expansion

Macro-expansion in Neut is a process of applying step-expansion recursively.

I believe this step-expansion process is best explained by example. Consider we have defined a notation as follows:

(notation (if b e1 e2)
  (switch b
    (bool.true e1)
    (bool.false e2)))

Also suppose that we have the following AST:

(if some-condition
  (i64.print 10)
  (string.print "hello"))

In this situation, the compiler creates the following substitution:

b  ~> some-condition
e1 ~> (i64.print 10)
e2 ~> (string.print "hello")

and then replaces b , e1 , and e2 in the codomain of the notation according to this substitution. The resulting tree is:

(switch some-condition
  (bool.true (i64.print 10))
  (bool.false (string.print "hello")))

Given this step-expansion, macro-expansion is schematically defined as follows.

macroExpand :: Tree -> Tree
macroExpand inputTree =
  case inputTree of
    Leaf x ->
      recurseOrReturn (Leaf x)
    Node t1 ... tn ->
      recurseOrReturn (Node (macroExpand t1) ... (macroExpand tn))

recurseOrReturn :: Tree -> Tree
recurseOrReturn t =
  case stepExpand t of
    -- if there exists a notation that step-expands `t`, with resulting subtitution `sub`
    Just sub ->
      macroExpand (applySubstitution sub t)
    Nothing ->
      t

(The pseudo-code above is intended to be call-by-value)

You may want to use this statement withto accomplish (a certain sort of) safety.

let

let evaluates given term and binds the result to the specified variable.

Example

(let foo (i64 10)) ; define a variable `foo` to be `10`

(i64.print foo) ; ~> 10 (this is equivalent to `(let _ (i64.print foo))`)

(let (bar i64) 20) ; `let` with type annotation

(i64.print bar) ; ~> 20

; `define` is defined in `core/<VERSION>/notation.neut` as follows:
; (notation (define f xts e)
;   (let f (fix f xts e)))
;
; (notation (define f e)
;   (let f e))
;

; ordinary definition (i.e. 1-mutual definition)
(define fact ((x i64))
  (if (icmp-sle-i64 x 0) ; compare (by less-than-or-equal) two `i64`s as signed integers
    1
    (mul-i64 x (fact (sub-i64 x 1)))))

(i64.print (fact foo)) ; ~> 3628800 (= 10!)

; mutual recursion can be realized as in the ordinary way:
(define even-f ((f (hom i64 bool)) (n i64))
  (if (icmp-eq-i64 n 0)
    true
    (f (sub-i64 n 1))))

(define odd ((n i64))
  (if (icmp-eq-i64 n 0)
    false
    (even-f odd (sub-i64 n 1))))

(define even ((n i64))
  (even-f odd n))

(i64.print
  (if (even 10)
    100
    1000)) ; ~> 100

Syntax

(let LEAF_PLUS TREE)

LEAF_PLUS ::= LEAF | (LEAF TREE)

Semantics

(let x e) checks the type of the term e , evaluates the term e , then defines a variable x with e as its content. x is available in the continuation. The type of x can be annotated to be t by writing (let (x t) e) .

If a user input e is not parsed as a statement when it is supposed to be, the compiler interprets it as (let _ e) .

record

The record statement creates a record type, its elimination rules (destructors), and its introduction rule.

Example

(record my-record ((a tau))
  (item-1
    a)
  (item-2
    i64)
  (item-3
    top))

(let item
  (my-record.new
    i64
    10
    20
    top.unit))

(i64.print (my-record.item-1 i64 item)) ; ~> 10

(i64.print (my-record.item-2 i64 item)) ; ~> 20

Syntax

(inductive LEAF ((LEAF TREE) ... (LEAF TREE))
  (LEAF TREE)
  ...
  (LEAF TREE))

Semantics

When parsed, the record statement is translated into let statements that defines (1) the record type, (2) the elimination rules (or the destructors of the record type), and (3) the introduction rule. For example, consider the following record statement:

(record my-record ((a tau))
  (item-1
    a)
  (item-2
    i64)
  (item-3
    top))

Given this statement, the compiler generatesstatements that define the followings:

  1. The record type my-record : Pi (a : tau). tau
  2. The elimination rules (destructors) of the type:
    my-record.item-1 : Pi (a : tau, _ : my-record a). a
    my-record.item-2 : Pi (a : tau, _ : my-record a). i64
    my-record.item-3 : Pi (a : tau, _ : my-record a). top
    
  3. The introduction rule of the type:
    • my-record.new : Π (a : tau, item-1 : a, item-2 : i64, item-3 : top). my-record a

Internally, every record statement is translated into the correspondingstatement. For example, given the record statement above, the compiler translates it into the following inductive statement:

(inductive my-record ((a tau))
  (new
    ((item-1 a)
     (item-2 i64)
     (item-3 top))
    (my-record a)))

The compiler then processes this statement in the same way ordinary inductive statement, generating the record type my-record and the introduction rule my-record.new , and the pattern matching function my-record.case . After that, the compiler automatically generates destructors from this definition via my-record.case . For example, the generated definition of my-record.item-1 is as follows:

(define my-record.item-1 ((a tau) (record-value (my-record a)))
  (my-record.case a record-value
    (λ (item-1 item-2 item-3) item-1)))

The other destructors are defined similarly.

use/unuse

use inserts a prefix to the prefix environment. unuse removes a prefix from the prefix environment.

Example

(let foo.bar.buz (i64 10))

(i64.print buz)         ; ~> undefined variable: `buz`
(i64.print bar.buz)     ; ~> undefined variable: `bar.buz`
(i64.print foo.bar.buz) ; ~> 10

(use foo)

(i64.print buz)         ; ~> undefined variable: `buz`
(i64.print bar.buz)     ; ~> 10
(i64.print foo.bar.buz) ; ~> 10

(use foo.bar)

(i64.print buz)         ; ~> 10
(i64.print bar.buz)     ; ~> 10
(i64.print foo.bar.buz) ; ~> 10

(unuse foo)

(i64.print buz)         ; ~> 10
(i64.print bar.buz)     ; ~> undefined variable: `bar.buz`
(i64.print foo.bar.buz) ; ~> 10

Syntax

(use LEAF)

(unuse LEAF)

Semantics

When parsed, the statement (use PREFIX) inserts PREFIX at the head of the prefix environment, which is a list of prefixes.

When parsed, the statement (unuse PREFIX) removes PREFIX from the prefix environment. If the PREFIX is not contained in the environment, the unuse statement does nothing.

The prefix environment modifieshow the compiler interprets a variable.

section/end

section - end automatically adds the specified prefix to the variables defined by let .

Example

(section pohe)

(let foo (i64 10))

(section qux)

(let bar (i64 20))

(i64.print foo)          ; ~> 10
(i64.print pohe.foo)     ; ~> 10

(i64.print bar)          ; ~> 20
(i64.print qux.bar)      ; ~> 20
(i64.print pohe.qux.bar) ; ~> 20

(end qux)

(i64.print foo)          ; ~> 10
(i64.print pohe.foo)     ; ~> 10

(i64.print bar)          ; ~> undefined variable
(i64.print qux.bar)      ; ~> 20
(i64.print pohe.qux.bar) ; ~> 20

(end pohe)

(i64.print foo)          ; ~> undefined variable
(i64.print pohe.foo)     ; ~> 10

(i64.print bar)          ; ~> undefined variable
(i64.print qux.bar)      ; ~> undefined variable
(i64.print pohe.qux.bar) ; ~> 20

Syntax

(section LEAF)

(end LEAF)

Semantics

The list of statement

(section FOO)
(let x1 e1)
...
(let xn en)
(end FOO)

is equivalent to:

(use FOO)
(let FOO.x1 e1)
...
(let FOO.xn en)
(unuse FOO)

In other words, the section - end statement

  • insertsat the beginning and the end of the section
  • adds the name of the section as a prefix of the variables defined by let
  • keeps all the other statements in the section intact

Each section must be paired with an end with the corresponding name.

Syntax and Semantics of Terms

tau

tau is the type of types.

Example

(define id ((a tau) (x a)) x)

(id i64 10)

Syntax

tau

Inference Rule

-------------------(empty)
well-formed(EMPTY)


well-formed(Γ)
--------------- (tau)
Γ |- tau : tau

Semantics

tau doesn’t have any operational semantics.

variable

Example

(let x (i64 10))

(i64.print x) ; ~> 10

(let _ (i64 20)) ; anonymous variable

Syntax

LEAF

Inference Rule

Γ |- A : tau
------------------------- (ext)
well-formed(Γ, x : A)


well-formed(Γ)  (x : A) ∈ Γ
---------------------------- (var)
       Γ |- x : A

Semantics

A variable doesn’t have any operational semantics by itself.

Notes

If a leaf is not parsed into any other syntactic construct, the leaf x is regarded as a variable.

The name of a variable is interpreted in relation with the keyword environment. For example, if the keyword environment is ["foo", "bar", "buz"] , the name qux is interpreted in the following way:

qux
foo.qux
bar.qux
buz.qux

Π

Π is the universal quantification.

Example

; unary Π-introduction
(let f2
  (Π-introduction ((x i64)) x))

; Π-elimination
(Π-elimination i64.print (Π-elimination f2 2))

; Π-elimination with the familar (or, implicit) syntax
(i64.print (f2 2))

; nullary Π-introduction
(let f1
  (lambda () (i64 1))) ; 'lambda' can be used instead of 'Π-introduction'

; binary Π-introduction
(let f3
  (λ ((x i64)   ; an argument with type annotation
      y)        ; an argument without type annotation
    (i64.add x y)))

(i64.print (f3 1 2))

Syntax

(Π (LEAF_PLUS*) B)
(Π-introduction (LEAF_PLUS*) e)
(Π-elimination TREE+)
(TREE+)

LEAF_PLUS ::= LEAF | (LEAF TREE)

Inference Rule

Γ |- A1 : tau    Γ, x1 : A1 |- A2 : tau    (...)    Γ, x1 : A1, ..., xn : An |- B : tau
---------------------------------------------------------------------------------------- (Π)
                    Γ |- (Π ((x1 A1) ... (xn An)) B) : tau


             Γ, x1 : A1, ..., xn : An |- e : B
------------------------------------------------------------------------------- (Π-introduction)
 Γ |- (Π-introduction ((x1 A1) ... (xn An)) e) : (Π ((x1 A1) ... (xn An)) B)


Γ |- e : (Π ((x1 A1) ... (xn An)) B)   Γ |- e1 : A1   (...)   Γ |- en : An {xi := ei}
-------------------------------------------------------------------------------------- (Π-elimination)
              Γ |- (Π-elimination e e1 ... en) : B {xi := ei}

Semantics

(Π-elimination e e1 ... en)
~> (Π-elimination v v1 ... vn)  [i.e. reduce e and ei into the values v and vi, from left to right]

(Π-elimination (Π-introduction ((x1 A1) ... (xn An)) e) v1 ... vn)
~> e {x1 := v1, ..., xn := vn}

Notes

If a tree (e e1 ... en) is not parsed into any other terms, the tree is regarded as (Π-elimination e e1 ... en) .

If the name of an argument of a Π-introduction is “_”, the compiler automatically generates a fresh name so that the variable cannot be used in its scope.

Note that the arguments of a lambda-abstraction is generalized from unary to n-ary. This enables the compiler to emit more performant code when a lambda-abstraction receives multiple arguments; Without that generalization, the arguments must be represented as a tuple, discarding the possibility to pass the arguments of a function using multiple registers.

Some additional notations for Π are defined in core/VERSION/notation.neut :

(notation forall Π)

(notation Pi Π)

(notation lambda Π-introduction)

(notation λ Π-introduction)

; tells the compiler that the type of `e` is `t`
(notation (witness t e)
  ((λ ((x t)) x) e))

Incidentally, I personally recommend you to use the witness notation above when defining a function to write the resulting type of the function explicitly. For example, the code

(define fact ((x i64))
  (witness i64
    (if (icmp-sle-i64 x 0)
      1
      (mul-i64 x (fact (sub-i64 x 1))))))

is preferred to:

(define fact ((x i64))
  (if (icmp-sle-i64 x 0)
    1
    (mul-i64 x (fact (sub-i64 x 1)))))

fix

fix is for recursion.

Example

(let fact
  (fix self ((x i64))
    (if (icmp-sle-i64 x 0) ; i.e. if x <= 0
      1
      (mul-i64 x (self (sub-i64 x 1))))))

Syntax

(fix LEAF_PLUS (LEAF_PLUS ... LEAF_PLUS) TREE)

LEAF_PLUS ::= LEAF | (LEAF TREE)

Inference Rule

Γ, f : (Π ((x1 A1) ... (xn An)) B), x1 : A1, ..., xn : An |- e : B
-------------------------------------------------------------------- (fix)
Γ |- (fix f ((x1 A1) ... (xn An)) e) : (Π ((x1 A1) ... (xn An)) B)

Semantics

(Π-elimination (fix self ((x1 A1) ... (xn An)) e) v1 ... vn)
~> e {x1 := v1,
      ...,
      xn := vn,
      self := (fix self ((x1 A1) ... (xn An)) e)}

Notes

Every tail call is optimized into a loop.

The logic of Neut doesn’t adopt the universe hierarchy, and thus inconsistent. This means that the Z combinator can be written in the source language as an ordinary term. In other words, from the viewpoint of expressive power, fix is simply redundant. The existence of fix is just for optimization purpose.

constant

constant is for external constants.

Example

((constant add-i64) 1 3) ; ~> 4
(add-i64 1 3)            ; ~> 4

Syntax

(constant LEAF)
LEAF

Inference Rule

Γ |- A : tau          {`c` is declared to be a constant of type `A`}
--------------------------------------------------------------------- (constant)
         Γ |- (constant c) : A

Semantics

The constant rule doesn’t have any operational semantics by itself; Each constant has its own dedicated semantics.

Notes

If a leaf is declared to be a constant using thestatement beforehand, the leaf is interpreted as (constant LEAF) .

int

i{n} is the integer type in LLVM.

Example

(add-i64 (i64 1) 2)
(i32 10)
23456789
(mul-i2 (i2 100) 3)

Syntax

; the integer type i{n}
i{n} [where n is one of 1, 2, ..., 64]

; an integer of type i{n}
(i{n} LEAF)  [where n is one of 1, 2, ..., 64]

; an integer without explicit type info
LEAF [where this LEAF can be parsed as integer]

Inference Rule

well-formed(Γ)     {`l` is an integer}     {i{n} is a valid integer type}
--------------------------------------------------------------------------- (integer)
                  Γ |- (i{n} l) : i{n}

Semantics

The terms of an integer type don’t have any operational semantics by themselves.

Notes

The int type in Neut is the same as the one of LLVM, restricted into i1 , i2 , …, i64 .

Every integer of type i{n} is interpreted modulo 2^n , just as in the same way of LLVM. For example, (i2 10) is the same as (i2 6) , (i2 2) , (i2 -2) , or (i2 -6) , since all of these are equivalent modulo 2^2 .

An integer without explicit type information is overloaded; it’s type is firstly set to be unknown, and then inferred.

float

f{n} is the float type in LLVM.

Example

(f16 3.8)
(f32 9.22888)
(f64 1.23456789)
(fadd-f64 1.23456 (f64 7.89))

Syntax

; the float type f{n}
f{n}   [where n is one of 16, 32, 64]

; a float of type f{n}
(f{n} LEAF) [where n is one of 16, 32, 64]

LEAF [where this LEAF can be parsed as float]

Inference Rule

well-formed(Γ)     {`l` is a float}      {f{n} is a valid float type}
------------------------------------------------------------------------ (float)
                  Γ |- l : f{n}

Semantics

The terms of a float type don’t have any operational semantics by themselves.

Notes

The float type in Neut is the same as the one of LLVM. Specifically, LLVM’s half corresponds to Neut’s f16 , float to f32 , and double to f64 .

An float without explicit type information is overloaded; it’s type is firstly set to be unknown, and then inferred.

enum

enum is the enumeration type.

Example

See the example inthe section about the enum statement.

Syntax

(enum LEAF)

(enum-introduction LEAF)
LEAF

(enum-elimination TREE
  (ENUM_CASE TREE)
  ...
  (ENUM_CASE TREE))

ENUM_CASE ::= LEAF | default

Inference Rule

well-formed(Γ)  {`E` is declared to be an enum type}
----------------------------------------------------- (enum)
                 Γ |- (enum E) : tau


well-formed(Γ)   {`l` is a value of enum-type `(enum E)`}
----------------------------------------------------------- (enum-introduction)
       Γ |- (enum-introduction l) : (enum E)


Γ |- e : (enum E)       (Γ |- e_l : A) for all l ∈ E = {l1, ..., ln}
---------------------------------------------------------------------- (enum-elimination)
       Γ |- (enum-elimination e (l1 e1) ... (ln en)) : A

Semantics

(enum-elimination (enum-introduction c)
  (c1 e1)
  ...
  (cn en))
~> ei [where c = ci]

(enum-elimination (enum-introduction c)
  (c1 e1)
  ...
  (cn en)
  (default e)
  ...)
~> e [where e != e1, ..., en]

Notes

The cases of an enum-elimination must be exhaustive.

array

array is the array type in LLVM.

Example

; float array
(let _
  (array-introduction f16 3.28 2.14 2.0 9.82))

(let xs
  (array-introduction i64 1 2 -30 20))

(let k
  (with identity.bind
    (let a (i64.array-access 0 4 &xs))
    (let b (i64.array-access 1 4 &xs))
    (let c (i64.array-access 2 4 &xs))
    (let d (i64.array-access 3 4 &xs))
    (erase xs)
    (i64.add a (i64.add b (i64.add c d)))))

(i64.print k) ; -7

(i64.print
  (array-elimination i64 (a b c d) xs
    (i64.add a (i64.add b (i64.add c d))))) ; -7

Syntax

; the array type
(array TREE LOWTYPE)

(array-introduction LOWTYPE TREE ... TREE)

(array-elimination LOWTYPE (LEAF_PLUS ... LEAF_PLUS) TREE TREE)

LOWTYPE ::= i{n} | f{n}

LEAF_PLUS ::= LEAF | (LEAF TREE)

Inference Rule

Γ |- len : i64    {`k` is an integer type or a float type}
----------------------------------------------------------- (array)
              Γ |- (array len k) : tau


Γ |- l1 : k   ...  Γ |- ln : k      {`k` is an integer type or a float type}
----------------------------------------------------------------------------- (array-introduction)
       Γ |- (array-introduction k l1 ... ln) : (array (i64 n) k)


Γ |- e1 : (array (i64 n) k)     Γ, x1 : k, ..., xn : k |- e2 : B
------------------------------------------------------------------- (array-elimination)
 Γ |- (array-elimination k (x1 ... xn) e1 e2) : B

Semantics

(array-elimination _ (x1 ... xn) (array-introduction _ e1 ... en) e)
~> e {x1 := e1,
      ...,
      xn := en}

Notes

The type of elements of an array must be the integer type, or the float type.

The memory layout of an array is the same as the one of LLVM; For example, an array of type i8 is aligned as in the array [0 x i8] in LLVM.

The array type can be used, for example, to implement string.

You won’t use array-elimination in actual code; You would use the constant array-access instead.

struct

struct is the struct type in LLVM.

Example

(let st
  (struct-introduction
    (f16 3.8)
    (i8 8)
    (i16 -300)
    (f32 33.0)
    (i64 30)
    (i64 10)
    (f64 -329444.4444444)
    (i8 9)))

(i64.print
  (struct-elimination
    ((_ f16) (_ i8) (_ i16) (_ f32) (z i64) (w i64) (_ f64) (_ i8))
    st
    (i64.add z w))) ; ~> 40

Syntax

; the struct type
(struct LOWTYPE ... LOWTYPE)

(struct-introduction (LOWTYPE TREE) ... (LOWTYPE TREE))

(struct-elimination ((LEAF LOWTYPE) ... (LEAF LOWTYPE)) TREE TREE)

LOWTYPE ::= i{n} | f{n}

Inference Rule

well-formed(Γ)    {Every `ki` in {`k1`, ..., `kn`} is either an integer type or a float type}
---------------------------------------------------------------------------------------------- (struct)
                         Γ |- (struct k1 ... kn) : tau


Γ |- e1 : k1   (...)   Γ |- en : kn      Γ |- (struct k1 ... kn) : tau
-------------------------------------------------------------------------- (struct-introduction)
   Γ |- (struct-introduction (e1 k1) ... (en kn)) : (struct k1 ... kn)


Γ |- e1 : (struct k1 ... kn)       Γ, x1 : k1, ..., xn : kn |- e2 : B
--------------------------------------------------------------------------- (struct-elimination)
    Γ |- (struct-elimination ((x1 k1) ... (xn kn)) e1 e2) : B

Semantics

(struct-elimination (x1 _) ... (xn _) (struct-introduction (_ e1) ... (_ en)) e)
~> e {x1 := e1,
      ...,
      xn := en}

Notes

Note that an element of a struct type in Neut is restricted into an integer or a float. This means, among others, that you can’t write a struct that contains another struct, or a memory region. This might get in the way when you implement an interface of a syscall; You might want to use the constant unsafe.cast in that case, or “flatten” the struct.

The memory layout of a struct is the same as the one of LLVM; For example, a struct of type (struct i8 f16 i32) is aligned as in a struct of type {i8, f16, f32} in LLVM.

The struct type is intended to be used for implementation of interfaces of syscalls.

question

question requests the compiler to show the type of a term.

Example

(let x top.unit)

(question x)

Syntax

(question TREE)

?TREE

Inference Rule

Γ |- e : A
----------------------- (question)
Γ |- (question e) : A

Semantics

(question e)
~> e

Notes

The type of a term wrapped by question is reported by the compiler. This might be useful when used in harmony with a linter like flycheck.

erase

erase makes given variables invisible in its scope.

Example

((λ (x y) x) bool.true bool.false)                        ; ~> bool.true

((λ (x y) (erase (x) x)) bool.true bool.false)            ; ~> undefined variable: x

((λ (x y) (erase (y) x)) bool.true bool.false)            ; ~> bool.true

((λ (x y) (erase (x y) x)) bool.true bool.false)          ; ~> undefined variable: x

; `with` has a dedicated notation for `erase`
(with identity.bind
  (let x bool.true)
  (let y x)
  (erase x)
  y) ; ~> undefined variable: y

Syntax

(erase (LEAF ... LEAF) TREE)

Inference Rule

Γ |- e : A    (x1 : A1, ..., xn : An) ∈ Γ    (x1 : A1, ..., xn : An) ∉ freevar(e)
-------------------------------------------------------------------------------------- (erase)
                   Γ |- (erase (x1 ... xn) e) : A

Semantics

(erase (x1 ... xn) e)
~> e

Notes

erase can be used, for example, to ensure that a variable is not used after a certain point of a program:

; `with` has a dedicated notation for `erase`
(with identity.bind
  (let VAR (some-computation))
  {COMPUTATION_WITH_VAR}
  (erase VAR)
  {COMPUTATION_WITHOUT_VAR}
  ...)

erase can also be used to ensure that a variable introduced in adoesn’t cause name collision with a free variable. For example, consider the following notation:

(notation (foo e1 e2)
  (λ
    ((z tau)
     (k (Π ((_ i64) (_ bool)) z)))
    (k e1 e2)))

(let p (foo 10 bool.true))

Though the notation foo works in many cases, it can cause an unintuitive behavior if the e1 or e2 in (foo e1 e2) contains k as a free variable:

(let k 100)

(let p (foo k bool.true))
; ~> (let p
;      (λ
;        ((z tau)
;         (k (Π ((_ i64) (_ bool)))))
;        (k k bool.true)))
; ~> type error

Though we luckily got a type error in the example above, generally this sort of variable capturing can result in a more hard-to-debug runtime error. erase can be used here to prevent this kind of situation as follows:

(notation (foo e1 e2)
  (λ
    ((z tau)
     (k (Π ((_ i64) (_ bool)) z)))
    (k (erase (z k) e1) e2)))

(let k 100)

(let p (foo k bool.true))
; ~> (let p
;      (λ
;        ((z tau)
;         (k (Π ((_ i64) (_ bool)))))
;        (k (erase (z k) k) bool.true)))
; ~> undefined variable: k

That is, using erase , we can ensure that a variable introduced by notation does not collide with a free variable contained in an argument of the notation.

Note that erase is only a partial solution to the problem of hygienic macros . Consider the following code:

(notation (unless condition on-false on-true)
  (if (bool.not condition) on-true on-false)

The notation above presupposes that the free variable bool.not is bound in a sane way; bool.not can’t be something like (λ (x) x) , or (string.print "hello, world!") . Or, even the very symbol erase can theoretically be bound to a variable. Thus erase has no use in this situation.

Auxiliary Syntax

asterisk

* is a placeholder that must be inferred.

Example

(define id ((a tau) (x a)) x)

(enum foo value)

(id foo foo.value) ; ~> foo.value

(id * foo.value) ; ~> foo.value (`*` is inferred to be `foo`)

Syntax

*

Semantics

* doesn’t have any operational semantics.

Notes

* can be used as a placeholder that must be resolved by the compiler using the constraints generated in its type inference procedure.

If the type is not determined, the compiler raises an error; For example, the type of x in the following code is not determined:

(let x 10)

since the 10 cannot be determined to be i32 , i16 , or i64 , etc.

sigma

sigma is the existential quantification.

Example

; binary sigma-intro without dependence
(let pair
  (sigma-introduction (i64 2) (λ ((x tau)) x)))

; binary sigma-elim without dependence
(sigma-elimination (x _) pair
  (i64.print x))

(let n-pair
  (sigma-introduction
    (i64 1)
    (λ ((x tau)) x)
    (f32 10.82)
    top.unit
    top.unit
    top
    tau))

(let 0-pair
  (sigma-introduction)

Syntax

(sigma ((x1 A1) ... (xn An)) B)

(sigma-introduction e1 ... en)

(sigma-elimination (LEAF_PLUS ... LEAF_PLUS) e1 e2)

LEAF_PLUS ::= LEAF | (LEAF TREE)

Semantics

(sigma ((x1 A1) ... (xn An)) B)
~> (Π ((Z tau)
        (_ (Π ((x1 A1) ... (xn An) (_ B)) Z)))
       Z))

(sigma-introduction e1 ... en)
~> (Π-introduction
     ((Z tau)
      (k (Π ((x1 hole) ... (xn hole)) Z)))
     (k e1 ... en))

(sigma-elimination ((x1 A1) ... (xn An)) e1 e2)
~> (e1 hole (lambda ((x1 A1) ... (xn An)) e2))

Notes

Actual definition of sigma is found in core/VERSION/notation.neut ; sigma is just a convenient notation of a certain use of Π . This encoding is the ordinary one in CoC.

Sigma-type with dependence can be used, for example, to realize the type of string as follows:

(define string
  (sigma ((len i64)) (array len i8)))

In this case, a term of this string type is something like (3, [10, 20, 30]) . The first element is the len : i64 part. This part contains the length information of a string. The second element is the actual content of the string.

A sigma-type without dependence is the familiar product type; (sigma ((_ A1) ... (_ An)) B) is (product A1 ... An B) .

with

with is a do-notation with the bind operation made explicit.

Example

(with identity.bind
  (let str "foo")
  (let _ (string.print &str))
  (string.print &str) ; the same as (let _ (string.print &str))
  (let x (i64 10))
  (let y (add-i64 100 x))
  (i64.print y))

(with identity.bind
  (let str "foo")
  (let _
    (let _ (i64 100)) ; each `e` in `(let x e)` is implicitly wrapped by `with`
    (string.print &str)
    (string.print &str))
  (string.print &str)
  (let len (string.print &str))
  len)

Syntax

(with TREE TREE ... TREE)

Semantics

(with bind (let x (e e1 ... en)) rest+)
~> (bind * * (with (e e1' ... en'))
     (lambda (sig)
       (sigma-elimination (x1 ... xj) sig (with rest+))))
where:
  ei' := if ei == &x then x else ei
  x1, ..., xj := (all the "borrowed" variables in e1, ..., en)
  sig : a fresh variable

(with bind (let x e) rest+)
~> (bind * * (with e)
     (lambda (x) (with rest+)))

(with bind e rest+)
~> (with bind (let _ e) rest+)

(with bind (erase x1 ... xn) rest+)
~> (erase (x1 ... xn) (with bind rest+))

(with e)
~> e

Notes

with can be understood as a generalization of begin in Scheme, or an explicit version of the do notation in Haskell.

The “borrowing” is covered by the first rule of the semantics; As you can see from the definition, this realization of borrowing works for any bind operation.

Note that the bind operator is not restricted to monadic bind; You can set any term there as long as the resulting term of this syntactic translation is well-typed.

The e in (let x e) is automatically surrounded by with .

Partial Application

Example

(define id ((a tau) (x a)) x)

; ordinary application
(id bool bool.true)

; partial application
((id bool _) bool.true)

; of course you can bind the partially-applied function to a variable
(let id-bool (id bool _))

(id-bool bool.true)

Sematics

(e e1 ... en)  [where e_{i1} = _, ..., e_{im} = _]
~> (λ (x1 ... xm) (e e1 ... en)) [replacing e_{ik} with xk]

Primitives

LLVM-Based Primitives

Neut currently supports the following LLVM instructions: fneg, add, fadd, sub, fsub, mul, fmul, udiv, sdiv, fdiv, urem, srem, frem, shl, lshr, ashr, and, or, xor, trunc, zext, sext, fptrunc, fpext, fptoui, fptosi, uitofp, sitofp, icmp, fcmp .

Each instruction is “separated” according to its type. Let us take the instruction fneg for example. This instruction can be applied to half , float , and double in LLVM. Thus, Neut has the following three constants as primitives: fneg-f16, fneg-f32, fneg-f64 . For the instruction add , the constants add-i1, add-i2, ..., add-i64 is available.

The instructions with two type arguments like fptosi is named as fptosi-f64-i32, fptosi-f32-i1, fptosi-f16-i5 , etc.

As for the comparison operators icmp and fcmp , the names are of the form {icmp, fcmp}-{CONDITION}-{TYPE} . For example, icmp-eq-i16, icmp-ult-i32, icmp-sge-i64, fcmp-ogt-f32, fcmp-ord-f16, fcmp-ueq-f64 are all valid.

The semantics of these instructions is as specified in the LLVM Language Reference Manual .

Syscalls

Neut supports syscalls, although its current support is far from complete. As you can see in core/VERSION/constant/{linux, darwin}.neut , only basic syscalls like read, write, exit, open, close are supported. This is not something that is caused by theoretical difficulties; Rather, this is simply because of my limited resource of time.

It would be worth noting that a syscall can’t produce/consume a resource. For example, the syscall write is declared to have the following type in Neut:

(constant os.write
  (Π
    ((len i64)
     (out file-descriptor)
     (buf (array len i8))
     (nbyte i64))
    (product (array len i8) i64)))

Compare the type above with the C signature of write :

ssize_t write(int fildes, const void *buf, size_t nbyte);

Ignoring the (len i64) part that is necessary to receive an array of arbitrary length, the crucial difference here is the types of the codomains. While the latter one returns ordinary ssize_t (integer), the former one also returns the original array. This is because the syscall write doesn’t consume (i.e. deallocate) given string. If we write in Neut doesn’t return the original string, the string is never freed in the succeeding program, causing space leak.

Regarding macOS: You may note that the “syscalls” are lowered to some external interface functions on macOS. This is because macOS doesn’t support a direct use of a syscall; Indeed, for example, if we were to use the syscall fork directly (0x2000002), a succeeding malloc causes a fatal error, saying something like mach_vm_map(size=1048576) failed (error code=268435459) .

Implementation note: A syscall can be added via the following procedure:

  1. Declare corresponding constant (like os.write ) using thestatement
  2. Register its argument information and the syscall number in src/Data/Syscall.hs
  3. Rebuild the compiler and run some test codes to check if it really works

Here, the “argument information” is a list consists of one of the following tags: “immediate”, “struct”, “array”, and “unused”. This information is necessary to generate a corresponding syscall without writing each function definition explicitly. For example, consider you add a syscall foo with its argument information [“unused”, “immediate”, “struct”, “array”]. This creates a function with 4 arguments. Let us call these arguments x1 , x2 , x3 , and x4 . The compiler then generates a function that calls the syscall, assuming that the “struct”- and “array”-tagged arguments are borrowed one. That is, the compiler generates a function that is defined schematically as follows:

syscall-foo x1 x2 x3 x4 :=
  let result := CALL_SYSCALL(foo, x2, x3, x4) in
  return (x2, x3, result)

Using this mechanism, for example, the actual implementation of the syscall os.write above can be generated from [“unused”, “immediate”, “array”, “immediate”]:

syscall-os.write A out buf nbyte :=
  let result := CALL_SYSCALL(write, out, buf, nbyte) in
  return (buf, result)

Array Indexing

Neut offers a family of constants that allows us to access an element of an array. The names of these constants are of the form {LOWTYPE}.array-access , where the {LOWTYPE} is either i{n} or f{n} . For example, i8.array-access can be used for indexing of an array of type (array N i8) .

The types of these accesser are of the following form:

(Π
  ((i i64)
   (n i64)
   (_ (array n LOWTYPE)))
  (product (array n LOWTYPE) LOWTYPE))

where the first argument i specifies the index of the array, and the second argument n specifies the length of the array. Using this constant, one can use an element of an array without consuming the array as follows:

(let xs
  (array-introduction i64 1 2 -30 20))

(with identity.bind
  (let a (i64.array-access 0 * &xs)) ; the `*` is inferred to be `4`, since the length of `xs` is 4
  (let b (i64.array-access 1 * &xs))
  (let c (i64.array-access 2 * &xs))
  (let d (i64.array-access 3 * &xs))
  (i64.add a (i64.add b (i64.add c d)))) ; 1 + 2 + (-30) + 20

Unsafe Cast

The constant unsafe.cast : Pi (A : tau, B : tau, x : A). B is available (assuming that the constant is declared using the statement). The semantics of this constant is the no-op cast:

(unsafe.cast t1 t2 e)
~> e

You can easily break the resource management system of Neut using this constant. Consider the following example:

(ensure core/1.0.0.0
  "https://github.com/u2zv1wx/neut-core/raw/master/release/0.1.0.0.tar.gz")

; (note that this declares the constant unsafe.cast)
(include "core/0.1.0.0/core.neut")

(let z (unsafe.cast i64 (Π ((_ i64)) i64) 1))

(let foo z) ; (*1)

(let bar z) ; (*2)

The code above uses the variable z for the two times. Since z is casted to a Π-type, The resulting code tries to copy the integer 1 as if it were a closure. Since the internal representation of a closure at least uses 3 words, this causes an access to an invalid memory region, that is, a segmentation fault.

Compiler Subcommands

The neut binary provides the subcommands in this section.

archive

Example

$ neut archive path/to/source/directory -o path/to/release/directory/0.1.0.0.tar.gz

Notes

The archive subcommand creates a tar.gz archive from the specified directory. Thestatement expects an archive created by this subcommand.

build

Example

$ neut build path/to/file.neut             # create an executable ./file
$ neut build -o output path/to/file.neut   # create an executable ./output
$ neut build --emit llvm path/to/file.neut # create a LLVM IR file ./file.ll
$ neut build --emit asm path/to/file.neut  # create an assembly code ./file.s

Notes

The build subcommand builds given source code and emits resulting code.

Internally, these command firstly creates an LLVM IR, then passes it to clang (if necessary).

check

Example

$ neut check path/to/file.neut
$ neut check --no-color path/to/file.neut
$ neut check --end-of-entry EOE path/to/file.neut

Notes

The check subcommand type-checks given file.

If --no-color option is specified, the result of type checking is printed without console color.

If --end-of-entry SEPARATOR is specified, each entry of the result of type checking is followed by SEPARATOR .

This subcommand is intended to be used with an editor-side syntax checker like flycheck .

Appendix

LLVM IR for the First Example

declare void @free(i8*)
declare i8* @write(i8*, i8*, i8*)
declare i8* @malloc(i64)
define i64 @main() {
  ; <REPETITION 1>
  ; <memory allocation for the string>
  %_11875 = getelementptr i8, i8* null, i64 1
  %_11876 = ptrtoint i8* %_11875 to i64
  %_11877 = call fastcc i8* @malloc(i64 %_11876)
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  %_11878 = bitcast i8* %_11877 to [1 x i8]*
  %_11879 = inttoptr i8 97 to i8*
  %_11880 = ptrtoint i8* %_11879 to i8
  %_11881 = getelementptr [1 x i8], [1 x i8]* %_11878, i32 0, i64 0
  store i8 %_11880, i8* %_11881
  ; <write the string into stdout>
  %_11882 = inttoptr i64 1 to i8*
  %_11883 = inttoptr i64 1 to i8*
  %_11884 = call fastcc i8* @write(i8* %_11882, i8* %_11877, i8* %_11883)
  %_11885 = bitcast i8* %_11877 to [1 x i8]*
  %_11886 = getelementptr [1 x i8], [1 x i8]* %_11885, i32 0, i32 0
  %_11887 = load i8, i8* %_11886
  %_11888 = bitcast [1 x i8]* %_11885 to i8*
  ; <memory deallocation for the string>
  call fastcc void @free(i8* %_11888)
  %_11889 = inttoptr i8 %_11887 to i8*
  ; <REPETITION 2>
  ; <memory allocation for the string>
  %_11890 = getelementptr i8, i8* null, i64 1
  %_11891 = ptrtoint i8* %_11890 to i64
  %_11892 = call fastcc i8* @malloc(i64 %_11891)
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  %_11893 = bitcast i8* %_11892 to [1 x i8]*
  %_11894 = inttoptr i8 97 to i8*
  %_11895 = ptrtoint i8* %_11894 to i8
  %_11896 = getelementptr [1 x i8], [1 x i8]* %_11893, i32 0, i64 0
  store i8 %_11895, i8* %_11896
  ; <write the string into stdout>
  %_11897 = inttoptr i64 1 to i8*
  %_11898 = inttoptr i64 1 to i8*
  %_11899 = call fastcc i8* @write(i8* %_11897, i8* %_11892, i8* %_11898)
  %_11900 = bitcast i8* %_11892 to [1 x i8]*
  %_11901 = getelementptr [1 x i8], [1 x i8]* %_11900, i32 0, i32 0
  %_11902 = load i8, i8* %_11901
  %_11903 = bitcast [1 x i8]* %_11900 to i8*
  ; <memory deallocation for the string>
  call fastcc void @free(i8* %_11903)
  %_11904 = inttoptr i8 %_11902 to i8*
  ; <REPETITION 3>
  ; <memory allocation for the string>
  %_11905 = getelementptr i8, i8* null, i64 1
  %_11906 = ptrtoint i8* %_11905 to i64
  %_11907 = call fastcc i8* @malloc(i64 %_11906)
  ; <create the string "a" (i.e. write the character 'a' to the allocated memory)>
  %_11908 = bitcast i8* %_11907 to [1 x i8]*
  %_11909 = inttoptr i8 97 to i8*
  %_11910 = ptrtoint i8* %_11909 to i8
  %_11911 = getelementptr [1 x i8], [1 x i8]* %_11908, i32 0, i64 0
  store i8 %_11910, i8* %_11911
  ; <write the string into stdout>
  %_11912 = inttoptr i64 1 to i8*
  %_11913 = inttoptr i64 1 to i8*
  %_11914 = call fastcc i8* @write(i8* %_11912, i8* %_11907, i8* %_11913)
  %_11915 = bitcast i8* %_11907 to [1 x i8]*
  %_11916 = getelementptr [1 x i8], [1 x i8]* %_11915, i32 0, i32 0
  %_11917 = load i8, i8* %_11916
  %_11918 = bitcast [1 x i8]* %_11915 to i8*
  ; <memory deallocation for the string>
  call fastcc void @free(i8* %_11918)
  %_11919 = inttoptr i8 %_11917 to i8*
  ; <return 0>
  %_11920 = inttoptr i64 0 to i8*
  %_11921 = ptrtoint i8* %_11920 to i64
  ret i64 %_11921
}

LLVM IR for the Second Example

declare void @free(i8*)
declare i8* @write(i8*, i8*, i8*)
declare i8* @malloc(i64)
define i64 @main() {
  ; memory allocation
  %_12034 = getelementptr i8, i8* null, i64 1
  %_12035 = ptrtoint i8* %_12034 to i64
  %_12036 = call fastcc i8* @malloc(i64 %_12035)
  ; create the string "a" (i.e. write the character 'a' to the allocated memory)
  %_12037 = bitcast i8* %_12036 to [1 x i8]*
  %_12038 = inttoptr i8 97 to i8* ; a = 97
  %_12039 = ptrtoint i8* %_12038 to i8
  %_12040 = getelementptr [1 x i8], [1 x i8]* %_12037, i32 0, i64 0 ; where to write 'a'
  store i8 %_12039, i8* %_12040 ; write a
  ; write the string into stdout for the three times
  %_12041 = inttoptr i64 1 to i8* ; this `1` stands for stdout
  %_12042 = inttoptr i64 1 to i8* ; this `1` is the length of the string in bytes
  %_12043 = call fastcc i8* @write(i8* %_12041, i8* %_12036, i8* %_12042)
  %_12044 = inttoptr i64 1 to i8*
  %_12045 = inttoptr i64 1 to i8*
  %_12046 = call fastcc i8* @write(i8* %_12044, i8* %_12036, i8* %_12045)
  %_12047 = inttoptr i64 1 to i8*
  %_12048 = inttoptr i64 1 to i8*
  %_12049 = call fastcc i8* @write(i8* %_12047, i8* %_12036, i8* %_12048)
  ; memory deallocation
  %_12050 = bitcast i8* %_12036 to [1 x i8]*
  %_12051 = getelementptr [1 x i8], [1 x i8]* %_12050, i32 0, i32 0
  %_12052 = load i8, i8* %_12051
  %_12053 = bitcast [1 x i8]* %_12050 to i8*
  call fastcc void @free(i8* %_12053)
  %_12054 = inttoptr i8 %_12052 to i8*
  ; return 0
  %_12055 = inttoptr i64 0 to i8*
  %_12056 = ptrtoint i8* %_12055 to i64
  ret i64 %_12056
}

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

算法设计与分析

算法设计与分析

屈婉玲、刘田、张立昂、王捍贫 / 清华大学 / 2011-5 / 25.00元

《算法设计与分析》为计算机科学技术专业核心课程“算法设计与分析”教材.全书以算法设计技术和分析方法为主线来组织各知识单元,主要内容包括基础知识、分治策略、动态规划、贪心法、回溯与分支限界、算法分析与问题的计算复杂度、NP完全性、近似算法、随机算法、处理难解问题的策略等。书中突出对问题本身的分析和求解方法的阐述,从问题建模、算法设计与分析、改进措施等方面给出适当的建议,同时也简要介绍了计算复杂性理论......一起来看看 《算法设计与分析》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

随机密码生成器
随机密码生成器

多种字符组合密码

Base64 编码/解码
Base64 编码/解码

Base64 编码/解码