A story told by Type Errors

栏目: IT技术 · 发布时间: 4年前

内容简介:Custom type errors is an extremely powerful tool for improving the UX of Haskell libraries. However, they are not used frequently enough. Partially because this technique requires the usage of some advanced Haskell concepts like type families, data kinds a

Custom type errors is an extremely powerful tool for improving the UX of Haskell libraries. However, they are not used frequently enough. Partially because this technique requires the usage of some advanced Haskell concepts like type families, data kinds and kind polymorphism. And partially because not everyone is aware of such a valuable piece of standard Haskell library.

In this blog post I’m going to show that using custom type errors is a simple task. I will present a lot of very different usage examples and teach you how to flavour your Haskell code with useful compile-time error messages.

Motivation

I can not express how important it is to have lucid error messages. An ideal error message should not only point out an incorrect piece of code but also suggest how to fix it. Unfortunately, not all standard error messages are that helpful. Moreover, GHC cannot know in advance about all possible usages for various functions. That’s why error messages are not attached to particular use cases. Often they are vague due to the fact how the type system in GHC is implemented and this makes errors hard to understand sometimes. However, most of the common types and functions became standard idioms in day-to-day Haskell programming. So why not give Haskell users a helping hand and guide them how to use the language efficiently by exploiting the power of the type system itself?

What is TypeError?

Custom type errors mechanism allows Haskell developers to introduce their own compile-time error messages about usages of their functions without a need to fork GHC and patch it for the particular use cases. It provides a user-level way for extending the capabilities of the compiler. Custom user error messages can use the information only about types. With such type errors, you can’t introduce new parse errors about Haskell syntax (for that you actually need to fork GHC and patch it). But using type errors you can guide users of your library in the right direction of using your functions and types.

To use custom type errors you need to perform two steps:

ErrorMessage
TypeError

The following sections explain what is a type family, how TypeError and ErrorMessage look like and how to use them.

For a deeper understanding of type-level computations in Haskell I recommend reading Thinking with Types by Sandy Maguire.

Short intro to type families

In simple words, type family is a type-level function from types to types. Below you can see the example of some simple type family that for a given type of unsigned numeric values returns signed type that contains every unsigned value of the corresponding type:

type family Signed (t :: Type) :: Type where
    Signed Word8   = Int16
    Signed Natural = Integer

Defining a type family is as simple as defining an ordinary function. The only difference is that type families take types as arguments and return types as their result. You can use :k or :kind command in GHCi to see the type of any type family. And you can use :kind! to apply type family and evaluate it to see the result:

ghci> :kind Signed
Signed :: * -> *

ghci> :kind! Signed Natural
Signed Natural :: *
= Integer

ghci> :kind! Signed Int
Signed Int :: *
= Signed Int

NOTE:GHC is moving towards renaming * to Type and you already can use Type in your code to specify kinds. But GHCi still displays the Type kind as * .

Our Signed type family is not defined for the Int type that’s why we see in GHCi that the result of Signed Int is not evaluated. Above we’ve implemented so-called closed type family — it is defined only for those types that we specified under where clause. Just like familiar term-level functions. There also exist open type families that can be extended externally. But we are going to talk only about closed type families in this blog post.

The Signed type family can be useful to define a safe interface like this one:

class ToSigned a where
    toSigned :: a -> Signed a

In the ToSigned typeclass the result type depends on the argument type and it is possible to have instances of this typeclasses only for types handled by the Signed type family.

TypeError type family

Let’s look at the implementation details of the TypeError type family first. Below I provide its full definition from base :

type family TypeError (a :: ErrorMessage) :: b where

TypeError is a type family that takes a type of ErrorMessage kind and returns a type of a polymorphic kind (not just Type as in the example from the previous section). The fact that TypeError has polymorphic kind of the result means that the result can be used in many places, like constraint context or return value of any function (though usually it’s used inside constraints).

You can notice that the TypeError type family doesn’t have a body at the language-level. There is nothing after the where keyword. And it is also a closed type family which means that you can’t extend it externally. The implementation of TypeError is baked into GHC internals.

Short intro to DataKinds

In Haskell, every type can be typed. The type of type is called kind . You can inspect the value type in GHCi using :t command. Similarly, you can inspect the kind of type using :k command.

ghci> :t True
True :: Bool

ghci> :t Bool
    error: Data constructor not in scope: Bool

ghci> :k Bool
Bool :: *

ghci> :k True
    Not in scope: type constructor or classTrue
    A data constructor of that name is in scope; did you mean DataKinds?

You can see that you can’t inspect a type of type and you can’t inspect a kind of value. However, the last error message is intriguing. What does it mean and what is DataKinds ?

Turns out that GHC provides an ability to promote value-level data constructors to type-level. To be able to use promoted data constructors you need to enable the DataKinds language extension and add a single quote in front of the constructor:

ghci> :set -XDataKinds
ghci> :k 'True
'True :: Bool

True is a value that has type Bool and Bool has kind Type . But 'True is a type that has kind Bool . So we promoted constructors to types and types to kinds. You may ask if 'True is a type then what values it has? The answer is that it doesn’t have any values, this type is uninhabited. Not every type has values. Though, such promoted types still can be useful for type-level computations.

ErrorMessage data type

Now, let’s have a look at the ErrorMessage data type which is used as an argument in the TypeError type family:

data ErrorMessage
    = Text Symbol
    | forall t. ShowType t
    | ErrorMessage :<>: ErrorMessage
    | ErrorMessage :$$: ErrorMessage

This data type looks attractive. What we can notice first is that it has several constructors that are defined as operators ( :<>: and :$$: ). Second, this data type is intended to be used on the type-level, not value-level. That’s why the Text constructor stores a type-level string of kind Symbol , not just String . Symbol is a kind of type-level strings in Haskell:

ghci> :t "Ordinary string"
"Ordinary string" :: String

ghci> :k "Type-level string"
"Type-level string" :: Symbol

The ErrorMessage constructors have the following meaning:

Text
ShowType
:<>:
:$$:

Let’s try to build our first error message!

type FooMessage =
    'Text "First line of the foo message" ':$$:
    'Text "Second line of the foo message: " ':<>: 'ShowType ErrorMessage

Note how we prepend every constructor (even operators) with a single quote. This is because we are using DataKinds to create type-level values from promoted constructors. You can check that FooMessage is indeed a type-level value that has ErrorMessage kind:

ghci> :k FooMessage
FooMessage :: ErrorMessage

After creating an error message we can finally use it!

foo :: TypeError FooMessage
foo = error "unreachable"

Now, if you will try to compile the module which has this foo function, you will see the following error message:

TypeErrors.hs:19:8: error:
    • First line of the foo message
      Second line of the foo message: ErrorMessage
    • In the type signature: foo :: TypeError FooMessage

19 | foo :: TypeError FooMessage
   |        ^^^^^^^^^^^^^^^^^^^^

This use case is not particularly useful but it should demonstrate how to construct and use custom type errors. In the following sections I’m going to explain how to implement a lot of cool and useful stuff with type errors.

You can construct text of error messages much easier by using the type-errors-pretty library.

Motivating example: adding two lists

One of the most common Haskell typeclasses is the Num typeclass. It contains a lot of arithmetic operations, including number addition:

ghci> :t (+)
(+) :: Num a => a -> a -> a

This operator is used to add two numbers. But what will happen if we try to add two lists with this operator?

ghci> [1,2] + [3,4]

<interactive>:4:1: error:
Non type-variable argument in the constraint: Num [a]
      (Use FlexibleContexts to permit this)
When checking the inferred type
        it :: forall a. (Num a, Num [a]) => [a]

As you can see, we can’t add two lists. It is a compile-time error. And this is a good thing: no implicit casts, no undefined and unexpected behaviour. But the unfavourable part of it is the error message itself. I can imagine how horrified and frustrated Haskell beginners could be after looking at this error message because from the beginner’s point of view the text of the message doesn’t make any sense at all! Sure, it is understandable if you are using Haskell long enough and you learned how to decode error messages. But not at the start of your Haskell adventure…

Imagine if the error message could look like this instead:

ghci> [1,2] + [3,4]

    • You've tried to perform an arithmetic operation on lists.
      Possibly one of those: (+), (-), (*), fromInteger, negate, abs

      If you tried to add two lists like this:

          ghci> [5, 10] + [1, 2, 3]

      Then this is probably a typo and you wanted to append two lists.
      Use (++) operator to append two lists.

          ghci> [5, 10] ++ [1, 2, 3]
          [5, 10, 1, 2, 3]

      If you want to combine a list of numbers with an arithmetic operation,
      you can either use 'zipWith' for index-wise application:

          ghci> zipWith (*) [5, 10] [1, 2, 3]
          [5, 20]

      or 'liftA2' for pairwise application:

          ghci> liftA2 (*) [5, 10] [1, 2, 3]
          [5, 10, 15, 10, 20, 30]

      If you want to apply unary function to each element of the list, use 'map':

          ghci> map negate [2, -1, 0, -5]
          [-2, 1, 0, 5]


    • In the expression: [1, 2] + [3, 4]
      In an equation for ‘it’: it = [1, 2] + [3, 4]

Much better! Now it is more clear what went wrong and how to fix the error. Good news is that it is actually possible to provide such a neat error message. You just need to do the following:

type ListNumMessage
    = ... the above text constructed using earlier explained syntax ...

instance TypeError ListNumMessage => Num [a]

NOTE:If you compile your code with -Wredundant-constraints flag you see a lot of warnings about unused constraints in your code when using custom type errors. This is an unfortunate drawback but it’s not that bad.

We’ve implemented the Num instance for the list. So every time this instance is used, GHC will kindly tell why we can’t use it and how to fix the error. If, instead, there is no instance for some data type then all you see is the error message saying “There is no such instance”. However, if you know that there is no reasonable instance of this typeclass and it can be used in the wrong context and confuse users of your library then you can forbid this instance and provide helpful error message explaining your motivation and guiding users in the right direction when this instance is used.

I think that UX of Haskell beginners can be improved a lot if such instances were added not only for lists but also for Bool , Char , String , tuples, IO and many more data types. It will take time to write all the messages. But you only need to do this once and from now on till the end of times all developers would benefit from these instances.

A similar approach is heavily used by the silica library which implements lenses but has a lot of nice custom error messages that help developers to understand the concept of lenses better.

Create the restricted instance: Eq instance for the function

In the previous example, we’ve forbidden the instance completely. But sometimes we want to forbid it partially (wat?). I’ll explain what I mean.

Let’s say that we want to check whether two functions are equal (this might be useful for property-based testing or during refactoring). It is a difficult question: what does the function equality mean in programming? In math we have a formal definition . This means that in Haskell we can apply this definition in the context of pure computations. We can say that two functions are equal if they produce the same output for all possible values of their arguments. This property can be encoded via the following instance:

instance (Bounded a, Enum a, Eq b) => Eq (a -> b) where
    (==) :: (a -> b) -> (a -> b) -> Bool
    f == g = let universe = [minBound .. maxBound]
             in map f universe == map g universe

And we can verify that the instance works:

boolId1, boolId2, boolId3 :: Bool -> Bool
boolId1 = id
boolId2 = not . not
boolId3 = not

ghci> boolId1 == boolId2
True
ghci> boolId1 == boolId3
False

However, this instance is dangerous because we can accidentally compare two functions that have an argument of type Int (or some other type with a lot of values) and checking that two functions produce the same result for every Int might take half of the Universe lifetime. So, what we eventually want is to allow function equality but only when function argument is a “small” data type. Fortunately, this can be achieved by the combination of type families and custom error types. The idea is to implement a type family that pattern-matches on a type and returns type error constraint only for non-small types. Otherwise, it should return empty constraint. See the code snippet below for the implementation:

type FunEqMessage (arg :: Type) (res :: Type) = ... message ...

type family CheckFunArg (arg :: Type) (res :: Type) :: Constraint where
    CheckFunArg Bool  _ = ()
    CheckFunArg Int8  _ = ()
    CheckFunArg Word8 _ = ()
    CheckFunArg arg   r = TypeError (FunEqMessage arg r)

instance (CheckFunArg a b, Bounded a, Enum a, Eq b) => Eq (a -> b) where
   ... implementation stays the same ...

We’ve implemented CheckFunArg type family manually. The type-errors library can help with more complicated functions.

And now we can safely use it!

inc1, inc2 :: Int -> Int
inc1 = (+1)
inc2 = succ
ghci> inc1 == inc2

    • You've attempted to compare two functions of the type:

          Int -> Int

      To compare functions their argument should be one of the following types:

          Bool, Int8, Word8

      However, the functions have the following argument type:

          Int

    • In the expression: inc1 == inc2
      In an equation for ‘it’: it = inc1 == inc2

NOTE:With this instance it is still possible to hang function comparison if you will try to compare two functions of type like:

Int8 -> Int8 -> Int8 -> Int8 -> Bool

But this is just an implementation detail how to patch this instance to take such cases into consideration.

Restrict instance externally: Foldable

This use case is similar to the previous one, but now we want to restrict some functions from the already implemented instance. Some instances are written in the external libraries. It is not possible in Haskell to not import some instances. But sometimes you really want to not have all of them. Or, alternatively, the instance itself is useful except a couple of dangerous functions.

Example:there exist well-known efficient container types Set and HashSet from the containers and unordered-containers packages correspondingly. These data structures provide fast modification and query operation. Like these two:

member :: Ord a              => a ->     Set a -> Bool
member :: (Eq a, Hashable a) => a -> HashSet a -> Bool

However, default Prelude exports the following method of the Foldable typeclass:

elem :: (Foldable f, Eq a) => a -> f a -> Bool

The problem here is that elem for both Set and HashSet works in O(n) time while member works in O(log n) time and it’s quite easy to accidentally use slow elem instead of fast member function.

NOTE:It is worth mentioning that it is actually possible to patch the Foldable typeclass itself so it can have an efficient implementation of the member method for both Set and HashSet . The change was proposed earlier but haven’t been accepted.

Fortunately, it’s still possible to have useful Foldable instances for Set and HashSet but produce a compile-time error message when you are using elem or notElem from Foldable . This trick is implemented in the relude alternative prelude. See the full code in the repository:

NOTE:Code in relude also contains tests for custom error messages using doctest . With such an approach, you can automatically check your compile-time messages at runtime.

The idea behind the implementation is to reexport our own version of the elem function which just delegates the implementation to elem from Data.Foldable but it has an additional constraint over the argument that pattern-matches on the type and produces either error message or empty constraint. As you can see, this technique can be extended even further. If you want to forbid instance completely, you just need to forbid every method of the typeclass for the particular data type.

Below you can see a general template for solving described problems:

module BetterFoo (fooBar) where

import qualified Foo

fooBar :: (DisallowFooBar a, Foo.Foo a) => a -> Bar
fooBar = Foo.fooBar

type family DisallowFooBar (a :: Type) :: Constraint where
    DisallowFooBar Baz = TypeError ... error message …
    DisallowFooBar a   = ()

NOTE:In the first example with the list we provided fat text for the whole instance because we don’t know which function was used. But now you can see how we can be aware of the fact which function is used and provide more specific error messages.

Deprecation: better migration guide

According to PvP it’s okay to remove a function from a library while you are increasing major version. However, from the users of the library point of view function removal may hurt when they try to upgrade to a newer version of your library. If the function you are using is removed from the library, the only error you get is that there is no such function. And now you need to find CHANGELOG for the library to see what to use instead and if the library author doesn’t provide migration guide, you need to read comments and reasoning under the corresponding issues (again, only if there was an issue at first place). This is a poor UX.

GHC gives developers an ability to mark their functions with the DEPRECATED pragma and provide a custom message. When a deprecated function is used you will see a warning with the specified text. Unfortunately, it’s very easy to miss a warning message. And, again, once the function is removed, you will see only the message that there is no such function. However, with custom type errors we can make deprecation cycles smoother. Instead of removing the function, you can add custom type error to the function saying that this function is deprecated. So every time the function is used, users will see compiler error telling what to do instead. It is still a compiler error like it would be if you removed the function, but now at least users know how to fix it with less hassle.

It’s extremely easy to introduce such deprecation message. This can be done via the following code:

class CompilerError (msg :: ErrorMessage)
instance TypeError msg => CompilerError msg

type ParseDeprecated = ... message goes here ...

parse :: CompilerError ParseDeprecated => FilePath -> IO ()
parse = error "unreachable"

Here is what we do:

  1. We create the CompilerError typeclass. Usually, you create typeclasses for types of kind Type or Type -> Type . In this case, we create typeclass for things of the ErrorMessage kind.
  2. Then we create a single instance of CompilerError for every ErrorMessage with the TypeError constraint for that message. One instance to rule all error messages.
  3. You can see how to use this typeclass from the parse function type signature: just add it to the constraint with the specified error message. Now every time the parse function is used, you will get an error message like this one:
ghci> parse "path/to/config"

    • Function 'parse' was deprecated in my-parser-1.2.6.0.
      It will be deleted in my-parser-1.3.0.0.
      Use 'parseConfig' instead.

      See the following issue for motivation:

          * https://github.com/user/my-parser/issue/42

    • In the expression: parse "path/to/config"
      In an equation for ‘it’: it = parse "path/to/config"

Why write migration guide in some separate document if you can force the compiler to show this guide to every user of your library? I’m just joking. Please, write migration guides anyways, a separate document is still super useful!

Analyse Generic representation of a type: names, fields, constructors

Final usage of custom type errors is more advanced than the previous ones. The approach uses Generic capabilities to analyse the structure of the data types during compilation. Generics allow haskellers to derive automatically instances of arbitrary data types. However, it’s not always possible to derive instances. Sometimes you want to check whether a structure of a data type satisfies specific requirements. For example, if you don’t support automatic deriving for sum types, it is better to tell about this fact during compilation, not from docs or runtime. But with Generic you can do much more advanced checks! See elm-street for examples of different compile-time verification.

A possible list of compile-time analytics includes:

  1. A data type contains only a single constructor.
  2. A data type has a low number of fields.
  3. Every constructor has the same fields with the same name.
  4. A data type has a field of a particular type.
  5. Every field of a data type is a newtype.
  6. A data type is an enumeration.

And much more. The choices are limited only by your imagination!

This approach is heavily used by the generic-lens library to check the structure of data type before allowing to use lenses.

Conclusion

You can see that custom type errors is a really powerful mechanism. It can increase the quality of type errors by a lot. But they require some effort from the developers. And sometimes they require some advanced Haskell knowledge and understanding of more challenging topics. But in the end, it pays off a lot.

Acknowledgements

I want to thank Veronika Romashkina for her support and help with the blog post. I spent a lot of time and effort compiling and explaining all the examples and I couldn’t do this without her help.


以上所述就是小编给大家介绍的《A story told by Type Errors》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

阿里巴巴Java开发手册

阿里巴巴Java开发手册

杨冠宝 / 电子工业出版社 / 2018-1 / 35

《阿里巴巴Java开发手册》的愿景是码出高效,码出质量。它结合作者的开发经验和架构历程,提炼阿里巴巴集团技术团队的集体编程经验和软件设计智慧,浓缩成为立体的编程规范和最佳实践。众所周知,现代软件行业的高速发展对开发者的综合素质要求越来越高,因为不仅是编程相关的知识点,其他维度的知识点也会影响软件的最终交付质量,比如,数据库的表结构和索引设计缺陷可能带来软件的架构缺陷或性能风险;单元测试的失位导致集......一起来看看 《阿里巴巴Java开发手册》 这本书的介绍吧!

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具