#!/usr/bin/env stack
-- stack --resolver lts-12.21 script
import Data.Foldable (foldl')
data Foo = Foo Int
deriving Show
data Bar = Bar !Int
deriving Show
newtype Baz = Baz Int
deriving Show
main :: IO ()
main = do
print $ foldl'
((Foo total) x -> Foo (total + x))
(Foo 0)
[1..1000000]
print $ foldl'
((Bar total) x -> Bar (total + x))
(Bar 0)
[1..1000000]
print $ foldl'
((Baz total) x -> Baz (total + x))
(Baz 0)
[1..1000000]
Foo
Bar
vs Baz
? A few differences.Advantages of strictness annotations:
Recommendation: if you don’t need laziness in a field, make it strict.
#!/usr/bin/env stack
-- stack --resolver lts-12.21 script
import Data.Foldable (foldl')
import UnliftIO.Exception (pureTry)
data Foo = Foo Int
deriving Show
data Bar = Bar !Int
deriving Show
newtype Baz = Baz Int
deriving Show
main :: IO ()
main = do
print $ pureTry $
case Foo undefined of
Foo _ -> "Hello World"
print $ pureTry $
case Bar undefined of
Bar _ -> "Hello World"
print $ pureTry $
case Baz undefined of
Baz _ -> "Hello World"
Foo
contains:
Int
(one word)Int
has a data constructor (one word)Int
has a payload Int#
(we’ll get to later, one word)Bar
in theory has the exact same thing, but wait till next section
Baz
is a newtype, guaranteed to have no runtime representation
Int
itself is still two wordsThat extra Int
data constructor is annoying, get rid of it!
data Bar = Bar {-# UNPACK #-} !Int
Int
into the Bar
representationInt
){-# UNPACK #-}
on Int
Why not always unpack fields? It can be a pessimization with large data types due to copying lots of data instead of copying a single pointer. If the value is a machine word, it’s always better to unpack, thus the primitive type optimization.
Int
is defined in normal Haskell code, it’s not a GHC
built-in. Don’t believe me?
https://www.stackage.org/haddock/lts-12.21/ghc-prim-0.5.2.0/GHC-Types.html#t:Int
data Int = I# Int#
data Word = W# Word#
Magic hash!
$ stack exec -- ghci -XMagicHash
GHCi, version 8.0.1: https://www.haskell.org/ghc/ :? for help
Prelude> import GHC.Prim
Prelude GHC.Prim> :k Int#
Int# :: TYPE 'GHC.Types.IntRep
Int#
is the magic, built-in value provided by GHC, in the ghc-prim package.
#!/usr/bin/env stack
-- stack --resolver lts-12.21 script
{-# LANGUAGE MagicHash #-}
import GHC.Prim
import GHC.Types
main :: IO ()
main = print $ I# (5# +# 6#)
High level, good code:
#!/usr/bin/env stack
-- stack --resolver lts-12.21 script
main :: IO ()
main = print $ sum [1..100 :: Int]
Hopefully GHC optimizes this into a tight loop. But let’s write that tight loop manually:
#!/usr/bin/env stack
-- stack --resolver lts-12.21 script
{-# LANGUAGE BangPatterns #-}
main :: IO ()
main = print $ loop 0 1
where
loop !total i
| i > 100 = total
| otherwise = loop (total + i) (i + 1)
total
but not i
?OK, let’s get primitive!
#!/usr/bin/env stack
-- stack --resolver lts-12.21 script
{-# LANGUAGE MagicHash #-}
import GHC.Prim
import GHC.Types
main :: IO ()
main = print $ I# (loop 0# 1#)
where
loop total i
| isTrue# (i ># 100#) = total
| otherwise = loop (total +# i) (i +# 1#)
Example:
data Foo = Bar !Int !Int | Baz !Int | Qux
How much memory needed for:
Bar 5 6
Baz 5
Qux
Constructors with no fields (like Qux
or Nothing
): one copy in
memory shared by all usages.
Compare the following:
data Result = Success !Int | Failure
data MaybeResult = SomeResult !Result | NoResult
Versus:
data MaybeResult = Success !Int
| Failure
| NoResult
Takeaway: if performance is crucial, consider “inlining” layered sum types. Downside:
https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging
case
ing
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.