Dissecting Guix, Part 3: G-Expressions

Welcome back to Dissecting Guix! Last time, we discussed monads, the functional programming idiom used by Guix to thread a store connection through a series of store-related operations.

Today, we'll be talking about a concept rather more specific to Guix: g-expressions. Being an implementation of the Scheme language, Guile is built around s-expressions, which can represent, as the saying goes, code as data, thanks to the simple structure of Scheme forms.

As Guix's package recipes are written in Scheme, it naturally needs some way to represent code that is to be run only when the package is built. Additionally, there needs to be some way to reference dependencies and retrieve output paths; otherwise, you wouldn't be able to, for instance, create a phase to install a file in the output directory.

So, how do we implement this "deferred" code? Well, initially Guix used plain old s-expressions for this purpose.

Once Upon a Time

Let's say we want to create a store item that's just a symlink to the bin/irssi file of the irssi package. How would we do that with an s-expression? Well, the s-expression itself, which we call the builder, is fairly simple:

(define sexp-builder
  `(let* ((out (assoc-ref %outputs "out"))
          (irssi (assoc-ref %build-inputs "irssi"))
          (bin/irssi (string-append irssi "/bin/irssi")))
     (symlink bin/irssi out)))

If you aren't familliar with the "quoting" syntax used to create s-expressions, I strongly recommend that you read the excellent Scheme Primer; specifically, section 7, Lists and "cons" and section 11, On the extensibility of Scheme (and Lisps in general)

The %outputs and %build-inputs variables are bound within builder scripts to association lists, which are lists of pairs that act like key/value stores, for instance:

'(("foo" . "bar")
  ("floob" . "blarb")
  ("fvoolag" . "bvarlag"))

To retrieve values from association lists, which are often referred to as alists, we use the assoc-ref procedure:

(assoc-ref '(("boing" . "bouncy")
             ("floing" . "flouncy"))
           "boing")
 "bouncy"

%outputs, as the name might suggest, maps derivation output names to the paths of their respective store items, the default output being out, and %build-inputs maps inputs labels to their store items.

The builder is the easy part; we now need to turn it into a derivation and tell it what "irssi" actually refers to. For this, we use the build-expression->derivation procedure from (guix derivations):

(use-modules (guix derivations)
             (guix packages)
             (guix store)
             (gnu packages guile)
             (gnu packages irc))

(with-store store
  (let ((guile-3.0-drv (package-derivation store guile-3.0))
        (irssi-drv (package-derivation store irssi)))
    (build-expression->derivation store "irssi-symlink" sexp-builder
      #:guile-for-build guile-3.0-drv
      #:inputs `(("irssi" ,irssi-drv)))))
 #<derivation /gnu/store/…-irssi-symlink.drv => /gnu/store/…-irssi-symlink …>

There are several things to note here:

The shortcomings of using s-expressions in this way are numerous: we have to convert everything to a derivation before using it, and inputs are not an inherent aspect of the builder. G-expressions were designed to overcome these issues.

Premortem Examination

A g-expression is fundamentally a record of type <gexp>, which is, naturally, defined in (guix gexp). The two most important fields of this record type, out of a total of five, are proc and references; the former is a procedure that returns the equivalent s-expression, the latter a list containing everything from the "outside world" that's used by the g-expression.

When we want to turn the g-expression into something that we can actually run as code, we combine these two fields by first building any g-expression inputs that can become derivations (leaving alone those that cannot), and then passing the built references as the arguments of proc.

Here's an example g-expression that is essentially equivalent to our sexp-builder:

(use-modules (guix gexp))

(define gexp-builder
  #~(symlink #$(file-append irssi "/bin/irssi")
             #$output))

gexp-builder is far more concise than sexp-builder; let's examine the syntax and the <gexp> object we've created. To make a g-expression, we use the #~ syntax, equivalent to the gexp macro, rather than the quasiquote backtick used to create s-expressions.

When we want to embed values from outside as references, we use #$, or ungexp, which is, in appearance if not function, equivalent to unquote (,). ungexp can accept any of four reference types:

All these reference types will be represented by <gexp-input> records in the references field, except for the last kind, which will become <gexp-output> records. To give an example of each type of reference (with the return value output formatted for easier reading):

(use-modules (gnu packages glib))

#~(list #$"foobar"                         ;s-expression
        #$#~(string-append "foo" "bar")    ;g-expression
        #$(file-append irssi "/bin/irssi") ;buildable object (expression)
        #$glib:bin                         ;buildable object (symbol)
        #$output:out)                      ;output
 #<gexp (list #<gexp-input "foobar":out>
               #<gexp-input #<gexp (string-append "foo" "bar") …>:out>
               #<gexp-input #<file-append #<package irssi@1.4.3 …> "/bin/irssi">:out>
               #<gexp-input #<package glib@2.70.2 …>:bin>
               #<gexp-output out>) …>

Note the use of file-append in both the previous example and gexp-builder; this procedure produces a <file-append> object that builds its first argument and is embedded as the concatenation of the first argument's output path and the second argument, which should be a string. For instance, (file-append irssi "/bin/irssi") builds irssi and expands to /gnu/store/…-irssi/bin/irssi, rather than the /gnu/store/…-irssi that the package alone would be embedded as.

So, now that we have a g-expression, how do we turn it into a derivation? This process is known as lowering; it entails the use of the aptly-named lower-gexp monadic procedure to combine proc and references and produce a <lowered-gexp> record, which acts as a sort of intermediate representation between g-expressions and derivations. We can piece apart this lowered form to get a sense of what the final derivation's builder script would look like:

(define lowered-gexp-builder
  (with-store store
    (run-with-store store
      (lower-gexp gexp-builder))))

(lowered-gexp-sexp lowered-gexp-builder)
 (symlink
   "/gnu/store/…-irssi-1.4.3/bin/irssi"
   ((@ (guile) getenv) "out"))

And there you have it: a s-expression compiled from a g-expression, ready to be written into a builder script file in the store. So, how exactly do you turn this into said derivation?

Well, it turns out that there isn't an interface for turning lowered g-expressions into derivations, only one for turning regular g-expressions into derivations that first uses lower-gexp, then implements the aforementioned conversion internally, rather than outsourcing it to some other procedure, so that's what we'll use.

Unsurprisingly, that procedure is called gexp->derivation, and unlike its s-expression equivalent, it's monadic. (build-expression->derivation and other deprecated procedures were in Guix since before the monads system existed.)

(with-store store
  (run-with-store store
    (gexp->derivation "irssi-symlink" gexp-builder)))
 #<derivation /gnu/store/…-irssi-symlink.drv => /gnu/store/…-irssi-symlink …>

Finally, we have a g-expression-based equivalent to the derivation we earlier created with build-expression->derivation! Here's the code we used for the s-expression version in full:

(define sexp-builder
  `(let* ((out (assoc-ref %outputs "out"))
          (irssi (assoc-ref %build-inputs "irssi"))
          (bin/irssi (string-append irssi "/bin/irssi")))
     (symlink bin/irssi out)))

(with-store store
  (let ((guile-3.0-drv (package-derivation store guile-3.0))
        (irssi-drv (package-derivation store irssi)))
    (build-expression->derivation store "irssi-symlink" sexp-builder
      #:guile-for-build guile-3.0-drv
      #:inputs `(("irssi" ,irssi-drv)))))

And here's the g-expression equivalent:

(define gexp-builder
  #~(symlink #$(file-append irssi "/bin/irssi")
             #$output))

(with-store store
  (run-with-store store
    (gexp->derivation "irssi-symlink" gexp-builder)))

That's a lot of complexity abstracted away! For more complex packages and services, especially, g-expressions are a lifesaver; you can refer to the output paths of inputs just as easily as you would a string constant. You do, however, have to watch out for situations where ungexp-native, written as #+, would be preferable over regular ungexp, and that's something we'll discuss later.

A brief digression before we continue: if you'd like to look inside a <gexp> record, but you'd rather not build anything, you can use the gexp->approximate-sexp procedure, which replaces all references with dummy values:

(gexp->approximate-sexp gexp-builder)
 (symlink (*approximate*) (*approximate*))

The Lowerable-Object Hardware Shop

We've seen two examples already of records we can turn into derivations, which are generally referred to as lowerable objects or file-like objects:

There are many more available to us. Recall from the previous post, The Store Monad, that Guix provides the two monadic procedures text-file and interned-file, which can be used, respectively, to put arbitrary text or files from the filesystem in the store, returning the path to the created item.

This doesn't work so well with g-expressions, though; you'd have to wrap each ungexped use of either of them with (with-store store (run-with-store store …)), which would be quite tedious. Thankfully, (guix gexp) provides the plain-file and local-file procedures, which return equivalent lowerable objects. This code example builds a directory containing symlinks to files greeting the world:

(use-modules (guix monads)
             (ice-9 ftw)
             (ice-9 textual-ports))

(define (build-derivation monadic-drv)
  (with-store store
    (run-with-store store
      (mlet* %store-monad ((drv monadic-drv))
        (mbegin %store-monad
          ;; BUILT-DERIVATIONS is the monadic version of BUILD-DERIVATIONS.
          (built-derivations (list drv))
          (return (derivation-output-path
                   (assoc-ref (derivation-outputs drv) "out"))))))))
                   
(define world-greeting-output
  (build-derivation
   (gexp->derivation "world-greeting"
     #~(begin
         (mkdir #$output)
         (symlink #$(plain-file "hi-world"
                      "Hi, world!")
                  (string-append #$output "/hi"))
         (symlink #$(plain-file "hello-world"
                      "Hello, world!")
                  (string-append #$output "/hello"))
         (symlink #$(plain-file "greetings-world"
                      "Greetings, world!")
                  (string-append #$output "/greetings"))))))

;; We turn the list into multiple values using (APPLY VALUES …).
(apply values
       (map (lambda (file-path)
              (let* ((path (string-append world-greeting-output "/" file-path))
                     (contents (call-with-input-file path get-string-all)))
                (list path contents)))
            ;; SCANDIR from (ICE-9 FTW) returns the list of all files in a
            ;; directory (including ``.'' and ``..'', so we remove them with the
            ;; second argument, SELECT?, which specifies a predicate).
            (scandir world-greeting-output
                     (lambda (path)
                       (not (or (string=? path ".")
                                (string=? path "..")))))))
 ("/gnu/store/…-world-greeting/greetings" "Greetings, world!")
 ("/gnu/store/…-world-greeting/hello" "Hello, world!")
 ("/gnu/store/…-world-greeting/hi" "Hi, world!")

Note that we define a procedure for building the output; we will need to build more derivations in a very similar fashion later, so it helps to have this to reuse instead of copying the code in world-greeting-output.

There are many other useful lowerable objects available as part of the gexp library. These include computed-file, which accepts a gexp that builds the output file, program-file, which creates an executable Scheme script in the store using a g-expression, and mixed-text-file, which allows you to, well, mix text and lowerable objects; it creates a file from the concatenation of a sequence of strings and file-likes. The G-Expressions manual page has more details.

So, you may be wondering, at this point: there's so many lowerable objects included with the g-expression library, surely there must be a way to define more? Naturally, there is; this is Scheme, after all! We simply need to acquaint ourselves with the define-gexp-compiler macro.

The most basic usage of define-gexp-compiler essentially creates a procedure that takes as arguments a record to lower, the host system, and the target system, and returns a derivation or store item as a monadic value in %store-monad.

Let's try implementing a lowerable object representing a file that greets the world. First, we'll define the record type:

(use-modules (srfi srfi-9))

(define-record-type <greeting-file>
  (greeting-file greeting)
  greeting?
  (greeting greeting-file-greeting))

Now we use define-gexp-compiler like so; note how we can use lower-object to compile down any sort of lowerable object into the equivalent store item or derivation; essentially, lower-object is just the procedure for applying the right gexp-compiler to an object:

(use-modules (ice-9 i18n))

(define-gexp-compiler (greeting-file-compiler
                       (greeting-file <greeting-file>)
                       system target)
  (lower-object
   (let ((greeting (greeting-file-greeting greeting-file)))
     (plain-file (string-append greeting "-greeting")
       (string-append (string-locale-titlecase greeting) ", world!")))))

Let's try it out now. Here's how we could rewrite our greetings directory example from before using <greeting-file>:

(define world-greeting-2-output
  (build-derivation
   (gexp->derivation "world-greeting-2"
     #~(begin
         (mkdir #$output)
         (symlink #$(greeting-file "hi")
                  (string-append #$output "/hi"))
         (symlink #$(greeting-file "hello")
                  (string-append #$output "/hello"))
         (symlink #$(greeting-file "greetings")
                  (string-append #$output "/greetings"))))))

(apply values
       (map (lambda (file-path)
              (let* ((path (string-append world-greeting-2-output
                                          "/" file-path))
                     (contents (call-with-input-file path get-string-all)))
                (list path contents)))
            (scandir world-greeting-2-output
                     (lambda (path)
                       (not (or (string=? path ".")
                                (string=? path "..")))))))
 ("/gnu/store/…-world-greeting-2/greetings" "Greetings, world!")
 ("/gnu/store/…-world-greeting-2/hello" "Hello, world!")
 ("/gnu/store/…-world-greeting-2/hi" "Hi, world!")

Now, this is probably not worth a whole new gexp-compiler. How about something a bit more complex? Sharp-eyed readers who are trying all this in the REPL may have noticed the following output when they used define-gexp-compiler (formatted for ease of reading):

 #<<gexp-compiler>
    type: #<record-type <greeting-file>>
    lower: #<procedure  (greeting-file system target)>
    expand: #<procedure default-expander (thing obj output)>>

Now, the purpose of type and lower is self-explanatory, but what's this expand procedure here? Well, if you recall file-append, you may realise that the text produced by a gexp-compiler for embedding into a g-expression doesn't necessarily have to be the exact output path of the produced derivation.

There turns out to be another way to write a define-gexp-compiler form that allows you to specify both the lowering procedure, which produces the derivation or store item, and the expanding procedure, which produces the text.

Let's try making another new lowerable object; this one will let us build a Guile package and expand to the path to its module directory. Here's our record:

(define-record-type <module-directory>
  (module-directory package)
  module-directory?
  (package module-directory-package))

Here's how we define both a compiler and expander for our new record:

(use-modules (gnu packages guile)
             (guix utils))

(define lookup-expander (@@ (guix gexp) lookup-expander))

(define-gexp-compiler module-directory-compiler <module-directory>
  compiler => (lambda (obj system target)
                (let ((package (module-directory-package obj)))
                  (lower-object package system #:target target)))
  expander => (lambda (obj drv output)
                (let* ((package (module-directory-package obj))
                       (expander (or (lookup-expander package)
                                     (lookup-expander drv)))
                       (out (expander package drv output))
                       (guile (or (lookup-package-input package "guile")
                                  guile-3.0))
                       (version (version-major+minor
                                 (package-version guile))))
                  (string-append out "/share/guile/site/" version))))

Let's try this out now:

(use-modules (gnu packages guile-xyz))

(define module-directory-output/guile-webutils
  (build-derivation
   (gexp->derivation "module-directory-output"
     #~(symlink #$(module-directory guile-webutils) #$output))))

(readlink module-directory-output/guile-webutils)
 "/gnu/store/…-guile-webutils-0.1-1.d309d65/share/guile/site/3.0"

(scandir module-directory-output/guile-webutils)
 ("." ".." "webutils")

(define module-directory-output/guile2.2-webutils
  (build-derivation
   (gexp->derivation "module-directory-output"
     #~(symlink #$(module-directory guile2.2-webutils) #$output))))

(readlink module-directory-output/guile2.2-webutils)
 "/gnu/store/…-guile-webutils-0.1-1.d309d65/share/guile/site/2.2"

(scandir module-directory-output/guile2.2-webutils)
 ("." ".." "webutils")

Who knows why you'd want to do this, but it certainly works! We've looked at why we need g-expressions, how they work, and how to extend them, and we've now only got two more advanced features to cover: cross-build support, and modules.

Importing External Modules

Let's try using one of the helpful procedures from the (guix build utils) module in a g-expression.

(define simple-directory-output
  (build-derivation
   (gexp->derivation "simple-directory"
     #~(begin
         (use-modules (guix build utils))
         (mkdir-p (string-append #$output "/a/rather/simple/directory"))))))

Looks fine, right? We've even got a use-modules in th--

ERROR:
  1. &store-protocol-error:
      message: "build of `/gnu/store/…-simple-directory.drv' failed"
      status: 100

OUTRAGEOUS. Fortunately, there's an explanation to be found in the Guix build log directory, /var/log/guix/drvs; locate the file using the first two characters of the store hash as the subdirectory, and the rest as the file name, and remember to use zcat or zless, as the logs are gzipped:

Backtrace:
           9 (primitive-load "/gnu/store/…")
In ice-9/eval.scm:
   721:20  8 (primitive-eval (begin (use-modules (guix build #)) (?)))
In ice-9/psyntax.scm:
  1230:36  7 (expand-top-sequence ((begin (use-modules (guix ?)) #)) ?)
  1090:25  6 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?)
  1222:19  5 (parse _ (("placeholder" placeholder)) ((top) #(# # ?)) ?)
   259:10  4 (parse _ (("placeholder" placeholder)) (()) _ c&e (eval) ?)
In ice-9/boot-9.scm:
  3927:20  3 (process-use-modules _)
   222:17  2 (map1 (((guix build utils))))
  3928:31  1 (_ ((guix build utils)))
   3329:6  0 (resolve-interface (guix build utils) #:select _ #:hide ?)

ice-9/boot-9.scm:3329:6: In procedure resolve-interface:
no code for module (guix build utils)

It turns out use-modules can't actually find (guix build utils) at all. There's no typo; it's just that to ensure the build is isolated, Guix builds module-import and module-importe-compiled directories, and sets the Guile module path within the build environment to contain said directories, along with those containing the Guile standard library modules.

So, what to do? Turns out one of the fields in <gexp> is modules, which, funnily enough, contains the names of the modules which will be used to build the aforementioned directories. To add to this field, we use the with-imported-modules macro. (gexp->derivation does provide a modules parameter, but with-imported-modules lets you add the required modules directly to the g-expression value, rather than later on.)

(define simple-directory-output
  (build-derivation
   (gexp->derivation "simple-directory"
     (with-imported-modules '((guix build utils))
       #~(begin
           (use-modules (guix build utils))
           (mkdir-p (string-append #$output "/a/rather/simple/directory")))))))
           
simple-directory-output
 "/gnu/store/…-simple-directory"

It works, yay. It's worth noting that while passing just the list of modules to with-imported-modules works in this case, this is only because (guix build utils) has no dependencies on other Guix modules. Were we to try adding, say, (guix build emacs-build-system), we'd need to use the source-module-closure procedure to add its dependencies to the list:

(use-modules (guix modules))

(source-module-closure '((guix build emacs-build-system)))
 ((guix build emacs-build-system)
   (guix build gnu-build-system)
   (guix build utils)
   (guix build gremlin)
   (guix elf)
   (guix build emacs-utils))

Here's another scenario: what if we want to use a module not from Guix or Guile but a third-party library? In this example, we'll use guile-json , a library for converting between S-expressions and JavaScript Object Notation.

We can't just with-imported-modules its modules, since it's not part of Guix, so <gexp> provides another field for this purpose: extensions. Each of these extensions is a lowerable object that produces a Guile package directory; so usually a package. Let's try it out using the guile-json-4 package to produce a JSON file from a Scheme value within a g-expression.

(define helpful-guide-output
  (build-derivation
   (gexp->derivation "json-file"
     (with-extensions (list guile-json-4)
       #~(begin
           (use-modules (json))
           (mkdir #$output)
           (call-with-output-file (string-append #$output "/helpful-guide.json")
             (lambda (port)
               (scm->json '((truth . "Guix is the best!")
                            (lies . "Guix isn't the best!"))
                          port))))))))

(call-with-input-file
    (string-append helpful-guide-output "/helpful-guide.json")
  get-string-all)
 "{\"truth\":\"Guix is the best!\",\"lies\":\"Guix isn't the best!\"}"

Amen to that, helpful-guide.json. Before we continue on to cross-compilation, there's one last feature of with-imported-modules you should note. We can add modules to a g-expression by name, but we can also create entirely new ones using lowerable objects, such as in this pattern, which is used in several places in the Guix source code to make an appropriately-configured (guix config) module available:

(with-imported-modules `(((guix config) => ,(make-config.scm))
                         )
  )

In case you're wondering, make-config.scm is found in (guix self) and returns a lowerable object that compiles to a version of the (guix config) module, which contains constants usually substituted into the source code at compile time.

Native ungexp

There is another piece of syntax we can use with g-expressions, and it's called ungexp-native. This helps us distinguish between native inputs and regular host-built inputs in cross-compilation situations. We'll cover cross-compilation in detail at a later date, but the gist of it is that it allows you to compile a derivation for one architecture X, the target, using a machine of architecture Y, the host, and Guix has excellent support for it.

If we cross-compile a g-expression G that non-natively ungexps L1, a lowerable object, from architecture Y to architecture X, both G and L1 will be compiled for architecture X. However, if G natively ungexps L1, G will be compiled for X and L1 for Y.

Essentially, we use ungexp-native in situations where there would be no difference between compiling on different architectures (for instance, if L1 were a plain-file), or where using L1 built for X would actually break G (for instance, if L1 corresponds to a compiled executable that needs to be run during the build; the executable would fail to run on Y if it was built for X.)

The ungexp-native macro naturally has a corresponding reader syntax, #+, and there's also ungexp-native-splicing, which is written as #+@. These two pieces of syntax are used in the same way as their regular counterparts.

Conclusion

What have we learned in this post? To summarise:

Mastering g-expressions is essential to understanding Guix's inner workings, so the aim of this blog post is to be as thorough as possible. However, if you still find yourself with questions, please don't hesitate to stop by at the IRC channel #guix:libera.chat and mailing list help-guix@gnu.org; we'll be glad to assist you!

Also note that due to the centrality of g-expressions to Guix, there exist a plethora of alternative resources on this topic; here are some which you may find useful:

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, AArch64 and POWER9 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

Sauf indication contraire, les billets de blog de ce site sont la propriété de leurs auteurs respectifs et publiés sous les termes de la licence CC-BY-SA 4.0 et ceux de la GNU Free Documentation License (version 1.3 ou supérieur, sans section invariante, sans texte de préface ni de postface).