Compiling queries without EVAL

Gary King recently blogged a question about whether a use of EVAL he had recently made in some Common Lisp code was legit. This caught my eye because I suspect this question came up while Gary was working on a project that I once worked on. Anyway, since he asked, the short answer is, “No. The old rule that if you’re using EVAL you’re doing something wrong is still true.” A slightly longer answer follows.

The problem Gary set out to solve is that he’s got a a bunch of chunks of text coming out of some sort of database and he wants to query them with logical expressions like this one:

 (and "space" "mission")
 (and (or "moon" "lunar") "landing"))

These expressions are obviously not Lisp but rather a query language where the logical operators AND and OR (and presumably other ones like NOT) have their usual logical meaning and literal strings should evaluate to true if the text we’re matching against testing contains the string and false if not. Thus the expression above would match text containing the words “space” and “mission” or either of “moon” or “lunar” along with “landing”. Gary considered, and rejected, writing “a simple recursive evaluator to deal with ands and ors”. Instead he wrote some code to munge around expressions like the one above into a form that he could EVAL. In other words, instead of writing his own interpreter he just munged his code into a form that Lisp could interpret for him. Which is okay. Except for the rule that if you’re using EVAL you’re doing something wrong.

What he should have done is written a compiler. Luckily Common Lisp comes with a Lisp compiler built in so all we have to do to write our own compiler is translate our source language into Lisp. Here’s how I’d do it.

First define a function that takes a query expression and returns two values, an equivalent expression with all string literals replaced with GENSYM’d variable names, and an alist of the strings and the corresponding GENSYM’d names. In this phase we also detect if the same literal string appears more than once so we only generate a single binding for each unique string.

(defun translate (tree &optional bindings)
  (typecase tree
     (multiple-value-bind (newcar newbindings)
         (translate (car tree) bindings)
       (multiple-value-bind (newcdr newbindings2)
           (translate (cdr tree) newbindings)
         (values (cons newcar newcdr) newbindings2))))
    (symbol (values tree bindings))
     (let ((binding (assoc tree bindings :test #'equal)))
         (binding (values (cdr binding) bindings))
          (let ((sym (gensym)))
            (values sym (acons tree sym bindings)))))))))

With this function we can translate an expression like:

(or (and "a" "b") (and (or "c" "d") (or "a" "b")))

Into this expression:

(OR (AND #:G1 #:G2) (AND (OR #:G3 #:G4) (OR #:G1 #:G2)))

and this list of bindings:

(("d" . #:G4) ("c" . #:G3) ("b" . #:G2) ("a" . #:G1))

Now we just need to use those two values to build up a bit of Lisp. Gary’s solution was to build up an expression that he could EVAL. But it’s better to generate a LAMBDA expression because then we can compile it. Here’s the function:1

(defun make-lambda-expression (expr)
  (multiple-value-bind (tree bindings) (translate expr)
    (let ((string (gensym)))
      `(lambda (,string) 
         (let (,@(loop for (word . sym) in bindings collect
                      `(,sym (find-word-in-string ,word ,string))))

If we pass the same expression to this function we get the following lambda expression back.

(LAMBDA (#:G9)
  (LET ((#:G8 (FIND-WORD-IN-STRING "d" #:G9))
        (#:G7 (FIND-WORD-IN-STRING "c" #:G9))
        (#:G6 (FIND-WORD-IN-STRING "b" #:G9))
        (#:G5 (FIND-WORD-IN-STRING "a" #:G9)))
    (OR (AND #:G5 #:G6) (AND (OR #:G7 #:G8) (OR #:G5 #:G6)))))

We could use FUNCALL to evaluate this expression which would at least get us out of using EVAL.2 But the real advantage of this approach is that we can compile this expression. Since Gary said he wanted to find all the strings in his database that match a given expression, he’s probably going to be evaluating his query once per string in his database. In that case it’s probably worth it to take a small up front hit in order to speed up the execution of the query since we’re going to be executing it many times. Luckily compiling a lambda expression is about as trivial as EVALing any other expression:

(defun compile-expression (expr)
  (compile nil (make-lambda-expression expr)))

This function, fed a query expression, returns a compiled function that takes a single string argument and returns true if the query expression matches and false if not. On Lisps with native compilers the returned function will be compiled down to machine code just the same as if we had written it by hand in our source code. We can FUNCALL this function such as in this code that collects all the strings returned by a cursor function that match the query expression:

(defun query-strings (query database)
  (loop with predicate = (compile-expression query)
     with cursor = (string-cursor database)
     for string = (next-string cursor)
     while string
     when (funcall predicate string) collect string))

We can also use the query function with all of Lisp’s higher-order-functions, such as REMOVE-IF-NOT:

(remove-if-not (compile-expression query) *all-strings*)

Now, one could argue that there’s a whole heck of a lot of difference between using EVAL and wrapping something in a lambda expression and compiling it — in both cases you can generate, and then evaluate, arbitrary Lisp code. But there is an important difference, namely that EVAL just evaluates — it takes some data and interprets it as Lisp and gives you an answer straight away whereas compiling a lambda expression gives you a function, something that can interact with the rest of your code, as an argument to higher-order functions, and so on. Or, if you don’t buy that, at least you avoid breaking the no EVAL rule.

Update: It hit me as I was brushing my teeth that while nicely avoiding EVAL, my first solution had a big problem — because all the FIND-WORD-IN-STRING calls are done in the LET before the boolean expression is evaluated we lose all the advantage of AND and OR’s short-circuiting behavior. That is, the generated code searches for all the strings and then combines the results of all those searches. Much better (and simpler) would be to implement TRANSLATE and MAKE-LAMBDA-EXPRESSION this way:

(defun translate (tree &optional (stringvar (gensym)))
  (typecase tree
     (cons (translate (car tree) stringvar) (translate (cdr tree) stringvar)))
    (symbol tree)
    (string `(find-word-in-string ,tree ,stringvar))))

(defun make-lambda-expression (expr)
  (let ((string (gensym)))
    `(lambda (,string) ,(translate expr string))))

This has the slight disadvantage that if the same literal string appears in the pattern more than once, we will potentially search for it more than once. But that’s probably much less of an issue than the problem this code fixes. Obviously if we cared to, we could generate code that caches searches once they are done to get the best of both worlds. But I’ll leave that as an exercise for the reader.

1. This function generates calls to a helper function FIND-WORD-IN-STRING. Gary’s version was somewhat more complex but for the purposes of illustration this definition should do:

(defun find-word-in-string (word string)
  (search word string :test #'char-equal))

2. Gary’s solution was haired up a bit because he didn’t simply generate a LET form to EVAL but instead generated just the boolean expression and then wrapped the call to EVAL in a PROGV to establish dynamic bindings at run-time. Which, while sort of clever, was another sign from the gods that he had gone down a wrong path somewhere.


One Response to “Compiling queries without EVAL”

  1. Klaus Schilling Says:

    I use as many progv and eval as I want to, and none of the fexpr haters will ever be able to dissuade me from doing so.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: