3.4 Duplication of Variables

If a free variable enters the right side of a sentence more times than it enters the left side, its value must be duplicated by the interpreter when it applies this sentence. This should be taken into account in programming, because unnecessary duplication of variables can lead to substantial losses of efficiency. In more sophisticated implementations of Refal than a simple interpreter, unnecessary duplication of object expressions may be avoided by using pointers to some parts of the original expressions. But when we program for an interpreter and think about Refal programs in terms of straightforward step-by-step execution, we should take the responsibility for avoiding unnecessary duplication.

These considerations have a direct bearing on the way we define branchings in Refal. It would not be difficult to define the semantics of the conditional if-then-else expression in Refal, and then use it as it is used, say, in Pascal. We could decide to use T and F as truth values, and define the function If in the format:

  <If e.Condition
   Then (e.If-True)
   Else (e.If-False)>

When If is used, an active expression computing the truth value of the condition must be substituted for e.Condition . Then If could be simply defined as follows:

  If {
     T Then (e.1) Else (e.2) = e.1;
     F Then (e.1) Else (e.2) = e.2;  }

Logical connectives could be used in the usual manner. Conjunction, e.g., would be defined by the function And :

  And {
     T s.X = s.X;
     F s.X = F; }

However, this approach leads to unnecessary duplications. We shall demonstrate this using the example of a procedure which orders and concatenates strings.

Suppose we have a relation of precedence (order) between strings of symbols defined by the predicate

  <Pre (e.1)(e.2)>

which becomes T if the string e.1 precedes the string e.2 , and F otherwise. We can define the function Order which concatenates two strings in the correct order as follows:

  Order { 
     (e.1)e.2  = <If (<Pre (e.1)(e.2)>)
                Then (e.1 e.2)
                Else (e.2 e.1)>; }

Yet with an interpretive implementation of Refal, like ours, this program is not the best solution. Each of the variables e.1 and e.2 appears once on the left side and three times on the right side. This means that two copies of each string will be made in execution; if the original expression is reserved for the output, then one of the copies will be used in the evaluation of the predicate Pre , and the other will be always wasted because the machine uses only one of the two expressions for the two truth-values.

Let us first eliminate the glaring inefficiency of making two copies (in the then- and else-clauses) when we know that only one of them will actually be used. To do this, we divorce the clauses by putting them in different sentences. Order calls the predicate Pre and passes the result, together with the original strings, to the auxiliary function Order1 , which has two sentences for the two clauses and makes the choice between them (a more elegant way of defining Order without an auxiliary function will be shown in the next chapter, when we introduce an extension of the basic Refal):

  Order { (e.1)e.2 = <Order1 <Pre (e.1)(e.2)>(e.1)e.2> }
   
  Order1 {
     T (e.1)e.2 = e.1 e.2;
     F (e.1)e.2 = e.2 e.1; };

Instead of defining the branching in the algorithm through the function If , or any other special function, we use the type of branching which is built into our language: the choice of a sentence depending on the argument. The function Pre adds to the argument of Order1 an indicator, T or F , sending to Order1 a signal it needs to make the choice. This kind of branching in Refal is both more natural and more efficient. It also does not limit the number of alternatives to two, but works like the device known in other programming languages as a switch or a case statement.

Our definition of Order still requires the making of one copy of the variables e.1 and e.2 . Should we try to eliminate this copying?

This depends on the expected size of these expressions (and, of course, on how often this function is going to be evaluated). Suppose e.1 and e.2 are English words, and Pre is based on the lexicographic order. To define Pre , we must first define the relation of precedence in the alphabet Pre-alph for letters. In basic Refal this can be done as follows:

  Pre-alph {
  *1.This relation is reflexive
     s.1 s.1 = T;
  *2.If the letters are different, see whether the
  *  first is before the second in the alphabet
     s.1 s.2 = <Before s.1 s.2 In <Alphabet>>; }
   
  Before {
     s.1 s.2 In e.A s.1 e.B s.2 e.C = T;
     e.Z = F; }
   
  Alphabet { = 'abcdefghijklmnopqrstuvwxyz'; }

Now it is easy to define the lexicographic Pre :

  Pre {
     ()(e.2) = T;
     (e.1)() = F;
     (s.1 e.X)(s.1 e.Y) = <Pre (e.X)(e.Y)>;
     (s.1 e.X)(s.2 e.Y) = <Pre-alph s.1 s.2>; 
      }

< follow this link if you want to work with example >

Since the length of English words is small, their copying will not cause significant loss of efficiency. It is justified by the simple and natural definition of the predicate Pre . However, in some cases we may wish to avoid copying. For instance, if our strings are not strings of English letters, but strings of some expressions, which may be very big, it is desirable to redefine Pre so that no unnecessary copying is done.

The reason why we have to copy e.1 and e.2 in the definition of Order is that Pre destroys its arguments and replaces them by a single symbol: the truth-value. Thus the following general rule can be put forward:

To avoid unnecessary duplication of arguments, avoid unnecessary destruction of arguments.

We shall refer to such functions which do not destroy their arguments but simply add to them T or F as conservative predicates. If we use a conservative predicate Prec instead of Pre , the function Order must be redefined as

  Order { (e.1)(e.2) = <Order1 <Prec (e.1)(e.2)>; }

We assume here that the format of Prec is defined by this comment:

  * <Prec (e.1)(e.2)>  ==  T(e.1)(e.2)
  *                  ==  F(e.1)(e.2)

To define Prec we modify the definition of Pre on two counts. First, we generalize to dealing with terms instead of just symbols, and replace Pre-alph by Pre-term . Second, the new predicate must become conservative. We can achieve this without altering the format of the argument. The changes are straightforward for the sentences 1, 2, and 4. But in the case covered by sentence 3, we must keep the first term, which is common to both arguments, in order to add it to both of them after the precedence is established. The solution is:

  Prec {
     ()(e.2) = T()(e.2);
     (e.1)() = F(e.1)();
     (t.1 e.X)(t.1 e.Y) = <Add-c t.1 <Prec (e.X)(e.Y)>>;
     (t.1 e.X)(t.2 e.Y) = 
              <Pre-term t.1 t.2>(t.1 e.X)(t.2 e.Y); }
   
  Add-c { t.1 s.T(e.X)(e.Y) = s.T(t.1 e.X)(t.1 e.Y); }

Exercise 3.6 The solution above is not the best. Run

  <Prec ('but')('butter')>

with Pre-alph as Pre-term to see how it works. Assuming that the length of the common initial part of the two strings is c, and the average number of steps for the evaluation of Pre-term is t, compute the number of steps necessary for the evaluation of Prec . Redefine it so that it works faster. Suggestion: add one more box to the format of Prec to accumulate the common part. Compute the number of steps required with this definition.

As mentioned above, it is not always necessary to avoid duplication. Sometimes it is a part of the algorithm. If not, duplication often does not lead to a significant loss of efficiency. For instance, if the algorithm involves passing an expression with a Refal step for every symbol, an additional duplication of this expression will add little to the run time. Duplication of symbols or small expressions is never a problem. Duplication of big expressions on the top level of the algorithm also will not, typically, be a problem, because time will mostly go into inner loops. Finally, more sophisticated implementations of Refal, which are under construction now, use pointers to avoid unnecessary copying.

Yet when programming for a Refal interpreter one must think about free variables not as pointers to values stored somewhere, but rather as the values themselves. When you reshuffle and reproduce the elements of the left side on the right side of a sentence this process will be repeated at the execution time. Unwarranted copying of variables must definitely be avoided.

Exercise 3.7 Define the following conservative predicates:
(a) <String e.X> , true iff e.X is a string (no parentheses).
(b) <Pal e.X> , true iff e.X is a palindrome.