0. mini k shock-and-awe.

our problem is this:  given a string s, define the function 'trim'.  trim takes s and deletes leading, trailing, and redundant blanks.

in the course of solving this problem we will encounter several of the distinguishing characteristics of k.

we prefer this "mini shock-and-awe" introduction to the language over the standard method of enumerating features.  you can find that approach masterfully exemplified in don orth's kx user's guide.  we think our approach gives the student a quick insight into how k differs from other programming languages.


1. delete leading blanks.

let s be:

  s:"   this      has  leading,  trailing, and redundant    blanks    "

s is a list of 65 characters:

  # s
65

we can access those characters by indexing:

  s[5 6 9 18]
"is l"

we can also "take" and "drop" sub-lists from the "front" and "back" of s:

  10 # s
"   this   "
  10 _ s
"   has  leading,  trailing, and redundant    blanks    "
  -10 # s
"blanks    "
  -10 _ s
"   this      has  leading,  trailing, and redundant    "

we can find the first occurrence of a particular character in s:

  s ? "h"
4

and use take and drop to calculate the prefix and suffix around that character:

  4 # s
"   t"

  4 _ s
"his      has  leading,  trailing, and redundant    blanks    "

we can mark where in s a particular character occurs:

  s = "h"
0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

we are now in a position to solve the first part of our problem:

first, if we knew the position p of the first non-blank in s we could use "drop" to eliminate everything up to that point:

  p _ s
"this      has  leading,  trailing, and redundant    blanks    "

so how do we calculate p?  first, mark the positions of all the blanks in s:

  s = " "
1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1

now we see that a 1 in the result corresponds positionally to a blank in s, and a 0 in the result corresponds to a non-blank in s.

so, in order to find p we must find the position of the first non-blank in s:

  (s = " ") ? 0
3

and use that result to drop from s:

  ((s = " ") ? 0) _ s
"this      has  leading,  trailing, and redundant    blanks    "

now, to finalize our solution, we can wrap this code in curly braces to form a lambda -- k's term for "function":

  dlb:{[s]((s = " ") ? 0) _ s}
  dlb[s]
"this      has  leading,  trailing, and redundant    blanks    "

review:

- a string is a list of characters
- a boolean list is a list of 1's and 0's
- we can find its length (# s)
- we can index sub-strings (s[i])
- we can take and drop from the front and back of a list (n # s, n _ s)
- we can find the position of an element in a list (s ? c)
- we can mark the positions of an element in a list (s = c)

using these properties we can define an expression which deletes leading blanks from a string.

we can then define a lambda which "freezes" the expression and allows it to be applied to any string.


1a. an interlude on second-order functions.

there's always more than one way to solve a programming problem.

our solution uses two k concepts:  data, in the form of a list of characters (a string), and first-order functions, like 'take' and 'count', which take either one or two data-arguments and return a data-result.

k also has second-order functions.  a second-order function takes a first-order function as an argument and returns a first-order function as a result.  our second solution to the "delete leading blanks" problem will use two second-order functions.

but first a few examples to make the concept clear.

the 'plus' function, written x + y, takes two numeric arguments and returns a numeric result.  e.g. 5 = 3 + 2.  suppose we have a list of positive integers n of unspecified length and we want to sum them up.  for example,

  n:2 5 3 3 4 7 2

in most programming languages you would compose an expression which loops over the elements of n, summing them as it proceeds.  in pseudo-code:

  r is 0
  for i = 1 to count n let r be r + n[i]
  return r

now suppose that you also want to find the value of the maximum element of n.  then you would write a similar program:

  r is 0
  for i = 1 to count n let r be r max n[i]
  return r

and similarly for the functions 'minimum', 'times', 'minus', and so forth.

so you can see that there is a single pattern exemplified in all these cases:

  r is <inital-value>
  for i = 1 to count n let r be r <function> n[i]
  return r

in k, we have a single second-order function, called 'over' and written with '/' which abstracts the logic of this pattern and puts it at the disposal of the programmer:

  0 +/ n
26

that is,

  <initial-value> <function>/ argument

the code-fragment '+/' should be understood as a single functional element, flanked by a data-argument on the left and a first-order function on the right.  the result of +/ is a first-order function which takes n and applies to it the logic described above.

in order to spare the programmer the necessity of supplying an individual value, k allows you to write

  +/ n

in which case +/ will default the initial argument appropriately.

now one last wrinkle in this discussion and we can move on to solving our problem using second-order functions.

in the course of computing its result, the 'over' function is producing a sequence of intermediate results.  in our example, we start the computation with the initial value 0; then we add 2, the first element of n, to get the intermediate value 2; then we add 5, to get the next intermediate value 7; and so forth.  schematically:

    0
    2  =  2 + 0
    7  =  5 + 2
    10 =  7 + 3
    13 = 10 + 3
    17 = 13 + 4
    24 = 17 + 7
    26 = 24 + 2
    26

k contains a second-order function, called 'scan' and written '\', which returns the intermediate results:

  0 +\ n
0 2 7 10 13 17 24 26

as you can see, the result consists of 1+#n elements, since it retains the initial value as the first element.  as with / you can let that value default, in which case the result contains #n elements:

  +\ n
2 7 10 13 17 24 26

review:

- a second-order function takes a first order function and returns a first-order function.
- 'over' and 'scan' (/ and \) are second-order functions.


1b. delete leading blanks redux.

we will now proceed to solve the leading blanks problem using the second-order functions 'over' and 'scan'.

our new function 'dlb2' has the form:

  dlb2:{[s] ( ...?... ) _ s}

we want to compute the index of the first non-blank in s, and then use that index to drop from the front of the string.  so let's begin as we did before by using '=' to mark all the blanks in s:

  s = " "
1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1

we next observe that k provides a primitive function for "logical and":

  0 & 0
0
  0 & 1
0
  1 & 0
0
  1 & 1
1

that is, for boolean x and y, x & y is 1 if and only if x = y = 1.

now we can use 'scan' to progressively "and" the elements of s = " ":

  &\ s = " "
1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

observe that this has the effect of "turning off" all the 1's after the first 0 is encountered.  'scan' applies & repeatedly, leaving behind the intermediate results.  here is the pseudo-code for this scan:

  r is 1
  for i = 2 to count n append r[i-1] & n[i] to r
  return r

that is, we initialize the result to the first element of n, which is 1.  then we loop over n, starting with i = 2, and append to r the result of r[i-1] & n[i].

now we can use 'over' to sum the result, adding up the initial 1's in the list:

  +/ &\ s = " "
3

that is, we first compute the boolean list s = " ", then we scan it with &\, then we reduce it using +/.

we would say that this expression is "idiomatic", and should be read and pronounced this way:

  the plus-over and-scan of s = " "

this solves our problem:

  dlb2:{[s] (+/ &\ s = " ") _ s}

but wait!  is there yet another solution to this problem?  yes, there is*:

this time, let's mark the non-blanks:

  s:"   this      has  leading,  trailing, and redundant    blanks    "

  ~s = " "
0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 0

now scan the boolean result with 'max':

  |\ ~ s = " "
0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

this will produce 0's until scan encounters the first non-blank.

now find the indices of the 1's:

  & |\ ~ s = " "
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64

and index out the substring:

  s[& |\ ~ s = " "]
"this      has  leading,  trailing, and redundant    blanks    "

and wrap the expression in a lambda:

  dlb3:{[s] s[& |\ ~ s = " "]}

review:

- 'x & y' is logical-and.
- '|/' is maximum-over


2. delete trailing blanks.

we could solve the second part part of our problem if only we could reverse our string s, then we apply 'dlb' (or equivalently 'dlb2') to it, then reverse the result:

  | s
"    sknalb    tnadnuder dna ,gniliart  ,gnidael  sah      siht
  dlb[| s]
"sknalb    tnadnuder dna ,gniliart  ,gnidael  sah      siht   "

  | dlb[| s]
"   this      has  leading,  trailing, and redundant    blanks"

now we can encapsulate the logic in the function 'dtb':

  dtb:{[s] | dlb[| s]}

we can feed it the result of applying 'dlb', first deleting leading, and then trailing blanks:

  dtb dlb s
"this      has  leading,  trailing, and redundant    blanks"

aside:

notice that k is fairly tolerant about the syntax of function application.  if f is a function taking an argument a, then f can be applied to a using square brackets:

  f[a]

or simply by juxtaposing function and argument:

  f a

this is true as well in the case of indexing a list v with one or more indices i:

  v[i]

  v i

:aside

review:

- we can reverse a list (| s)
- function application and list indexing can be written with square brackets or by juxtaposition.


3. delete redundant blanks.

in order to solve this part of our problem we'll look at a third second-order function, 'prior', or as it somtimes called 'pairwise'.

like 'over' and 'scan', 'prior' takes a first-order function and an optional initial value and returns a first-order function.  unlike those two functions, 'prior' is written with two symbols, for example:

  a:2 5 3 3 4 8
  >': a
1 0 0 1 1

here, 'prior' is taking the first-order function > and applying it to adjacent elements of a.

first, observe that the count of the result is one less than that of the argument.  and second, observe that the pairs seem to be "reversed" as they are fed to the function:

1 = 5 > 2
0 = 3 > 5
0 = 3 > 3
1 = 4 > 3
1 = 8 > 4

or:

  (5>2)(3>5)(3>3)(4>3)(8>4)
    1    0    0    1    1

we can assure that the count of the result is the same as that of the argument by supplying a value to be used as the first element of the result:

  0 >': a
0 1 0 0 1 1

suppose in this example we want to know, not just the positions of values in a which are greater than their left-adjacent siblings, but what those values are.  in k we have the function 'where', written '& x'.  applied to a boolean list, 'where' returns a list of the indices which correspond to the 1's in the boolean list:

  &0 1 0 0 1 1
1 4 5

aside:

do not confuse 'where' with 'logical and'.  although both functions are written using the symbol '&' the meaning of the symbol is determined by syntactic context.  'where' takes one argument:

  & x

and 'logical and' takes two:

  x & y

we say that k syntax is "ambivalent", or "context-dependent".  in fact, k uses 20 symbols to express 40 different functions.  each symbol has a "monadic" meaning (one argument) and a "dyadic meaning (two arguments.)  the 20 symbols are:

    ~ ! @ # $ % ^ & * - _ = + | : , < . > ?

:end of aside.

and now we can use indexing to pull out the values which are greater than their left-adjacent siblings:

  a[& 0 >': a]
5 4 8

now further suppose that we want to find the positions of the values which are *not* greater than their left-adjacent siblings.  k gives us the primitive function 'not', written '~':

  ~ 0 >': a
1 0 1 1 0 0

so that

  a[& ~ 0 >': a]
2 3 3

we will use 'prior', 'where', 'not', and indexing to solve the third part of our problem.  here is the sequence:

/ define s
  s:"   this      has  leading,  trailing, and redundant    blanks    "

/ mark the blanks in s
  s = " "
1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1

/ mark positions where each 1 is preceded by a 1
  0 &': s = " "
0 1 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1

/ mark positions where each 1 is *not* preceded by a 1 (i.e. where a blank is *not* preceded by a blank.)
  ~ 0 &': s = " "
1 0 0 1 1 1 1 1 0 0 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0

/ which positions are those?
  & ~ 0 &': s = " "
0 3 4 5 6 7 13 14 15 16 18 19 20 21 22 23 24 25 26 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 55 56 57 58 59 60 61

/ index out the substring at those positions
  s[& ~ 0 &': s = " "]
" this has leading, trailing, and redundant blanks "

/ wrap the expression in a lamba
  deb:{[s] s[& ~ 0 &': s = " "]}

review:

- the second-order 'prior' function (':) lets us apply a first-order function between adjacent elements in a list.
- the 'not' function (~ x) takes a boolean value and returns its negation.
- the 'where' function (& x) takes a boolean list and returns the positions of the 1's.


4. trim.

now let's finish up by defining trim:

  trim:{[s] deb dtb dlb s}

  trim s
"this has leading, trailing, and redundant blanks"


5. summary.

functions:

  dlb:{[s]((s = " ") ? 0) _ s}
  dlb2:{[s] (+/ &\ s = " ") _ s}
  dlb3:{[s] s[& |\ ~ s = " "]}
  dtb:{[s] | dlb[| s]}
  deb:{[s] s[& ~ 0 &': s = " "]}
  trim:{[s] deb dtb dlb s}

reviews:

- a string is a list of characters
- a boolean list is a list of 1's and 0's
- we can find its length (# s)
- we can index sub-strings (s[i])
- we can take and drop from the front and back of a list (n # s, n _ s)
- we can find the position of an element in a list (s ? c)
- we can mark the positions of an element in a list (s = c)
- a second-order function takes a first order function and returns a first-order function.
- 'over' and 'scan' (/ and \) are second-order functions.
- we can reverse a list (| s)
- function application and list indexing can be written with square brackets or by juxtaposition.
- 'x & y' is logical-and.
- '|/' is maximum-over
- the second-order 'prior' function (':) lets us apply a first-order function between adjacent elements in a list.
- the 'not' function (~ x) takes a boolean value and returns its negation.
- the 'where' function (& x) takes a boolean list and returns the positions of the 1's.