Function Arrays

(first draft, incomplete)

Introduction

K has arrays with shape and functions with valence. Functions are atoms, having shape !0. Lists can be indexed -- a[i;j;k] -- and functions can be applied -- f[i;j;k], but lists can't be transformed into functions or vice-versa, except via the intermediary representation of strings and 'execute'.

This paper describes how K might be extended to include a new datatype, that of function arrays. A function array is a function with non-null shape. Lists can be transformed into function-arrays and vice-versa.

Atoms and lists

K has atoms and lists. K atoms are integer, float, character, symbol, dictionary, nil, or lambda. lambdas are defined functions:

  {[a;b;c]a+b-c}

The empty list is written

()

Vectors are homogenous lists of atoms of type integer, float, character, and symbol. Each of these types supports its own vector notation:

  1 2 3

  1.2 3.4 5.6

  "abc"

  `one`two`three

Corresponding to each vector-type, there is an empty vector:

!0

  0#0.

""

0#`

Heterogenous lists are formed with parentheses and ';':

  (1;2.2;("abc";`d`e`f;;{x+y}))

A dictionary is an atom constructed from a list of symbol-value pairs or symbol-value-dictionary (or null) triples:

  .((`first;10);(`second;20.345);(`third;"abc"))

Dictionaries can be deconstructed into lists of symbol-value-dictionary (or null) triples:

  .((`first;10);(`second;20.345);(`third;"abc"))
((`first;10;)
 (`second;20.345;)
 (`third;"abc";))

There is one empty list:

()

Shape and rank

K data has shape. The shape of an object d is an integer vector s which describes the regular structure of d. Intuitively, each element s describes as much of the structure of the corresponding parts of d as can be summarized in a single integer. The count of s is called the rank of d. The rank of an object is a measure of its regularity.

The shape of an atom is !0. Atoms have no parts, hence there is no structure to describe:

  ^17
!0

  ^"a"
!0

The rank of an atom is 0:

  #^17
0

Since a vector is a homogenous list of atoms, vector shape is simply the enlisted count. We do not have to check whether the parts of a vector are structurally similar - we know that they are:

  ^10 20 30
,3

  ^"abcdef"
,6

The rank of a vector is 1.

The shape of an empty list -- general or vector -- is ,0:

  ^!0
,0

  ^()
,0

The shape of a non-empty general list is determined recursively. We start by setting s to ,#d. The we get the shapes of each part of d, and append to s the longest common subsequence among the shapes. For example,

  ^(1 2 3;4 5 6)
2 3

We set s to ,2. Then we get the shapes of 1 2 3 and 4 5 6. In both cases, the shape is ,3, and the longest common subsequence is ,3. so we append that to s and get 2 3.

  ^(1 2 3;4 5)

We set s to ,2. Then we get the shapes of 1 2 3 and 4 5, which are ,3 and ,2. The longest common subsequence is !0, so we append that to s and ,2.

Here are two implementations of the 'shape' algorithm, one depth-first, the other breadth-first:

dshape:{:[~t:4:x;(#x),{(+/&\=/((#x)&#y)#'(x;y))#x}/_f'x;0>t;#x;!0]}

bshape:{s:!0;i:0;while[(i=#s)&~|/,//{x'}/[i;@:]x;if[1=#c:?(),i,//{x'}/[i;#:]x;s,:c];i+:1];s}

Indexing

Shape may also be thought of as describing the domain of the indexing function.

For example, an object d of rank r:

  d:2 3 4#10*!24
  d
((0 10 20 30
  40 50 60 70
  80 90 100 110)
 (120 130 140 150
  160 170 180 190
  200 210 220 230))

d can be indexed

  d[0;1 2;3]
70 110

If index j is in the i-th position of the indexing expression, then every element of ,//j must be a member of !s i. (Nil is used to mean "every index along this axis".) If j does not satisfy this requirement, the result is an "index error".

  d[0;17;1]
index error
d[0;17;1]
^
>

The number of indexing positions is always equal to the rank of the object: if d has rank r, then 'd[j1;..;jr]' is a valid indexing expression. If too many indices are applied, the result is a "rank error". If too few, the remaining positions are treated as null, i.e. all elements along that axis:

  d[0;1 2;3;4]
rank error
d[0;1 2;3;4]
^
>

  d[0;1]
40 50 60 70

Indexing is cross-sectional. the result of d[1 2;3 4 5] contains data at positions d[1;3], d[1;4], d[1;5], d[2;3], d[2;4], d[2;5]. The shape of the result of d[1 2;3 4 5] is (^1 2),^3 4 5, i.e. 2 3. If k is a list of the indices of d[k 0;..;k r-1], then the shape of the result is ,/^:'k.

Valence and application

The valence of a function f is the rank of its (possibly infinite) domain D = a X .. X z. Intuitively, it is the maximum number of arguments to which f may be applied.

Since functions are atoms, and it is always a rank error to index an atom, the brackets of indexing can be reused to denote function application. For example, the function f:{[a;b;c]a+b*c} may be applied to three arguments:

  f[10;20;30]
610

Application of a function of valence k to more than k arguments results in a "valence error"

  f[10;20;30;40]
valence error
f[10;20;30;40]
^
>

The result of applying a function f of valence k to n<k arguments is a function-projection of valence k-n, where the corresponding argument positions of the function are set to constant values, viz. those supplied as arguments to the projection.

  g:f[10;;30]
  g
{[a;b;c]a+b*c}[10;;30]
  g[20]
610

  h:f[10]
  h
{[a;b;c]a+b*c}[10]
  h[20;30]
610

Function order

The valence of a function is analogous to the rank of an array, but whereas rank is computable as the count of the shape, which in turn is computed by a primitive, valence is neither derived from some more primitive property of functions, nor directly computable by a primitive.

For example, we cannot compute the valence of an arbitrary primitive p, save by looking up p in a table of valences-by-primitive (or analyzing it's name: 5:p). If f is a defined function, then we may be able to compute the valence by analyzing f's argument-list if it has one, or if it doesn't then by counting unassigned occurrences of the variables 'x', 'y', and 'z'.

Where v is the valence of primitive p I'll define the order of p to be the vector 1_-!1+v. Hence the order of + is -1 -2, and that of +: ,-1.

We cannot in general say that order is derived from valence (even if we had a primitive for computing that), since we will eventually want to permute the argument-order of an arbitrary function, and have the order of the resulting function carry information about the order of its arguments. But the valence of e.g. commute(+) is the same as that of +, whereas the order of commute(+) is -2 -1, while that of + is -1 -2.

Instead, we will think of the 'shape' primitive as having been extended to deliver the order property of constituent functions:

  ^(+)
-1 -2

  ^(+;-)
2 -1 -2

Note that in the case of a single primitive, parentheses are required to first "nominalize" the verb '+'. The result of '(+)' is a function-atom, a piece of data.

K shape records the cardinalities of structurally similar subarrays: each element of a shape vector is a count: the number of subarrays of every array at that level. But order is different: the elements of an order vector indicate how incoming arguments are to be mapped to the function's argument-list. So extended shape is a mix of cardinal and ordinal values.

What about defined functions?

Here too, sufficient information must be carried along with function to distinguish e.g. {x+y} from the commute of {x+y}, since the order of the former is -1 -2 and that of the latter -2 -1. At this point, we don't have a functional method for permuting the arguments of either primitives or defined functions, i.e. for modifying the order of a function-atom. So we need merely guarantee that order for defined functions is defined correctly for the types of objects we are currently able to construct in K. And here too it suffices to define the order of f as -1_-!1+v. So the order of {x+y} is -1 -2, that of {-x} ,-1, &c.

What about niladic functions?

Intuitively, it might seem as though the order of a nilad is !0, since a nilad has no arguments, and therefore has valence 0. But as we will see in the next section, this leads to inconsistent results. Fortunately, K nilads are in fact monadic functions. Typically, we apply a nilad to a null argument, written

  nilad[]

but we can apply any nilad to any value with identical effect:

  nilad 10 20 30

Consequently, we will say that nilads have valence 1 and order ,-1.

What is the shape of a:({x+y};{x+y-z})? The shape of the first function is -1 -2, and that of the second -1 -2 -3. We will say that the shape of the pair is 2 -1 -2, for the same reason that we say the shape of (!2;!3) is ,2: the unextended shape vector records the longest list of identical counts. The extended shape vector records the longest list of identical counts and argument positions. Note that a[1;10;20;30] is 0, and that a[0;10;20] is 30, since the shape of a[1] is -1 -2 -3 and that of a[0] -1 -2.

Function arrays

Consider the list

  a:(+;-).

The shape of a is:

  ^a
2 -1 -2

I.e., a pair of function-atoms, each of which has order -1 -2. Since the rank of an object determines the maximum number of indexing positions available in an indexing expression, and the valence of a function determines the maximum number of argument positions available in an application expression, we should expect that

   a[i;j;k]

is a valid expression involving a, in which i occupies an indexing position, and j and k occupy argument positions. We should continue to expect that a[i] is a function if i is an atom and not nil, and an array of functions if i is not. But what should we expect the result of a[0;3] to be? Intuitively, we're picking out the first function +, and then applying it to the single argument 3. So we should expect the result to be the monadic function +[3].

What is the order of this result? Since order is analogous to shape, we might expect the answer to be ,-2, but this is impossible. +[3] has valence 1, and its arguments haven't been permuted, so its order should be ,-1. In general, we will say that the order of a function is always a permutation of 1_-!1+v.

Monadic + (+:, "transpose" or "flip") has the following definition: applied to an object of rank > 1, it swaps the first two axes. So, if ^x = i,j,.. then ^+x = j,i,.. . What then should be result of transposing a function, say *? Since (*) has shape -1 -2, the result of +(*) should have shape -2 -1. That is, +: applied to a function is commute.

Now consider +a. What should we expect. Since +: swaps the axes of its argument:

  ^b:+a
-1 2 -2

and

  ^c:+:'+a
-1 -2 2

and

  ^d:++:'a
-2 -1 2

Intuitively, c is a function of valence 2. Whatever is supplied as its first argument is mapped to the first arguments of both of its constituents + and -. Whatever is supplied as its second argument is mapped to the second arguments of its constituents. So we should expect e.g.

  c[3;7]
10 -4

b is also a function of valence 2, so

  b[3;;7]
10 -4

The arguments of d are reversed, so

  d[3;7]
10 4

Structural operations

flip

As we've seen, 'flip' swaps the first two "axes":

  ^f:+(+;-)	/ list -> function
-1 2 -2

  ^+f		/ function -> list
2 -1 -2

  ^+(-)	        / commute -
-2 -1

count

We have several alternatives.

count is first of the shape: *^f

count is first of the positive shape: *s _di&0>s:^f

count is valence of the function array: +/0>^f

The first doesn't seem very useful, since f is not a list.

first

The shape of the first of a non-null, non-function-array a is 1 _^a. For example

  a:2 3 4#!24
  ^*a
3 4

Since a is a pair of 3 x 4's, each a is a 3 x 4. So the first of a is a 3 x 4.

But f is a function, not a list of n x m's. Can we assign a meaning to the following result?

  ^t:*f
2 -1			/ renumbering the argument position

t is a list of monadic functions. But which functions? What are they?

A better approach might be to say that the shape of the first of a non-atomic function array f is s _di(0<s:^f)?1:

  ^t:*f
-1 -2

  t~(+)
1

take

Simlarly, we say that 'take' operates on the first positive element of the shape:

  ^1#f
-1 1 -2

  ^1_ f
-1 1 -2

drop

And the same for 'drop':

  ^2_ f
!0

enlist

'Enlist' turns a function array into a unit list:

  ^,f
1 -1 2 -2

join

'Join' penetrates to the first positive elements of the shapes of conformable function arrays.

  g:+(*;%)

  ^f,g
-1 4 -2