# Staged iteration strategy

Created: 2008-05-07 12:10:27      Last updated: 2008-05-07 12:32:36

Consider two lists A and B, of equal size 3. `A[1]` corresponds to `B[1]`, `A[2]` to `B[2]`, etc, for instance A are image scans from 2007 and B from 2008, and the index indicates the patient number.

```A = [a0, a1, a2]
B = [b0, b1, b2]
```

There's then two lists of possible parameters P and Q, of different lengths, P has 2 and Q has 4 items.

```P = [p0, p1]
Q = [q0, q1, q2, q3]
```

Each of the A items should be processed in ap using each of the P parameters, and each of the B items processed in bq using each of the Q parameters.

The problem then is how to compare AnPp against the all BnQq - but notice that An and Bn have to match. The normal crossproduct would compare all AaPp against all BbQq - but we want to restrict the iteration strategy. We can't use the dot product directly because for a given patient n we want to compare all Ps against all Qs using the cross product.

This is solved in t2 using staged iteration, but here is a hack showing how this can be achieved in t1 using a nested workflow. The nested workflow basically "stops" the iteration at list level, due to the echo_lists inside that makes the nested workflow excpect lists.

There's explicit `crossproduct(p,a)` strategy set on ap and `crossproduct(q,b)` on bq to make sure they output with the a/b iteration at the highest level lists, ie. ap outputs:

```[  [ a0p0, a0p1],
[ a1p0, a1p1],
[ a2p0, a2p1]
]
```

So the top level lists from ap corresponds to each item of A. The same trick applies to bq - if we don't specify this the implicit iteration might output the opposite with the highest list corresponding to the Ps and Qs. (Hint: Simply drag p or a within the iteration strategy editor to change the order)

The second part is to use a nested workflow that takes the outputs from ap and bq, but through the Echo list local worker. This worker doesn't do anything except it forces the nested workflow to take a list of items as inputs instead of single inputs.

Hence we can set a `dotproduct(pq, ap)` on the processor for the nested workflow apbq_iter - since this nested workflow consumes lists at both ports this means it will be iterated over with these inputs:

```ap = [ a0p0, a0p1]    bq = [b0q0, b0q1, b0q2, b0q3]
ap = [ a1p0, a1p1]    bq = [b1q0, b1q1, b1q2, b1q3]
ap = [ a2p0, a2p1]    bq = [b2q0, b2q1, b2q2, b2q3]
```

Inside the nested workflow there's the normal ```crossproduct(pq, ap)``` so that it can do an all-to-all comparison.

(The beanshell inside apbq here actually only returns the string `ap+bq`, ap returns `a+p` and bq returns `p+q`, but assume that instead there was real services invoked for each of these processors doing some kind of filtering/comparison operation using the given parameters, and that `a0p0` etc. are the outputs of those operations.)

With this hack we can run a dotproduct for the outer list, and a crossproduct for the inner list.

The output of running this workflow should be:

```{
[ (a0p0b0q0, a0p0b0q1, a0p0b0q2, a0p0b0q3),
(a0p1b0q0, a0p1b0q1, a0p1b0q2, a0p1b0q3)
],
[ (a1p0b1q0, a1p0b1q1, a1p0b1q2, a1p0b1q3),
(a1p1b1q0, a1p1b1q1, a1p1b1q2, a1p1b1q3)
],
[ (a2p0b2q0, a2p0b2q1, a2p0b2q2, a2p0b2q3),
(a2p1b2q0, a2p1b2q1, a2p1b2q2, a2p1b2q3)
]
}

```

So in the final output `{}` (depth=3) there's three big `[]` lists of depth 2, corresponding to `a0/``b0`, `a1`/`b1` and `b2`/`b2`. Within each of these are two ` ()` lists of depth 1 corresponding to `p0` and `p1`. The content of these lists (depth 0) are the actual items returned from apbq - one for each of `q0,q1,q2,q3`.

