Abstract:
This correspondence presents several parallel algorithms
for image template matching on an SIMD array processor with a hypercube
interconnection network. For an N by N image and an M by
M window, the time complexity is reduced from O(N2M2) for the serial
algorithm to O(M2/K2 + M * log2 NIK + log2 N * log2 K) for the
N2K2-PE system (1 s K . M), or to O(N2M2/L2) for the L2-PE system (L < N). With efficient use of the inter-PE communication network,
each PE requires only a small local memory, many unnecessary
data transmissions are eliminated, and the time complexity is greatly
reduced.