Algorithm
Set B' = Blocal
for j = 0 to sqrt(P) -2
in each row I the [(I+j) mod sqrt(P)]th task broadcasts
A' = Alocal to the other tasks in the row
accumulate A' * B'
send B' to upward neighbor
done
Previous slide
Next slide
Back to first slide
View graphic version