Let A be a matrix of dimensions m × n and x a vector of length n. Calculate in parallel using UPC++ the vector y of length m resulting from y = A × x. The values of the result vector must be: yi = ∑ j =0..n Ai,j × xj .
a. Distribute both the rows of the matrix A and the elements of x among the different parts of the global memory space. Apply a suitable distribution to y in order to minimize remote accesses.
b. Distribute the rows of the matrix A among the different parts of the global memory and replicate the vector x. Apply a suitable distribution to y in order to minimize remote accesses.
c. Distribute the columns of the matrix A among the different parts of the global memory. Apply a suitable distribution to both vectors x and y in order to minimize remote accesses. You mayn eed a reduction in order to obtain good performance results. Measure and discuss the performance of the three versions.
"Looking for a Similar Assignment? Get Expert Help at an Amazing Discount!"
