I am currently trying to vectorize some code I had written using a large for loop in Python. The vectorized code is as follows:
rho[pi,pj] += (rho_coeff*dt)*i_frac*j_frac
rho[pi+1,pj] += (rho_coeff*dt)*ip1_frac*j_frac
rho[pi,pj+1] += (rho_coeff*dt)*i_frac*jp1_frac
rho[pi+1,pj+1] += (rho_coeff*dt)*ip1_frac*jp1_frac
Each of pi, pj, dt, i_frac, j_frac, ip1_frac, jp1_frac is a numpy array of one dimension and all of the same length. rho is a two dimensional numpy array. pi and pj make up a list of coordinates (pi,pj) which indicate which element of the matrix rho is modified. The modification involves the addition of the (rho_coeff*dt)*i_frac*j_frac term to the (pi,pj) element as well as addition of similar terms to neighbouring elements: (pi+1,pj), (pi,pj+1) and (pi+1,pj+1). Each coordinate in the list (pi, pj) has a unique dt, i_frac, j_frac, ip1_frac and jp1_frac associated with it.
The problem is that the list can have (and always will have) repeating coordinates. So instead of successively adding to rho each time the same coordinate is encountered in the list, it only adds the term corresponding to the last repeating coordinate. This problem is described briefly with an example in the Tentative Numpy Tutorial under fancy indexing with arrays of indices (see the last three examples before boolean indexing). Unfortunately they did not provide a solution to this.
Is there a way of doing this operation without resorting to a for loop? I am trying to optimize for performance and want to do away with a loop if possible.
FYI: this code forms part of a 2D particle tracking algorithm where the charge from each particle is added to the four adjacent nodes of a mesh surrounding the particle's position based on volume fractions.