To get what you want from nn2, you need to:
- Specify the
searchtype to be "radius". By not specifying the searchtype, the searchtype is set to standard and the radius input is ignored.
- Specify
k to be nrow(mydata) because k is the maximum number of nearest neighbors to compute.
The manner in which you are calling nn2 will return the 100 nearest neighbors in mydata for each point in mydata. Consequently, you will get the same result for any radius. For example:
library(RANN)
set.seed(123)
## simulate some data
lon = runif(100, -99.1, -99)
lat = runif(100, 33.9, 34)
## data is a 100 x 2 matrix (can also be data.frame)
mydata <- cbind(lon, lat)
radius <- 0.02 ## your radius
res <- nn2(mydata, k=nrow(mydata), searchtype="radius", radius = radius)
## prints total number of nearest neighbors (for all points) found using "radius"
print(length(which(res$nn.idx>0)))
##[1] 1224
res1 <- nn2(mydata, k=100, radius = radius)
## prints total number of nearest neighbors (for all points) found using your call
print(length(which(res1$nn.idx>0)))
##[1] 10000
radius <- 0.03 ## increase radius
res <- nn2(mydata, k=nrow(mydata), searchtype="radius", radius = radius)
## prints total number of nearest neighbors (for all points) found using "radius"
print(length(which(res$nn.idx>0)))
##[1] 2366
res1 <- nn2(mydata, k=100, radius = radius)
## prints total number of nearest neighbors (for all points) found using your call
print(length(which(res1$nn.idx>0)))
##[1] 10000
Note that according to its documentation ?nn2, nn2 returns a list with two elements:
nn.idx: a nrow(query) x k matrix where each row contains the row indices of the k nearest neighbors in mydata to the point at that row in the collection of query points in query. In both of our calls, query=mydata. When called with searchtype="radius", if there are m < k neighbors within the given radius, then k - m of those index values will be set to 0. Since the set of query points is the same as mydata, the index to the point itself will be included.
nn.dist: a nrow(query) x k matrix where each element contains the Euclidean distances for the corresponding nearest neighbor in nn.idx. Here, if the corresponding element in nn.idx is 0, then the value in nn.dist is set to 1.340781e+154.
With your call, you get 100 nearest neighbors for each point in mydata, hence length(which(res1$nn.idx>0))==10000 no matter what the radius is in the example.
Finally, you should note that because nn2 returns two nrow(mydata) x nrow(mydata) in your case, it can very easily overwhelm your memory if you have a lot of points.
Updated to specifically produce the result of getting the count of neighbors within a given radius.
To compute the number of neighbors within a radius of each point in the data, call nn2 as such
res <- nn2(mydata, k=nrow(mydata), searchtype="radius", radius = radius)
Then, do this:
count <- rowSums(res$nn.idx > 0) - 1
Notes:
- Since
query=mydata and k=nrow(mydata) the resulting res$nn.idx will be nrow(mydata) x nrow(mydata) as explained above.
- Each row
i of res$nn.idx corresponds to row i in query, which is the i-th point in query=mydata. Call this i-th point p[i].
- Each row
i of res$nn.idx contains the row indices of the neighbors of p[i] AND zeroes to fill that row in the matrix (because not all points in mydata will be within the radius of p[i]).
- Therefore, the number of neighbors of
p[i] can be found by finding those values in the row that are greater than 0 and then counting them. This is what rowSums(res$nn.idx > 0) does for each row of res$nn.idx.
- Lastly, because
query=mydata, the point being queried is in the count itself. That is, a point is the nearest neighbor to itself. Therefore subtract 1 from these counts to exclude that.
The resulting count will be a vector of the counts. The i-th element is the number of neighbors within the radius to the i-th point in query=mydata.
Hope this is clear.