All of these methods rely on variants of hashing in order to obtain near constant time behavior in distributing the data and it is preferable that they are as close as possible to being distancepreserving. Specifically, spatial objects in proximity should have similar hash values. In particular, it is desirable to be able to estimate how far apart two objects are (within a given error bound) by just considering their hash values. Such hash functions enable performing an approximate range query using simple hash table lookup operations. Other issues involve the parallel processing of spatial queries. Some easy examples are the distance join query which finds pairs (p,q) of objects (from two different sets) where the distance between p and q is less than a given threshold, or computing the shortest paths from each node to every other node in a road network. More difficult are the spatial problems which can not be easily decomposed into multiple tasks running in parallel, e.g., the distance semijoin query, and network Voronoi diagram construction. This requires developing a generic method to traverse a graph or a tree in parallel to solve these query problems. Ideally, the method should require little or no communication between parallel tasks which will be accomplished by allowing the parallel tasks to produce redundant results which can then be pruned.
The tools that we develop will help improve the robustness and scalability for spatial data management. Results from our investigation on parallel query processing can be useful for query problems which requires traversing a tree or a graph which are often spatially embedded. Having a method to traverse a graph or a tree in parallel that requires little or no communication enables us to process many types of spatial queries using distributed computing resources where communication can be very costly. Specifically, we expect our tools to enable spatial applications such as online mapping, computer aided design, online gaming and scientific simulations to handle terabytes of spatial data while it is impossible or inefficient to do with the current technologies. This is of utility to all organizations that process spatial data and attempts will be made to use it in some government agencies.
NSF Grant IIS1320791.
Fast computation of the distance oracle ASDO (all store distance oracle).
Dynamic map labeling which is the case when the location associated with the label is constantly changing.
Represents the evaluation of a distributed solution to analytical queries on a road network database.
Evaluation of algorithms for computing a spatiotextual similarity join.
Sample ASDO Web Demo: A full version web demo is can be found here To use this demo click on any two points within the map of the mainland of the USA and the oracle returns the network distance between the two points. This is achieved with a database lookup operation. The ASDO (i.e., the all store distance oracle) has been precomputed with an epsilon error tolerance value of 0.25. For comparison purposes, we provide the true network distance computed using our data as well as the Google computed network distance although our distances are erroneous since we built the ASDO using a free inaccurate road network while Google uses an accurate commercial road network. This problem will go away once we invest in a better road network input dataset for our oracle.
Notice that the distance computed by ASDO is very close to the true distance (the epsilon error tolerance value of 0.25 is much in excess of the real error and thus is quite reasonable). Please note that the Google computed network distance is only available if you use the Chrome browser; otherwise, it is missing. Additionally, we provide the SQL query that we executed against the oracle to provide the network distance.
Observe the speedup of ASDO over the true distance which can be as high as 1000X. Moreover, ASDO permits simultaneous computation of the distance and can handle as many as 65,000 distance lookups per second assuming one database server. Adding Y servers will increase the number of lookups per second by a factor of Y. On the other hand, the true distance is computed sequentially and cannot be parallelized on one machine although it can be parallelized over Y machines in which case it is sped up by a factor of Y but this is still a much slower throughput than what is achievable with ASDO.
1st Location

Lat: Lon: 
2nd Location

Lat: Lon: 
ASDO Network Distance


Quad Depth


Euclidean Distance


Google Maps Distance


True Network Distance


SQL Query

Date of last update: July 9, 2015