In decentralised target tracking, a set of sensors observes moving targets. When the sensors are static but steerable, each sensor must dynamically choose which target to observe in a decentralised manner. We show that the information exchanged by the sensors to synchronise their beliefs can be exploited to learn a model of the utility function that drives each others' decisions. Instead of communicating utilities to enable negotiation, each sensor regresses on the learnt model to predict the utilities of other team members. This approach bridges the gap between coordinating implicitly, a locally-greedy solution, and negotiating explicitly. We validated our approach in both hardware and simulations, and found that it out-performed implicit coordination by a statistically significant margin with both ideal and limited communications.

Questions and Answers

You need to be logged in to be able to post here.