I want you to look into the future. To a time when drone delivery is the norm. When drone regulations have eased and there’s always a fleet of delivery drones zipping through the air.
What a glorious day that’ll be. But, there are many improvements that need to be made before we get to that point, including the collaboration between drones to avoid bumping into one another and working together efficiently.
Researchers at MIT are working on improving the collaboration between not only drones, but for all robots, by syncing the actions of multiple robots working in the same space. MIT has developed an algorithm to simplify Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs), which are mathematical models that describe the way multiple robots work together.
SEE ALSO: How Amazon Drone Delivery Will Work
The more robots in a system, the more complex the math becomes. MIT discovered in 2014 how to use Dec-POMDPs to allow multiple robots to accomplish tasks collaboratively. Now the research has gone a step further, where the system generates the control systems from scratch and still solves Dec-POMDPs.
Watch MIT test its system on a small group of delivery drones in the video below. The drones need to cross paths to make the deliveries, obviously without crashing. Before they fly, the drones map out the optimal flight path and the Dec-POMDPs take over to aid the collaboration.
The complication in all of this, MIT says, is that Dec-POMDPs factor in uncertainty. For example, a drone will depend on its sensors while flying, but those sensors will probably be slightly error prone, so any given reading should be interpreted as probable not actual.
MIT explains more about how this process works:
SEE ALSO: 10 Reasons Drone Delivery is Doomed
“Even an accurate measurement, however, may be open to interpretation, so the robot would need to build a probability distribution of possible locations on top of the probability distribution of sensor readings. Then it has to choose a course of action, but its possible actions will have their own probabilities of success. And if the robot is participating in a collaborative task, it also has to factor in the probable locations of other robots and their consequent probabilities of taking particular actions.
“Since a probability distribution consists of a range of possible values – in principle, an infinite number of values – solving a problem with probabilities heaped on probabilities is much harder than solving a problem involving discrete values.
“A graph is data representation consisting of nodes, usually depicted as circles, and edges, usually depicted as lines connecting the circles. Network diagrams and family trees are familiar examples.

Diagram shows the destinations and obstacles in MIT’s scenario. (via Shayegan Omidshafiei)
“The researchers’ algorithm first constructs a graph in which each node represents a “belief state,” meaning a probabilistic estimate of an agent’s own state and the state of the world. The algorithm then creates a set of control procedures – the edges of the graph – that can move the agent between belief states.
“The researchers refer to these control procedures as “macro-actions.” Because a single macro-action can accommodate a range of belief states at both its origin and its destination, the planning algorithm has removed some of the problem’s complexity before passing it on to the next stage.
“For each agent, the algorithm then constructs a second graph, in which the nodes represent macro-actions defined in the previous step, and the edges represent transitions between macro-actions, in light of observations. In the experiments reported in the new paper, the researchers then ran a host of simulations of the task the agents were intended to perform, with agents assuming different, random states at the beginning of each run. On the basis of how well the agents executed their tasks each time through, the planning algorithm assigned different weights to the macro-actions at the nodes of the graph and to the transitions between nodes.
“The result was a graph capturing the probability that an agent should perform a particular macro-action given both its past actions and its observations of the world around it. Although those probabilities were based on simulations, in principle, autonomous agents could build the same type of graph through physical exploration of their environments.
“Finally, the algorithm selects the macro-actions and transitions with the highest weights. That yields a deterministic plan that the individual agents can follow: After performing macro-action A, if you make measurement B, execute macro-action C.”