Streamlined rules for robots

With the explosion of the Internet and the commoditization of autonomous robots (such as the Roomba) and small sensors (such as the ones in most cell phones), computer scientists have become more and more interested in distributed computing, or how disparate autonomous devices 鈥� whether servers in a network or robots investigating an underwater oil spill 鈥� can work together toward some common goal.
Distributed devices have to adjust their behavior to changing circumstances. Frequently, however, their understanding of their circumstances is based only on a few local observations 鈥� and even those could be slightly inaccurate. Behavior that is perfectly reasonable in one case could prove catastrophic in another that, to a device, looks identical. Device programmers thus have to find behavioral policies that strike a balance between advancing the common goal and minimizing the risk of something going badly wrong.
Optimizing that balance would mean weighing every possible option for each device against all other options for all other devices under all circumstances. For even simple distributed-computing systems, that calculation quickly becomes so complex that it鈥檚 basically insoluble. But Frans Oliehoek, a postdoc in MIT鈥檚 Computer Science and Artificial Intelligence Laboratory, is developing new techniques to calculate policies for distributed-computing systems. Although those techniques aren鈥檛 guaranteed to find the perfect policy, they will usually come pretty close 鈥� and they won鈥檛 take centuries to yield an answer.
To get a clearer idea of the problem, consider a very simple example. Companies such as Google or Facebook maintain server farms with tens of thousands of computers. There鈥檚 a lot of redundant information on those computers, so that hordes of users can access the same information at the same time. If a given computer is falling behind in handling users鈥� requests, how long should it let its queue of unanswered requests get before it fobs them off on another computer? Ten? Fifteen? A thousand? A million? The optimal answer has to strike a balance among cases where the other servers in the farm are idle, cases where the other servers have even longer queues, and everything in between. A given server may be able to infer something about traffic as a whole by glancing at the queues of the servers next to it. But if it were continually asking all the other servers in the farm about the length of their queues, it would choke the network with queries.
Historical perspective
Making the problem even more complicated, policy has to vary according to a device鈥檚 history. It may be, for instance, that a robot helicopter trying to find a way into a burning building is much less likely to get itself incinerated if it makes two reconnaissance loops around the building before picking an entry point than if it makes just one. So its policy isn鈥檛 as simple as, 鈥淚f you鈥檝e just completed a loop, fly through the window farthest from the flames.鈥� Sometimes it鈥檚, 鈥淚f you鈥檝e just completed a loop, make another loop.鈥� Moreover, if a squadron of helicopters is performing a collective task, the policy for any one of them has to account for all the possible histories of all the others.
In a series of papers presented at the International Conference on Autonomous Agents and Multiagent Systems, Oliehoek and colleagues at several other universities have described a variety of ways to reduce the scale of the policy-calculation problem. 鈥淲hat you want to do is try and decompose the whole big problem into a set of smaller problems that are connected,鈥� Oliehoek says. 鈥淲e now have some methods that seem to work quite well in practice.鈥�
The key is to identify cases in which structural features of the problem mean that certain combinations of policies don鈥檛 need to be evaluated separately. Suppose, for instance, that the goal is to find policies to prevent autonomous helicopters from colliding with each other while investigating a fire. It could be that after certain sequences of events, there鈥檚 some possibility of helicopter A hitting helicopter B, and of helicopter B hitting helicopter C, but no chance of helicopter A hitting helicopter C. So preventing A from colliding with C doesn鈥檛 have to factor in to the calculation of the optimal policy. In other cases, it鈥檚 possible to lump histories together: Different histories can still point to the same result for the same action.
The mathematical model of decision making that Oliehoek has been investigating 鈥渋s a very general model, so you can model all sorts of decision problems with it,鈥� says Francisco Melo, an assistant professor of computer science and engineering at Portugal鈥檚 Universidade T茅cnica de Lisboa. 鈥淚t鈥檚 a very lively line of work right now.鈥� But, Melo, adds, 鈥渋t鈥檚 a very complex model. There鈥檚 not much hope of computing an exact solution except for very, very, very small problems.鈥� Melo says that while other researchers have performed theoretical analyses of the complexity of the model, and still others have attempted to find practical algorithms that yield approximations of the ideal policy, Oliehoek鈥檚 work combines the virtues of both lines of research. 鈥淚 think that Frans鈥� work is all trying to 鈥� from a theoretical point of view 鈥� understand, if we actually want to do planning, what sorts of structures can we explore?鈥� Melo says. 鈥淎nd that is also useful when you鈥檙e trying to make approximate algorithms. So I think that his contributions were important.鈥�
This story is republished courtesy of MIT News (), a popular site that covers news about MIT research, innovation and teaching.
Provided by Massachusetts Institute of Technology