Kevin Carr, an undergraduate in electrical and computer engineering, adjusts one of the transmitter modules that is used to wirelessly link five units being tested in a self-healing computer system. The unit to the left of the transmitter is a Field Programmable Gate Array. Other transmitters and FPGAs can be seen in the background. (Photo by Matt Brailey)
We've all heard about the space missions that are DOA when NASA engineers lose touch with the spacecraft or lander. In other cases, some critical system fails and the mission is compromised.
Both are maddening scenarios because the spacecraft probably could be easily fixed if engineers could just get their hands on the hardware for a few minutes.
Ali Akoglu and his students at The University of Arizona are working on hybrid hardware/software systems that one day might use machine intelligence to allow the spacecraft to heal themselves.
Akoglu, an assistant professor in electrical and computer engineering, is using Field Programmable Gate Arrays, or FPGA, to build these self-healing systems. FPGAs combine software and hardware to produce flexible systems that can be reconfigured at the chip level.
Because some of the hardware functions are carried out at the chip level, the software can be set up to mimic hardware. In this way, the FPGA 鈥渇irmware鈥 can be reconfigured to emulate different kinds of hardware.
Speed vs. Flexibility
Akoglu explains it this way: There are general-purpose systems, like your desktop computer, which can run a variety of applications. Unfortunately, even with 3 GHz, dual-core processors, they鈥檙e extremely slow compared with hardwired systems.
With hardwired systems, the hardware is specific to the purpose. As an example, engineers could build a very fast system that would run Microsoft Word but nothing else. It couldn鈥檛 run Excel or any other application. But it would be super fast at what it鈥檚 designed for.
鈥淚n that case, you have an extremely fast system, but it鈥檚 not adaptable,鈥 Akoglu explained. 鈥淲hen new, and better software comes along, you have to go back into the design cycle and start building hardware from scratch.鈥
鈥淲hat we need is something in the middle that is the best of both worlds, and that鈥檚 what I鈥檓 trying to come up with using Field Programmable Arrays,鈥 he said.
Work on the self-healing systems began in 2006 as a project in Akoglu鈥檚 graduate-level class. His students presented a paper on the system and sparked interest from NASA, which eventually provided an $85,000 grant to pursue the work.
Akoglu and his students now are in the second phase of the project, which is called SCARS (Scalable Self-Configurable Architecture for Reusable Space Systems). The project is being carried out in collaboration with the Jet Propulsion Laboratory.
Currently, they are testing five hardware units that are linked together wirelessly. The units could represent a combination of five landers and rovers on Mars, for instance.
鈥淲hen we create a test malfunction, we try to recover in two ways,鈥 he explained. 鈥淔irst, the unit tries to heal itself at the node level by reprogramming the problem circuits.鈥
If that fails, the second step is for the unit to try to recover by employing redundant circuitry. But if the unit鈥檚 onboard resources can鈥檛 fix the problem, the network-level intelligence is alerted. In this case, another unit takes over the functions that were carried out by the broken unit.
鈥淭he second unit reconfigures itself so it can carry out both its own tasks and the critical tasks from the broken unit,鈥 Akoglu explained.
If two units go down and can鈥檛 fix themselves, the three remaining units split up the tasks. All of this is done autonomously without human aid.
Lightning-Fast Processing
Because FPGAs can be programmed to carry on tasks simultaneously, they also can be configured to do lightning-fast processing.
鈥淪o if you鈥檙e running a loop, and it is running 10,000 times, you can replicate the loop as a processing element in the FPGA 鈥榥鈥 number of times,鈥 Akoglu explained. 鈥淭hat means you have an 鈥榥鈥 times speed-up.鈥 It鈥檚 like creating a huge multicore processor configured for a specific task.
FPGAs traditionally have been used for prototyping circuits because their firmware can be reprogrammed. Rather than creating costly circuits in hardware, engineers can test their ideas quickly and inexpensively in FPGA firmware.
In the past five years, the amount of circuitry that can be crammed into FPGAs has increased dramatically, promoting them from simple test-beds to end products in themselves, Akoglu explained.
The Ridgetop Group, a Tucson company that specializes in diagnosing circuit faults using statistical methods, now is working with Akoglu on the self-healing systems.
鈥淭his is the next phase of our project,鈥 Akoglu said. 鈥淥ur objective is to go beyond predicting a fault to using a self-healing system to fix the predicted fault before it occurs.鈥 This could lead to extremely stable computer systems that could operate for long periods without failure.
Source: University of Arizona