Engineering of Adaptive Resilient Systems
Develop technologies to build systems that adapt intelligently
to become resilient to change .
Resilience against attack induced failures
Cyber attacks
on computer-based systems need no introduction. What might be somewhat
counter intuitive to non-practioners is that it is impossible to build
a perfectly secure system that does meaningful work (which, these days
means that it is connected to a network, interoperates with multiple
computaional elements, and hanldes multiple sources of possibly large,
structured and unstructured data) and where no attack will succeed.
It is impossible even to detect all attacks on time.
Since 1999 we have been espousing an approach where the focus is not
absolute prevention or complete detection, but rather
tolerance and resilience . In the earliest days,
we developed techniques to enable an
application to participate in its own defense and valiadated the
concept of defense-enabling . We then expanded the range of
attack effects a defense-enabled application can
tolarate to both
crash and Byzantine failures . The work led to the principles of
survivability architecture-- guidance to design and build systems that
are designed to tolerate cyber attacks. We developed and demonstrated
the high watermark in survivable
disributed systems in the early 2000s.
A major factor contributing to survivability is adaptation, without
which a system will be like a sitting duck that can be taken down by
the same attack over and over again. While our initial focus of
adaptation in cyber-defense was gaining time and graceful degradation,
we soon started researching
how to be smart about adaptive decision making and how to change
the curve -- i.e., instead of degrading, how
to imorove the defense. As many of the concepts and techniques we
proposed became more mainstream, including adaptive response,
attackers have also eveolved to focus on zero-day attacks. But we have
an answer to that as well!
Along the way, we also demonstrated how our techniques can address
the pressing problem of the day. We developed an
anti-Phishing solution
in the early 2006-7, a solution to
defend SOA systems against cyber attacks in the late 2009-10,
and a solution to preserve privacy
in a pub-sub middleware in 2011-12.
Accommodating changes in systems resources and ecosystem
As in life, change is inevitable in the life of a computer system as
well. That would be true, even if there were no cyber attacks! As a
matter of fact, the reason we started looking into adaptive
distributed systems is because the operating environment of a networked
distributed system is kind of unpredictable-- you can never be sure how
much network bandwidth you will get or how many requests will hit your
server. But more importantly, you can be assured the deployed situation
will not be anything similar to the development environment and any
expectation of the quality of service provoided by your system based
on tests and evaluations done in the development setting will therefore
quickly become invalid. As a result, mission critical systems often
fail to perform at a level desired of them.
We first started looking at this problem around 1995, and developed
Quality Objects (QuO),
the first Quality of Serice (QoS) aware
distributed objects middleware. QuO was used to integrate
network
resource management and
replication management as means to develop
QoS-aware applications that can adapt to changes in available
resources in the operating environment. The QoS management techniques
developed in these projects were used for
QoS-aware information
dessimantion and are being
used and enhanced in our
mission-oriented Information Management work.
We know we can engineer QoS- and resource-ware adaptive applications.
While these systems adapt at runtime, their development is mostly
manual. And the universe of resources that an application program
depends on is expanding well beyond the traditional systems
resources such as memory,
CPU and network bandwidth to include hardware, OS, librarie, other
services etc. Any changes in the application's ecosystem may cause a
costlt rewrite. Some of our
current work is exploring a potential solution.
Current Projects
Past Projects