There is a certain irony in the fact that the trustworthy computing forwards to a security and privacy flaws within Microsoft products. Microsoft, of course, has been the main proponent of pushing the concept of “Trustworthy Computing” in the information market that outlined the framework for developing a secure computing environment.
How Trustworthy Computing will be ultimately achieved has yet to be seen. While some vendors are developing their own proprietary solutions, many build their strategies based on the specifications provided by the industry standards body Trusted Computing Group (TCG).
TCG’s membership includes nearly all the major proprietary companies in the computer industry including hardware developers like Intel and AMD, computer manufacturers Dell and Hewlett Packard, software companies like Microsoft and Symantec. More notable than TCG’s members are its non-members. Sans Sun Microsystems, there are no major open source-based companies on TCG’s membership list.
TCG’s membership includes nearly all the major proprietary companies in the computer industry including hardware developers like Intel and AMD, computer manufacturers Dell and Hewlett Packard, software companies like Microsoft and Symantec. More notable than TCG’s members are its non-members. Sans Sun Microsystems, there are no major open source-based companies on TCG’s membership list.
This paper critically analyzes current Trustworthy Computing initiatives, and considers Trustworthy Computing in open source space to show that the move towards Trustworthy Computing is a necessary, however lofty if not unreachable goal that may ultimately be nothing more than a scheme to improve public relations for proprietary computing companies.
1.1 History of Trustworthy Computing
Society has gone through a number of large technology shifts that have shaped the culture: the agrarian revolution, the invention of metalworking, the industrial revolution, the advent of electricity, telephony and television—and, of course, the microprocessor that made personal computing a reality. Each of these fundamentally transformed the way billions of people live, work, communicates, and are entertained.
Personal computing has so far only really been deployed against white-collar work problems in the developed world. (Larger computer systems have also revolutionized manufacturing processes.) However, the steady improvement in technology and lowering of costs means that personal computing technology will ultimately become a building block of everybody's home and working lives, not just those of white-collar professionals.
Progress in computing in the last quarter century is akin to the first few decades of electric power. Electricity was first adopted in the 1880s by small, labor-intensive businesses that could leverage the technology's fractional nature to increase manufacturing productivity (that is, a single power supply was able to power a variety of electric motors throughout a plant). In its infancy, electricity in the home was a costly luxury, used by high-income households largely for powering electric lights. There was also a good deal of uncertainty about the safety of electricity in general and appliances in particular. Electricity was associated with lightning, a lethal natural force, and there were no guarantees that sub-standard appliances wouldn't kill their owners.
Between 1900 and 1920 all that changed. Residents of cities and the fast-growing suburbs had increasing access to a range of energy technologies, and competition from gas and oil pushed down electricity prices. A growing number of electric-powered, labor-saving devices, such as vacuum cleaners and refrigerators, meant that households were increasingly dependent on electricity. Marketing campaigns by electricity companies and the emergence of standards marks (for example, Underwriters' Laboratories (UL) in the United States) allayed consumer fears. The technology was not wholly safe or reliable, but at some point in the first few years of the 20th century, it became safe and reliable enough.
In the computing space, we're not yet at that stage; we're still in the equivalent of electricity's 19th century industrial era. Computing has yet to touch and improve every facet of our lives—but it willA key step in getting computing to the point where people would be as happy to have a microprocessor in every device as they are relying on electricity will be achieving the same degree of relative trustworthiness. "Relative," because 100% trustworthiness will never be achieved by any technology—electric power supplies surge and fail, water and gas pipes rupture, telephone lines drop, aircraft crash, and so on.
2.1 WHY TRUST?
While many technologies that make use of computing have proven themselves extremely reliable and trustworthy—computers helped transport people to the moon and back, they control critical aircraft systems for millions of flights every year, and they move trillions of dollars around the globe daily—they generally haven't reached the point where people are willing to entrust them with their lives, implicitly or explicitly. Many people are reluctant to entrust today's computer systems with their personal information, such as financial and medical records, because they are increasingly concerned about the security and reliability of these systems, which they view as posing significant societal risk. If computing is to become truly ubiquitous—and fulfill the immense promise of technology—we will have to make the computing ecosystem sufficiently trustworthy that people don't worry about its fallibility or unreliability the way they do today.
Trust is a broad concept, and making something trustworthy requires a social infrastructure as well as solid engineering. All systems fail from time to time; the legal and commercial practices within which they're embedded can compensate for the fact that no technology will ever be perfect.
Hence this is not only a struggle to make software trustworthy; because computers have to some extent already lost people's trust, we will have to overcome a legacy of machines that fail, software that fails, and systems that fail. We will have to persuade people that the systems, the software, the services, the people, and the companies have all, collectively, achieved a new level of availability, dependability, and confidentiality. We will have to overcome the distrust that people now feel for computers.
The Trustworthy Computing Initiative is a label for a whole range of advances that have to be made for people to be as comfortable using devices powered by computers and software as they are today using a device that is powered by electricity. It may take us ten to fifteen years to get there, both as an industry and as a society.
Transitional, in which deprecated elements are allowed,
Frameset, in which mostly only frame related elements are allowed;
This is a "sea change" not only in the way we write and deliver software, but also in the way our society views computing generally. There are immediate problems to be solved, and fundamental open research questions. There are actions that individuals and companies can and should take, but there are also problems that can only be solved collectively by consortia, research communities, nations, and the world as a whole.
2.2 THE NEED FOR TRUSTWORTHY COMPUTING
Current statistics show the obvious need for a true Trustworthy Computing architecture. The United States Computer Emergency Readiness Team (CERT) cites that in June 2004 alone there were 56,034,751 reported incidents of intruders in various information systems. These incidents include the use of malicious code (i.e. viruses and worms) Denial of Service (Dos) attack, user/root compromise, etc.
The success of these attacks stemmed from the fact that these systems contained vulnerabilities – fundamental flaws in the system’s design, implementation, or configuration that attackers can recognize and exploit. The job of information security professionals includes mitigating the risks of these vulnerabilities by reducing or eliminating the probability of attack while limiting the impact of a successful attack by implementing various security measures. Each security measure implemented within a system detracts from the functionality of that system. Security managers have to balance security and functionality knowing that the only completely secure machine is one that has been completely removed from production. Trustworthy Computing promises to become information’s savior by eliminating the rift between functionality and security and allow a system to maximize both simultaneously.
The need for higher security in current systems is substantial; however, in order for computing to reach its full potential a new level of dependability, beyond merely compensating for known flaws, must be reached and successfully conveyed to the public. Trustworthy Computing is often associated with the potential of reaching a pervasive (a.k.a. ubiquitous) computing environment. Conceptually, pervasive computing refers to the eventuality that computers will enter into nearly every aspect of daily life with nearly everything around us (cars, tools, appliances, & other computing devices) equipped with some kind of embedded processing chip and networking capacity where it can send/receive data and react instantly to the environment.
Pervasive computing depends on the success of Trustworthy Computing. Computing devices have a legacy of failures from hardware, software, and malicious attacks that must be overcome in order for people to gain the proper faith in the reliability of computing systems to the point where they will allow them into every facet of their daily lives; a fundamental need for pervasive computing to come to term. Pervasive computing is considered to be full realization of computer technology’s potential, but before it can exist the consumer must comfortable allowing computers to penetrate every aspect of daily existence.
Since Trustworthy Computing has both technical and social criteria, it makes for a broad concept that not only requires advances in engineering, but also acceptance in society. A system deemed technically trustworthy means that it perpetually functions in the expected, designed manner while maintaining the integrity, confidentiality, and availability of the data within. The achievement of social trust relies upon the actualization of a consensus confidence from the users in system that it will perform as desired without losing or divulging any personal or otherwise sensitive data to unauthorized parties.
Microsoft recognized the need for Trustworthy Computing, knowing that it would take years for both technology and society to reach the point where it would become reality. Microsoft has considerable distance to cover in order for their products to be deemed trustworthy. A recent study done by MI2G states that, “recent global malware epidemics have primarily targeted the Windows computing environment and have not caused any significant economic damage to environments running Open Source including Linux, BSD and Mac OS X. When taking the economic damage from malware into account over the last twelve months, including the impact of MyDoom, NetSky, SoBig, Klez and Sasser, Windows has become the most breached computing environment in the world accounting for most of the productivity losses associated with malware - virus, worm and Trojan - proliferation”.
Microsoft’s control, however, encompasses only its own products, and since Trustworthy Computing requires all components of a system to achieve and maintain trustworthy status, Microsoft started the Trusted Computing Group with other industry leaders to use their combined market power to push their vision of Trustworthy Computing.
EXISTING WORK IN THIS FIELD
3.1 TRUSTWORTHY COMPUTING
The four pillars of TWC namely Security, Privacy, Reliability and Business Integrity as illustrated below (Table A) forms the framework of TWC. These goals from the trust in any business. All these goals raise issues related to engineering, business practices and public perceptions although not all to the same degree. These are goals from a user point of view.
Microsoft CTO and Senior Vice President Craig Mundie authored a whitepaper in 2002, defining the framework of the Trustworthy Computing program. Four areas were identified as the initiative’s key “pillars”. Microsoft has subsequently organized its efforts to align with these goals. These key activities are set forth as:
- Business Integrity
The basis for a customer's decision to trust a system
The customer can expect that systems are resilient to attack, and that the confidentiality, integrity, and availability of the system and its data are protected.
The customer is able to control data about themselves, and those using such data adhere to fair information principles
The customer can depend on the product to fulfill its functions when required to do so.
The vendor of a product behaves in a responsive and responsible manner.
Table A: The four pillars of Trust Worthy Computing
By reliable, I mean the system computes the right thing; in theory, reliable means the system meets its specification—it is “correct”; in practice, because we often do not have specifications at all or the specifications are incomplete, inconsistent, imprecise, or ambiguous, reliable means the system meets the user’s expectations—it is predictable and when it fails, no harm is done.
We have made tremendous technical progress in making our systems more reliable. For hardware systems, we can use redundant components to ensure a high-degree of predictive fail-safe behavior. For software systems, we can use advanced programming languages and tools and formal specification and verification techniques to improve the quality of our code. What is missing? Here are some starters:
Science of software design. While we have formal languages and techniques for specifying, developing, and analyzing code, we do not yet have a similar science for software design. To build trustworthy software, we need to identify software design principles (e.g., Separation of Concerns) with security in mind and to revisit security design principles (e.g., Principle of Least Privilege) with software in mind.
Compositional reasoning techniques. Today’s systems are made up of many software and hardware components. While today’s attacks exploit bugs in individual software components, tomorrow’s attacks will likely exploit mismatches between independently designed components. I call these “design-level vulnerabilities” because the problem is above the level of code: even if the individual components are implemented correctly, when put together emergent abusive behavior can result. We need ways to compose systems that preserve properties of the individual components, or when put together, achieve a desired global property. Similarly, we need way to detect and reason about emergent abusive behavior.
Software metrics. While we have performance models and benchmark suites for computer hardware and networks, we do not for software. We need software metrics that allow us to quantify and predict software reliability. When is a given system “good enough” or “safe enough”?
· Accuracy: The design of a system includes RAID arrays, sufficient redundancy, and other means to reduce loss or corruption of data.
By secure, I mean the system is not vulnerable to attack. In theory, security can be viewed as another kind of correctness property. A system, however, is vulnerable at least under the conditions for which its specification is violated and more subtly, under the conditions that the specification does not explicitly cover. In fact, a specification’s precondition tells an attacker exactly where a system is vulnerable.
One way to view the difference between reliability and security is given by Whittaker and Thompson in How to Break Software Security, Pearson 2004, in this Venn diagram:
The two overlapping circles combined represent system behavior. The left circle represents desired behavior; the right, actual. The intersection of the two circles represents behavior that is correct (i.e., reliable) and secure. The left shaded region represents behavior that is not implemented or implemented incorrectly, i.e., where traditional software bugs are found. The right shaded region represents behavior that is implemented but not intended, i.e., where security vulnerabilities lie.
As for reliability, for security we think in terms of modeling, prevention, detection, and recovery. We have some (dated) security models, such as Multi-Level Security and Bell-LaPadula, to state desired security properties and we use threat modeling to think about how attackers can enter our system. We have encryption protocols that allow us to protect our data and our communications; we have mechanisms such as network firewalls to protect our hosts from some attacks. We have intrusion detection systems to detect (though still with high false positive rates) our systems are under attack; we have code analysis tools to detect buffer overruns. Our recovery mechanisms today rely on aborting the system (e.g., in a denial-of-service attack) or installing software patches.
We need more ways to prevent and protect our systems from attack. It’s more cost-effective to prevent an attack than to detect and recover from it. There are two directions in which we would like to see the security community steer their attention:
(a) design-level, not just code-level vulnerabilities and,
(b) Software, not just computers and networks.
· Secure by Design: architecture might be designed to use triple-DES encryption for sensitive data such as passwords before storing them in a database, and the use of the SSL protocol to transport data across the Internet. All code is thoroughly checked for common vulnerabilities using automatic or manual tools. Threat modeling is built into the software design process.
· Secure by Default: Software is shipped with security measures in place and potentially vulnerable components disabled.
· Secure by Deployment: Security updates are easy to find and install—and eventually install themselves automatically—and tools are available to assess and manage security risks across large organizations.
By privacy, I mean the system must preserve users’ identity and protect their data. Much past research in privacy addresses non-technical questions with contributions from policymakers and social scientists. I believe that privacy is the next big area related to security for technologists to tackle. It is time for the technical community to address some fundamental questions such as:
· What does privacy mean?
· How do you reason about privacy? How do you resolve conflicts among different privacy policies?
· Are there things that are impossible to achieve wrt some definition of privacy?
· How do you implement practical mechanisms to enforce different privacy policies? As they change over time?
· How do you design and architect a system with privacy in mind?
· How do you measure privacy?
My call to arms is to the scientists and engineers of complex software systems to start paying attention to privacy, not just reliability and security. To the theoretical community—to design provably correct protocols that preserve privacy for some formal meaning of privacy, to devise models and logics for reasoning about privacy, to understand what is or is not impossible to achieve given a particular formal model of privacy, to understand more fundamentally what the exact relationship is between privacy and security, and to understand the role of anonymity in privacy (when is it inherently needed and what is the tradeoff between anonymity and forensics).
To the software engineering community - to think about software architectures and design principles for privacy. To the systems community - to think about privacy when designing the next network protocol, distributed database, or operating system. To the artificial intelligence community—to think about privacy when using machine learning for data mining and data fusion across disjoint databases. How do we prevent unauthorized reidentification of people when doing traffic and data analysis? To researchers in biometrics, embedded systems, robotics, sensor nets, ubiquitous computing, and vision—to address privacy concerns along with the design of their next-generation systems.
· Privacy/Fair Information Principles: Users are given appropriate notice of how their personal information may be collected and used; they are given access to view such information and the opportunity to correct it; data is never collected or shared without the individual's consent; appropriate means are taken to ensure the security of personal information; external and internal auditing procedures ensure compliance with stated intentions.
· Manageability: The system is designed to be as self-managing as practicable. Hot fixes and software updates can be installed with minimal user intervention.
4. Business Integrity
By Business Integrity, I mean the system has to be usable by human beings. To ensure reliability, security, and privacy, often we need to trade between user convenience and user control. Also, there are tradeoffs we often need to make when looking at these four properties in different combinations.
Consider usability and security. Security is only as strong as the system's weakest link. More often than not that weakest link involves the system's interaction with a human being. Whether the problem is with choosing good passwords, hard-to-use user interfaces, complicated system installation and patch management procedures, or social engineering attacks, the human link will always be present.
Similarly consider usability and privacy. We want to allow users to control access, disclosure, and further use of their identity and their data; yet we do not want to bombard them with pop-up dialog boxes asking for permission for each user-system transaction. Similar remarks can be made for tradeoffs between usability and reliability (and for that matter, security and privacy, security and reliability, etc.).
Fortunately, the human-computer interaction community is beginning to address issues of usable security and usable privacy. We need to design user interfaces to make security and privacy both less obtrusive to and less intrusive on the user. As computing devices become ubiquitous, we need to hide complicated security and privacy mechanisms from the user but still provide user control where appropriate. How much of security and privacy should we and can we make transparent to the user?
We also should turn to the behavioral scientists to help the computer scientists. Technologists need to design systems to reduce their susceptibility to social engineering attacks. Also, as the number and nature of attackers change in the future, we need to understand the psychology of the attacker: from script kiddies to well-financed, politically motivated adversaries. As biometrics become commonplace, we need to understand whether and how they help or hinder security (perhaps by introducing new social engineering attacks). Similarly, help or hinder privacy?
This Business Integrity problem occurs at all levels of the system: at the top, users who are not computer savvy but interact with computers for work or for fun; in the middle, users who are computer savvy but do not and should not have the time or interest to twiddle with settings; at the bottom, system administrators who have the unappreciated and scary task of installing the latest security patch without being able to predict the consequence of doing so. We need to make it possible for normal human beings to use our computing systems easily, but with deserved trust and confidence.
· Usability: The user interface is uncluttered and intuitive. Alerts and dialog boxes are helpful and appropriately worded.
· Responsiveness: Quality-assurance checks occur from early on in a project. Management makes it clear that reliability and security take precedence over feature richness or ship date. Services are constantly monitored and action is taken whenever performance doesn't meet stated objectives.
· Transparency: Contracts between businesses are framed as win-win arrangements, not an opportunity to extract the maximum possible revenue for one party in the short term. The company communicates clearly and honestly with all its stake holders.
3.2 THE MOVE TOWARDS TRUSTWORTHY COMPUTING
In their white paper on Trustworthy Computing, Microsoft compares their initiative to other major technology shifts in history like the industrial revolution and the advent of electricity. The comparison is furthered when Microsoft discusses the future of computing as a dependable utility like electricity or telephony. These old technologies are described as being current trustworthy computing and the white paper states, “the Trustworthy Computing Initiative is a label for a whole range of advances that have to be made for people to be as comfortable using devices powered by computers and software as they are today using a device that is powered by electricity”. This exemplifies the ultimate goal of Trustworthy Computing: to reach a state where computing technology is as reliable and dependable as today’s most common utilities.
Microsoft outlines four fundamental goals for Trustworthy Computing to ensure the reliability, confidentiality, and integrity of the system and its data. These goals are: Security, Privacy, Reliability, and Business Integrity. Security means that the system is protected from outside attacks like malware and DoS. Privacy is defined as the user’s ability to control who can access their information as well as ensuring that those who can do not misuse that information. Reliability means that a computer will run as desired and is only unavailable when it is expected to be. Business Integrity is the vendor’s duty to respond and behave in a responsible manner.
The goals specified by Microsoft encompass a very broad area that goes beyond Microsoft’s direct control. Many reasons system’s fail to meet these goals today is because of their complexity. Many systems are discontinuous with varying operating systems, applications, and other devices that all interact. A flaw in a single entity within this system may result in a compromise of all the data in the system. For a truly trustworthy system to exists, all parts of that system must adhere to the same fundamental framework.
TCG is a consortium of system manufacturers, device manufacturers, software vendors and others that act as a standards body for the development of open standards that will allow computing manufactures to develop secure components that will be able to interoperate in a potentially trustworthy environment.
3.3 THE TRUSTED COMPUTING GROUP STANDARD
Microsoft helped found the TCG (formerly known as the Trusted Computing Platform Alliance) to ensure the creation of open standards that can be developed upon so that all computing devices that follow TCG’s specification can be identified as trustworthy. Revolving around the Trusted Platform Module (TPM) the TCG’s latest release of their main specification was publicized in February 2002.
The TPM is a hardware-specific device that intended to protect the data on the computing device and authorized what applications can access what data. The specification also defines the software stack allowing software developers to utilize the TPM in operating systems as well as other applications. TPM also provides a means to determine the trustworthiness of foreign devices that also integrate TPM. Many vendors have already begun to design and develop on top of the TPM specification. Intel includes TPM in their LaGrande Technology (LT). Additionally Microsoft plans to utilize the security feature provided by TPM in their Next Generation Secure Computing Base (NGSCB). This paper concentrates on LT and NGSCB to provide a software and hardware perspective on the TCG specification as well as a high-level overview of the proposed architecture.
(a)Intel’s LaGrande Technology
As shown in Figure 1, LT incorporates TPM in order to protect critical data and processes from the operating system and other applications. The “Standard Partition” in the environment represents the current standard in computing. Intel separates the kernel, certain processes, inputs and graphics and uses TPM to authenticate and authorize applications to access requested resources.
Figure 1 - Intel's LaGrande Architecture
LT provides security within the processor and chipset as a hardware solution based on the TPM specification defined by TCG. In order for LT utilization in a computing platform, the TPM software stack must be included for interaction between the hardware and the operating system and other applications.
(b)Microsoft’s Next Generation Secure Computing Base
Microsoft’s NGTCB (formerly codenamed Palladium) utilizes the security functionality provided by TCG and builds further security components into the Windows operating system. A major new component NGTCB plans to add to Windows is being called “the nexus.” The nexus is comparable to a secure kernel running alongside the normal Windows kernel that creates a parallel execution environment alongside the typical operating system. The nexus effectively makes the decision at the operating system level as to what applications can access what data as defined by the end user. NGTCB utilizes the nexus to carry out its functions including the major features Microsoft plans to include in NGTCB: strong process isolation, sealed storage, secure path to and from user, and attestation.
Strong process isolation provides users with a means to confine memory so that the data cannot be altered except by the nexus-enabled application that last saved the information or other authorized applications as defined by the user. Sealed storage refers to the cryptographic method used by nexus-aware applications to ensure that the stored data is read by authorized applications. Secure path to and from the user presents a secure channel that carries data from the mouse and keyboard to the computer, transfers data between nexus-applications, and from the computer to the video display. Attestation invokes the user’s ability to dynamically allow applications and process to access protected data.
Figure 2 shows the architecture of NGSCB. Many similarities exist between Intel’s LT and Microsoft’s NGSCB. The architecture depends on the segregation between today’s typical computing architecture and the secure component, in this case the Nexus Module. This diagram displays more details on the software side of a TCG specification implementation. The “hardware” segment of the diagram corresponds perfectly with LT. The Security Support Component (SSC) at the bottom of the diagram is Microsoft’s name for TCG’s TPM. As stated by Microsoft, “The upcoming version of the TPM (version 1.2) is expected to serve as the SSC in the NGSCB architecture”.
(c)Open Source Trust
Very few open source companies have publicly announced a Trustworthy Computing initiative. Sun Microsystems' involvement in the TCG is the sole major open source-based company to be involved with trustworthy computing, and even Sun is not truly open source.
Open Source software developers do not have the urgent need to develop a TCG-compliant Linux or UNIX platform just yet. As of right now, open source solutions are the more trusted products when compared to their proprietary counterparts. However, since the TCG standard is open, when hardware components become publicly available, development of an open source trustworthy product will proceed rapidly. The software stack for the OS to interoperate with the TPM and the secure chipset has been made publicly available on the TCG website, thus allowing open source developers to integrate the TCG standard into their operating systems whenever they see it fit to do so.
While the initiatives undertaken by the TCG, Microsoft, and other hardware and software vendors do drastically increase the security when compared to the computer systems available now, they fail to account for many of the underlying issues that plague information security by making flawed assumptions. Furthermore, current Trustworthy Computing initiatives fail to address two of the three areas of network security responsibility. Finally, the architecture of TCG-compliant systems introduces the potential for new and devastating attack on these trustworthy computing platforms.
Once a technology has become an integral part of how society operates, that society will be more involved in its evolution and management. This has happened in railways, telecommunications, TV, energy, etc. Society is only now coming to grips with the fact that it is critically dependent on computers.
We are entering an era of tension between the entrepreneurial energy that leads to innovation and society's need to regulate a critical resource despite the risk of stifling competition and inventiveness. This is exacerbated by the fact that social norms and their associated legal frameworks change more slowly than technologies. The computer industry must find the appropriate balance between the need for a regulatory regime and the impulses of an industry that has grown up unregulated and relying upon de facto standards.
Many contemporary infrastructure reliability problems are really policy issues. The state of California's recent electricity supply crisis was triggered largely by a bungled privatization. The poor coverage and service of US cellular service providers is due in part to the FCC's policy of not granting nationwide licenses. These policy questions often cross national borders, as illustrated by the struggle to establish global standards for third-generation cellular technologies. Existing users of spectrum (often the military) occupy different bands in different countries, and resist giving them up, making it difficult to find common spectrum worldwide.
We are seeing the advent of mega-scale computing systems built out of loose affiliations of services, machines, and application software. The emergent (and very different) behavior of such systems is a growing long-term risk.
An architecture built on diversity is robust, but it also operates on the edge of chaos. This holds true in all very-large-scale systems, from natural systems like the weather to human-made systems like markets and the power grid. All the previous mega-scale systems that we've built—the power grid, the telephone systems—have experienced unpredicted emergent behavior. That is why in 1965 the power grid failed and rippled down the whole East Coast of the United States, and that's why whole cities occasionally drop off the telephone network when somebody implements a bug fix on a single switch. The complexity of the system has outstripped the ability of any one person—or any single entity—to understand all of the interactions.
Incredibly secure and trustworthy computer systems exist today, but they are largely independent, single-purpose systems that are meticulously engineered and then isolated. We really don't know what's going to happen as we dynamically stitch together billions—perhaps even trillions—of intelligent and interdependent devices that span many different types and generations of software and architectures.
As the power of computers increase, in both storage and computational capacity, the absolute scale, and complexity of the attendant software goes up accordingly. This manifests itself in many ways, ranging from how you administer these machines to how you know when they are broken, how you repair them, and how you add more capability. All these aspects ultimately play into whether people perceive the system as trustworthy.
We don't yet have really good economical, widely used mechanisms for building ultra-reliable hardware. However, we do have an environment where it may become common-place to have 200+ million transistors on a single chip. At some point it becomes worthwhile to make that into four parallel systems that are redundant and therefore more resistant to failure. The marginal cost of having this redundancy within a single component may be acceptable. Similarly, a computer manufacturer or end user may choose to install two smaller hard drives to mirror their data, greatly improving its integrity in the event of a disk crash.
We may have new architectural approaches to survivability in computer systems these days, but it always comes from redundancy. This means you have to buy that redundancy. So people will, in fact, again have to decide: Do they want to save money but potentially deal with more failure? Or are they willing to spend more money or deal with more complexity and administrative overhead in order to resolve the appropriate aspects of security, privacy, and technological sufficiency that will solve these problems?
The Web Services model is characterized by computing at the edge of the network. Peer-to-peer applications will be the rule, and there will be distributed processing and storage. An administrative regime for such a system requires sophisticated machine-to-machine processes. Data will be self-describing. Machines will be loosely coupled, self-configuring, and self-organizing. They will manage themselves to conform to policy set at the center.
Web applications will have to be designed to operate in an asynchronous world. In the PC paradigm, a machine knows where its peripherals are; the associations have been established (by the user or by software) at some point in the past. When something disrupts that synchronicity, the software sometimes simply hangs or dies. Improved plug-and-play device support in Windows XP and "hot-pluggable" architectures such as USB and IEEE 1394 point the way toward a truly "asynchronous" PC, but these dependencies do still exist at times.
On the Web, however, devices come and go, and latency is highly variable. Robust Web architectures need dynamic discoverability and automatic configuration. If you accept the idea that everything is loosely coupled and asynchronous, you introduce even more opportunities for failure. For every potential interaction, you have to entertain the idea that it won't actually occur, because the Web is only a "best-effort" mechanism—if you click and get no result, you click again. Every computing system therefore has to be redesigned to recover from failed interactions.
Questions of identity are sometimes raised in the context of Trustworthy Computing. Identity is not explicitly called out in the framework, because a user does not expect a computer system to generate their identity. However, user identity is a core concept against which services are provided. Assertions of identity (that is, authentication) need to be robust, so that taking actions that depend on identity (that is, authorization) can be done reliably. Hence, users expect their identities to be safe from unwanted use.
Identity is difficult to define in general, but particularly so in the digital realm. We use the working definition that identity is the persistent, collective aspects of a set of distinguishing characteristics by which a person (or thing) is recognizable or known. Identity is diffuse and context-dependent because these aspect "snippets" are stored all over the place in digital, physical, and emotional form. Some of this identity is "owned" by the user, but a lot of it is conferred by others, either legally (for example, by governments or companies) or as informal social recognition.
Many elements of Trustworthy Computing systems impinge on identity. Users worry about the privacy of computer systems in part because they realize that seemingly unrelated aspects of their identity can be reassembled more easily when the snippets are in digital form. This is best evidenced by growing public fear of credit-card fraud and identity theft as a result of the relative transparency and anonymity of the Internet versus offline transactions, even though both crimes are equally possible in the physical world. Users expect that information about themselves, including those aspects that make up identity, are not disclosed in unapproved ways.
It's already challenging to manage extremely large networks of computers, and it's just getting harder. The immensity of this challenge has been masked by the fact that up to this point we have generally hired professionals to manage large systems. The shortcomings of the machines, the networks, the administration, the tools, and the applications themselves are often mitigated by talented systems managers working hard to compensate for the fact that these components don't always work as expected or desired.
Many of the system failures that get a lot of attention happen because of system complexity. People make an administrator error, fail to install a patch, or configure a firewall incorrectly, and a simple failure cascades into a catastrophic one. There is a very strong dependency on human operators doing the right thing, day in and day out.
There are already too few knowledgeable administrators, and we're losing ground. Worse, the needs of administration are evolving beyond professional IT managers. On the one hand we are at the point where even the best operators struggle: systems are changing too rapidly for people to comprehend. On the other, the bulk of computers will eventually end up in non-managed environments that people own, carry around with them, or have in their car or their house.
We therefore need to make it easier for people to get the right thing to happen consistently with minimal human intervention. We must aim towards a point where decision-makers can set policy and have it deployed to thousands of machines without significant ongoing effort in writing programs, pulling levers, and pushing buttons on administrators' consoles.
The industry can address this in any of a number of ways. Should we actually write software in a completely different way? Should we have system administrators at all? Or should we be developing machines that are able to administer other machines without routine human intervention?
Each of these approaches requires new classes of software. As the absolute number and complexity of machines goes up, the administration problem outstrips the availability and capability of trained people.
The result is that people in the programming-tools community are going to have to think about developing better ways to write programs. People who historically think about how to manage computers are going to have to think about how computers can become more self-organizing and self-managing.
We need to continue to improve programming tools, because programming today is too error-prone. But current tools don't adequately support the process because of the number of abstraction layers that require foreground management. In other words, the designer needs not only to consider system architecture and platform/library issues, but also everything from performance, localization, and maintainability to data structures, multithreading and memory management. There is little support for programming in parallel, most control structures are built sequentially and the entire process is painfully sequential. And that is just in development; at the deployment level it is incredibly difficult to test for complex interactions of systems, versions, and the huge range in deployment environments. There is also the increasing diffusion of tools that offer advanced development functionality to a wider population but do not help novice or naive users write good code. There are also issues around long-term perspectives: for example, tools don't support "sunset-ing" or changing trends in capability, storage, speed, and so on. Think of the enormous effort devoted to Y2K because programmers of the 1960s and 1970s did not expect their code would still be in use on machines that far outstripped the capabilities of the machines of that era.
The growth of the Internet was proof that interoperable technologies—from TCP/IP to HTTP—are critical to building large-scale, multipurpose computing systems that people find useful and compelling. (Similarly, interoperable standards, enforced by technology, policy or both, have driven the success of many other technologies, from railroads to television.) It is obvious and unavoidable that interoperable systems will drive computing for quite some time.
But interoperability presents a unique set of problems for the industry, in terms of technologies, policies and business practices. Current "trustworthy" computing systems, such as the air-traffic-control network, are very complex and richly interdependent, but they are also engineered for a specific purpose, rarely modified, and strictly controlled by a central authority. The question remains whether a distributed, loosely organized, flexible, and dynamic computing system—dependent on interoperable technologies—can ever reach the same level of reliability and trustworthiness.
Interoperability also poses a problem in terms of accountability and trust, in that responsibility for shortcomings is more difficult to assign. If today's Internet—built on the principle of decentralization and collective management—were to suffer some kind of massive failure, who is held responsible? One major reason why people are reluctant to trust the Internet is that they can't easily identify who is responsible for its shortcomings – who would you blame for a catastrophic network outage, or the collapse of the Domain Name System? If we are to create and benefit from a massively interoperable (and interdependent) system that people can trust, we must clearly draw the lines as to who is accountable for what.
We face a fundamental problem with Trustworthy Computing: computer science lacks a theoretical framework. Computer security—itself just one component of Trustworthy Computing—has largely been treated as an offshoot of communications security, which is based on cryptography. Cryptography has a solid mathematical basis, but is clearly inadequate for addressing the problems of trusted systems. As Microsoft researcher Jim Kajiya has put it, "It's as if we're building steam engines but we don't understand thermodynamics." The computer-science community has not yet identified an alternative paradigm; we're stuck with crypto. There may be research in computational combinatorics, or a different kind of information theory that seeks to study the basic nature of information transfer, or research in cooperative phenomena in computing, that may eventually form part of an alternative. But, today this is only speculation.
A computing system is only as trustworthy as its weakest link. The weakest link is all too frequently human: a person producing a poor design in the face of complexity, an administrator incorrectly configuring a system, a business person choosing to deliver features over reliability, or a support technician falling victim to impostors via a "social engineering" hack. The interaction between sociology and technology will be a critical research area for Trustworthy Computing. So far there is hardly any cross-fertilization between these fields.
CONCLUSION & FUTURE WORK
- Delivering Trustworthy Computing is essential not only to the health of the computer industry, but also to our economy and society at large.
- Trustworthy Computing is a multi-dimensional set of issues. All of them accrue to four goals: Security, Privacy, Reliability, and Business Integrity. Each demands attention.
- While important short-term work needs to be done, hard problems that require fundamental research and advances in engineering will remain.
- Both hardware and software companies, as well as academic and government research institutions, need to step up to the challenge of tackling these problems.