Chapter 6. Living Systems
A "Living System" is one that grows into its environment, by self-organizing around opportunities. Living systems can last for a long time, adapt well to change, and thus be highly successful. By contrast, "Planned Systems" tend to be fragile, poor at coping with change, and thus short-lived. In this article I'll explain Living Systems, of software and people, and how to grow them.
Why "Living Systems"?
Since the beginning, life has relied upon the transmission of messages. -- RFC 3164 (syslog)
Wikipedia writes, "Living systems are open self-organizing living things that interact with their environment. These systems are maintained by flows of information, energy and matter." The term was originated by psychologist James Grier Miller to formalize the concept of life.
I want to use the term to define a new metaphor for software systems and organizations, the two types of system I'm most interested in. The two are more than just similar. Software is the product of an group of people, and as Conway observed, the structure of a software system mimics the structure of the organization that produces it. I've written that "the physics of software is the physics of people," and by that I meant psychology.
Most software products today are highly planned, and they fail as living systems. They are essentially dead on delivery, sold by force and bluff. For a software to be a "living system", it must be used by the organization that builds it, and it then lives or dies along with that organization. An "organization" can be much larger than one company or one team. It can consist of thousands of teams, businesses, customers and suppliers, connected in ineffable yet vitally real networks.
Nowhere is this more clear than the Internet, a Living System of software, of people, of businesses and other organizations. The organization that built the Internet is nothing less than human society itself. There are many Living Systems, all around us. It is a strangely simple truth: that the better we get at writing large-scale software systems, they more they resemble the real world around us.
In contrast to Living Systems, we have Planned Systems. It is far easier to plan a system than to grow it. However, plans are inevitably built on wrong assumptions and poor judgments. Planned Systems look attractive and efficient from some perspectives, yet they inevitably fail catastrophically. Real life provides many examples, such as collective farming, planned cities, Microsoft Windows 8, and so on.
In the software business, this Living vs. Planned dichotomy is best expressed by free software vs. closed source. Free software (and its corporate cousin open source) usually grows out of real use, where as closed source is usually planned. This is mainly why I don't work on closed source: it dies rapidly and predictably. I prefer my work to survive as long as possible.
I'll make a few broad claims, starting with: the most successful large-scale software systems are Living Systems. That is, in a competitive market, a Living System will wipe out any competing Planned Systems. It will recognize and solve real problems faster, cheaper, and more accurately. If your business depends on a Planned System, you are vulnerable to attack by a Living System.
The second claim I'll make is that this also applies to organizations. If your company is a Planned System, it is already dead. Whereas if your company operates as a Living System, it can dominate its market. Interestingly, when two Living Systems meet, they don't usually fight. Rather, they specialize into different areas, and they then merge to form a single Living System. Competition and conflict usually work for the benefit of the Living System, even if individual components fail.
Let me take this question of conflict and competition further. Of course individuals do compete, and quite harshly sometimes. This is our biological imperative. However we also have a biological imperative to collaborate, a far more profitable strategy, most of the time. A Living System embraces competition between individuals, and survives the failure of individual components. It actually depends on that process of competition and failure. A Planned System is essentially trying to act as a single individual, and cannot tolerate internal competition, nor failure of individual components.
What do Living Systems Look Like?
A Living System consists of a large number of loosely-coupled components. It is essentially spread out in space (thus, "distributed"), and in time (thus, "asynchronous"). That means things happen in unpredictable places, and at unexpected times. To a central planner, this looks like dangerous chaos.
In a Planned System, by contrast, the times and places of events are meticulously scripted. The focus is on "command and control," where decisions are made centrally and communicated to the structure. Planned Systems are essentially hierarchical, for this is the optimum way to communicate decisions rapidly down from the top.
We build a Planned System, whereas a Living System grows itself. My goal is to learn, and to teach, how to grow Living Systems artificially. I'm studying the genes and patterns of care and nurture for a self-growing system. In fact I'm talking about creating artificial life, and artificial intelligence, though in a shape that traditional AI researchers might not recognize. I don't believe individuals components -- including you and me -- can be "intelligent" at all, except in a narrow and superficial sense: intelligence is a property of systems.
Living Systems are typified by their lack of central planning or decision making. Look at a software project and ask, "who is the designer?" If there is a clear designer, individual or organization (and there almost always is), that is a Planned System. A Living System has no designer, no road-maps, no clear future plans except "survive and grow."
A Living System looks more like Adam Smith's free market than Stalin's Five Year Plan. Economics, politics, and psychology are as important -- perhaps more important -- in growing successful Living Systems as technology. A free market depends on several key things: clear laws, standards, contracts, and fair regulation. A Living System likewise depends on these.
So whereas a Living System has no central planner, it may have central regulators. Let me explain the difference. In a planned city, a committee decides where to build schools, homes, factories, offices, railway stations, shops, sports facilities, and so on. In a Living City, all these are decided by independent agents (school boards, businesses, home owners, etc.) and the city regulates the provision of electricity and clean water, the disposal of garbage, and so on. Further, the city runs police and courts, as regulators, to dissuade criminals and cheats.
A regulator makes laws that define a fair market, and then enforces those laws. Units of measurement, currencies, contracts, and such. In software systems, these laws are, for instance, the source code license, and contribution policy. A fair market lets anyone create a new venture, and compete with other ventures. To allow true competition (meaning, free choice by customers), clients can demand clear contracts, which in software terms are documented APIs and protocols.
The DNA of a Living Systems is essentially a set of regulated contracts. Thus the Internet is grown out of a set of RFCs (protocols called Requests for Comments), regulated by the Internet Engineering Task Force (IETF). Living cities are grown out of criminal and civil laws, standards for water and power and waste, transport, and so on.
If all strategies were honest, there would be no need for regulators. However any Living System will be vulnerable to cheating strategies. A certain segment of people, for instance, are systematic or opportunistic cheats. Given a market, they will always seek a way to convert value to their own benefit, even at a higher cost to others. They will lie, steal, deceive, coerce, and so on.
Without resistance against such cheats, the market will suffer, and the system will eventually die. Top-down authority is one defense against cheats. However it has a significant vulnerability: cheats can, and often do, capture the authority itself. In Living Systems, cheats can try to capture the regulators, and this happens in real life all the time.
When cheats capture the regulators of real-life Living Systems, the usual response is to move away, if we can. In open source software systems, we can fork and continue under a better regulator. This is why forking is an essential freedom, rather than a failure. Since forking can also be a strategy for capture, a fork-safe license (GPL or similar) is best for Living Systems in software.
Living Systems grow, constantly and organically. This is their most visible trait: the lack of the usual massive construction efforts. Rather, you will see a smooth flow of small changes. It may seem boring or unambitious. However, it is a better algorithm for survival. A Living System must do two things. First, it must solve some profitable set of problems. Second, it must adapt and change over time, as its environment changes.
Shifting a Planned System to cope with a changing environment is very hard, often impossible. Resources define power. Thus, Planned Systems actively and aggressively resist change, deny it, and when it becomes inevitable, they die catastrophically. However, a Living System feeds off change. It makes no distinction between exploring the landscape of problems of "now" and of "tomorrow." It grows through continuous learning. To actually destroy a Living System you have to do widespread damage to it, which is hard when a successful Living System has spread wide.
To a Living System, small amounts of damage are indistinguishable from normal activity. In fact Living Systems thrive on challenge, so long as it is not overwhelming. Challenge is what allows components to compete, and develop better answers. What does not kill a Living System makes it stronger.
So, as Living Systems learn and move quickly and opportunistically into new areas, they will tend to thrive and grow dominant, wiping out any competing Planned Systems. They can react rapidly, shifting resources around to areas where they are more valuable. And since they do not need any upfront coordination to act, they can scale to any size. Zero upfront coordination means infinite scale.
Components of a Living System
Let's now look at the individual components of a Living System. Remember that a Living System resembles a free market, where components compete to provide some identifiable and measurable service. The components of a Living System have some traits that set them apart from the components of a planned system.
Every component of a Living System has a clear set of owners and investors, and ownership is usually highly localized (in contrast to a Planned System, where all components have the same owners). Components organize into chains of suppliers and clients, and they have identities, names, and addresses, so that clients can find them. One classic way to cheat is for one group to provide a poor quality component that claims to be a high-quality one. Thus the regulator may have to enforce identity, and protect investment in an identity.
Components are, as far as possible, location independent. That creates a larger, and more efficient free market. It means that we strive for location independence as a feature of our Living System. This is contrary to a Planned System, where location is highly significant, and where there is little or no competition between components.
Similarly, components may come and go in time, quite arbitrarily. There are no guarantees that a component we depend on today will still exist, or be available tomorrow. This may sound fragile, yet it is highly robust. Rather than depending on specific components, we depend on contracts. If our need is real, there will be many alternatives. If one disappears, another will take its place. If you miss one taxi, you will catch another.
Components are highly independent, decoupled from one another. That is, they exist and change at their own rate, in their own direction. A change in one component is essentially invisible to another component except through its public interfaces. This freedom is essential to a free market, driven by specialization and trade. Thus one component may focus on speed, while another on security.
Since there is no centralized decision about what components exist, nor who makes them, they will be highly heterogeneous, and this diversity is essential to the intelligence of the overall system. A set of diverse components in a Living System, connected in a free market, will solve large problems faster, and more accurately, than a monolithic Planned System.
Components are abstracted, meaning they may represent entire systems in themselves. For instance a web address can represent a single, small piece of software (one web server), or it may represent a massive infrastructure (an Internet business). It is up to each group of owners to decide whether they build Living Systems or Planned Systems, in turn. A Living System will happily embrace Planned Systems as components. The opposite isn't true.
Components avoid upfront consensus, also known as "shared mutable state". Every component has knowledge, and they may share knowledge, yet they do so asynchronously. So while the Living System represents a large, coherent pool of knowledge, there is no guaranteed consistency between components. This may seem paradoxical. Surely every person in a meeting, for instance, agrees on the agenda for the meeting?
In fact meetings, with their agendas and minutes, are the epitome of the shared mutable state that a Planned System depends on. Planned Systems cannot function without systematic upfront agreement. In concurrent software design, we use "locks" to achieve the same result. It is provably true that a software system that uses locks to share state between components will not scale. You can try to make distributed software as a Planned System: it starts easily yet scales poorly, if at all. Whereas a Living System takes a little more thought at the start, and then scales without limit.
Finally, components are lazy and opportunistic. They only work when there are tasks waiting, and they only change and grow when there are new, profitable opportunities. This means components can remain lightweight and minimalistic. Further, they can solve the "problem landscape" much more accurately, without excess baggage. In a Planned System by contrast, components are built upfront, on the assumption of future problems, or at best, knowledge of past problems.
An example: in a planned conference, the organizers choose speakers on certain topics, based on their experience of the previous year. Now, one month before the conference, a significant event drives public demand for speakers on a totally different topic. How long will it take the conference to react? A participant-driven conference can react in real-time, whereas a planned conference will take a full year to respond.
Protocols of a Living System
The components of a Living System are connected in relationships. Each relationship consists of a flow of information, knowledge, or requests, in both directions. The best way to model such relationship seems to be as discrete events, or "messages," that carry a formalized set of interactions we call "protocols."
In natural Living Systems, we also see messages and protocols. Cells, for instance, communicate with chemical messages. We humans appear to communicate with a set of protocols that underly our human languages. For instance, male-dominated hierarchies are a consistent feature of human society, suggesting that the command-and-control protocols they depend on are built-in to our minds, not learned. I'd hypothesize that the male mind, driven by the ancestral need to plan hunting parties, is responsible for Planned Systems.
Protocols have a number of common patterns. We see broadcast protocols where one component signals to many listeners. A broadcast protocol is typically one-way. The signaler may, in rare occasions, get feedback from a few listeners.
We see one-to-one protocols where two components exchange knowledge, tasks, requests, and so on. One-to-one protocols can be more or less chatty, and ideally are fully asynchronous. Chattier protocols take longer to conclude, thus raising overall "latency." For example if I'm cooking a pizza and I have to confirm every ingredient, it will take longer. "Do you like mushrooms?" "How about garlic?" "Ok, what kind of cheese do you prefer?"
The ideal relationships aim for lowest realistic latency, since the latency of the overall system is the sum of the latency of its entire supply chain. That is, if I'm making a meal, and I have to spend one minute solving the "pizza" issue, that adds one minute to the overall preparation time. In an asynchronous low-latency dialog, I'd ask all the questions at once, and deal with the answers as I got them back, one by one.
To make effective asynchronous systems we need queues, and smart queuing strategies. Ideally, we have queues at any point where messages may arrive, and we push messages as close to their consumers as possible, to reduce latency. We need strategies to deal with full queues (space is not infinite): it might be to throw away older messages, or to pause the sender (this works for one-to-one dialogs, not for one-to-many). We may need multiple incoming queues, one per flow, and the ability to wait for a message on any of these queues.
The protocols of a Living System are highly ritualized. They implement formal contracts. If I ask, "Do you like garlic?" then I expect a yes/no answer. A discussion about the weather is a breach of contract. When we're growing our own Living Systems, we have to write the protocols down, so they can be learned and verified. The simpler and clearer, the better. Complex, arcane protocols are expensive to learn and implement, which distorts the free market.
Some Living Systems use earned trust, together with identity, in place of verifiable contracts. This can be a valuable short-hand, especially when exchanging knowledge, though it is also vulnerable to cheats (frauds). An alternative is to ensure that every contract is verifiable, backed by meta-contracts on performance. This is often better for trading work. Any taxi driver is fine, so long as drive to the right address and don't over-charge. However we want our news from trusted sources.
Once we have testable contracts we can deal with violations. One strategy is to fail, and let someone else deal with it. Another is to discard that peer and try another. However, after a contract violation, you generally don't want to continue blindly, as that can cause wider damage.
Case Study: the ZeroMQ Community
The ZeroMQ community is a Living System of people that builds a Living System of software (the software collection under the same name). Though I originally designed the ZeroMQ community with most of the properties of a Living System, it only really came true in early 2012, when the community rejected its central planners.
The community consists of a large number of loosely-coupled projects that share a common goal, which is to provide the queuing and messaging needs for other software systems. I've argued, and believe, that only a Living System can use ZeroMQ optimally.
The ZeroMQ projects are connected into a supply chain by formalized relationships, over APIs and wire protocols. We spend a large amount of time documenting these APIs and protocols, and ensuring they are testable. Indeed, we do not usually document the internals of components, just their external APIs.
There is no central planning nor coordination. Instead, each project evolves organically as its users invest in patches and improvements. By making this process simple, the ZeroMQ collaboration contract ensures that the ZeroMQ organization expands to include all its competent users.
Anyone can start a new ZeroMQ project, or fork an existing one, for competition or experimentation. As a community we encourage this, and so we have multiple competitors at most levels. This works well in practice. The basic license is LGPL v3 or MPL v2, ensuring that forks are always safe (patches can flow in both directions).
The regulator in the ZeroMQ community is a self-elected group, headed by iMatix, the firm that developed the original software. There is not much regulation needed, except to stop abuse of the name "ZeroMQ". Clear documentation of protocols is sufficient to allow clients to verify their suppliers.
ZeroMQ is highly scalable. The cost of adding a new project is close to zero, apart from the discovery cost. Projects communicate asynchronously, using GitHub issues and pull requests. There is little or no upfront coordination. We review code after the fact, and fix poor code through further patches rather than discussion.
ZeroMQ's full transition to a Living System was hard because we had no prior successes to imitate. The bulk of free software projects still depend on significant planning. To go against standard practice was seen as highly eccentric, if not actually insane. The loss of key contributors -- who had provided the authority that central planning depends on -- was seen as potentially catastrophic.
However the ZeroMQ community rapidly expanded into the space left by the central planners, and flourished. We disproved the theory that central planning was essential to quality. In fact we found that without central planning, the software improved significantly in quality and in accuracy. Whereas the ZeroMQ development branch had been highly unstable, experimental, and discordant with users' needs, it became mostly stable, trusted, and a close fit for users' needs.
Today we can hold the ZeroMQ community as a worked example of how to do Living Systems "right." It is all the more valuable as data since there have been numerous attempts to replace it, both by the fleeing central planners, and by other teams. Noticeably, every Planned System that claimed to be "better than ZeroMQ" has failed, whereas every Living System that began by competing with ZeroMQ ended up becoming a valuable part of it.
Transforming into a Living System
Can we turn a Planned System into a Living System? If we assume we have the technical right (consensus from enough participants, or legal right through software licensing), what are the practical requirements?
The most difficult part will be to get the size of components right. This will often mean breaking up existing components, and creating new ones. That can be catastrophic if done in too many places at once. Thus, in a larger migration, you would start in one area, refactor that, and then grow the resulting culture out.
Components are usually sized around the people, so a good size is "the work that a few people can do." The scale of a Living System comes from adding more components, and allowing them to use and replace each other in whatever fashion makes sense locally, not increasing individual component size. A component is too small when it cannot provide a full service by itself, and it is too large when it does not focus on one thing.
Finally, you need the contracts. For a software system, we have had good results simply by taking the the ZeroMQ C4.1 process contract, and using that together with a code style guide and the software license. For several reasons, I strongly recommend a share-alike license, such as LGPL (my thesis is that if you use a leaky license like Apache or BSD, you in fact won't get a successful Living System at all).
Launching such a Living System in the past was difficult, as self-organizing software ecologies were poorly documented and little understood. We lacked empirical evidence that processes like C4.1 could work, let alone work so well. As far as I know, this was the first documented contract for a Living System in software.
Economics of Living Systems
How do we make money from free software? It is a question I'm often asked. The answer comes in various forms depending on whether I'm talking to individuals, to small firms, or to large firms.
A key understanding of Living Systems is that they are essentially about economics. No component exists for random reasons. However, to offer a choice between selfishness and altruism is a false dichotomy. Living Systems are driven by selfishness and altruism at the same time. It is a basic theory of economics: by selfish specialization and trade, we create common wealth. It is the human species' superpower: specialization and trade at a massive scale, between individuals, families, generations, villages, cities, and entire regions.
A Living System is owned by all participants, so it can be harder to measure its value, whereas a Planned System, owned by a few at the top, can have very visible value, to its owners and outside observers. However the overall value and economic power of a Living System will always overwhelm any competing Planned System. A Living System can be incredibly profitable, its profits are just widely distributed among all its participants.
This is the first answer: a Living System can kill competing Planned Systems, and thus liberate large amounts of captive value, which can be absorbed by the Living System. We see this in real life, where free market economies out-perform planned economies, leading to movement of skilled labor (value) from the latter to the former.
The second answer is that we can build new markets on top of successful Living Systems, that are impossible on Planned Systems. The Internet is a clear example of this: it has enabled a massive new economy that was impossible on older networks. Those new markets can be very profitable.
A Planned System can only survive by taking value away from its components. In many ways, it resembles a cult, and depends on cult techniques like brain washing, where a few prosper at the expense of many. Planned Systems are inherently unethical, as well as unsustainable. There is an inherent morality in a fair and free market, despite the large number of of Planned Systems that claim to represent "the market."
Conclusions
In this essay I've looked at artificial Living Systems, which imitate and can be modeled on real living systems. Living Systems are spread out in space and time. They consist of large numbers of independently owned components that work together, competing and collaborating, in a free market for services, labor, resources, and knowledge. These components evolve independently, under pressure from their market. They live and die according to their success in finding accurate answers to real problems faced by their clients.
The components in a Living System communicate asynchronously by passing messages around, in various patterns. These flows are highly ritualized, in the form of protocols. The more accurate the protocol, the easier it is for clients to choose suppliers freely, and the more efficient the market.
A Living System has no central controlling owner, though it may elect authorities to regulate (define, and enforce) contracts. It has no single points of failure. Rather than treating failure as exceptional and to be avoided, it uses failure as a basic learning technique. Inaccurate components are allowed to fail and are discarded rapidly, and replaced by more accurate components.
Living Systems grow by learning, into supply chains that connect components to the external environment. We can measure the efficiency of a Living System by looking at overall latency as a problem enters the system, and a response emerges. Such latencies can vary from years in Planned Systems to hours in highly adaptive Living Systems.
Living Systems thus organize opportunistically, accurately judging the relative cost of a given problem, and the value of solving it. Unlike Planned Systems, they are driven by live data rather than assumptions, beliefs, or old data. This lets them operate more accurately, faster, and cheaper than Planned Systems.
To build a large scale Living System in software, build a Living System of people. The two will co-evolve and done correctly, will dominate any given market. Whereas competing Planned Systems will fail as whole units, competing Living Systems will tend to specialize into different areas, and then merge into a single unified Living System.