ALBERT-LÁSZLÓ BARABÁSI: We live in a very special moment because almost everything we do is tagged with data. That is not only true for us, it is true for our very biological and universal existence.
The more we know about the world, the more we understand that it is a very complex system. Our biological existence is governed by a highly complex genetic and molecular network; how the genes and molecules in our cells interact with each other, but society is really not just a sum of individuals. Society is not a phone book. What makes society is really the interactions we have.
But the question is: How do we understand this complexity? If we want to understand a complex system, the first thing we need to do is map out its architecture and the network behind it.
We have data about almost everything, and this vast amount of data creates a wonderful and unique laboratory for the scientist; offers an opportunity to really understand how our world works.
Graph theory has become a very popular subject of study for mathematicians, and I’m Hungarian, and it turns out that the Hungarian School of Mathematics, thanks to Paul Erdős and Alfred Rényi, has contributed a lot to the problem. Between 1959 and ’60, they published eight papers that put forward the ‘theory of random graphs.’
They looked at some of the complex networks around us and they said, you know, “We have no idea how these networks are put together, but for all practical purposes, they look random.” So their model is very simple: Pick a pair of nodes and throw a dice. If you get six, you connect them. If you don’t, you switch to another pair of nodes. And with that idea, they developed what we now call the ‘random network model.’
What’s interesting from a physicist’s point of view is that for us, being random doesn’t mean being unpredictable. Actually, randomness is a form of predictability. And that’s exactly what Erdős and Rényi proved, that in a random network, the average dominates.
Let me take an example: The typical person, according to sociologists, has about a thousand people he knows on a first name basis. If society were to be random, then the most popular individual, the person with the most friends, would have about 1,150 friends or more. And the less famous one, about 850, that is, the number of our friends follows a Poisson distribution with a large peak in the average and very fast decay, which is obviously meaningless, right? This is an indication that something is wrong with the random network model. Not in the sense that the model is wrong, but it doesn’t capture reality, it doesn’t capture how networks are formed.
After years of being interested in networks, I realized I needed to find real data describing real networks. The first opportunity for us to study real networks came with the world wide web map. We know that the world wide web is a network. The name says it: it’s a web. Its nodes are the web pages and the links are the URLs, the things we can click to go from one page to another. We’re talking about 1998, which is about six, seven years after the world wide web was initially invented. The web is very small, with only a few hundred million pages.
So we set out to map it, and that really marked the beginning of what we now call, ‘network science.’ Once we had this map of the world wide web, we realized that it was very, very different from the random network maps that had been generated over the years. When we dug deeper, we realized that the degree distribution, that is the number of links per node, did not follow the Poisson we had for the random network, but followed instead what we call a Power law distribution . We name these networks ‘scale-free networks.’
In a scale-free network, we lack averages. Averages are not significant. They have no intrinsic scale. Everything is possible. They are dimensionless. Most real networks are not formed by connecting existing nodes, but they grow, starting with one node, adding other nodes and other nodes.
Think of the world wide web: In 1991, there was a single web page. How do we get to over a trillion today? Well, another webpage was created that linked to the first page, and then another that linked to one of the previous pages. And ultimately, every time we put up a webpage and connect to other webpages, you add new nodes to the world wide web. The network builds one node at a time. Networks are not static objects with a fixed number of nodes that need to be connected- networks are growing objects. They evolve by growing up.
Sometimes it took 20 years like the world wide web to get to its current size, or four billion years when it comes to subcellular networks to get to the complexity we see today. We realize in the world wide web, we don’t connect randomly. We connect with what we know. We connect to Google, to Facebook, to other major webpages that we are familiar with, and we tend to node more connected pages. So our connectivity pattern is biased towards more connected nodes.
And we ended up formalizing it using the concept of ‘preferential attachment.’ And when we combine growth and attachment preferences, suddenly power laws emerge from the model. And suddenly we had hubs, and we had the same statistics and the same architecture that we had seen earlier on the world wide web. We started looking at the metabolic network inside cells, the protein interactions inside cells, the way actors connect to each other in Hollywood. In all those systems, we found scale-free networks. We saw non-randomness, we saw hubs emerging. And therefore, we realize that the way complex systems build themselves follows the same universal architecture.
Let’s just be clear that network science is not the answer to all the problems we face in science, but it is a necessary path if we want to understand the complex systems that emerge from the interaction of many parts. And today, we don’t have the theory of social networks, the theory of biological networks, and the theory of the world wide web- but instead, we have network science, which within a scientific framework, describes all of these.