Monthly Archives: February 2015

How do scientists share on academic social networks like ResearchGate?

A data analysis of user habits shows open sharing is mostly limited to publications; very few scientists are liberal with knowhow

A key aim of Sciencebite is to create a open platform for scientific expertise online. At the moment, Sciencebite is much smaller than the academic social networks,, and So I wondered how scientists behave on these networks, how they are promoting their identities, publications and expertise, and how they are connecting with each other.

Without wishing to comment on the competitive positions of the three largest academic networks, in this part of the world – Berlin – where I am based, ResearchGate is the one that we hear about the most, and with a claimed user count of > 6m academics from whitelisted institutions, it has certainly achieved an amazing penetration of the world’s academics. Speaking personally as someone with a background in academic science, I can confirm that every scientist that I know now seems to have a ResearchGate profile, although many do not actively engage with it.

So how are scientists behaving online, taking ResearchGate as the source of data? I wrote a script to browse ResearchGate profiles at random, and ran it twice, to look at the change over a year: firstly in November 2013 (sampling 3028 profiles), and secondly in February 2015 (sampling 3407 profiles). The sample represents some 0.1% of ResearchGate profiles, and leads to some pretty interesting conclusions about how scientists behave on the new internet platforms.

Scientists are increasingly sharing their professional identities online

ResearchGate has 6m users as of January 2015, and is currently growing at around 10k users per day. A growing minority of users share their professional identity on the site (Figure 1): 36% shared a profile picture in 2013, 43% in 2015. Fewer have updated their profiles with their current positions, but this too is growing: 7% in 2013 and 24% in 2015. It’s interesting that fewer have filled in a current position than uploaded a profile picture, which perhaps reflects that many users register with Facebook, or have left academia since joining ResearchGate. As of February 2015, 18% have both uploaded a profile picture and filled in their current position. If we consider this as a definition of an active user, we can estimate ResearchGate’s active user base as 1.1m.

Figure 1: identity sharing on ResearchGate

Figure 1: identity sharing on ResearchGate


A growing number of scientists are sharing full texts of their publications

We see an impressive engagement with sharing full texts of publications online (Figure 2). Some 41% of ResearchGate users had uploaded at least one full text by Nov 2013, growing to 80% by Feb 2015. Considering the rapid growth of ResearchGate signups in this time, it reflects an impressive rate of sharing among new users.

Figure 2: publication uploading on ResearchGate has acheived an impressive penetration.

Figure 2: publication uploading on ResearchGate has achieved an amazing penetration of the academic community. Now 80% of RG users have uploaded at least one full text.

It’s remarkable that so many more users have shared their publications (~3m) than filled in their profile information (~1m). However, given the importance of publications in the academic career path, it is perhaps less surprising: academics find it more important to share publications than to display their current position. This would surely be the opposite of user behavior on mainstream professional networks like LinkedIn.

In addition to the sampling of users, I sampled 1000 publications randomly on ResearchGate, looking at the relationship between publication date and sharing. In the past 3 years, a great number of new publications have been listed on ResearchGate, reflecting its deep penetration of the scientific community. We can also see that new publications are increasingly shared on ResearchGate – almost half of new publications in the past three years are uploaded (Figure 3).


Figure 3: the growth in shared full texts on ResearchGate reflects the amazing penetration of the academic community in recent years. Now almost 50% of new publications are shared openly as full text.


Almost nobody is using ResearchGate’s Open Reviews

ResearchGate launched a new feature in March 2014: Open Reviews, for scientists to openly peer-review each others’ papers after publication, for quality, originality, reproducibility, etc. Unfortunately almost nobody is using it. Only 4 users in my sample of 3407 have given an Open Review, and none came back to give a second.

The reason for this low adoption is probably that scientists get little career advancement from an Open Review. If they give a negative review, they may make enemies, and they gain none of the influence over a journal that they do in the traditional peer review system. Moreover, if they give a positive review, it does not help their publication and citation record, which are their main criteria to advance their careers.


Sharing of knowhow and advice is still lacking

Far fewer users engage with the Q&A forum on ResearchGate, in comparison with their sharing of publications (Figure 4). In November 2013, only 6.3% had ever asked, commented on, or answered a question. Even of those who had engaged with Q&A, very few were repeat users: 2.3% had ever asked a question, only 0.56% had come back to ask a second question, and only one in the entire sample could be really classified as an active user by asking more than 10 questions.

The situation had not improved much by February 2015. 6.6% had even asked/commented/answered. 2.8% had ever asked, and 1.1% had ever come back for a second. No users in the sample had asked more than 10 questions.

Why do so few scientists request and share practical knowhow and advice online? Perhaps it is the secretive culture of research, and the importance of publications versus other types of dissemination. Perhaps it is also because of the highly specialized nature of knowhow in science – valuable knowhow is often much more specialized than in other professions, for example in software development, where programmers enjoy a culture of sharing all kinds of practical information, through open source software, blogs and Q&A sites.


Figure 4: knowhow sharing on ResearchGate is still minimal. Only a very small proportion of users use the Q&A, and this proportion has changed little in 2013 - 2015

Figure 4: knowhow sharing on ResearchGate is still minimal. Only a very small proportion of users use the Q&A, and this proportion has changed little in 2013 – 2015

Conclusion: scientists are changing some of their habits, but only as far as the academic career path allows them

Great numbers of scientists are now sharing online, and in the past couple of years, it has become common to share publications openly on academic social networks like ResearchGate. However, not many scientists have changed their behavior beyond this – other types of sharing are still lacking, In particular, sharing of expertise and knowhow is still minimal.

I believe this represents a great challenge and opportunity to scientists who are interested in working differently. The trend of online sharing among scientists still overwhelmingly reflects the traditional academic career path, because it so strongly anchored to journal publications and citations. For all the frustrations of collaborating and making progress in science, there are perhaps still a great number of undiscovered solutions that would allow us to work together better.

The individuality of popcorn

or, why life can be both random and destined

There’s a widespread delusion that life is a like a lottery, and that events in the real world are determined by chance. I notice it when people talk about the future using the language of statistics: what are the chances of X, e.g. getting rich, becoming famous, having an accident? The unique case of the individual is replaced with a vague idea about the population.

This delusion is very common: we see it in popular culture, politicians, journalists and advertisers use it. I hardly see anyone in the public eye oppose it – a notable exception being Peter Thiel in his recent book, From Zero to One.

Perhaps scientists share this attitude most of all. I encountered it at a recent popular science event in Berlin. A professional scientist at one of the local research institutes constructed an experiment “popcorn decay”. The aim of this experiment was to measure the popping of a population of popcorn kernels, and therefore show that popcorn popping was like radioactive decay, which is an intrinsically unpredicatable (i.e. random) process, and presumably therefore uncaused and meaningless. But it wasn’t really about popcorn – it was about everything in life. He wanted to convey an approach to thinking about all events in life – not just the popping of a popcorn.

Every one counts

What is randomness? What is probability?

What are we talking about when we say something is random? Basically it means that we cannot predict it. For example, when one flips a coin, it is very hard to predict how many times the coin will turn before it lands, and therefore one is completely ignorant about whether it turns out heads or tails.

The important thing to realise is that randomness is subjective. Just because you cannot predict an event, does not mean that nobody else can predict it, or that you could not predict it if you were given extra information or allowed to look at it in retrospect.

What is random for one person, may not be random for another. Imagine your dog barks some mornings for no apparent reason. On a given morning, you don’t know whether it will bark. This is random. Then you notice that the dog only barks when a certain postman delivers the mail. It is not random any more: you ask the postman for his schedule, and now you know when the dog is going to bark.

So, randomness means unpredictability. We call a variable random if we cannot predict it. It does not mean that it is not determined by some hidden cause or that it is meaningless. Only that we cannot predict it. Randomness is subjective. Randomness is not a physical property, it is a statement of ignorance.

The first person to make all this clear was Bayes. Bayesian statistics is a buzzword now (especially in this age of big data), so it is unfortunate that even most statisticians use him without really appreciating what he was talking about.

Probability is just a quantification of that ignorance. A probability of 50% means that one is completely ignorant about whether something will happen or not. So when the weather forecasters say 50% chance of rain, they are actually saying it might rain, and perhaps it often rains in circumstances that seem similar, but they are completely ignorant about whether it will rain or not.

Even famous scientists use the language of probability for deceptive purposes. Astronomer Royal, Prof Martin Rees caused a media sensation when he claimed in his book, that humanity has a 50% chance of extinction in this century. For most people, that conjures up a mental image of flipping a coin – heads: humanity lives, tails: humanity dies out. But it is not really like that. Most events in real life happen only once – and the 21st century will only happen once. There is no meaningful analogy to flipping a coin. And Martin Rees is not privy to any secret knowledge about the dangers of human extinction that millions of other educated people have also studied. If he wanted to speak honestly, he would have said “I think there are real dangers facing us in the near future. I don’t know the future better than anyone else, but I feel completely unsure of whether humanity is going to survive this century.”

But what about quantum physics? Is there some level at which events are random in a physical sense, not just a subjective sense?

I studied physics at Oxford, which was a real traditional academic discipline, like almost no other subject. Of course we learned the Copenhagen interpretation of quantum physics, that a system is in a superposition of states until an observer “rolls the dice”, and the system randomly “collapses” into one or other state. This is of course famously illustrated by Schrödinger’s cat, which is killed if a radioactive nucleus decays. The fate of the cat is unknowable in advance, and its death is therefore somehow causeless and meaningless.

We learned that lots of people were once unhappy with this picture of things, and they came up with “hidden variable theories”, to reassure themselves that there was a hidden cause behind the cat’s fate. Einstein, was of course one of them, “God does not play dice,” he asserted. And we got a mental image of Einstein in the desert, like some biblical prophet, alone and mocked by his people. We learned that nowadays any self-respecting physicist ignores all that stuff. We got the vague impression that hidden variable theories were disproven by von Neumann. And even if they weren’t disproven, we should ignore them on the principle of Occam’s Razor.

As scientists we were also taught to think of all things objectively – that we would never learn the truth about an object if we failed to remove ourselves from the picture. Although we learned probability theory and the physics of stochastic (random) processes, the understanding that we gained was flawed. Random processes cannot be seen completely objectively, because randomness itself is a subjective experience.

The Copenhagen interpretation inspired the “many worlds” interpretation. In some ways, many worlds is a logical extension of the Copenhagen interpretation. It basically holds that everything in this world is without cause – that for every effect, the opposite effect might just as well have occurred, because both exist in parallel worlds that diverged from each other when the event took place.

All this is vastly removed from our everyday experience. Of course in real life too, many events have an uncertain outcome. I recently founded a start-up company with a couple of others in Berlin. Now, whether a start-up company succeeds or not is uncertain, even to its founders. However, the probability of it succeeding is perhaps conceptually different to the probability of a nucleus decaying. In a nuclear decay, the probability of decay in a given period is the same for all observers with access to a table of half-lives. However, in judging whether a company will succeed, all observers are not equal. My co-founders, investors and myself have privileged information. Other industry insiders and competitors may have other privileged information. Each observer’s understanding and judgement of the company, the service, the competitive situation and the industry varies immensely. Every observer may assign his own subjective probability of the company succeeding. Several years from now, when the fate of the company is clear, we may retrospectively analyse the history of the company and understand why it succeeded or failed. So at no point is there a roll of the dice. Eventually when we look back, we will understand the fate of the company, and conclude that given the environment and our actions, there could not have been the opposite outcome. The universe never divides into two parallel universes. There is only the gradual transition from unknown to known.

Randomness does not mean uncaused (an event can be unpredictable, but still have causes) – the de Broglie-Bohm theory and Grete Hermann

I believe that the understanding of probability and randomness which is normally taught to scientists is actually rather shallow. I think a much deeper view is that offered by Bayesian statistics: randomness is a property of the observer (ignorance), not a property of the system. When I first heard about Bayes, I thought that quantum mechanics must be the exception. So, shortly after I left academic science, I was fascinated to discover that there was an accepted interpretation of quantum mechanics – the de Broglie-Bohm interpretation – in which probability is once again a property of the observer and not of the system, and therefore compatible with Bayesian probability theory.

My understanding of de Broglie-Bohm is that a particle has both a definite position and a definite momentum. However its trajectory is determined not by Newton’s equation of motion, but by an equation of motion that depends on the wavefunction. The de Broglie-Bohm theory of the double slit experiment can be appreciated in a single figure:

The double-slit diffraction pattern according to the de Broglie-Bohm interpretation. The particle always has a definite position and momentum, and moves deterministically according to a function of the wave function. However its initial conditions are unknowable.

Thus the “randomness” is from initial conditions, not from collapse of the wavefunction. It can therefore be called a “non-local” hidden variable theory, as unlike the “local” hidden variable theories. It is not disproven by Bell’s theorem. In other words, there’s no getting away from the wavefunction, but the wavefunction is deterministic: there’s no roll of dice.

But the trouble with the de Broglie-Bohm theory is that, even if it formally just a flavour of canonical quantum mechanics, it’s mathematically rather complicated, and therefore fails the test of Occam’s razor. When I cast my mind back to the electron-in-trap or the hydrogen atom, I wouldn’t dream of applying the de Broglie-Bohm theory. It’s just a more complicated version of the same quantum mechanics.

Recently I heard of Grete Hermann’s work. In the 1920s, Hermann disproved von Neumann’s theorem on hidden variables, as Bell did 20 years later, although her work remained obscure. Hermann also argued that although hidden variable theories are not forbidden, they are actually unnecessary for causality in quantum mechanics. The key is to make the difference between two concepts as we talk about randomness: causation and prediction. So it was impossible to predict the fate of Schrödinger’s cat, because that depended on the decay of a nucleus, which was physically unpredictable. But that doesn’t mean it was not caused: on the contrary, in quantum mechanics, the measurement itself gives us the deterministic cause. It is always possible to explain an event with classical physics after the fact, even if it is impossible before the fact.


1. Life is both random (hard for us to predict) and destined (there is no effect without cause)

I must admit that I have an ideological agenda in this. I want to believe that life is not “random”. All events in a person’s life happen only once: the fact that we can’t predict the future is because of our ignorance, not because the future is meaningless. And although we can learn a lot from statistics of a population, it does not show us the secrets of our own lives.

2. Proving the individuality of popcorn

All this gave me inspiration for an experiment to follow “popcorn decay”. Instead of showing that popcorn popping (like nuclei decay) is random, I would like to investigate whether popcorn popping can be predicted. Using as reproducible a heating method as I can find, and using a single variety of popcorn from a single batch, I’ll also measure everything that’s easy to measure about each kernel before popping it: weight, dimensions, density, color, defects, etc, and see how much of the variation in popping time can be explained by these parameters, using simple machine learning – i.e. multivariate linear regression.

Afterword: the meaning of a popcorn’s life

A criticism of all this discussion, normally given to me by scientists, is that all this is of no practical value. Popcorn individuality is therefore to be scorned in the same way that scientists are supposed to scorn all discussions of philosophy or faith. Belief in “truth” is useless – all we need is a practical theory, not “truth”.

Regarding the practical point of view, I have an additional perspective. I have been out of academic science now for 6 years, and I’m a start-up entrepreneur. I can say from experience that belief is very important practically for an entrepreneur: I have to have complete faith in the purpose of what I’m doing, because if I don’t, nobody else will, not investors, not colleagues, not customers. But isn’t doing science also an enterprise? Looking back, I feel that doing science requires conviction, even if the character of natural science itself is only to offer observable explanations, and not a deeper truth.