The image above (click on the image for full size or go here to explore the network) is a network visualisation of the energy sector in France. Or so at least claims Greenpeace France. It looks nice, it looks plausible but methodological it is poorly done and therefore creates the impression of a network where there may be non.
If you look into the underlying data (Excel file), you realise that this is what we call an affiliation network, two-mode network or bipartite network in academia (I’m analysing this type of network for my PhD). This affiliation network was transformed into a person-to-person network by assuming that joint affiliation to the same organisation creates a tie.
The problem is that (1) the underlying data is poor, (2) the affiliation is non-consistent as well as temporally
undefined only vaguely defined, and (3) the interpretation of one joint affiliation as a social network more than problematic. This triple flaw in the construction of the network you see above makes the result unusable for any socially relevant interpretation – and the main reason to analyse or visualise network data is to draw conclusions for the social reality, for instance to identify positions of power or cooperating cliques.
I have to go more into detail to show why it is impossible to do so with this data:
1) Data quality
Let’s say that this is the least important flaw. Basically, the affiliation information seems to be mainly based on Wikipedia and company website data. However, the quality of these sources is far from perfect and there are no sources given for all the affiliations of all people in the list.
2) Affiliation consistency & temporal definition
When you look into the data, you will see that affiliation for Greenpeace is anything from having been to the same school, being in the leadership of big international companies or being employed by certain research facilities to political party membership or membership (?) in such obscure groups as Bilderberg. It is unclear from the data whether joint affiliation to these organisations actually means that the people have ever met (in that context) or whether they just happen to be in the same organisation.
Being in the same organisation also does not necessarily imply being in a positive relationship – two members of the same political party may be fierce opponents and a relation generated from co-affiliation would not mean much. What is also problematic is that the affiliations have no time stamps. Having been in ENA may be relevant in French politics, but two people who have passed through ENA with 10 years difference may not draw any social tie from having done so (unless there is clear evidence for that). Two people may have participated in the famous Bilderberg meetings, but if they did not attend the same year it’s very likely that this would not at all connect them. And even if one was on the board of the same organisation within the period of the last five years, this still does not mean one actually is related.
I was on the Board of Trustees of a university but I hardly knew my fellow board members because the board only met 3-4 times during the one year of my membership. Without knowledge on the frequency and depth of interaction, board co-affiliation is only a weak indicator (if at all) for a social tie, especially if the board membership has not been at the same time.
3) One co-affiliation equals a tie
If I’m reading the data correctly, Greenpeace has constructed the network in such a way that any single co-affiliation constitutes a tie in the network image they portray. Given the weakness of affiliation ties, it would have been a minimum to assume that at least two or three joint affiliations are an indication for an actual tie (unless there is strong evidence supporting the strength of the single affiliation tie).
Because of this loose criterion, the network image is probably way too dense with way too many realised relations to reflect real-world network structures. If half of the relations turn out to be indicators for weak connections (i.e. people being hardly acquainted to each other) this would probably be a lot already.
Giving that the network visualisation is based on not perfect data sources, non-consistent and temporally unconfirmed affiliations and assumptions about the existence of social ties based on over-interpretation of co-affiliations found, the resulting network is non-conclusive at best and completely misleading at worst.
And even worse: As far as I can see, there is no interpretation on what this network is actually supposed to mean. Does it mean influence on energy policy in France? Does it show who is well-connected with whom? Does it tell who is better informed about the latest developments in French energy policy making? Can you predict who will become the next president, the next leader of an energy company or the next host of an amazing dinner night?
Is a person who connects a political party and a research institution in any way more powerful than somebody who connects McKinsey and Dexia? Is it helpful to be connected to Sarkozy through a board where he may actually never show up?
Without any pre-formulated expectation on how the network actually should function – what social results are expected from the structure – this is just playing around with data without any use. Network analysis and network visualisation should not be used just because it is possible. It should be used to answer questions that cannot be answered differently.
If network analysis and visualisation is used, the data sources, transformation and interpretation should be made in such a way that the resulting network structure actually represent a social reality and not just some indication of similarity without any significance for social reality. This has clearly not been done by Greenpeace France.
Update: Eric has pointed me to this less critical yet also not fully convinced French blog post.
Image: Greenpeace | BY-NC-SA