The ggnetwork
package provides a way to build network plots with ggplot2.
Install the stable version from CRAN:
install.packages("ggnetwork")
Or use remotes to install the latest version of the
package from
GitHub:
remotes::install_github("briatte/ggnetwork")
The package is meant to be used with ggplot2 version
2.0.0 or above, so make sure that you update your version of
ggplot2 from CRAN before using ggnetwork:
install.packages("ggplot2")
library(ggplot2)
ggnetwork further requires the network and
sna packages for network manipulation, and will also
install the ggrepel
package for repulsive label drawing.
The ggnetwork package is very much related to the
development of geom_net by
Samantha C. Tyner and Heike Hoffmann. It also shares some
similarity to the ggnet and
ggnet2 functions, which are part of the GGally package
by Barret Schloerke and others. Each of these projects are extensions to
Hadley Wickham’s implementation of Leland Wilkinson’s “grammar of
graphics” in ggplot2.
Minimal example
Let’s define a small random graph to illustrate each component of
ggnetwork:
library(network)
library(sna)
n <- network(rgraph(10, tprob = 0.2), directed = FALSE)
Let’s now add categorical and continuous attributes for both edges
and vertices. We’ll start with nodes, adding a categorical vertex
attribute called "family", which is set to either
"a", "b" or "c", and a continuous
vertex attribute called "importance", which is set to
either 1, 2 or 3.
n %v% "family" <- sample(letters[1:3], 10, replace = TRUE)
n %v% "importance" <- sample(1:3, 10, replace = TRUE)
We now add a categorical edge attribute called "type",
which is set to either "x", "y" or
"z", and a continuous vertex attribute called
"day", which is set to either 1, 2 or 3.
e <- network.edgecount(n)
set.edge.attribute(n, "type", sample(letters[24:26], e, replace = TRUE))
set.edge.attribute(n, "day", sample(1:3, e, replace = TRUE))
Last, note that ggnetwork contains a “blank” plot theme
that will avoid plotting axes on the sides of the network. We will use
that theme in most of the plots:
theme_blank
## function (base_size = 12, base_family = "", ...) 
## {
##     ggplot2::theme_bw(base_size = base_size, base_family = base_family) + 
##         ggplot2::theme(axis.text = ggplot2::element_blank(), 
##             axis.ticks = ggplot2::element_blank(), axis.title = ggplot2::element_blank(), 
##             legend.key = ggplot2::element_blank(), panel.background = ggplot2::element_rect(fill = "white", 
##                 colour = NA), panel.border = ggplot2::element_blank(), 
##             panel.grid = ggplot2::element_blank(), ...)
## }
## <bytecode: 0x7fd604d4bb18>
## <environment: namespace:ggnetwork>
 
Main building blocks
ggnetwork
The ggnetwork package is organised around a ‘workhorse’
function of the same name, which will ‘flatten’ the network object to a
data frame that contains the edge list of the network, along with the
edge attributes and the vertex attributes of the sender nodes.
The network object referred to above might be an object of class
network, or any data structure that can be coerced to it,
such as an edge list, an adjacency matrix or an incidence matrix. If the
intergraph package is installed, then objects of class
igraph can also be used with the ggnetwork
package.
The data frame returned by ggnetwork also contains the
coordinates needed for node placement as columns "x",
"y", "xend" and "yend", which as
a consequence are “reserved” names in the context of
ggnetwork. If these names show up in the edge or the vertex
attributes, the function will simply fail to work.
The default node placement algorithm used by ggnetwork
to produce these coordinates is the Fruchterman-Reingold force-directed
layout algorithm. All of the placement
algorithms implemented in the sna package are available
through ggnetwork, which also accepts additional layout
parameters:
ggnetwork(n, layout = "fruchtermanreingold", cell.jitter = 0.75)
ggnetwork(n, layout = "target", niter = 100)
The layout argument will also accept user-submitted
coordinates as a two-column matrix with as many rows as the number of
nodes in the network.
The top of the data frame produced by ggnetwork contains
self-loops to force every node to be included in the plot. This explains
why the rows shown below have the same values in "x" and
"xend" (and in "y" and "yend"),
and only missing values in the columns corresponding to the edge
attributes:
head(ggnetwork(n))
##            x         y family importance vertex.names      xend      yend day
## 2 0.06153741 0.7570453      a          2            3 0.0000000 1.0000000   1
## 3 0.08938645 0.5007880      a          2            9 0.0000000 1.0000000   3
## 4 0.08938645 0.5007880      a          2            9 0.5949937 0.6871140   3
## 5 0.08938645 0.5007880      a          2            9 0.4826225 0.4549953   2
## 6 0.08938645 0.5007880      a          2            9 0.4988651 0.1265779   2
## 8 0.48262251 0.4549953      c          1            6 0.1570683 0.0000000   1
##   type
## 2    z
## 3    y
## 4    y
## 5    z
## 6    x
## 8    y
The next rows of the data frame contain the actual edges:
tail(ggnetwork(n))
##              x         y family importance vertex.names       xend      yend
## 51  0.00000000 0.7370770      a          3            5 0.00000000 0.7370770
## 61  0.38729137 0.5340420      c          1            6 0.38729137 0.5340420
## 7   0.56327197 0.3433634      b          3            7 0.56327197 0.3433634
## 81  0.62924005 0.7626922      a          1            8 0.62924005 0.7626922
## 91  0.01230182 0.3354880      a          2            9 0.01230182 0.3354880
## 101 0.86944917 0.1852995      c          1           10 0.86944917 0.1852995
##     day type
## 51   NA <NA>
## 61   NA <NA>
## 7    NA <NA>
## 81   NA <NA>
## 91   NA <NA>
## 101  NA <NA>
The data frame returned by ggnetwork has (N +
E) rows, where N is the number of nodes of the
network, and E its number of edges. This data format is very
likely to include duplicate information about the nodes, which is
unavoidable.
Note that ggnetwork does not include any safety
mechanism against duplicate column names. As a consequence, if there is
both a vertex attribute called "na" and an edge attribute
called "na", as in the example above, then the vertex
attribute will be renamed "na.x" and the edge attribute
will be renamed "na.y".
 
fortify.network and fortify.igraph
The ‘flattening’ process described above is implemented by
ggnetwork as fortify methods that are
recognised by ggplot2. As a result, ggplot2
will understand the following syntax as long as n is an
object of class network or of class
igraph:
ggplot(n)
However, if the object n is a matrix or an edge list to
be coerced to a network object, you are required to use the
ggnetwork function to pass the object to
ggplot2:
ggplot(ggnetwork(n))
 
geom_edges
Let’s now draw the network edges using geom_edges, which
is just a lightly hacked version of geom_segment. In the
example below, we map the type edge attribute to the
linetype of the network edges:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(aes(linetype = type), color = "grey50") +
  theme_blank()

The other aesthetics that we mapped are the basic coordinates of the
network plot. These might also be set as part of the call to
geom_segment, but setting them at the root of the plot
avoids having to repeat them in additional geoms.
Note that geom_edges can also produce curved edges by
setting its curvature argument to any value above
0 (the default):
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(aes(linetype = type), color = "grey50", curvature = 0.1) +
  theme_blank()

 
geom_nodes
Let’s now draw the nodes using geom_nodes, which is just
a lightly hacked version of geom_point. In the example
below, we map the family vertex attribute to the color of
the nodes, and make the size of these nodes proportional to the
importance vertx attribute:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(aes(linetype = type), color = "grey50") +
  geom_nodes(aes(color = family, size = importance)) +
  theme_blank()

Because ggplot2 follows Wilkinson’s grammar of graphics,
it accepts only one color scale. In the example above, that scale is
mapped to a vertex attribute, but it could have also been mapped to an
edge attribute. Mapping a color to both a vertex attribute and
an edge attribute will create a single color scale that incorrectly
merges both attributes into one:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(aes(color = type)) +
  geom_nodes(aes(color = family)) +
  theme_blank()
This is a limitation of ggnetwork that would require
violating some fundamental aspects of the grammar of graphics to be
circumvented.
 
 
More building blocks
geom_nodetext
Let’s now add node labels. These are simply plotted over the nodes by
the nodetext geom, which works exactly like
geom_text. In the example below, we map the
vertex.names attribute (which contains numbers 1 to 10) to
uppercase letters:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(color = "black") +
  geom_nodes(color = "black", size = 8) +
  geom_nodetext(aes(color = family, label = LETTERS[ vertex.names ]),
                fontface = "bold") +
  theme_blank()

 
geom_nodelabel
If you prefer to use the geom_label geom recently
introduced in ggplot2, ggnetwork also supports
these through the nodelabel geom:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(color = "black") +
  geom_nodelabel(aes(color = family, label = LETTERS[ vertex.names ]),
                 fontface = "bold") +
  theme_blank()

 
geom_nodetext_repel and
geom_nodelabel_repel
ggnetwork supports the repulsive label functions
introduced by the ggrepel package, which allows to label
nodes with non-overlapping annotations. Simply add _repel
to either geom_nodetext or geom_nodelabel to
use that functionality:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(color = "black") +
  geom_nodelabel_repel(aes(color = family, label = LETTERS[ vertex.names ]),
                       fontface = "bold", box.padding = unit(1, "lines")) +
  geom_nodes(color = "black", size = 8) +
  theme_blank()

 
geom_edgetext and geom_edgelabel
Let’s now add edge labels. These are plotted at mid-distance of the
nodes that the edges connect by the edgetext geom, which
works exactly like geom_label, except that its default
arguments do not draw a border around the labels. Here’s an example
where we map the day edge attribute to edge labels:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(aes(linetype = type), color = "grey75") +
  geom_nodes(color = "gold", size = 8) +
  geom_nodetext(aes(label = LETTERS[ vertex.names ])) +
  geom_edgetext(aes(label = day), color = "white", fill = "grey25") +
  theme_minimal() +
  theme(axis.text = element_blank(),
        axis.title = element_blank(),
        panel.background = element_rect(fill = "grey25"),
        panel.grid = element_blank())

The edgelabel geom is just an alias of the
edgetext geom. Note that these geoms are unlikely to
produce adequate results if the edges produced by
geom_edges are curved.
 
geom_edgetext_repel and
geom_edgelabel_repel
As you would do with nodes, simply add _repel to either
geom_edgetext or geom_edgelabel to draw
repulsive edge labels:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(aes(linetype = type), color = "grey75") +
  geom_nodes(color = "gold", size = 8) +
  geom_nodetext(aes(label = LETTERS[ vertex.names ])) +
  geom_edgetext_repel(aes(label = day), color = "white", fill = "grey25",
                      box.padding = unit(1, "lines")) +
  theme_minimal() +
  theme(axis.text = element_blank(),
        axis.title = element_blank(),
        panel.background = element_rect(fill = "grey25"),
        panel.grid = element_blank())

 
 
More plotting parameters
This section presents some rather experimental features of
ggnetwork.
Edge arrows
ggnetwork uses code by Heike Hoffmann to better show arrows
in directed graphs. To illustrate this, we will need a directed graph
example, so let’s use the first of the seven emon networks
bundled in the network package:
data(emon)
emon[[1]]
##  Network attributes:
##   vertices = 14 
##   directed = TRUE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   total edges= 83 
##     missing edges= 0 
##     non-missing edges= 83 
## 
##  Vertex attribute names: 
##     Command.Rank.Score Decision.Rank.Score Formalization Location Paid.Staff Sponsorship vertex.names Volunteer.Staff 
## 
##  Edge attribute names: 
##     Frequency
If this network is passed to ggnetwork without any
further plotting parameter, the result will feature “shortened” edges
that do not reach their receiver nodes:

This is because directed networks are expected to be plotted with
edge arrows indicating the directedness of each edge. Adding edge arrows
with geom_edges works through the same call to the
arrow function that is supported by
geom_segment:
ggplot(emon[[1]], aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(arrow = arrow(length = unit(6, "pt"), type = "closed")) +
  geom_nodes(color = "tomato", size = 4) +
  theme_blank()

The slightly shortened edges avoid overplotting the edge arrows and
the nodes. The amount of “edge shortening” can be set through the
arrow.gap parameter of ggnetwork, which
defaults to 0 when the network is undirected and
0.025 when it is. This parameter might need adjustment
depending on the size of the nodes, and it will probably not manage to
avoid any overplotting when the size of the nodes is not constant.
 
Edge weights
ggnetwork can use an edge attribute as edge weights when
computing the network layout. The name of that edge attribute should be
passed to the weights argument for that to happen, as in
this example, which will produce different layouts than if
weights had been left set to NULL (the
default):
ggnetwork(emon[[1]], weights = "Frequency")
The Kamada-Kawai is one example of a network layout that supports
edge weights. The user should refer to the documentation of each network
layout to understand which of these can make use of edge weights.
If ggnetwork finds duplicated edges in a network, it
will return a warning, as these edges should probably have been
converted to single weighted edges for adequate plotting.
 
Node faceting
In order for ggnetwork to operate correctly with faceted
plots, the by argument, which is NULL by
default, can be set to the name of an edge attribute. The result will be
a longer data frame that can be plotted with either
facet_wrap or facet_grid, as in the example
below, where the faceting variable, the Frequency edge
attribute, has to be specified twice (once to ggnetwork,
once to facet_wrap):
ggplot(ggnetwork(emon[[1]], arrow.gap = 0.04, by = "Frequency"),
       aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(arrow = arrow(length = unit(6, "pt"), type = "closed"),
             aes(color = Sponsorship)) +
  geom_nodes(aes(color = Sponsorship), size = 4) +
  facet_wrap(~ Frequency) +
  theme_facet()

The by argument is basically an attempt to bring minimal
support for temporal networks in ggnetwork. It will
systematically show all nodes in all plot facets, using the same
coordinates in every facet. For more advanced plots of dynamic networks,
the user should turn to the ndtv and
tsna
packages.
The example above also shows how to use a vertex attribute as part of
the aesthetics of the edges. Given how ggnetwork operates,
these vertex attributes will always be those of the sender node.
Last, the example also shows that ggnetwork comes with a
theme called theme_facet. This theme is a variation of the
previously mentioned theme_blank that preserves its facet
boxes:
theme_facet
## function (base_size = 12, base_family = "", ...) 
## {
##     theme_blank(base_size = base_size, base_family = base_family) + 
##         ggplot2::theme(panel.border = ggplot2::element_rect(fill = NA, 
##             color = "grey50"), ...)
## }
## <bytecode: 0x7fd60b495920>
## <environment: namespace:ggnetwork>
 
 
Additional methods
Since ggnetwork works entirely through
ggplot2, all ggplot2 methods apply:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(aes(linetype = type), color = "grey50") +
  geom_nodes(aes(x, y, color = family, size = 1.5 * importance)) +
  geom_nodetext(aes(label = LETTERS[ vertex.names ], size = 0.5 * importance)) +
  geom_edgetext(aes(label = day), color = "grey25") +
  scale_color_brewer(palette = "Set2") +
  scale_size_area("importance", breaks = 1:3, max_size = 9) +
  theme_blank()

Similarly, it is possible to use any of the geometries more than once
per plot:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(color = "grey50", alpha = 0.5) +
  geom_nodes(aes(x, y, color = family, size = 5.5 * importance), alpha = 0.5) +
  geom_nodes(aes(x, y, color = family, size = 1.5 * importance)) +
  scale_color_brewer(palette = "Set1") +
  guides(size = FALSE) +
  theme_blank()
## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
## of ggplot2 3.3.4.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Last, all geoms provided by ggnetwork can be subsetted
through the data argument, just as any ggplot2
geom, and as in the example below, which draws only a subset of all node
labels:
ggplot(n, aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_edges(color = "grey50", alpha = 0.5) +
  geom_nodes(aes(x, y, color = family), size = 3) +
  geom_nodelabel_repel(aes(label = vertex.names),
                       box.padding = unit(1, "lines"),
                       data = function(x) { x[ x$family == "a", ]}) +
  scale_color_brewer(palette = "Set1") +
  theme_blank()

Last printed on Feb 14, 2024, using ggnetwork version 0.5.13.