Apart from directly browsing and searching the website,
data access in STRING is possible also via a REST-based
API (application programing interface) and via wholesale
data download. With version 10.0, we have introduced
a further option: direct access from the R programming
environment, following the Bioconductor standard (39).
The corresponding package is named STRINGdb (Figure
3), and can be downloaded from the Bioconductor
repository (http://www.bioconductor.org/packages/release/
bioc/html/STRINGdb.html). The package interacts with
the STRING server via the REST API and via additional,
dedicated web services. To optimize the speed of subsequent
accesses, the entire interaction network and associated data
for a given organism are downloaded from the server and
cached locally in theRenvironment,whenever possible. The
package is built around the iGraph framework (40), which
handles the complexity of the network data structures and
provides fast query/analysis functions. Once a network is
loaded/cached into an iGraph object, high-level functions
facilitate the most common user tasks, such as mapping protein
names onto their corresponding STRING identifiers,
retrieving the neighbors of a protein of interest, retrieving
PubMed IDs for publications that support a given interaction,
finding clusters of proteins in the network and generating
stable links back to the STRING website.
The plot network function can be used to display a native
STRING network of proteins in R (Figure 3). Functions
are also available to augment a given network with userprovided
node colorings (‘payload information’, see also
Figure 1), such that subsets of proteins can be tagged and
visually highlighted. Statistical enrichment tests can be executed
on gene lists within the STRING namespace, covering
Gene Ontology and pathway annotations, as well as
tissue and diseases annotations. Results can be visualized
as lists of enriched terms and/or heatmaps. The R-package
proves particularly valuable for users arriving with a very
large set of genes, for which the web-based interface of
STRING has previously been a major bottleneck.