AdhereR can generate various real-time interactive plots
that allow the easy exploration of individual patients, useful both for
research and in the clinical practice. These interactive visualizations
use Shiny, which allows the intuitive interaction with dynamic plots from a normal WEB browser (such as Google Chrome or Firefox) on pretty much any platform
(Windows, Linux or macOS on a laptop or desktop, but also Android or
iOS on a smartphone or tablet), without the need to have R installed on
the client (e.g., the smartphone used for the visualization). In fact,
the actual data and the R engine on which AdhereR runs may be hosted on a
dedicated hardware and software infrastructure half-a-world away,
everything happening transparently (and securely) over the internet. More details about importing data, plotting it and saving the results can be found in here.
While interactive visualizations are essential, sometimes we may want to also produce publication-quality plots
of a (group of) patient(s); this is easily done with AdhereR, as shown
by a few examples below (please note that the images themselves are
low-size JPEGs, but clicking on them allows the download of
Use with big databases
AdhereR can process data from a variety of sources, including from a “flat” file in the CSV (“comma-separated values“, using commas (,) or other separators) format and other file formants (e.g., Excel, Stata, SAS or SPSS) that can be imported into R using various methods (see here or here),
but this is appropriate only when the data is relatively small (say, a
few tens to thousands of records). However, most real-world clinical
data far exceeds these sizes, not to mention that they are usually
structured across more than one “table”, are stored remotely (sometimes,
centrally) and access is strictly controlled to respond to privacy and
AdhereR can use data stored in dedicated RDBMs (such
as MySQL) using (explicitly or implicitly) SQL, or it can process data
from Apache Hadoop (through HDFS and Map/Reduce).
This means that AdhereR can access vast amounts of data stored
remotely or locally, and it can process it locally (on the client
machine) as well as remotely (on dedicated hardware and software
platforms, such as a heterogeneous computer cluster). For more info,
please see the dedicated vignette.
AdhereR runs efficiently on almost anything
AdhereR is written in “pure” R, and despite various complaints that R is slow, AdhereR’s kernel is heavily optimised (mostly using data.table) and capable of parallel processing. This ensures that AdhereR is actually quite fast
(for example, earlier benchmarks — around 2017 — of version 0.1 on a
Core i7-3770 16Gb RAM desktop computer running Linux with a databse
containing 500,000 patients with 4,058,110 events computed CMA1 in about
10 minutes when run in parallel on all 4 physical cores; see here for details).
Another frequently cited limitation of R (and, implicitly, of
AdhereR) is that it can’t process datasets that don’t fit in the
computer’s RAM. However, AdhereR is not affected, because it can
processes subsets of the whole dataset individually, sequentially or in parallel. As detailed in this vignette,
the data may be stored in an SQL database, from where the data for
groups of patients are selected, sent to AdhereR for processing, and the
results written back in the database (this can be done in parallel if
multiple cores/CPUs/nodes are available — see the vignette for details).
In this manner, huge amounts of data can be processed by leveraging parallelism on machines with multiple cores/CPUs or even across heterogeneous clusters. Thus, AdhereR runs on anything in between an Atom-powered tablet, a consumer-grade laptop and a computer cluster, under Windows, macOS or Linux.
AdhereR is not just for R
While AdhereR is targeted at R and currently only implemented in R
(for several reasons, including its widespread use in research and
business, support for data processing and visualisation, flexibility,
openness and available libraries), we are aware that there exist other programming languages (such as Python or Julia) and statistical platforms (such as SAS or Stata) for which the methods implemented by AdhereR would be useful.
One alternative would be to develop in sync multiple versions of
AdhereR (say, to have one AdherePy, one AdhereJul, one AdhereSAS and one
AdhereSta), but this is a bad idea on several levels (not least to do
with our limited development and testing resources). Therefore, we have
opted for the next best thing, which is to implement a bridging interface that allows other languages and platforms to transparently use AdhereR (including its interactive plotting).
We provide a full implementation for Python 3 (described in this vignette), which consists of a Python module (called “adherer”) exposing a hierarchy of Python classes that mirror the original R classes. The module is smart enough to find (in most cases) by itself where R and AdhereR are installed, to call them with the appropriate parameters, and to interpret and convert the results back to Python, providing a “full Python” experience to the user (with all the gory details hidden in its code). Also, despite some overhead costs related to data conversion and calling R, the bridge is fast enough to allow real data processing and visualisation in a production environment.