To speak about data visualization will be a challenge for me, as l am no specialist on this field. Data visualization is something l do intuitively as a side aspect of my work. Nonetheless, it’s not insignificant. On the contrary, in some ways it’s a central activity of my routine as researcher.
For me, data visualization has three aims:
- It helps me to understand the results of my research;
- It helps me to explain my research to the specialized public through papers and conferences;
- It gives publicity to my work.
I work with dynamics simulations of photoexcited molecules. At the end of my calculations for a single research project, l may easily have 5 GB of raw data. That’s roughly equivalent to over 16.000 e-books. How can I and my coworkers analyze that massive amount of data and make sense of them?
This is the main challenge for data visualization.
In 1976, Arieh Warshel (Nobel of Chemistry, 2013) wrote an amazing paper on dynamics simulations of retinal, which remained as the state-of-the-art for almost 3 decades. There, he showed the following picture:
In this figure, we see the energy and the population of each state of the molecule as a function of time for a single trajectory. This kind of 1-D representation is still very popular, as it quickly and in a straightforward way illustrates the main aspects of the dynamics.
With the advances of editing and publishing technologies, specially due to digital drawing and introduction of colors, style improved a lot over time, but the basic information is the same as of 1976. We can see this in the next figures, which I took from more recent papers.
Things are evolving in the direction of surface plots thanks to the popularization of colors and digital formats (PDF, essentially). Surface graphs as those in the next figure contain more info, are more beautiful, but they aren’t necessarily more clear. It’s a dangerous pathway, but worth trailing anyway.
In dynamics simulations, a natural representation of data is in the format of a movie. Movies can illustrate the time evolution of molecular geometries and other properties. But to make a single movie, like this one below (and the one illustrating this post), may cost me a whole afternoon of work and it is backed by heavy scripting.
How much effort and time is worth investing in data visualization? Should we reserve budget for it? Should we buy specialized softwares or hire professional assistance? Should we, scientists, get some training in the field?
I’m not sure about the answers to these questions, but I know that good data visualization leads to both, a better interpretation of the results and an increasing of the research impact.
To professionalize data visualization may also help us to develop new ways to represent our results. Some more involved diagram types are mostly unknown in my field. I don’t remember any papers using, for instance, network diagrams, streamgraphs, or heat maps.
We have, however, a media problem.
Scientific journals are kind of stuck in time. After moving to digital formats, they did not incorporate almost any new technology. Beyond a better diagramation, color pictures, and maybe some hypertext, a scientific paper today is no different from a scientific paper from 100 years ago.
A major academic publisher as Wiley, for instance, still publishes important journals as the Angewandte Chemie and the ChemPhysChem in black and white, unless authors pay a small fortune to get colors in the PDF files of their papers.
Today, if you submit to publication a movie file along with your scientific paper, the movie will be relegated to the Supporting Information section of the journal, where probably it will never be watched.
How useful would be if movies were directly integrated in the papers?
Imagine that when reading the paper, you could rescale a graph to glance at a particular feature; or rotate a molecular structure to get a better angle?
How useful would it be if could choose the units in a table or graph; or simply have access to the data underlying a graph, so you could use them yourself at your convenience?
The technology to do all these “tricks” is easily available. But their incorporation into scientific publications requires some rethinking of our concepts on how a scientific publication should be.
The first point is to admit that the printed-paper era is over and that we should definitely move to digital platforms. Second, we should be brave enough to accept the reader as an active player in the analysis of the scientific results, rather than a passive recipient looking at our results exclusively in the way we choose to show them.
There is a revolution in data visualization at our doors, just waiting to be invited to the science party. We need that scientists, editors, and publishers let it in.
- Data visualization is only one of the aspects where scientific journals could improve, but there are many others deserving attention too, authorship attribution and peer-reviewing practices are among them.
- Interactivity may also be a key to make scientific papers less boring.