For Data to Live Long and Prosper

On February 25, the AMS released its new policy on citations for data sources in journal articles. We were all set to tell authors about it when sadly, far bigger news stole the attention of scientists everywhere. The great creator of Spock, actor Leonard Nimoy, had died. Within two days, the story of data policy had become the story of Star Trek.
“That’s not logical,” you say.
OK, we’re not Vulcan, but even a human can see this. Data. Spock. Now is the time to bring them together.
Nimoy made an improbable—some would say illogically great—impact on society masquerading as a half-Vulcan, half-human creature named Spock hurtling through space on both the small and big screens. The tributes following Nimoy’s death last week have spoken of his ability to transcend the seeming limitations of such a curious role. Nimoy embodied racial ambiguity in a time of prejudice, ennobled diplomacy and rationality in an age of war, and gave voice to those who feel alien in their own neighborhoods and schools.
Of all the dualities in Spock’s character—so brilliantly portrayed by an immigrant’s son who skipped college—arguably the most explicit was as the science officer on bridge of the “Enterprise.” His struggle to remain true to the Vulcan creed of logic without emotion was a perfect expression of science in its time. For nerds of the 1960s and ‘70s, Spock’s reliance on logic echoed the haughty aloofness with which popular culture characterized scientists of the Cold War. But through his formidable devotion to knowledge, truth, and teamwork—working through all the pointy-eared social awkwardness he faced among his crew-mates– Spock somehow made science a new kind of “cool” long before geeks made billions of bucks with computers.
The thing is, scientists are a duality, much as Spock and Captain Kirk were two sides of a coin. They get emotional about two things. One is logic. Scientists, like mathematicians, get dewy-eyed about beautiful theories, elegant proofs, and ingenious solutions. The other is data. Unlike Spock, they work themselves into a frenzy over data. The best way to make scientists swoon is to produce data that reveal secrets.
For science to live long and prosper, that data need to be treasured like a home planet. For a long time, most scientific publishers thought it was good enough that journal authors would casually mention data archives in their Acknowledgments. In this age of computer models and constantly updating technology, that’s not good enough. Now authors must use carefully sourced and dated formal citations and references that in turn lead to safeguarded, easily accessible repositories. The author’s guide online gives some helpful examples.
The new citation policy is just one step of many advancing data archive practices that were recommended in the AMS Statement on Full and Open Exchange of Data adopted in December 2013. That statement also calls on funding agencies to recognize the costs of managing data. It recognizes that data preservation and stewardship should be emphasized and discussed at meetings. It says AMS should promote conventions and standards for metadata to increase interoperability and usage, and that the Society should foster ways of deciding what data should be kept to improve preservation practices in the future.
AMS is not alone in this shift. There are others in the chain of research, publication, and archiving trying to do for data what Spock did for logic. Our Society is one of the original members of a year-old team of publishers, data facilities, and consortia called the Coalition on Publishing Data in the Earth and Space Sciences. COPDESS is working to ensure that data are preserved through proper, secure funding, and that careful decisions are made about what should be saved.
Most importantly, this international movement toward protecting and providing data is meant to preserve the scientific process. Science needs published studies to lead to more studies that can confirm or reject findings. According to the AMS Statement,

AMS should strongly encourage an environment in which scholarly papers published in scientific journals contain sufficient detail and references to data and methodology to permit others to test each paper’s scientific conclusions.

All that depends on data being available in the review process as well as in perpetuity, with published results closely aligned with open archives.
Logic and Data: the duality of the scientific spirit. It is easy to celebrate one without the other, but it would not be proper. Spock would understand.