FAQ
Jagiellonian University logo

Helping Protists to Find Their Place in a Big Data World

Publication date: 07.02.2014

Acta Protozoologica, 2014, Volume 53, Issue 1, pp. 115 - 128

https://doi.org/10.4467/16890027AP.14.011.1448

Authors

David J. Patterson
School of Life Sciences, Arizona State University, Tempe, Arizona, USA
All publications →

Titles

Helping Protists to Find Their Place in a Big Data World

Abstract

The ‘big new biology’ is a vision of a discipline transformed by a commitment to sharing data and with investigative practices that call on very large open pools of freely accessible data. As this datacentric world matures, biologists will be better able to manage the deluge of data arising from digitization programs, governmental mandates for data sharing, and increasing instrumentation of science. The big new biology will create new opportunities for research and will enable scientists to answer questions that require access to data on a scale not previously possible. Informatics will become the new genomics, and those not participating will become marginalized. If a traditional discipline like protistology is to benefit from this big data world, it must define, build, and populate an appropriate infrastructure. The infrastructure is likely to be modular, with modules focusing on needs within defined subject and makes it available in standard formats by an array of pathways. It is the responsibility of protistologists to build such nodes for their own discipline.

References

Download references

Adl S. M., Simpson A. G., Farmer M. A., Andersen R. A., Anderson O. R., Barta J. R., Bowser S. S., Brugerolle G., Fensome R. A., Fredericq S., James T. Y., Karpov S., Kugrens P., Krug J., Lane C. E., Lewis L. A., Lodge J., Lynn D. H., Mann D. G., McCourt R. M., Mendoza L., Moestrup Ø., Mozley-Standridge S. E., Nerad T. A., Shearer C. A., Smirnov A. V., Spiegel F. W., Taylor M. F. (2005) The new higher level classification of eukaryotes with emphasis on the taxonomy of protists. J. Eukaryot. Microbiol. 52: 399–451

Adl S. M., Simpson A. G. B., Lane C. E., Lukes J., Bass D., Bo,qwser S. S., Brown M. W., Burki F., Dunthorne M., Hamply V., Heiss A., Hoppenrath M., Lara E., le Gall L., Lynn D. H., McManus H., Mitchell E. A. D., Mosley-Stanridge S. E., Par,frey L. W., Pawlowski J., Rueckert S., Shadwick L., Schoch C. L., Smirnow A., Spiegel F. W. (2012) The revised classifica,tion of eukaryotes. J. Eukaryot. Microbiol. 59: 429–493. doi: 10.1111/j.1550-7408.2012.00644.x

Agosti D., Egloff W. (2009) Taxonomic information exchange and copyright: The Plazi approach. BMC Research Notes 2: 53. doi:10.1186/1756-0500-2-53

Andersen R. A. (2004) Biology and systematics of heterokont and haptophyte algae. Am. J. Bot. 91: 1508–1522

Ashlock P. D. (1971) Monophyly and associated terms. Syst. Zool. 20: 63–69

Baker M. (2010) Next-generation sequencing: Adjusting to data overload. Nature Methods 7: 495–499

Boyle B., Hopkins N., Lu Z., Antonio J., Mozzherin D., Rees T., Ma,tasci N., Narro M. L., Piel W. H., McKay S. J., Lowry S., Free,land C., Peet R. K., Enquist B. J. (2013) The taxonomic name resolution service: An online tool for automated standardization of plant names. BMC Bioinformatics 14: 16. doi:10.1186/1471,2105-14-16

Caron D. A., Countway P. D. (2009). Hypotheses on the role of the protistan rare biosphere in a changing world. Aquat. Microb. Ecol. 57: 227–238

Caron D. A., Countway P. D., Savai P., Gast R. J., Schnetzer A., Moorthi S. D., Dennett M. R., Moran D. M., Jones A. C. (2009) Defining DNA-based operational taxonomic units for microbial eukaryote ecology. App. Environ. Microb. 75: 5797–5808

Caron D. A., Hutchins D. A. (2013) The effects of changing cli,mate on microzooplankton grazing and commiunity structure: drivers, predictions and knowledge gaps. J. Plankton Res. 35: 235–252

Cavalier-Smith T., Chao E. (1996) 18S rRNA sequence of Hetero,sigma carterae (Raphidophyceae), and the phylogeny of hetero,kont algae (Ochrophyta). Phycologia 35: 500–510

Cavalier-Smith T., Scoble J. M. (2013) The phylogeny of Hetero,konta: Incisomonas marina, a uniciliate gliding opalozoan related to Solenicola (Nanomonadea), and evidence that Actinophryida evolved from raphidophytes. Eur. J. Protistol. 49: 328–353

Chapman A. D. (2005) Principles of data quality, version 1.0. Re,port for the Global Biodiversity Information Facility, Copenha,gen. ISBN 87-92020-03-8

Chapman A.D. (2009) Numbers of Living Species in Australia and the World, 2nd Edition. Australian Biological Resources Study, Australia

Charvet S., Vincent W. F., Comeau A., Lovejoy C. (2012) Pyrose,quencing analysis of the protist communities in a High Arctic meromictic lake: DNA preservation and change. Front. Micro,biol. 3: Article 422, 14 pp. doi: 10.3389/fmicb.2012.00422

Dallwitz M. J., Paine T. A., Zurcher E. J. (2007) Interactive identifi,cation using the internet. ftp://delta-intkey.com/www/netid.pdf

de Rosnay J. (1975) Le Macroscope. Vers une Vision Globale. Edi,tions de Seuil, Paris

Dubois A. (2006) Proposed rules for the incorporation of nomina of higher-ranked zoological taxa in the International Code of Zoo,logical Nomenclature. 2. The proposed rules and their rationale. Zoosystema 26: 165–258

Dubois A. (2012) The distinction between introduction of a new nomen and subsequent use of a previously introduced nomen in zoological nomenclature. Bionomina 5: 57–80

Franz N. M., Peet, R. K. (2009) Towards a language for mapping relationships among taxonomic concepts. Syst. Biodiv. 7: 5–20 Franz N. M., Thau D. (2010) Biological taxonomy and ontology development: scope and limitations. Biodiversity Informatics 7: 45–66

Gajadhar A. A., Marquardt W. C., Hall R., Gunderson J., Ariztia-Car,mona E. V., Sogin M. L., (1991) Ribosomal RNA sequences of Sarcocystis muris, Theileria annulata and Crypthecodinium coh,nii reveal evolutionary relationships among apicomplexans, di,noflagellates, and ciliates. Mol. Biochem. Parasitol. 45:147–54

GBIF (2011) A beginner’s guide to persistent identifiers, version 1.0. http://links.gbif.org/persistent_identifiers_guide_en_v1.pdf

Gómez F. (2014) Problematic biases in the availability of molecu,lar markers in protists: The example of the dinoflagellates. Acta Protozool. 53: 63–75

Gore A. (2013) The Future. Random House, New York

Guillou L., Bachar D., Audic S., Bass D., Berney C., Bittner L., Boutte C., Burgaud G., de Vargas C., Decelle J., Del Campo J., Dolan J. R., Dunthorn M., Edvardsen B., Holzmann M., Koois,tra W. H., Lara E., Le Bescot N., Logares R., Mahé F., Massana R., Montresor M., Morard R., Not F., Pawlowski J., Probert I., Sauvadet A. L., Siano R., Stoeck T., Vaulot D., Zimmermann P., Christen R. (2013) The protist ribosomal reference database (PR2): A catalog of unicellular eukaryote small sub-unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 2013 Jan; 41 (Database issue): D597-604. doi: 10.1093/nar/gks1160

Haeckel E. (1887) Report on the Radiolaria collected by the H.M.S. Challenger during the Years 1873–1876. Report on the Scien,tific Results of the Voyage of the H.M.S. Challenger, Zoology, Volume XVIII, Her Majesty’s Stationery Office, London

Hey T., Tansley S., Tolle K. (2009) The Fourth Paradigm: Data,Intensive Scientific Discovery. Microsoft Research, Redmond, Washington

Honigberg B. M., Balamuth W., Bovee E. C., Corliss J. O., Gojdics M., Hall R. P., Kudo R. R., Levine N. D., Loeblich A. R., Weiser J., Wenrich D. H. (1964) A revised classification of the phylum Protozoa. J. Protozool. 11: 7–20

Kelling S., Hochachka W. M., Fink D., Riedewald M., Caruana R., Ballard G., Hooker G. (2009) Data-intensive science: a new paradigm for biodiversity studies. BioScience 59: 613–619. doi: 10.1525/bio.2009.59.7.12

Lahr D. J. G., Lara E., Mitchell E. A. D. (2012) Time to regulate microbial eukaryote nomenclature. Biol. J. Linnean. Soc. 107: 469–476

Larsen J., Patterson D. J. (1990) Some flagellates (Protista) from tropical marine sediments. J. Nat. Hist. 24: 801–937

Leary P. R., Remsen D. P., Norton C. N., Patterson D. J., Sarkar I. N. (2007) uBioRSS: Tracking taxonomic literature using RSS. Bioinformatics 23: 1434–1436

Levine N. D., Corliss J. O., Cox F. E. G., Deroux G., Grain J., Hon,igberg B. M., Leedale G. F., Loeblich A. R., Lom J., Lynn D. H., Merinfeld G., Page F. C., Poljansky G., Sprague V., Vavra J., Wallace F. G. (1980) A newly revised classification of the Pro,tozoa. J. Protozool. 27: 37–58

Linnaeus C. (1753) Species Plantarum. Salvius, Stockholm

Luther A. (1899) Ueber Chlorosaccus eien neue Gattung der Süss,wasseralgen nebst einigen Bemerkungen zur Systematik ver,wandter Algen. Beih. Kongl. Svenska Vetensk. Akad. Handl. 24 (iii 13): 1–22

Margulis L. (1996) Archael-eubacterial mergers in the origin of Eukarya: Phylogenetic classification of life. Proc. Natl. Acad. Sci. 93: 1071–1076

Margulis L., Corliss J. O., Melkonian M., Chapman D. J. (1990) Handbook of Protoctista. Jones and Bartlett Publishers, Boston

Morris P. J., Kelly M., Lowery D. B., Macklin J. A., Morris R., Tremonte D., Wang Z. (2009) Filtered Push: Annotating dis,tributed data for quality control and fitness for use analysis. Eos Transactions of the American Geophysical Union (AGU) 90(52) Fall Meeting Supplement, Abstract available at http://adsabs.harvard.edu/abs/2009AGUFMIN34B..08M 

Müller O. F. (1773) Vermium terrestrium et fluviatilium, seu, Ani,malium infusoriorum, helminthicorum, et testaceorum, non ma,rinorum succincta historia. Havniae & Lipsiae

Müller O. F. (1776) Zoologiae Danicae prodromus, seu animalium Daniæ et Norvegiae indigenarum characteres, nomina, et syn,onyma imprimis popularium. Havniae. (Hallager)

Müller O. F. (1779) Zoologia danica sev animalivm Daniae et Nor,vegiae rariorum ac minus notorum descriptiones et historia. Volumen primum. Explicationi iconum fasciculi primi eiusdem operis inserviens. Havniae & Lipsiae

National Science Foundation (2006) NSF’s Cyberinfrastructure vi,sion for 21st Century discovery, v. 5.0. NSF Cyberinfrastructure Council http://www.nsf.gov/od/oci/ci_v5.pdf

National Science Foundation (2011) A Report of the National Sci,ence Foundation Advisory Committee for Cyberinfrastructure Task Force on Grand Challenges. National Science Foundation, Washington, DC, http://www.nsf.gov/od/oci/taskforces/Task-ForceReport_GrandChallenges.pdf

National Research Council of the National Academies (2009) A New Biology for the 21st Century. National Academies Press, Wash,ington, DC, http://www.ncbi.nlm.nih.gov/books/NBK32509/pdf/TOC.pdf

Page R. D. M. (2011) Dark taxa: GenBank in a post-taxonomic world. Available at http://iphylo.blogspot.com/2011/04/dark-taxa-genbank-in-post-taxonomic.html 

Patterson C. (1982) Morphological characters and homology. In: Problems in Phylogenetic Reconstruction, Systematics Asso,ciation Special Volume 21, (Eds. K. A. Joysey, A. E. Friday). London: Academic Press, 21–74

Patterson D. J. (1989) Stramenopiles: chromophytes from a protistological perspective. In: The chromophyte algae: Problems and perspectives, (Eds. J. C. Green, B. S. C. Leadbeater, W. L. Diver). Clarendon Press, Oxford, 357–379

Patterson D. J. (1999) The diversity of eukaryotes. Amer. Nat. 154: S96–124

Patterson D. J. (2009) Future taxonomy. In: Systema Naturae 250 – the Linnaean Ark, (Ed. A. Polaszek), CRC Press, London, 115–124

Patterson D. J., Egloff W., Agosti D., Eades D., Franz N., Hagdorn G., Rees J., Remsen D. (2014) Scientific names of organisms: attribution, right, and licensing. BMC Research Notes (in press).

Patterson D. J., Larsen J. (1991) Nomenclatural problems with protists. Regnum Vegetabile 123: 197–208

Patterson D. J., Sogin M. L. (1992) Eukaryote origins and protistan diversity. In: The origin and evolution of the cell, (Eds. H. Hart,man, K. Matsuno), World Sci., Singapore, 13–47

Patterson D. J., Remsen D., Norton C., Marino W. (2006) Taxo,nomic Indexing – extending the role of taxonomy. Systematic Biology 55: 367–373

Patterson D. J., Faulwetter S., Shipunov A. (2008) Principles for a names-based cyberinfrastructure to serve all of biology. Zoo,taxa 1950: 153–163

Patterson D. J., Cooper J., Kirk P. M., Pyle R. L., Remsen D. P. (2010) Names are key to the big new biology. TREE 25: 686– 691. doi:10.1016/j.tree.2010.09.004

Pawlowski J., Christen R., Lecroq B., Bachar D., Shahbazkia H. R., Amaral-Zettler L., Guillou L. (2011) Eukaryotic richness in the abyss: Insights from pyrotag sequencing. PLoS One 6: e18169

Pierce R. W., Turner J. T. (1993) Global biogeography of global tintinnids. Mar. Ecol. Prog. Ser. 94: 11–26

Piwowar H. A., Vision T. J., Whitlock M. C. (2011) Data archiving is a good investment. Nature 473: 285. doi:10.1038/473285a

Pullan M. R., Watson M. F., Kennedy J. B., Raguenaud C., Hyam R. (2000) The Prometheus taxonomic model: A practical approach to representing multiple classifications. Taxon. 49: 55–75

Pyle R., Michel E. (2008) Zoobank: Developing a nomenclatural tool for unifying 250 years of biological information. Zootaxa 1950: 39–50

Raup D. (1991) Extinction: Bad Genes or Bad Luck?, Norton, New York

Redhead S. A., Norvell L. L. (2012) MycoBank, Index Fungorum, and Fungal Names recommended as official nomenclatural re,positories for 2013. http://www.imafungus.org/Issue/32/03.pdf

Sarmento H., Montoya J. M., Vázquez-Domínguez Váqué D., Gasol J. M. (2010) Warming effects on marine microbial food web processes: How far can we go when it comes to predictions? Phil. Trans. Roy. Soc. B 365: 2137–2149

Simpson A. G. B., Patterson D. J. (1999) The ultrastructure of Car,pediemonas membranifera (Eukaryota) with reference to the “excavate hypothesis”. Eur. J. Protistol. 35: 353–370

Sonneborn T. M. (1975) The Paramecium aurelia complex of four,teen sibling species. Trans. Am. Microsc. Soc. 94: 155–178 Thessen A., Cui H., Mozzherin D. (2012) Applications of Natural Language Processing in biodiversity science. Advances in Bio,informatics. doi:10.1155/2012/391574. http://www.hindawi.com/journals/abi/2012/391574/

Thessen A. E., Patterson D. J. (2011) Data issues in the life sciences. ZooKeys 150: 15–51. doi: 10.3897/zookeys.150.1766

Thessen A. E., Patterson D. J., Murray S.A. (2012) The taxonomic significance of species that have only been observed once: The genus Gymnodinium (Dinoflagellata) as an example. PLoS ONE 7: e44015. doi:10.1371/journal.pone.0044015

Tschöpe O., Suhrbier I., Güntsch A., Berendsohn W. G. (2012) An,noSys: A generic annotation system for biodiversity data. GBIF European Regional Nodes Meeting 2012 27.-29.3., Berlin

Vences M., Guayasamin J. M., Miralles A., de la Riva I. (2013) To name or not to name: Criteria to promote economy of change in Linnaean classification schemes. ZooTaxa 36: 201–244

Wang Z., Dong H., Kelly M., Macklin J. A., Morris P. J., Morris R. A. (2009) Filtered-Push: a map-reduce platform for collab,orative taxonomic data management. 2009 WRI World Con,gress on Computer Science and Information Engineering 3: 731–735. Available at http://bdei2.cs.umb.edu/wiki/images/a/a8/PID797506.pdf

Wegener-Parfrey L., Barbero E., Lasser B., Dunthorn M., Bhat,tacharya D., Patterson D. J., Katz L. A. (2006) Evaluating sup,port for the current classification of eukaryotic diversity. PLOS Genetics 2: 2062–2073

Weinberger D. (2007) Everything is miscellaneous. Henry Holt and Company, New York

Wetterstrand K. A. (2013) DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP). Available at: www.genome.gov/sequencingcosts. Accessed 17th March 2013

Wood J., Andersson T., Bachem A., Best C., Genova F., Lopez D. R., Los W., Marinucci M., Romary L., van de Sompel H., Vigen J., Wittenburg P. (2010) Riding the wave. How Europe can gain from the rising tide of scientific data. Final report to European Commission by the High Level Expert Group on Sci,entific Data. European Union

Zhang W., Feng M., Yu Y., Zhang C., Sun J., Xiao T. (2011) Species checklist of contemporary tintinnids (Ciliophora, Spirotrichea, Choreotrichia, Tintinnida) in the world. Biodiversity Science 6: 655–660. doi: 10.3724/SP.J.1003.2011.06136

URL LinKS

US Fair Access to Science and Technology Research Act of 2013: http://beta.congress.gov/bill/113th-congress/housebill/708?q=hr708

Biodiversity Heritage Library: http://www.biodiversitylibrary.org/

Tree of Life: http://tolweb.org

micro*scope: microscope.mbl.edu

The Plankton Ciliate project: http://www.liv.ac.uk/ciliate/intro. htm

The Protist Information Server: http://protist.i.hosei.ac.jp

Checklist of Phytoplankton in the Skagerrak-Kattegat: http://www.smhi.se/oceanografi/oce_info_data/plankton_checklist/ssshome.htm

Encyclopedia of Life: http://eol.org

DiscoverLife: www.discoverlife.org/

Atlas of Living Australia: www.ala.org.au/

Marine Species Identification portal: http://species-identification.org/

Nucleotide Sequence Database Collaboration: http://www.in,sdc.org/

Biodiversity Information Standards (TDWG): http://www. tdwg.org/

Open Archives Initiative Protocol for Metadata Harvesting (OAI ,PMH): http://www.oaforum.org/

Dryad data repository: http://datadryad.org/

Marine Metadata initiative: https://marinemetadata.org/

IRMNG homonyms: http://www.cmar.csiro.au/datacentre/ irmng/ homonyms.htm

Global Names Index: gni.globalnames.org

TaxaMatch fuzzy matching algorithm: http://www.cmar.csiro.au/datacentre/taxamatch.htm

World Registry of Marine Species: http://www.marinespecies. Org

The Interim Register of Marine and Nonmarine Genera: http://www.cmar.csiro.au/datacentre/irmng/

Biodiversity Heritage Library: http://www.biodiversitylibrary.org/

Information

Information: Acta Protozoologica, 2014, Volume 53, Issue 1, pp. 115 - 128

Article type: Original article

Authors

School of Life Sciences, Arizona State University, Tempe, Arizona, USA

Published at: 07.02.2014

Article status: Open

Licence: None

Percentage share of authors:

David J. Patterson (Author) - 100%

Article corrections:

-

Publication languages:

English

View count: 2714

Number of downloads: 3773

<p> Helping Protists to Find Their Place in a Big Data World</p>