Vast archives of digital text, speech, and video, along with new analysis technology and inexpensive computation, are the modern equivalent of the 17th-century invention of the telescope and microscope. We can now observe social and linguistic patterns in space, time, and cultural context, on a scale many orders of magnitude greater than in the recent past, and in much greater detail than before. This transforms not just the study of speech, language, and communication but fields ranging from sociology and empirical economics to education, history, and medicine — with major implications for both scholarship and technology development.
Size Matters: Big Data, New Vistas in the Humanities and Social Sciences
Mark Liberman is professor of linguistics and computer and information science at the University of Pennsylvania and director of the Linguistic Data Consortium. His research interests are in phonetics, phonology, speech technology, and computational linguistics. He is on the editorial boards of Speech Communications, Computer Speech and Language and The International Journal of Corpus Linguistics. Liberman came to Penn after being a member of the technical staff and department head of the Linguistics Research Department at AT&T Bell Laboratories. Liberman is also the founder of (and frequent contributor to) Language Log, a blog with a broad cast of dozens of professional linguists. The concept of the eggcorn was first proposed in one of his posts there.
Geoffrey Nunberg is a linguist and an adjunct full professor at the School of Information. His linguistics research includes work in semantics and pragmatics, text classification, and written-language structure. He also works and writes on the social and cultural implications of digital technologies.
Nunberg is well known for the regular feature on language he does on the NPR show “Fresh Air.” He has contributed “letters from America” to the BBC4 and has written numerous commentaries on language for the Sunday New York Times Week in Review, as well as articles and commentaries on language, politics and culture for The Atlantic, The American Prospect, Forbes, Fortune, the Washington Post, and other periodicals. He is the emeritus chair of the usage panel of the American Heritage Dictionary. Nunberg’s most recent books are Talking Right (2006) and The Years of Talking Dangerously (2009). His new book, about civility in American public life, will be published in July 2012 by PublicAffairs.
Matthew Salganik is a sociologist specializing in social networks, quantitative methods, and web-based social research. One main area of his research has focused on developing network-based statistical methods for studying populations most at risk for HIV/AIDS. A second main area of work has been using the world wide web to collect and analyze social data in innovative ways.
Salganik’s research has been published in journals such as Science, PNAS, Sociological Methodology, and Journal of the American Statistical Association. His papers have won the Outstanding Article Award from the Mathematical Sociology Section of the American Sociological Association and the Outstanding Statistical Application Award from the American Statistical Association. Popular accounts of his work have appeared in the New York Times, Wall Street Journal, Economist, and New Yorker. Salganik’s research is funded by the National Science Foundation, National Institutes of Health, Joint United Nations Program for HIV/AIDS (UNAIDS), and Google.