by Martha E. Pollack and S. Jack Hu
by Eric Michielssen
by Alfred O. Hero and Brian D. Athey
The Michigan Institute for Data Science (MIDAS) is the focal point for the new multidisciplinary area of data science at the University of Michigan. MIDAS was created in July 2015 as part of the University of Michigan Data Science Initiative. MIDAS will be comprised of an interdisciplinary core faculty of 40 data scientists (from statistics, biostatistics and mathematics, computer science and engineering, information science, and a range of data science intensive application experts). MIDAS will also include a Data Science Challenge Initiatives Program (Learning Analytics, Transportation, Social Sciences, Personalized Medicine & Health); a Data Science Education and Training Program; as well as an Industry Engagement Program.
by Alfred O. Hero
by Robert Nowak
Machine learning is an area of Computer Science focused on designing computer programs that enable machines to learn by example, much in the way young children are taught to understand the world around them. Machine learning takes advantage of the availability of massive datasets and powerful computing resources to automatically discover patterns and structure in data. In this talk, we will look at several cutting-edge applications of machine learning that highlight key innovations and progress made recent years, such as deep learning and high-dimensional statistics. The talk will also survey exciting challenges that lie ahead.
by Susan Murphy
We describe a sequence of steps that facilitate effective learning of treatment policies in mobile health. These include a clinical trial with associated sample size calculator and data analytic methods. An off-policy Actor-Critic algorithm is developed for learning a treatment policy from this clinical trial data. Open problems abound in this area, including the development of a variety of online predictors of risk of health problems, missing data and disengagement.
by Kathleen McKeown
Data science holds the promise to solve many of society’s most pressing challenges. But much of the necessary data is locked within the volumes of unstructured data on the web including language, speech and video. In this talk, I will describe how data science approaches are being used in research projects that draw from language data along a continuum from fact to fiction. I will present a system that predicts the future impact of a scientific concept—represented as a technical term—based on the information available in recently published research articles, research on learning from knowledge of past disasters, as seen through the lens of the media and on the use of data science in understanding subjective, personal narratives.
by Robert Nowak, Susan Murphy, Kathleen McKeown and Alfred O. Hero
by Brian D. Athey
by Ivo Dinov
by Erin Shellman
Data science is an emergent field that incorporates concepts from statistics, computer science and machine learning to create and apply knowledge from data. In this talk I’ll share what I think are the essential skills and characteristics of effective data scientists. I’ll also provide guidance on how students can develop those skills in school and how educators can prepare them for jobs in industry.
by Patrick Harrington
Big Data and the field of Data Science present a novel paradigm facing the global economy: granular control, measurement, and rigorous optimization of everything. In this talk, I will discuss the range of economic verticals that have been touched by big data and data science and those ripe for efficiency gains under adoption of these schools of thought. Healthcare, finance, advertising, e-commerce, and ultimately human beings are in a mature efficiency trajectory of operating performance enabled by data science with the latter, human beings, ready to me made more efficient. I will speak to my current venture, compgenome.com and how the evolution of an on-demand, real time talent market provides transparency with employer compensation and ultimately optimal career navigation. Finally, I will discuss how data science, coupled with software engineering is a viable and lucrative career path likely to exist for decades to come.
by Nandit Soparkar
The inputs via data-entry are crucial to the success of the data sciences. This surprisingly under-served area suggests challenges and opportunities for applied data sciences. Drawing analogies with the related areas of OLAP and data mining, I will describe specific examples where the inputs need to be pre-processed appropriately. In particular, I will discuss text & natural language inputs in the automotive and healthcare sectors. I will also briefly touch upon the human factors relevant for data inputs, given that all data science efforts are ultimately by and for the broader human endeavor.
by Brian D. Athey, Erin Shellman, Patrick Harrington, Nandit Soparkar and Ivo Dinov
For a list of nearby restaurants, see http://myumi.ch/6j1jz
by Alfred O. Hero
by Daniel L. Goroff
Exploratory data analysis is fun but dangerous. Observations alone, no matter how many, can rarely justify causal inferences. Simple calculations show that, even playing strictly by the current rules of empirical science, a shocking percentage of the conclusions reached will be wrong. Those same calculations show that reproducing hypothesis tests can make them much more reliable. The Sloan Foundation actively supports efforts to make empirical research more reproducible, including the development of mathematical approaches to privacy-preserving research. Recent and surprising theorems show how, even if privacy is not an issue, some of the techniques developed to protect confidential information can also protect against false discovery due to multiple hypothesis testing and exploratory data analysis.
by George Poste
The convergence of molecular biology, informatics, sensor and mobile device technologies and social media is forging a new era of precision medicine in which large scale data on disease processes in individuals and populations, their environments and their behavior will enable disease detection, treatment and prevention to be based increasingly on individual-specific (personalized) parameters to achieve better health outcomes at lower cost. The long term trajectory for precision medicine will progressively shift the focus of care from the current episodic, reactive responses to illness to proactive, continuous real-time monitoring of health status for earlier detection of disease, improved treatment compliance and other risk reduction strategies to prioritize maintaining health versus managing illness.
Realization of these aspirations will generate data on an unprecedented scale. The rise of precision medicine and data-intensive medicine are inextricably linked. The current health care ecosystem is ill prepared for this union and its implications for the future medical curriculum, new skill needs for physicians, infrastructure and personnel for advanced data analytics, the evolution of new models of healthcare delivery and the entry of influential new participants from the computing, logistics and consumer realms hitherto uninvolved in healthcare.
by Bror Saxberg
There's much research about how learning can be enhanced by the right kinds of learning experiences, including how technology can help. However, little of that is getting to students at scale, compared with random walks with technology. ("Video is great, right? Must have more!")
We'll talk about what it means to be "learning engineers": applying evidence and learning science at scale in practical circumstances. As in any other good design or engineering effort, we want to see what works, and doesn't, with careful data collection. Scale (plus technology, where appropriate) enables the creation of test-beds for systematic improvement as well as a chance for reliable impact.
by Kathleen M. Carley
Our ability to understand and predict socio-cultural activity is being transformed by the exponential growth in big data available on the web – both social media and open government and organizational records.
Analysis of such data has the potential to create the timely and detailed information needed to improve crisis response and so save lives and goods, improve community resilience, support early identification of security threats and decrease social-cyber attacks. However, whether considering issues such as disaster response, cyber-security, or state-stability the same core methodological challenges keep rising to the fore.
Three of these key methodological challenges are driven by the nature of the data: “wide” data, sampled data, and geo-temporal data. In this presentation the promise of the new big data science for social behavior is described as well as the challenges that need to be considered. These point will be illustrated using a variety of examples related to early tsunami warning in Indonesia, crisis response in Libya, state stability in the Middle East, and cyber-security globally.
by Jonathan H. Owen
The auto industry is increasing vehicle electrification, introducing connected vehicle capabilities, and adding more intelligence to in-vehicle electronics, controls and active safety systems that will ultimately lead to automated driving technologies. The convergence of these technologies promises to transform and even disrupt personal transportation as we know it today. This presentation will provide a high-level overview on the future of personal mobility and discuss major opportunities and challenges for data science as the automotive transformation occurs.
by Bror Saxberg, George Poste, Jonathan H. Owen, Kathleen M. Carley and Alfred O. Hero
by Brian D. Athey
by Edward Seidel
Midwest Big Data Hub is a network of partners that has unique resources based in the Midwest that will address challenges in collecting, managing, serving, mining, and analyzing rapidly growing and increasingly complex data and information collections to create actionable knowledge and guide decision-making. I will describe expected activities of the Hub as we build collaborations and pilot projects with academic, industry, government and non-profit partners.
by Ratna Babu Chinnam
McKinsey’s Global Institute predicted back in 2011 that analyzing large big data sets would become a key basis of competition for firms, underpinning new waves of productivity growth, innovation, and consumer surplus across most business sectors. Recent studies are reporting that over 85% of leading Fortune 1000 companies have a Big Data initiative in progress or under planning stages. The primary reason cited by businesses for investing in Big Data is to enable better, fact-based decision making. However, most businesses are floundering in their ability to extract value from any data. Big Data management and analytics require a multitude of advanced concepts, tools and technologies, and the required skills are hard to come by. In addition, given the absence of effective data science and operations research, there is undue reliance on traditional techniques of the past to drive big data analytics. For most companies, data-analytics success has been limited to a few tests or to narrow slices of the business and few have achieved any resemblance of what we would call “big impact through big data”. What we need is more tools and technologies that are effective on the back end and far more emphasis on data-driven business processes on the front end!
by Kathleen McKeown
by Keith Elliston
6th October 2015