Robot snakes, sensor networks, and cosmology: How machine learning is changing the world
ZusammenfassungThe arrival of Big Data in virtually every aspect of life is driving many new trends. Now everyone has an enormous amount of data, everyone wants to learn from it, and often there is simply too much data to do the learning manually. One of the most important trends is the explosive growth in the number and variety of machine learning methods and applications. Two machine learning topics will be covered in this talk. The first concerns machine learning algorithms for sets of data points. Traditional algorithms operate on individual data points. Each point may have a label, or be flagged as anomalous, or be clustered. Instead, consider cases where the phenomenon of interest affects many data points but is not apparent in any one of them alone. For example, any one sick patient arriving at a hospital may not be unusual, but dozens of them who all live along the same river might be recognized as the release of a water-born toxin. Algorithms for machine learning on sets will be described and applications of them to astronomy, water quality monitoring with autonomous boats, and image classification will be shown. The second topic concerns when learning from data is done with a goal of using the acquired knowledge to optimize some process and make decisions about what additional data to collect. Methods based on the concepts of information gain and expected improvement will be presented. They analyze the value of potential experiments and choose the best ones to perform. The methods may be used to let snake robots learn both from practicing and from crowd-sourced data collected at a museum. Other applications will be described including how to find the best scientific models of the universe, and how sensor networks can decide what, when, and where they will sense.
Zur PersonDr. Jeff Schneider is an associate research professor in the Carnegie Mellon University School of Computer Science. He received his PhD in Computer Science from the University of Rochester in 1995. He has over 15 years’ experience developing, publishing, and applying machine learning algorithms in government, science, and industry. He has dozens of publications and has given numerous invited talks and tutorials on the subject. Among his many current interests are learning algorithms for big data, astrophysics, active learning, medical care and public health, social networks and smart grids.
Dr. Schneider was the co-founder and CEO of Schenley Park Research, Inc. (SPR), a company dedicated to bringing new machine learning algorithms to industry. Later, he developed a new machine-learning based CNS drug discovery system and spent a two-year sabbatical as the Chief Informatics Officer of Psychogenics, Inc. to set up and commercialize the system. Through his work at CMU and his commercial and consulting efforts, he has worked with several dozen companies and government agencies including ten Fortune 500 companies, and groups from around the world.