Your browser does not support JavaScript!
  • NTHU
  • Academia Sinica
Short Courses
Short Course
     
Title Fundamentals of Machine Learning for Predictive Data Analytics
     
Instructor Yuh-Jye Lee, Professor, Department of Applied Mathematics, National Chiao Tung University and Research Fellow, Research Center for Information Technology Innovation, Academia Sinica
     
Course Description

“Google's always used machine learning. In all the areas we applied it to, speech recognition, then image understanding, and eventually language understanding, we saw tremendous improvements.” By John Giannandrea, the VP of Engineering, Google. In the last decade, machine learning has been applied to many real world problems successfully. It is considered as the most essential and fundamental knowledge for a data scientist. We introduce the core concepts of machine learning and several useful and insightful learning algorithms including the k-nearest neighbors, Naïve Bayes, Online Perceptron and Support Vector Machines. The performance evaluation of a machine learning model will be discussed in this course. Finally, I will share my Data Science and Machine Intelligence Lab’s Data Analytics Recipe to close the course.

Outlines

  • Explain these buzz words: Big Data, Data Science and Artificial Intelligence
  • Brief Introduction to Machine Learning
  • Three fundamental learning algorithms:
    • Naïve Bayes
    • The k-Nearest Neighbors
    • Online Perceptron
  • Support Vector Machines
  • How to Evaluate your Learning Models?
  • My Data Science and Machine Intelligence Lab’s Data Analytics Recipe
     
Slide [Slide I] [Slide II]
     
About the Instructor Dr. Yuh-Jye Lee received the PhD degree in Computer Science from the University of Wisconsin-Madison in 2001. Now, he is a research fellow at the Research Center for Information Technology Innovation, Academia Sinica serves as the CEO of Taiwan Information Security Center. He also is a professor of Department of Applied Mathematics at National Chiao Tung University. His research is primarily rooted in optimization theory and spans a range of areas including network and information security, machine learning, data mining, big data, numerical optimization and operations research. During the last decade, Dr. Lee has developed many learning algorithms in supervised learning, semi-supervised learning and unsupervised learning as well as linear/nonlinear dimension reduction. His recent major research is applying machine learning to information security problems such as network intrusion detection, anomaly detection, malicious URLs detection and legitimate user identification. Currently, he focusses on online learning algorithms for dealing with large scale datasets, time series data and behavior based anomaly detection for the needs of big data and IoT security problems.

 

     
Title Modern Statistical Process Control Charts and Their Applications for Analyzing Big Data
     
Instructor Peihua Qiu, Professor and Founding Chair, Department of Biostatistics, University of Florida
     
Course Description Big data often take the form of data streams with observations of certain processes collected sequentially over time. Among many different purposes, one common task to collect and analyze big data is to monitor the longitudinal performance/status of the related processes. To this end, statistical process control (SPC) charts could be a useful tool, although conventional SPC charts need to be modified properly and/or new SPC charts should be developed in some cases. This short course discusses traditional SPC charts, including the Shewhart, CUSUM and EWMA charts, as well as some recent control charts based on change-point detection and some fundamental multivariate SPC charts under the normality assumption. It also introduces novel univariate and multivariate control charts for cases when the normality assumption is invalid and discusses control charts for profile monitoring. Some examples will be discussed to use control charts for monitoring different types of processes with big data. Among many potential applications, dynamic disease screening and profile/image monitoring will be discussed in some detail.
     
Slide [Part I] [Part II] [Part III]
     
Text Book (optional) Qiu, P. (2014), Introduction to Statistical Process Control, Boca Raton, FL: Chapman & Hall/CRC.
     
About the Instructor Professor Peihua Qiu is the past editor of Technometrics, which is a flagship journal in industrial statistics, co-sponsored by ASA and American Society for Quality (ASQ). He has been working on various statistical process control (SPC) problems since 1998, and has made substantial contributions in several SPC areas, including nonparametric SPC, SPC by change-point detection, profile monitoring, and dynamic screening systems. His recent book Qiu (2014, Chapman & Hall) gives a systematic description about both traditional and newer SPC methods. Professor Qiu is an elected fellow of ASA and IMS and an elected member of ISI. After obtaining his Ph.D. in statistics from the University of Wisconsin at Madison, he helped create the Biostatistics Center at the Ohio State University during 1996-1998. Then, he worked as an assistant (1998-2002), associate (2002-2007) and full professor (2007-2013) of the School of Statistics at the University of Minnesota. He moved to the University of Florida as the founding chair of the Department of Biostatistics in 2013, and in 2018 the department got its first ranking of #20 among all biostatistics departments/programs in US from the US News and World Report. During his career, Professor Qiu is constantly involved in statistical consulting and collaborative research.