Data Mining

Next Offering

Start Date: October 16, 2017 
End Date:  December 4, 2017

Data Mining is one of four non-credit courses in the Certification in Practice of Data Analytics (CPDA) program. The course is delivered in 100% distance learning format and includes instructional material equivalent to approximately one semester credit hour. This is equates to around 40 hours of course work including 12-13 hours of instruction.  

This course can be taken individually, or as part of the four courses required to receive the CPDA certificate of completion. It is strongly recommended that participants who take the Foundations of Statistics first, followed by Data Mining. Machine Learning or Visualization Analytics and Sensemaking can follow in any order. 


Course Description

This course is an introduction to data mining fundamentals and algorithms. Students will develop an appreciation for data preparation and transformation, an understanding of the data requirements for the various algorithms and learn when it is appropriate to use which algorithm. Specific topics also include; Distance/Similarity Measurement; Anomaly Detection; and Association, Classification, Clustering, and Pattern Algorithms.

Please note: while this course is equal to approximately 40 hours of course work in seven weeks, students should allow a two-to-one study-to-class-time ratio. This means you should plan to study two hours for each one hour of class time. The course consists of approximately 5.7 hrs per week of class time, so an average person should expect to spend 11.4 hrs on completing the course work each week. Based on a person's own personal strengths and experience, you should increase or decrease the ratio. 

4 CEUs are granted upon successful completion of the course.

Recommended Prerequisites 

College level coursework in statistics is required. If you are pursuing the CPDA Certification and do not already have that background, students should complete Foundation of Statistics before taking this course. Please contact the program with questions or for clarification.   

Students will be required to learn the R software package prior to starting the course (

Free training in R software that will prepare you for this course can be found online at:

You will learn to:
  • Describe and critically assess the importance of a methodology and data preparation
  • Evaluate, identify, and process different measures of similarity and basics of visualization (do's and don'ts)
  • Identify and evaluate various classification approaches, clustering techniques, articulate how they work, and determine when to use them
  • Utilize association analysis, anomaly detection, and linear regression, articulate how they work, and determine when to use them.


Click Here to learn more about how this course is delivered 100% online!

Cancellations and Refunds

A full refund minus a $50 administrative fee will be made if cancellation is received three weeks prior to the start of the course. No refunds within three weeks of the course start date. 

Course Offering Dates

Each course offering in this program operates with a specific start date and end date. Students must complete each course during the specific time frame. Access to the online course and materials is removed when the course ends.