Data Mining

Data Mining Course

Next Offering

Start Date: February 28, 2018 
End Date:  April 18, 2018

Data Mining is one of four non-credit courses in the Certification in Practice of Data Analytics (CPDA) program. The course is delivered in 100% distance learning format and includes instructional material equivalent to a one semester credit hour class.  

This course can be taken individually, or as part of the four courses required to receive the CPDA certificate of completion. It is strongly recommended that participants who take the Foundations of Statistics first, followed by Data Mining. Machine Learning or Visualization Analytics and Sensemaking can follow in any order. 

 

Course Description

This course is an introduction to data mining fundamentals and algorithms. Students will develop an appreciation for data preparation and transformation, an understanding of the data requirements for the various algorithms and learn when it is appropriate to use which algorithm. Specific topics also include; Distance/Similarity Measurement; Anomaly Detection; and Association, Classification, Clustering, and Pattern Algorithms.

4 CEUs are granted upon successful completion of the course.


You will learn to:
  1. Describe and critically assess the importance of a methodology and data preparation.
  2. Evaluate, identify, and process different measures of similarity and basics of visualization (do's and don'ts).
  3. Identify and evaluate various classification approaches, clustering techniques, articulate how they work, and determine when to use them.
  4. Utilize association analysis, anomaly detection, and linear regression, articulate how they work, and determine when to use them.

Prerequisites 

College level coursework in statistics is required. If you are pursuing the CPDA Certification and do not already have that background, students should complete Foundation of Statistics before taking this course. Please contact the program with questions or for clarification.   
 

Students will be required to learn the R software package prior to starting the course (www.r-project.org).

Free training in R software that will prepare you for this course can be found online at:

https://www.datacamp.com/courses/free-introduction-to-r


Students are also required to learn Structured Query Language (SQL). Free online training that will prepare you for this course can be found at: 

https://www.datacamp.com/courses/intro-to-sql-for-data-science


 

Click Here to learn more about how this course is delivered 100% online!
 

Expected Time Commitment to Complete this Course

Each course is equivalent to a one semester credit hour class. Therefore each class consists of approximately 40 hours of class time that includes 12-13 hours of recorded faculty lectures and 23-24 hours of additional course work. Each course is seven weeks in length, so each week there is 5.7 hours of combined class time (40 hrs / 7 weeks). The average student should allow a 2:1 study-to-class-time ratio to complete the course. This means you should plan to study two hours for each one hour of class time. This equates to 11-12 hours each week to complete all course work. (5.7 hrs X 2 = 11-12 hrs).  Based on a person's own personal strengths and experience, you should increase or decrease the ratio. 
 

Cancellations and Refunds

A full refund minus a $50 administrative fee will be made if cancellation is received three weeks prior to the start of the course. No refunds within three weeks of the course start date. 


Course Offering Dates

Each course offering in this program is faculty lead, therefore it operates with a specific start date and end date. Students must complete each course during the specific time frame. Access to the online course and materials is removed when the course ends.