02UDEMY-BAYES

Marginal Books

UDEMY Course: Applied Bayes' Theorem and Naive Bayes Classifiers

The Objectives of the Course

A) The Purpose of the Course
Most courses on this subject are aimed at Machine Learning and Data Science experts. Often, they are presented for use with specialized development platforms or even as part of advanced off-the-shelf applications. On the other hand, the Bayes' Theorem and its applications are based on statistical principles and concept not often clearly explained.

The purpose of this course is educational. The techniques, algorithms and procedures presented in this course aim more at making machine learning methods based on the Theorem easier to understand as opposed to getting used.
The Bayes' Theorem is is one of those theorems where we can apply the proverb: “Still water is deep”.

The Theorem was developed in an article by Thomas Bayes in 1763. In due course, it found itself being used in a wide variety of statistical applications. The Theorem itself was an application of inference. From there on, and specifically with the advent of Machine Learning algorithms, the Theorem was extended to be the core of a wide variety of applications such as Classification, Networks and Optimization.

The Theorem and its applications are best developed using specialized programming environments. This is due to the mere fact that the applications of the Theorem require the handling of large data and performance intensive environments.

B) So, why do we Present a Course based on Excel?
Analysts require the use and the development of such applications have the following environments available to them:

· Off the shelf applications, ready-made and commercially available.
· Open source or free integrated development environments (IDE) that host a large
number of scientific and statistical libraries to use in such applications
In both cases, the Analyst is faced with an insurmountable learning curve, often not climbable at all. Whether the objective is to use off-the-shelf products or to develop their own applications, learning the methods in a machine learning environment is not possible via these two environments.

The course will then use Excel specifically for educational purposes and not as a machine learning tool. Excel is known by everyone, and if not, it is easy to learn. Excel is highly flexible in terms of exposing how things work. The course will then exploit such facilities to expose to the Analyst in a common sense and step-by-step manner the basis and procedures of these algorithms.

B) What Does the Course Cover?
The course is made up of 5 major sections preceded by a short introduction.

Section 1: Introducing the Course
This section consists of one lecture that presents the objectives of the course, its structure and resources as well as what to expect and what not to expect.

Section 2: An In-Depth Presentation of Probability Rules and Practices
The section starts with lectures that run through a detailed exposure to the fundamentals and practices of probability rules. Bayes' Theorem is highly linked with such rules and it will not be possible for analysts embarking on its use (and the understanding of its extensions) to learn and use these algorithms without a deep understanding of probability.
The section uses common sense to clarify often obscure concepts in probability. Many examples are presented and explained in detail.

Section 3: The Use of the Confusion Matrix for Evaluating Bayesian Results
Some might wonder why we are introducing the Confusion Matrix and its useful KPI’s in this course. The answer is that in both Sections 3 and 4, we will need to evaluate our results in terms of precision, accuracy, error rates, etc. The Confusion Matrix is a contingency table consisting of four results extracted from comparing the algorithm’s outcome with the historically known outcome of the classes in a Test Table. Four measurements consist of True Positive, True Negative, False Positive and False Negative. These four counts can be used in a variety of ways to measure such KPI’s as accuracy, precision, error rates and such. (The confusion matrix is also used in a variety of other classification machine learning methods: logistic regression, decision trees, etc.)

Section 4: The Fundamental Application of Bayes’ Theorem
this section presents the Theorem of Bayes first running through a common-sense example. This is followed by the derivation of the Theorem and a clear explanation of the terms used in the Bayes' Theorem formula. A set of 8 major workouts present the use of the Theorem in different formats (vertical and horizontal tables, decision trees and graphic solutions). The last 3 workouts output the results of the workouts to a Confusion Matrix and shows how that can be used to evaluate the results of the Theorem.

Section 5: How to Use the Naïve Bayes Classifiers
this is the heart of the course. It presents a wide variety of algorithms whose purpose is the supervised classification of data. The Naïve Bayes Classifiers are a family of algorithms based on the Bayes’ Theorem. They differ in various ways from each other. They are listed below.
Amongst the lectures detailing these algorithms with clear examples are “support” lectures that present topics that are needed as a support to these algorithms.
After starting with two lectures that present the fundamentals of Naïve Bayes Classifiers and the required theory, the course proceeds with a set of lectures consisting of 8 Naïve Bayes Classifier variants:
1) Categorical Naïve Bayes Classifiers
2) Gaussian and Continuous Naïve Bayes Classifiers
3) Non-Gaussian Continuous Naïve Bayes Classifiers
4) Bernoulli Naïve Bayes Classifier
5) Multinomial Naïve Bayes Classifier
6) Weighted Naive Bayes Classification
7) Complement Naïve Bayes Classification
8) Kernel Distance Estimation and Naive Bayes Classification
To support the presentations above, the course will interleave the following detailed presentations consisting of methods, topics and procedures:
1) Laplace Smoothing Correction
2) Extensions to Continuous Features: checking for normality, checking for independence of features, smoothing corrections for Gaussian features
3) Two Discrete Distributions - Bernoulli and Categorical
4) Two Discrete Distributions - Binomial and Multinomial
5) Entropy and Information and how used in Naïve Bayes Classification
6) Kononenko Information Gain and Evaluation of Classifiers
7) Log Odds Ratio and Nomograms used in Bayes Classification
8) Kernel Distance Estimation - Estimating the Bandwidth h.

Resources
All lectures will be supported by a variety of resources:
· Solved and documented workouts in Excel
· Dedicated workbooks that animate and describe various probability distributions

Details of the Course

Price: $19.99

However, on a regular basis, UDEMY will post the course at different prices (up or down!). Review the site for such prices.

Click to go to the
Course Page at UDEMY