Photo by Daniel Seßler on Unsplash
Objective
Select a suitable algorithm for performing multi-class intent classification within A.D.A.M
Simplicity over Complexity
While strategically designing A.D.A.M - (Adaptive Datacenter and Migration) voice-controlled AI virtual assistant , a significant amount of thought went into whether or not a simple model would suffice, or not. Suffice meaning, provide great results with minimal data, computation, etc. Could a simple model work? If so, how well? Thus, two algorithms were chosen as a start to analyze and compare against each other and for this sole purpose: OvR (One-vs-Rest) Logistic Regression and Multinomial Naive-Bayes.
Intuition behind the classification algorithms
The two main techniques for approaching multi-class classification scenarios are "one-versus-rest" and "one-versus-one".
In a "one-versus-rest" scenario, C separate binary classification models are trained. Every individual classifier is trained with an aim of determining the inclusion of each example as it pertains to being a part of class, c or not.
In order to predict a class for a new sample x, all C Classifiers are run on x and the class with the highest score is selected:
In comparison, in a "one-vs-one", regression scenario, separate binary classification models are trained for each possible pair of classes. Hence, to make an inference on a new sample x, we would run every classifier on x and select the class with the highest number of votes.
(One-vs-Rest) Logistic Regression
As suggested earlier, there are several methods for classifying an instance into k>=2 classes. One-vs-Rest Logistic Regression instantiates a separate binary logistic regression for each class, with an assumption that each classification model is independent. Whereas in A.D.A.M.'s intent classification case study, there are 6 independent intents or classes (as of this writing). Furthermore, the One-vs-Rest Logistic Regression model will define the log odds ration for each outcome and through a linear model (Whitaker et al., n.d):
(MNB) Multinomial Naive-Bayes
MNB is an instance of a Naive Bayes that is suitable for classification scenarios with discrete features.In addition, it is based on the Bayes Theorem, which inherently calculates the probability P(c|x), where c is a class that is composed of all possible outcomes and x is an instance that should be classified as it represents certain features (Puruula, 2012):
P(c|x) = P(x|c) * P(c) / P(x)
Proposed Model for Evaluating these Algorithms
So? Which Algorithm wins as it pertains to A.D.A.M?
To give it away, OvR Logistic Regression ends up beating Multinomial Naive Bayes in 3 different metrics. For further information and insights into this case study, please review the paper, "Algorithm comparison for multi-class intent classification: A case study in A.D.A.M."
Next Steps
In my opinion, further research is always a given, since data changes and grows, new algorithms are born and current methods can always be improved. Stay connected to see what comes next with A.D.A.M.
References
De Loera, J. A., & Hogan, T. (2020). Stochastic Tverberg Theorems With Applications in Multiclass
Logistic Regression, Separability, and Centerpoints of Data. SIAM Journal on Mathematics of
Data Science, 2(4), 1151–1166. https://doi.org/10.1137/19m1277102
Puurula, A. (2012). Combining modifications to multinomial naive bayes for text classification. In
Asia Information Retrieval Symposium (pp. 114-125). Springer, Berlin, Heidelberg.
Whitaker, T., Beranger, B., & Sisson, S. (n.d.). Logistic regression models for aggregated data.
https://arxiv.org/pdf/1912.03805.pdf
Comments
Post a Comment