The world’s Largest Sharp Brain Virtual Experts Marketplace Just a click Away
Levels Tought:
Elementary,Middle School,High School,College,University,PHD
| Teaching Since: | Apr 2017 |
| Last Sign in: | 327 Weeks Ago, 5 Days Ago |
| Questions Answered: | 12843 |
| Tutorials Posted: | 12834 |
MBA, Ph.D in Management
Harvard university
Feb-1997 - Aug-2003
Professor
Strayer University
Jan-2007 - Present
Exercise Question 3
when Y is Boolean and X = hX1 ...Xni is avector of continuous variables, then the assumptions of the Gaussian NaiveBayes classifier imply that P(Y|X) is given by the logistic function withappropriate parameters W. In particular:P(Y = 1|X) = 11+exp(w0 +∑ni=1wiXi)andP(Y = 0|X) = exp(w0 +∑ni=1wiXi)1+exp(w0 +∑ni=1wiXi)Consider instead the case where Y is Boolean and X = hX1 ...Xni is a vectorof Boolean variables. Prove for this case also that P(Y|X) follows thissame form (and hence that Logistic Regression is also the discriminativecounterpart to a Naive Bayes generative classifier over Boolean features).Hints:• Simple notation will help. Since the Xi are Boolean variables, youneed only one parameter to define P(Xi|Y = yk). Define θi1 ≡ P(Xi =1|Y = 1), in which case P(Xi = 0|Y = 1) = (1 − θi1). Similarly, useθi0 to denote P(Xi = 1|Y = 0).• Notice with the above notation you can represent P(Xi|Y = 1) as followsP(Xi|Y = 1) = θXii1(1−θi1)(1−Xi)Note when Xi = 1 the second term is equal to 1 because its exponentis zero. Similarly, when Xi = 0 the first term is equal to 1 because itsexponent is zero.
CHAPTER 3GENERATIVE AND DISCRIMINATIVECLASSIFIERS:NAIVE BAYES AND LOGISTIC REGRESSIONMachine LearningCopyright c±2015. Tom M. Mitchell. All rights reserved.*DRAFT OF February 15, 2016**PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR’S PERMISSION*This is a rough draft chapter intended for inclusion in the upcoming second edi-tion of the textbookMachine Learning, T.M. Mitchell, McGraw Hill. You arewelcome to use this for educational purposes, but do not duplicate or repost iton the internet. For online copies of this and other materials related to this book,visit the web site www.cs.cmu.edu/∼tom/mlbook.html.Please send suggestions for improvements, or suggested exercises, toTom.Mitchell@cmu.edu.1Learning Classifiers based on Bayes RuleHere we consider the relationship between supervised learning, or function ap-proximation problems, and Bayesian reasoning. We begin by considering how todesign learning algorithms based on Bayes rule.Consider a supervised learning problem in which we wish to approximate anunknown target functionf:X→Y, or equivalentlyP(Y|X). To begin, we willassumeYis a boolean-valued random variable, andXis a vector containingnboolean attributes. In other words,X=hX1,X2...,Xni, whereXiis the booleanrandom variable denoting theith attribute ofX.Applying Bayes rule, we see thatP(Y=yi|X)can be represented asP(Y=yi|X=xk) =P(X=xk|Y=yi)P(Y=yi)∑jP(X=xk|Y=yj)P(Y=yj)1
Attachments: