Buy Machine Learning Assignment

Instructions: Please either typeset your answers (LATEX recommended) or write them very clearly and legibly and scan them, and upload the PDF on edX. Legibility and clarity are critical for fair grading.

1. Let D be an arbitrary distribution on the domain {−1, 1}n, and let f, g : {−1, 1}n → {−1, 1} be two Boolean functions. Prove that

Px∼D[f(x) 6= g(x)] = 1− Ex∼D[f(x)g(x)]

2 .

Would this still be true if the domain were some other domain (such as Rn, where R denotes the real numbers, with say the Gaussian distribution) instead of {−1, 1}n? If yes, justify your answer. If not, give a counterexample.

2. Let f be a decision tree with t leaves over the variables x = (x1, . . . , xn) ∈ {−1, 1}n. Explain how to write f as a multivariate polynomial p(x1, . . . , xn) such that for every input x ∈ {−1, 1}n, f(x) = p(x). (You may interpret −1 as FALSE and 1 as TRUE or the other way round, at your preference.) (Hint: try to come up with an “indicator polynomial” for every leaf, i.e. one that evaluates to the leaf ’s value if x is such that that path is taken, and 0 otherwise.)

3. Compute a depth-two decision tree for the training data in table 1 using the Gini function, C(a) = 2a(1− a) as described in class. What is the overall accuracy on the training data of the tree?

X Y Z Number of positive examples Number of negative examples

0 0 0 10 20 0 0 1 25 5 0 1 0 35 15 0 1 1 35 5 1 0 0 5 15 1 0 1 30 10 1 1 0 10 10 1 1 1 15 5

Table 1: decision tree training data

4. Suppose the domain X is the real line, R, and the labels lie in Y = {−1, 1}, Let C be the concept class consisting of simple threshold functions of the form hθ for some θ ∈ R, where hθ(x) = −1 for all x ≤ θ and hθ(x) = 1 otherwise. Give a simple and efficient PAC learning algorithm for C that uses only m = O(1� log

1 δ ) training examples to output a classifier with

error at most � with probability at least 1− δ.

1

Buy Machine Learning Assignment

gchourasia

Cross-Out

5. In this problem we will show that mistake bounded learning is stronger than PAC learning, which should help crystallize both definitions. Let C be a function class with domain X = {−1, 1}n and labels Y = {−1, 1}. Assume that C can be learned with mistake bound t using algorithm A. (You may also assume at each iteration A runs in time polynomial in n, as well as that A only updates its state when it gets an example wrong.) The concrete goal of this problem is to show how a learner, given A, can PAC-learn concept class C with respect to any distribution D on {−1, 1}n. The learner can use A as part of its output hypothesis and should run in time polynomial in n, 1/�, and 1/δ.

To achieve this concrete goal in steps, we will break down this problem into a few parts. Fix some distribution D on X, and say the examples are labeled by an unknown c ∈ C. For a hypothesis (i.e. function) h : X → Y , let err(h) = Px∼D[h(x) 6= c(x)].

(a) Fix a hypothesis h : X → Y . If err(h) > �, what is the probability that h gets k random examples all correct? How large does k need to be for this probability to be at most δ′? (The contrapositive view would be: unless the data is highly misleading, which happens with probability at most δ′, it must be the case that err(h) ≤ �. Make sure this makes sense.)

(b) As we feed examples to A, how many examples do we need to see before we can be sure of getting a block of k examples all correct? (This doesn’t mean the hypothesis needs to be perfect; it just needs to get a block of k all correct. Think about dividing the stream of examples into blocks of size k, and exploit the mistake bound. How many different hypotheses could A go through?)

(c) Put everything together and fully describe (with proof) a PAC learner that is able, with probability of failure at most δ, to output a hypothesis with error at most �. How many examples does the learner need to use (as a function of �, δ, and t)?

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

- Free title page and bibliography
- Unlimited revisions
- Plagiarism-free guarantee
- Money-back guarantee
- 24/7 support

On-demand options

- Writer’s samples
- Part-by-part delivery
- Overnight delivery
- Copies of used sources
- Expert Proofreading

Paper format

- 275 words per page
- 12 pt Arial/Times New Roman
- Double line spacing
- Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Delivering a high-quality product at a reasonable price is not enough anymore.

That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more