LFD Book Forum

LFD Book Forum (http://book.caltech.edu/bookforum/index.php)
-   Homework 1 (http://book.caltech.edu/bookforum/forumdisplay.php?f=130)
-   -   Confused on question 6. (http://book.caltech.edu/bookforum/showthread.php?t=1993)

ArikB 10-07-2012 12:09 PM

Confused on question 6.
 
Edit: Solved it, the story below is no longer relevant. :)

I'm confused about how one is supposed to calculate the score exactly. My biggest confusion seems to stem from the misunderstanding of what a 'point' is. Is a point one of the input vectors? so 101, 110 and 111 are 3 points?

So then g[a], returns 1 for all three points would mean that:

Code:

101 | 1
110 | 1
111 | 1

And g[b], returns all 0's:

Code:

101 | 0
110 | 0
111 | 0

And g[c], the xor function, would return:

Code:

101 | 0
110 | 0
111 | 1

and g(d), inverse of g(c), would return:

Code:

101 | 1
110 | 1
111 | 0

Or could it be that g[1] means that it will only return a 1 if all points are 1? So:

Code:

101 | 0
110 | 0
111 | 1

and g[b] would have a score of 0, because there are no 000 points.

I'm utterly confused by the question. :/

yaser 10-07-2012 02:02 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by ArikB (Post 6143)
I'm confused about how one is supposed to calculate the score exactly. My biggest confusion seems to stem from the misunderstanding of what a 'point' is. Is a point one of the input vectors? so 101, 110 and 111 are 3 points?

A point is a data point, so these are 3 points. For each possible target function, there is a number of agreements (0,1,2 or3) with your hypothesis on these 3 points. We are keeping a tally of the number of agreements as we go through all possible target functions.

ArikB 10-07-2012 02:03 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by yaser (Post 6145)
A point is a data point, so these are 3 points. For each possible target function, there is a number of agreements (0,1,2 or3) with your hypothesis on these 3 points. We are keeping a tally of the number of agreements as we go through all possible target functions.


Thank you for the response, I was approaching the question completely wrong but solved it in the meantime. :)

yaser 10-07-2012 02:09 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by ArikB (Post 6146)
Thank you for the response, I was approaching the question completely wrong but solved it in the meantime. :)

You are welcome. Everyone is encouraged to ask questions, big or small.

noahdavis 10-08-2012 08:46 AM

Re: Confused on question 6.
 
Quote:

Originally Posted by yaser (Post 6145)
A point is a data point, so these are 3 points. For each possible target function, there is a number of agreements (0,1,2 or3) with your hypothesis on these 3 points. We are keeping a tally of the number of agreements as we go through all possible target functions.

Sorry I'm struggling a bit understanding the framework here. Maybe it's just terminology. What is the difference between a "possible target function" and a "hypothesis" ? I thought that they were equivalent, but it does not seem to be the case - a hypothesis must agree with a target function.

yaser 10-08-2012 10:31 AM

Re: Confused on question 6.
 
Quote:

Originally Posted by noahdavis (Post 6183)
Sorry I'm struggling a bit understanding the framework here. Maybe it's just terminology. What is the difference between a "possible target function" and a "hypothesis" ? I thought that they were equivalent, but it does not seem to be the case - a hypothesis must agree with a target function.

Possible target function is a notion introduced in this problem in order to make a point about learning. In general, there is one target function, albeit unknown. Here we spell out "unkown" by considering all the possibilities the target function can assume. We can afford to do that here because there is only a finite number of possibilities.

Hypotheses are the products of learning that try to approximate the target function. In this problem, we prescribe different learning scenarios that result in different hypotheses, then attempt to grade these hypotheses. We grade them according to how well each of them approximates the target function. The twist is that we consider all possible target functions and grade the hypothesis according to how well it approximates each of these possible targets.

noahdavis 10-08-2012 02:52 PM

Re: Confused on question 6.
 
Thank you - I understand now. For some reason it took me a leap to figure out how to build the "target function" such that it could be measured as stated in the problem. Originally, I had a list of 8 "functions" - but each function was just simply one of the 8 permutations where a permutation was an input point and a possible output.

apank 10-08-2012 06:29 PM

Re: Confused on question 6.
 
Hi,
What's a possible target function? Is that a combination of boolean operators? How do you come up with the formula 2^2^3 for total number of possibl target functions for 3 boollean inputs? Thank you.

yaser 10-08-2012 09:49 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by apank (Post 6191)
Hi,
What's a possible target function? Is that a combination of boolean operators? How do you come up with the formula 2^2^3 for total number of possibl target functions for 3 boollean inputs? Thank you.

A possible target function is any function that could have generated the 5 data points in this problem, i.e., any function whose values on these five points all agree with the data.

There are 2^3=8 points in the input space here, which are all binary combinations of the 3 input variables from 000 to 111. For each of these points, a Boolean function may return 0 or 1; hence two possibilities. Therefore, for all 8 points, a Boolean function may return 2\times 2 \times \cdots \times 2 (8 times) possibilities, which gives us the number of different Boolean functions 2^{2^3}=2^8=256.

dobrokot 01-08-2013 01:26 PM

Re: Confused on question 6.
 
Quote:

score = (# of target functions agreeing with hypothesis on all 3 points)*...+...
"all 3 points" are points 101,110,111 outside D ? "2 points" are 2 points of given three?
So, y-values on points in D are not used in the answer?

Seems, number of matches do not affected which hypothesis I choose - any hypothesis produce same number of matches Binomial(3, #matches) on these 3 points. Seems too easy, like dangerous trap or puzzle with catchy answer - if number of matches is always the same, why to define some complicated functions of matches and give Y-values on other five points. Or I got something wrong :| May be matches outside these 3 points (matches inside D) should be counted too?

butterscotch 01-08-2013 02:31 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by dobrokot (Post 8460)
"all 3 points" are points 101,110,111 outside D ? "2 points" are 2 points of given three?

You are right. 2 points of the remaining points in X.

Quote:

Originally Posted by dobrokot (Post 8460)
May be matches outside these 3 points (matches inside D) should be counted too?

The 3 points used in the final score are the 3 points in X outside of D.

tom.mancino 01-09-2013 11:28 AM

Re: Confused on question 6.
 
Quote:

Originally Posted by yaser (Post 6193)
A possible target function is any function that could have generated the 5 data points in this problem, i.e., any function whose values on these five points all agree with the data.

There are 2^3=8 points in the input space here, which are all binary combinations of the 3 input variables from 000 to 111. For each of these points, a Boolean function may return 0 or 1; hence two possibilities. Therefore, for all 8 points, a Boolean function may return 2\times 2 \times \cdots \times 2 (8 times) possibilities, which gives us the number of different Boolean functions 2^{2^3}=2^8=256.

Dr. Yaser or anyone, I am a little lost still on this problem set, I think due to a fundamental lack of mathematical knowledge (i.e my fault - I am completely self taught). I am able to visualize the entire boolean set of 8 possible points (000 - 111) - i.e. the total Xn set, after a little google assistance on boolean number theory but then I get lost in attempting to understand how to compare the other Boolean functions not in X to derive the T/F values. :clueless: I am assuming it is something fundamental I am missing in boolean mathematics, but am not sure.

butterscotch 01-09-2013 04:26 PM

Re: Confused on question 6.
 
no worries! :)

x_n is a vector of 3 values. x_n = [x_n1, x_n2, x_n3]. Each value can be 0 or 1. So there could 8 (2 * 2 * 2) distinct set of x_n vector values.

In digital logic, the boolean values true/false are represented as 1 and 0. 1 is true and 0 is false.

in 6c) The problem defines g as XOR: "if the number of 1's in x is odd, g returns 1; if it is even g returns 0".
Consider x_n = [1,0,0] then, the number of 1 in this example is 1, which is odd, so g returns 1. so g(x_n) = 1.

lhamilton 01-09-2013 05:09 PM

Re: Confused on question 6.
 
Sorry, I am also a bit confused on question #6.

Specifically, I want to understand 6(d). It says g returns the opposite of the XOR function: if the number of 1s is odd, it returns 0, otherwise it returns 1.

Here is what I am unclear on. Is the meaning of 6(d) that the hypothesis on set D is simply D, and then outside of D it is this opposite of XOR function? Or is 6(d) trying to define a hypothesis for the entire dataset?

Clearly, 6(d) does the exact wrong thing on D, so by definition there are no target functions that satisfy 6(d) if that's the function defined on the whole dataset. But if he's only describe what g does outside of D, then it's a totally valid target function.

Let me ask my question a different way; perhaps that will be clearer.

For 6(d), is g[0,0,0] = 0 or is g[0,0,0]=1?

butterscotch 01-09-2013 06:34 PM

Re: Confused on question 6.
 
"We want to determine the hypothesis that agrees the most with the possible target functions." and we are measuring this by counting how many of the 3 points not in D, agree with the hypothesis for each of the 8 target functions. In 6(a) & 6(b), the hypothesis is only defined on the last three points. Although 6(c) & (d) are not, it is known to us what f(x) is for points in D. So I think you are more interested in g[1,1,1], g[1,1,0], g[1,0,1].

kumarpiyush 01-10-2013 09:23 AM

Re: Confused on question 6.
 
Quote:

Originally Posted by noahdavis (Post 6190)
Thank you - I understand now. For some reason it took me a leap to figure out how to build the "target function" such that it could be measured as stated in the problem. Originally, I had a list of 8 "functions" - but each function was just simply one of the 8 permutations where a permutation was an input point and a possible output.

Does this mean the combinations of (000) ,(001) upto (111) are the 8 target functions?

kumarpiyush 01-10-2013 09:25 AM

Re: Confused on question 6.
 
Quote:

Originally Posted by yaser (Post 6186)
Possible target function is a notion introduced in this problem in order to make a point about learning. In general, there is one target function, albeit unknown. Here we spell out "unkown" by considering all the possibilities the target function can assume. We can afford to do that here because there is only a finite number of possibilities.

Hypotheses are the products of learning that try to approximate the target function. In this problem, we prescribe different learning scenarios that result in different hypotheses, then attempt to grade these hypotheses. We grade them according to how well each of them approximates the target function. The twist is that we consider all possible target functions and grade the hypothesis according to how well it approximates each of these possible targets.

I understood it now :-)

lhamilton 01-10-2013 02:31 PM

Re: Confused on question 6.
 
I still don't think I'm interpreting this question correctly.

For 6(d), the function described does not match the data set D.

So, given that, am I correct in thinking that for hypothesis 6(d) there are zero target functions that match the hypothesis?

Because, by definition, a target function must agree with the given data set D. Right?

butterscotch 01-10-2013 03:19 PM

Re: Confused on question 6.
 
Yes. the target function agrees with D and there are 8 of them.

Now we want to determine the hypothesis that agrees the most with the possible target functions.

Problem 6 defines this measurement as counting how many of the target functions match with each hypothesis on the three points.:)

Manny 04-10-2013 06:12 AM

Re: Confused on question 6.
 
I'm confused, were we supposed to work this out by hand or were we supposed to code it out?

zhou_jinyuan 06-16-2013 07:48 AM

Re: Confused on question 6.
 
I just started. Not sure if forum is closed or not. I have confusion too. As I understand hypothesis set is associated with a learning algorithm. does g in choice from a to d come from same learning algorithm or description represents different algorithm? Since we have 256 possible hypothesis, I can conceptually call my learning algorithm "try all" which have all 256 possible functions as its hypothesis. Does this exercise assume we are working with "try all" algorithm?.
Thanks,

yaser 06-16-2013 12:51 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by zhou_jinyuan (Post 11155)
I have confusion too. As I understand hypothesis set is associated with a learning algorithm. does g in choice from a to d come from same learning algorithm or description represents different algorithm? Since we have 256 possible hypothesis, I can conceptually call my learning algorithm "try all" which have all 256 possible functions as its hypothesis. Does this exercise assume we are working with "try all" algorithm?

A hypothesis set is just that; a set of hypotheses. The algorithm is a separate entity that chooses the final hypothesis from this set. It can in principle make that choice any way it wants (some algorithms may be better than others for the same hypothesis set).

To answer your question, the algorithm can try all hypotheses (in the hypothesis set), but it will have to choose one and only one as the final hypothesis that it reports. When we grade the algorithm, what matters is the performance of the final hypothesis it arrived at, regardless of how it arrived at it.

royal 06-21-2013 11:07 AM

Re: Confused on question 6.
 
It's taking me a while to get my head around whats going on in this question and how I am supposed to calculate the scores.

For a) , the hypothesis g returns 1 for all three points. So does this mean that for each of the points 101,110 and 111 as x_n, then y_n is 1 ?

If so then I am not sure what I am then supposed to compare this to?

Thanks for any help.

yaser 06-21-2013 12:00 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by royal (Post 11163)
For a) , the hypothesis g returns 1 for all three points. So does this mean that for each of the points 101,110 and 111 as x_n, then y_n is 1 ?

If so then I am not sure what I am then supposed to compare this to?

Correct. In this problem, we are considering different target functions (an unusual consideration that is meant to underline the difficulty of learning). You compare the values of g to the values of each f on these 3 points, and compute the score based on the different f's that can be the target function.

royal 06-21-2013 01:32 PM

Re: Confused on question 6.
 
Thanks for the quick reply.

I've only just realised that the fact I've done very little Boolean other than knowing what the basic gates are is making this confusing for me? I did some further reading, but it's hard to find a quick summary? Sorry to be asking dumb questions but using AND, OR and NOT I get 9 possible functions instead of 8:

a+b+c
a+b*c
a*b*c
a*b+c

a'b'c
a'b+c
a+b'c

a*b'c
a'b*c

Is there something straight forward thats wrong about this or do I need to go and spend an evening learning these before moving on?!

yaser 06-21-2013 02:22 PM

Re: Confused on question 6.
 
Quote:

Originally Posted by royal (Post 11170)
Thanks for the quick reply.

I've only just realised that the fact I've done very little Boolean other than knowing what the basic gates are is making this confusing for me? I did some further reading, but it's hard to find a quick summary? Sorry to be asking dumb questions but using AND, OR and NOT I get 9 possible functions instead of 8:

a+b+c
a+b*c
a*b*c
a*b+c

a'b'c
a'b+c
a+b'c

a*b'c
a'b*c

Is there something straight forward thats wrong about this or do I need to go and spend an evening learning these before moving on?!

No need to go through AND/OR implementation in this case. All you need is to list the all possible target functions (by values) on these three points exhaustively. That would be the 2^3=8 possible binary combinations of of 3 bits.

yaser 09-02-2015 09:19 PM

Re: Confused on question 6.
 
Could you post the question in an *ANSWER* thread (see above "BEFORE posting answers - please read").

henry2015 09-02-2015 09:46 PM

Re: Confused on question 6.
 
Sorry, deleted.

Will post under the correct thread.

robbinsleep 03-12-2016 05:43 AM

Re: Confused on question 6.
 
This is exactly what I am going to ask also :D


All times are GMT -7. The time now is 10:28 AM.

Powered by vBulletin® Version 3.8.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
The contents of this forum are to be used ONLY by readers of the Learning From Data book by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, and participants in the Learning From Data MOOC by Yaser S. Abu-Mostafa. No part of these contents is to be communicated or made accessible to ANY other person or entity.