Monday, August 27, 2007

[Discussion Topic--must comment] Optimality ; prior knowledge; discounted rewards; Environment vs. Agent complexity..

[[Folks:
 By now you all had enough time to get yourself signed up to the class blog. As I said, participation is "required" in this class. Participation involves
doing assigned readings, asking questions (as needed) in the class and most importantly, taking part in the class blog discussions. Here is the first discussion topic
for your edification.

As for the quanity vs. quality of your comments, I suggest you go by the Woody Allen quote below for guidance.. ;-)]]


Here are some of the things that I would like to see discussion/comments from the class

1. Optimality--given that most "human agents" are anything but provably optimal, does it make sense for us to focus on optimality of our agent algorithms? Also, if you have more than one optimality objective ( e.g., cost of travel and time of travel), what should be the goal of an algorithm that aims to get "optimal" solutions?

2. Prior Knowledge--does it make sense to consider agent architectures where prior knowledge and representing and reasoning with it play such central roles? Also, is it easy to compare the "amount" of knowledge that different agents start with?

3. Environment vs. Agent complexity--One big issue in agent design is that an agent may have very strong limitations on its memory and computational resources. A desirable property of an agent architecture should be that we can instantiate it for any <agent, enviornment> pair, no matter how complex the enviornment and how simplistic the agent. Comment on whether or not whether or not this property holds for the architectures we saw. Also, check out "Simon's Ant" on the web and see why it is related to this question.

4. Learning--In the class, we said that an agent can learn (and improve) its knowledge about how the world evolves and how its actions affect the world etc. One thing that was not clarified is whether "utilities" are learned or given/hard-wired. Any comments (using your knowledge of humans)?

5. Anything else from the first three classes that you want to hold-forth on.

Rao


----------------

"
The question is have I learned anything about life. Only that human being are divided into mind and body. The mind embraces all the nobler aspirations, like poetry and philosophy, but the body has all the fun. The important thing, I think, is not to be bitter... if it turns out that there IS a God, I don't think that He's evil. I think that the worst you can say about Him is that basically He's an underachiever. After all, there are worse things in life than death. If you've ever spent an evening with an insurance salesman, you know what I'm talking about. The key is, to not think of death as an end, but as more of a very effective way to cut down on your expenses. Regarding love, heh, what can you say? It's not the quantity of your sexual relations that counts. It's the quality. On the other hand if the quantity drops below once every eight months, I would definitely look into it. Well, that's about it for me folks. Goodbye. "
                ---Boris in Love & Death (1975 http://us.imdb.com/title/tt0073312/ )

14 comments:

Tuan A. Nguyen said...

Hi,
I had some thoughts about the learning ability and the perfectness of our agents this morning. The question is: If the agent can correctly learn the following factors (and anothers?), will we have a perfect agent? Or, can we design a perfect agent?:

(1) states of the world (i.e from learning it knows all features of the world, even in the future! -- so its sensor limit is not a matter".

(2) The performance measure/goals/utility it has to maximize/achieve (i.e the agent can learn what it needs to do!).

One can say that: "How about the learning criteria? It must be subjective!". In this sense, we still can't have a perfect agent. So I think we have to go beyond our current general architecture, e.g we need a meta-program that can generate the agent program so that the agent can act and do everything itself successfully!? Then, how this meta-program looks like? (your ideas?)

If there's some perfect agent born some day by human, it's so suprising that an im-perfect agent (like the smart guy crossing the street in RN's textbook) can design a perfect one!

Looking forward to hearing from you!
A.Tuan
--
PS: Here i'm talking about agents designed by/for human, not some (excellent) systems in nature like "HIV", which we still cannot defeat.

nishant said...

1. Well in AI everything depends upon the needs or the requirement of the particular problem for which the agent is designed, yah its true that all that is needed is getting the optimal solution out of the problem but there can be another measure of this also ,as discussed in class the "utility" of the actions made by the agent at each state can however make up for some of the non optimal solutions and this utility can be measured or improved as the agent gathers knowledge from the environment. Therfore the goal for any such problem should be to maximise its performance by conforming to the utilities defined on each state.

2. Proir knowledge will help the agent in knowing his environment better but i have doubt here , as when child is born ,he or she has all the info fed on to his brain which gradually get narrowed in to some specific knowlgede only ....
well consider a situation here
there is a child who from the very first day is kept in a lets say a closed room .. after like 20 yrs will that human be a super human being as he will be having all the pristine info that he was born with. ( I am not a devil here , as u might be thinking by my thought of keeping him closed for 20 yrs ....)

4. i had skipped the 3rd one
well intillaly the utilities can be hard wired but ultimately they should also be improved as the agent gathers the knowlgede about the environment

RANDY said...

1. I would say that there are only rare cases where optimality should be an overriding concern (say, if the agent was controlling a space probe with a limited battery life). Usually, something "nearly optimal" suffices (since resources are usually overabundant, especially when we can pre-arrange the environment).

2. Prior knowledge could greatly reduce learning time, which will be important until our computers are much, much faster than now. Also, once one AI is made & learns many copies of its mind could be made, in which case "prior knowledge" would constitute most of what they need to know to operate. However, quantifying the knowledge of an agent in any more than a crude way (order-of-magnitude precision or worse) would seem to require a metric over the space of possible concepts, & if we understood knowledge that well, we would likely already have human-level AI.

3. Aside from pathological environments (which can even be designed for humans), all of the systems we considered so far seem to be sufficiently scalable, though the reflex-without-memory agent would not fare well with too many goals unless there were enough patterns in their rule tables to compress them significantly (at which point they would begin to resemble the other solutions). I expect that some reasonable probability distribution for "normality" of worlds would result in all of them working in most "normal" worlds.

4. Learning of utilities does seem to occur in humans - consider people who learn to "push through pain" either for athletic or military purposes. While hard-wired utilities might make it slightly harder for AI to take over the world, that could be better addressed in other ways, & the added flexibility from having utility functions being customizable would enable adapting to new environments where goals might be too different to fit into the previous utility system.

Kyle Luce said...

(4) "utilities" are learned or given/hard-wired". I assume hard-wired is some kind of internal reward system.

For learning I think a reward system outside of the agent would be optimal. I am not sure if plausible/possible, but I think any 'internal' rewards would be issued to itself indefinitely. Learned self rewards? ..e.g. a bunch of food in your pocket, you can eat when ever you want :)
"Napoleon, give me some of your tots. {kicks pocket}"

(2) I think it is relatively easy to compare knowledge in a programmatic agent given if it is some kind of data base of lookups (e.g. Alice). However, for more complex AI, this may be a more daunting task. I think that a decision based system would be harder to gauge in 'knowledge'.

Subbarao Kambhampati said...

In response to Randy's comment on optimality--

(I must admit that I threw that one in there mostly to get Randy to say his piece so I can do this response..
Talk about evil plans ;-)

in order to understand what is "near optimal" and what is not, we still need to understand what we would need to do to achieve optimality.

This--IMHO--is the most important reason why we care about optimality. (For example, we will see--while doing A* search next week--that you can get optimality by having admissible heuristics. We will also see that getting admissible *and* "efficient" (technically "informed") heuristics is too hard. So, in practice, people get by with inadmissible heuristics. Nonethesame, the understanding of what is needed to get optimality helps us in designing heuristics that might give near-optimal solutions.


rao

Louis Casillas said...

I believe that achieving as optimal an agent as possible should be the goal. However to achieve this goal I think the agent's processing power would have to be extremely great. Possibly in the present we should be happy with near optimality. But I feel that we should strive to make an agent act as close to optimal as physically possible.

If there are two goals or multiple goals I would hope they would be assigned values for which is the most important. If they were all equal then some metric would have to be used such as what can be completed the soonest or what uses the least resources.

Just a side note. If AI machines do become widespread I hope they have some overriding goals in order to benefit humanity as a whole and not simply to benefit their owner whether that's a person or a corporation or what. Something like the 3 laws of robotics from Isaac Asimov. And maybe other things such as making a universal goal to use less gas or to make the most people smile. :)

Anupam said...

1. In case of multiple (possibly conflicting) optimality objectives, one approach would be to prioritize them, maybe using weights. But as the priorities change according to situation, weights should be flexible. So it would be desirable that agent can learn the relative importance and conflicts among objectives.

2. Having prior knowledge of the world is important even for a learning agent. Even for trial and error based learning, the agent needs to know if the outcome of a random sequence of actions is desirable or not. Comparing the knowledge of agents seems to be tricky. One could say that it can be done by comapring the optimal solutions given by different agents. But if an agent with lesser amount of prior knowledge learns faster during the course of problem solving and thus gives a better solution, we would conclude this agent "had" more knowledge, which was not true. It was just that it gained knowledge faster.

3. I feel the agent program is computationally the most demanding part of the agent architecture.
The complexity of the environment doesn't seem to be a big issue, but rather the agent's response to each eventuality. So if we can specify the agent's response in as simple or generic terms as possible, the whole architecture would be scalable. This improvement can be seen as we move from simple reflex system to state based system, as in this case the agent has a way of generalizing possible situation.
Simon's ant explains that an ant, instead of trying to cope with the enormous complexity of its world, "chooses" to follow some simple actions which let it achieve its goals.

4. As agents are designed by us for a particular "purpose", utilities have to be hardcoded accordingly. But this doesn't mean having a rigid set of utilities. As the agent learns, it can develop some higher level features or patterns from the given utilities. It can learn the relative dependencies and conflicts among the utilities. As an analogy, humans have some utilities hard-coded right from birth, like pain/hunger is a negative utility. But gradually we also learn what kind of situations should be avoided. http://en.wikipedia.org/wiki/Ego gives some good info about hard-coded and learnt utilities among humans.

-anupam

imina said...

Optimality for human agents is usually not as strongly defined. Thus getting a solution becomes more important than getting the best solution. Whereas when we are designing intelligent agents - the developer would have some set list of criteria (with weights/priorities) which they use to compare or gauge the action of agent and improve it to come to the some "logical" & desired result {i.e. as expected by the developer}. As long as there is any scope for "improvement" I would say optimality has not been achieved. In that respect the goal of any algorithm that aims to get the "optimal" solution is the one that conforms to the criterion pre-decided by the developer. On the other hand if we think of optimality as providing the best possible solution in a particular situation, it has to be dynamic - i.e considering whether you want to reach fast or whether you want to save money. That being the case, we might as well allow for the user input to specify which is desired!!
As for prior knowledge - it is an interesting question whether the agent itself is aware of the existence of prior knowledge within itself. If the agent is not aware then I would assume it would be very hard to compare the amounts of knowledge. Unless if we try out some experiments to find out its reflex actions (other than "do nothing"!!) when it hasn't been trained for it. I guess thats the kind of thing the baby-depth experiment was trying to do.... So is it true what I read on a t-shirt some time back, "I was born intelligent...education ruined me" :)
For the environment and agent complexity I would agree with Anupam's comment in that the Simon ant would just go around the obstacle to reach its destination no matter how circuitous the path is. Hmmm, so do we say that the ants are not optimal either? I guess for the ant it is just getting the food there or not and they may not have to adhere to the 9 to 5 work schedule and thus no hurry with the time?
I would rather say some utilities are hard-wired but some are learnt. For example, if you are hungry and you eat food, you are satiated - that would have to be hard-wired that you are full(and thus at relative peace with the world!) and thus "happy" (hopefully). Whereas the concept that we had as kids that we would be happy watching cartoons and not an educational DVD on Stoichiometry (we did not have too many of those)... is I think learnt rather than hard-wired. Similarly the concept that I will be happy if I get an A+ in this class, I don't particularly believe it is hard-wired into my system as yet, but I guess I am still learning..... :)

Subbarao Kambhampati said...

(in response to "Imina"]
People who learned to get high utility euphoria from educational DVDs on Stochiometry are clearly in need of medical supervision.

That said, the point about in built vs. learned utlities/rewards makes sense.

About the only hard-wired reward system we have is the dopamine rush. When we say we learned to love Stochiometry videos, then I guess it means that we were able to make enough connections to make the experience of watching these videos lead to a miniature dopamine rush for ourselves.

(which, of course, is what makes hallucinogenic drugs so diabolic. They short circuit the whole "achieve something, get a dopamine high" to "just get a dopamine high".

I wonder if the sign of full evolution of intelligence on the part of robots would be that at some point of time, they will all say "to heck with A* search and value iteration" and go hang out in Haight-Ashbury high on robotic reward drugs, and a beatific look on their faces.

Enough half-baked neuroscience..

rao

ps: A request for folks with blog handles that are not directly connected to their names--please sign your name so I know who is talking..

imina said...

ummm sorry for the confusion about my name. I am Ina, hence imina :)
-Ina

Yin said...

1. Actually, I am confused when we were talking about the utility-based agent. How could we define an optimal function to maximize the happiness? Of course, we can define different weights for different objectives. Even when we are not sure whether the importance of the cost is twice of the importance of the time or 1.5 times, some dynamic methods could help us to adjust their weights in runtime. However, everyone has his own preference. So, if we are trying to develop an agent to simulate the general human intelligence, the task is impossible. If we just need to simulate one real guy, I think we can define the optimal function arbitrarily, like (happiness = money). After all, how could we say there is no person who totally does not care about the time?

2. Prior knowledge is definitely very important. It is learning that makes our humans more and more intelligent, and the most important aspect of the learning is expanding prior knowledge. However, it is difficult to exactly compare the amount of knowledge. Roughly speaking, we can measure the amount of the knowledge by the size of the table. When the quality comes to be considered, a better choice is the entropy function introduced by the Information Theory, even though it is still hard to define that function for general knowledge.

3. I feel the complexity of the environment is influenced by the agent. If the agent takes great care over all details of the environment, the environment is unacceptable complex. The same environment would be simple for an agent which just considers a small range of information. The Simon’ Ant can survive in the complex environment because it does not care about the rock until it cannot keep the original route. The ant is simple but its overall behavior is complex when the environment is complex. It seems that the utility based agent holds that property because it only pursues locally maximum happiness rather than the globally largest benefits.

4. Utilities should be given at first but is allowed to change. As I have mentioned in the first question, sometimes we are not sure which one is what we really want or which one is really the most important for us. On that time, we can select safe but not optimal utility function and see what happens. Then, we can adjust that function according to our new observations.

Gavin Lewis said...

If we're trying to make agents that act as humans do, we should limit our focus on optimality to the degree that humans are suboptimal. Otherwise, focusing on acting rationally, computers can achieve near-optimality in at least trivial areas, and better optimality than humans in some surprisingly complex areas. With more than one optimality objective, perhaps cost of travel and time of travel could be integrated into an overall utility and the utility itself be optimized. This could be done based on a user selection (optimize for time alone, or cost alone), or it could be done based on knowing how much cost a unit of time might have.

It does make sense to consider architectures relying on some amount of prior knowledge, but not exclusively. The alternative is to assume no prior knowledge, which is a severe handicap. The discussion question, however, was whether it made sense to consider architectures that rely heavily on prior knowledge. It depends on who is asking the question. I don't know. I don't have enough experience to say for sure.

Nick said...

1) I don't know if it is necessary to focus on optimality for any reason other than to improve the time or space requirement of an agent... with regard to more than one optimality objective, perhaps the agent could try and balance each objective.

2) I think prior knowledge would be a good thing to have in an agent as that would help reduce the amount of time the agent would have to spend learning.
It depends on the purpose of the agent to be able to compare the amount of knowledge it would start with... example: what would a taxi agent do with the knowledge of all the works of Shakespeare over an agent designed to tutor literature students. In other words, more knowledge is not necessarily better. It would be better to give the taxi agent knowledge about traffic laws and things of that nature...

3) ... hmmm...

4) not clear on "utilities"

zhou said...

1.
If one agent has single goal which can be mathematically well modeled, it makes sense that our design of
algorithm will be focused on the theoretically exist optimality. Unfortunately, such cases are really rare. Usually, the goals of an agent are complicate, utility based (satisfaction rate), or even competing. In
these cases, the problem of optimality turns into a engineering problem rather than a scientific problem. We can hardly define the optimal case even given the environment, goals, and performance measures. However,
we can try to get better solutions, which maybe one inch nearer to the optimal, to certain problems, if we prioritize the competing goals.


2.
Prior knowledge is important. I think human usually make decision based on prior knowledge, except for
the activity of creativity. Given that we have not find a way to make the agent create new knowledge,
which means the agents designed now are knowledge based, it really make sense to consider agent
architectures where prior knowledge and representing and reasoning with it play such central roles. I
think it is not hard compare the amount of knowledge that different agents start with. Let think over the
children in elementary school. Students from second degree should knows more mathematical knowledge than those from first degree. However, they learn new knowledge based on the knowledge from the first degree.
Surely, there exist some knowledge expansion. So, the amount of knowledge became larger.



3.
The models in the text book are very scalable. Those standard architecture can form a universal machine and, given specific code, specific agent can be built. As for the Simon's ant, it is true that sometimes, we can achieve the goal without need to make our code complicate. Complex environment may make our program
runs long time, but we can eventually finish the task.



4.
Utility can be learned. According to the comment in the optimality, the optimal utility is very hard to be
achieved. On the other hand, the agent can improve the utility from learning. Take the taxi driver agent
as an example. This agent can, by chance, learns that there is a beautiful place on one alternative route
to the airport. If the time limit is not so urgent, driving through that route will bring more fun to the passenger. In this case, the taxi driver improve the utility through learning.