1 Introduction In the field of industrial robotics, the interaction between man and machine typically consists of programming and maintaining the machine by the human operator. For safety reasons, a direct contact between the working robot and the human has to be prevented. As long as the robots act out preprogrammed behaviors only, a direct interaction between man and machine is not necessary anyway. However, if the robot is to assist a human e.g. in a complex assembly task, it is necessary to have means of exchanging information about the current scenario between man and machine in real time. For this purpose, the classical computer devices like keyboard, mouse and monitor are not the best choice as they require an encoding and decoding of information: if, for instance, the human operator wants the robot to grasp an object, he would have to type in the object’s coordinates (if these are known at all) or move the mouse pointer to an image of the object on a computer screen to specify it. This way of transmitting information to the machine is not only unnatural but also error prone. If the robot is equipped with a camera system, it would be much more intuitive to just point to the object to grasp and Let the robot detect its position visually. Observing two humans in the same situation reveals another Interesting effect: by detecting the partner’s gaze direction the person who points to an object can immediately control whether his intention has been interpreted correctly. If the partner looks at the wrong object, this becomes obvious immediately. Therefore, the movement of the head fulfills two functions: first, it is an efficient exploitation of the sensor equipment by shifting the interesting objects into the focus of view. Second, it can be used as a communication channel to provide information about the current behavioral state. In a robot system, this function can be implemented by providing the robot with a dynamic camera head that actively tracks the human hand position. To guarantee a smooth interaction between man and machine a task like this requires that the visual processing, the transmission of the position information to the camera mechanics and the movement of the camera head itself are very fast. In the following, we will describe a system which fulfills these requirements (compare with [1]). Before we go into details about the vision processing methods, we shortly describe our anthropomorphic assistance robot CORA on which we have implemented our method.