Augmented Reality: An Overview of Technologies and Applications

Augmented Reality
Mikko Sairio Helsinki University of Technology

msairio@cc.hut.fi
Abstract This paper is an overview of technologies that fall under the term augmented reality. Augmented reality refers to a system in which the physical surroundings of a person are mixed with real-time computer generated information creating an enhanced perception of surrounding environment. Being partly virtual and real, augmented reality applications have quite extreme requirements to be practical to use. It also has very much potential in numerous different application areas. These issues make augmented reality both an interesting and challenging subject from scientific and business perspectives. 1 INTRODUCTION TO AUGMENTED REALITY Augmented reality (AR) can be defined as referring to cases in which an otherwise real environment is "augmented" by means of virtual objects (Milgram and Kishino, 1994). Augmentation can be achieved with various different techniques. Augmentation is done in order to enhance the users surrounding environment in real-time in respect to some function or purpose. Computer generated part of the environment makes AR a very close cousin to the concept of virtual reality. The bulk of augmented reality concerns combining real and virtual visual information, although the concept of AR also covers imposing audio and other enhancements over environment in real-time. 1.1 A brief history The idea of enhancing persons perception of reality dates back to the 13th century when Roger Bacon made the first recorded comment on the use of lenses, i.e. eye-glasses, for optical purposes. In 1665, an experimental scientist Robert Hooke introduced an idea of augmented senses in his book Micrographia. Ever since fiction writers, military industry and lately academic and commercial researchers have paved the road for augmented reality with an increasing effort. The term 'virtual reality' was first introduced by Jaron Lanier, the founder of VPL Research, one of the original companies selling virtual reality systems. The term was
defined as a computer generated, interactive, three-dimensional environment in which a person is immersed (Aukstakalnis and Blatner, 1992). Other related terms include 'Artificial Reality' by Myron Krueger in the 1970s, 'Cyberspace' by sci-fi writer William Gibson in 1984, and, more recently, 'Virtual Worlds' and 'Virtual Environments'. Modern augmented reality was realized for the first time in 1993. Starner (1996) wrote the first version of the Remembrance Agent which was an augmented memory software. Later that year Feiner, MacIntyre and Seligman (1993a) developed the KARMA augmented reality system. KARMA was developed to assist maintenance person with wireframe schematics and maintenance instructions on top of whatever was being repaired. It is not surprising that augmented reality is gaining more and more attention and research effort. This is due to the fact that until today, the requirements of feasible augmented reality applications have been out of reach of reasonably priced technology. Today, however, even the consumer technology is catching up with the requirements and opening up a plethora of new application opportunities. 1.2 The relation to virtual reality Augmented reality is similar to virtual reality in the sense that both make use of computer generated virtual data. Virtual reality tries to generate a complete environment, simulated or completely synthetic, that surrounds or immerses the subject. Augmented reality differs from virtual reality in that it does not try to block the surrounding real environment from user. Instead its purpose is to enhance the environment for some specific purpose. Virtual reality environments usually block user completely from the real world and replace natural sensory information with computer generated signals. Deepness of user experience is called immersion. With no distracting real world signals a total immersion is fairly easy to achieve. Augmented reality is a mixture of real environment, that the user senses either directly or through the systems pipeline, and virtual environment. The virtual environment can in turn represent either real world objects or virtual objects. Ultimate goal of AR is to blend all parts seamlessly together so that the user is made to believe that the whole environment is real. In other words, there shouldnt be any conflicts and discrepancies between the augmented environment and the rules by which the user normally senses the real world. However, because of the half real and half unreal nature of AR, there are always some distracting features, such as the time-lag between user actions and systems reactions, misplaced or disoriented virtual objects due to tracking errors, and abnormalities in the object-interactions, which tend to diminish the immersiveness of the system. The real time requirements of augmented reality are even stricter than those of typical virtual reality applications.
Milgram (1994) introduced a taxonomy that relates augmented reality to virtual reality as different degrees of reality-virtuality continuum. Reality-virtuality continuum is shown in Figure 1.
Mixed Reality (MR) Real Environment Augmented Reality (AR) Augmented Virtual Virtuality (AV) Environment
Figure 1: Simplified representation of Milgrams Reality-Virtuality continuum after Milgram and Kishino 1994 In the left end of the reality-virtuality continuum is the real environment. A completely immersive virtual environment is in the other end. Augmented reality is near the real environment end, as it consists of some synthetic elements that overlap the actual real environment. The inverse case where real world content contributes to synthetic surroundings would be called augmented virtuality (Milgram and Kishino, 1994) Milgram also considers augmented reality and augmented virtuality as different levels of the broader concept of mixed reality, even though the term augmented reality has become quite popular in literature. 1.3 The real and the virtual Milgram (1994) proposes that the distinction between terms real and virtual can be measured by three aspects, depending on whether one is dealing with real or virtual objects, real or virtual images, and direct or non-direct viewing of these. These are Reproduction Fidelity, Extent of Presence Metaphor and Extent of World Knowledge. Reproduction Fidelity evaluates how realistically the mixed environment is displayed or otherwise produced and delivered. This aspect reflects the abilities of the technology used to record, transmit, manipulate and display the environment. Computing power, display systems resolution, field of view and capabilities of the audio equipment, all affect reproduction fidelity. Extent of Presence Metaphor deals with immersiveness experienced by the user, i.e. to what extent the observer is present within that world. It is not just a question of how real the environment looks, because feeling of presence is highly subjective matter to the user. The extent of presence may be significantly high, when user is given some objective to achieve, despite otherwise lower quality of the environment. Extent of World Knowledge is a measure of how much the system knows about the surrounding world. The more there is knowledge, the easier is the task of generating a realistic result. Virtual reality environments are blessed in the sense of world knowledge because the environment is completely computer generated. AR applications that are meant to work on some pre-known precinct can also have a good share of information that can be used to construct the augmented environment. When moving to outdoors and
to other areas where the only available information is what the system can gather in realtime, the layering of environments becomes a more and more complex task. 2 AUGMENTED REALITY IN PRACTICE Augmented reality is taking its first steps to the everyday lives of consumers. It has been around in military applications for much longer: Heads Up Displays have emerged originally from fighters and helicopters to business jets and even to some experimental cars. Sports broadcasts are being augmented with various aids to make games easier to follow. Some televised live events have augmented advertisements on them. 2.1 Application domains As the technology develops, more application opportunities appear. Currently, augmented reality can be applied for instance on assembly, maintenance and construction, design and modeling, medical applications and surgeries, military training and warfare, precinct specific instant information, and various forms of entertainment. Assembly, maintenance and construction are obvious uses for augmented reality (Feiner et al., 1993a). Any engineer would praise an ability to receive instructions as he proceeds. Model data and blue prints could reveal the inner workings of some structure. Tracking keys painted on insides of machines could be used to provide user with stepby-step visual instructions. The same goes for design and modeling. A designer can do the modeling on top of an existing real world object or space. An interior architect can design, remodel, and visualize a room, using furniture models from a database that are superimposed on video images of the room. Augmented reality could be even extended to collaboration where a group of designers can interact with a model (Ahlers K., Kramer A. et al, 1995). Medical applications are evident. Surgeries are a prime example of use for augmented reality. For instance, surgeries can be performed according to pre-gathered information which is imposed over the view of the subject. Information could include volume rendered images of physical implants, tumors, or other details, as well as information about the procedure and state of the patient. This way a surgeon can take full advantage of x-rays and other knowledge about the patient on the spot. The military has made use for augmented reality starting from single soldier equipment such as night vision goggles and single soldiers full scale AR-systems to Heads Up Displays (HUDs) and HMDs applicable for practically any vehicle be it aerial, naval or on wheels. The Defense Advanced Research Projects Agency has funded an HMD project to develop a display that can be coupled with a portable information system (Ethrigde, 2001). Precinct specific information is provided in some museums and other exhibitions. Visitors are given devices that stream location sensitive narration and information inside the premises. Entertainment is the ultimate application domain when measured in potential popularity. There are already some early hacks such as outdoor augmented
reality game ARQuake which is an extension to a hit desktop game Quake (Thomas et al, 2000). 2.2 Augmented reality as a multidisciplinary science Numerous application domains make augmented reality a highly multidisciplinary field of research involving signal processing, computer vision, graphics, user interfaces, human factors, wearable computing, mobile computing, computer networks, distributed computing information access, information visualization, and hardware design for new displays. Augmented reality (AR) is a growing area in virtual reality research. The world environment around us provides a wealth of information that is difficult to duplicate in a computer. It is interesting how a relatively new art of AR brings together computer scientists, electrical engineers, optical scientists, psychologists and HCI-experts, mechanical engineers and even chemists and physicists. 3 AUGMENTED REALITY SYSTEMS Augmented reality systems need to be able to present the real world to the user as well as the virtual computer generated environment. There are a number of ways to implement this kind of system. Different implementations can be classified according to their extent of presence metaphor for instance (Milgram and Kishino, 1994). 3.1 Window on the World The simplest augmented reality system is the so called Window on the World system (WoW) (Feiner, MacIntyre et al., 1993b). The user observes the augmented environment through a window such as a computer monitor. The real world environment is first recorded and augmented with computer generated objects and then displayed on the window. The user is not in the center of the augmented universe but rather an outside spectator. Interaction is achieved through any normal HCI input devices. Even though the feeling of presence is faint at best, WoW systems are suitable for various telepresence applications. 3.2 Spatially immersive displays Immersiveness can be enhanced with larger screens up to spatially immersive displays. Spatially immersive display (SID) surrounds the user with multiple projection screens, creating a much very effective and immersive experience (Jalkanen, 2000). The strength of this kind of configuration is the fact that the user is actually inside the environment. On the other hand, the larger the configuration is, the more it is stuck geographically, and thus less suitable for applications where the user does not experience AR on location. SIDs are an extreme case: they stay where they are built. Of course, while the user is spatially stuck within the SID, the users body movement is not hampered with weight of equipment as is the case with wearables.
3.3 Head mounted displays When it comes to mobility and geographical freedom, wearable systems are the only flexible solution. Head mounted displays (HMD) are very common in augmented reality systems as they are in virtual reality applications. User worn HMDs both record the environment from the users perspective and present it to the user. So in a sense, HMD is merely a layer between users senses and the environment in which user is. There are two basic types of HMDs, the optical see-through and the video see-through, which differ in the way the resulting image is composed. The video see-through is similar to WOW and SID systems in that it records real environment, augments it and then presents the resulting environment to user. Optical see-through solutions combine the real view with the computer generated one using mirror optics. Optical see-through is the ultimate solution for presenting the real environment, as real environment does not go through any artificial processing that would alter or delay it. However, there are some obvious drawbacks, i.e. computer generated material is inevitably lagging behind the real view, contrast varies with lighting, and virtual images are not completely solid, which may be why optical see-throughs are growing less popular (Bonsor, 2001). Optical see-through set-ups also give no clues about the world that could be used to aid rendering of virtual environment. Head mounted displays provide a good extent of presence as the user is inside the actual environment. The ultimate goal of "unmediated reality" should be indistinguishable from direct-viewing conditions at the actual site, either real or virtual (Milgram and Kishino, 1994). In the case of wearable accessories such as HMDs this would also mean that they are physically unnoticeable. 3.4 Other AR configurations Many visual augmented reality solutions are based on head mounted displays, because they are extremely flexible when it comes to mobility, and they are developing rapidly and available for reasonable prices thanks to the break through of virtual reality in general. But augmented reality can very well utilize other senses too. For some applications, view is not even necessary. Visual AR just gets the most attention because of significance of sight to human perception. View and aural environment make or fail the immersiveness of system. Other senses, touch, smell and taste can be used to supplement the effect, though. For this reason most innovations other than visual have been made in the audio systems. Synthetic 3-dimensional sound environments are one of the recent additions in virtual reality applications (Begault, 1994). Their advantage is that headphones and other instruments that can be used to transmit audio signals for the user have been in development for decades. Audio signal processing has also been studied for a long time, and now that the computing power is cheaper, applications have also become very keen and practical. Probably the most common AR configuration consists of an audio enabled HMD device, a computer, input devices like data-gloves (maybe with touch feedback) and lots 6
of pre-defined information. This pre-defined might involve model data, rules for physics, or maybe tracking markings painted on objects or the surrounding room. 4 TECHNOLOGICAL PERSPECTIVE Current challenges in Augmented reality can be divided to some derived from the essence of AR itself and to more detailed issues concerning some specific line of research. 4.1 Basic Requirements of Augmented reality Basically augmented reality is based on some sort of display system that is natural to use, a precise tracking system to provide accurate information to keep a virtual scene in sync with reality, and exhaustive computing power to cope with the real-time requirements of AR. All these requirements have to be satisfied so that the user distraction is bearable. Ideally user would not experience any distraction or contradiction with own behavior and the environment. This also means that whether the AR system is worn by the user or it surrounds user as is the case in SIDs, the system has to be as transparent as possible. A current topic of many augmented reality developers is the incorporation of the three components into one unit that might be housed in a belt-worn device that wirelessly relays information to a display that resembles an ordinary pair of eyeglasses (Bonsor, 2001). Computing power is a key requirement in producing a feasible augmented reality system. Raw computing power is needed to minimize latency and to maximize frame rate, both important aspects of systems reproduction fidelity. It is also the easiest requirement to come up with as it has become practically a commodity and it is pushed forward by numerous other industries. Recent advances in mobile computing are crucial for wearable AR systems. Most technological advances in virtual reality research also apply on AR. Head mounted displays are still too clumsy to be completely ignored by the user. They also have limited field of vision, and they usually lack in contrast and resolution for a reasonably immersive environment (Jalkanen, 2000). HMDs as well as other wearable equipment, such as data-gloves and data-suits, may also require wiring, which naturally is a limitation for the user. All wearable equipment is bound to grow lighter, smaller and easier to work with, i.e. to be more transparent to the user. There are three conditions that a tracking system has to satisfy (Azuma, 1993). A tracker must be accurate up to a small fraction of a degree in orientation and a few millimeters in position depending on the applications requirements. Also, the combined latency of the tracker and the graphics engine must be very low. Finally, the tracker must work at long ranges, especially in outdoor environments.
Systems commonly used to track airplanes, ships and cars have sufficient range but insufficient accuracy. Many different tracking technologies exist, but almost all are short-range systems that cannot be easily extended. 4.2 Current Fields of Study There is still work to do with the basic requirements of AR. Most projects focus on tracking, latency and display techniques. Every augmented reality application has to be able to match the users point of view in the virtual and real worlds up to some precision. Tracking the users position and orientation is one of the classic problems faced in AR research. Position and orientation can be tracked using visual registering marks, magnetic tracking hardware, satellite positioning systems or mechanical sensors. They all have their pros and cons. Visual marks have to be installed before use, effectively preparing additional world knowledge for the system. Same goes for magnetic tracking systems which only work in environments with pre-defined magnetic field or magnetic guides. Satellite positioning systems like GPS take AR outdoors, but are miserable in accuracy. To sum it up, they all lack in some area: GPS and magnetic trackers are inaccurate, mechanical trackers are cumbersome, and vision-based trackers are computationally problematic. Recent trackers make use for more than one method. Good results have been achieved indoors by integrating landmark tracking and magnetic tracking (State et al, 1996). Another indoor solution, HiBall tracking system, goes even further: with latency less than one millisecond, it reaches better than 0.5 millimeters and 0.03 degrees of absolute error and noise (Welch et al, 2000). Outdoor tracking is still problematic, as is tracking in otherwise dynamic environment. Studies have been made to figure ways of extracting tracking data from real environment geometry with method called vision sensing (Okura et al., 1998). Camera parameters are also important in order to match real and virtual world coordinates. Accurate calibration is important especially with 3-dimensional display systems, because when rendering virtual objects above real environment, they have to match the perspective and depth of the real scene. User would certainly get very disoriented if virtual objects would be floating in front of real objects but yet they would feel to be further away. Faulty camera parameters also undermine chances for successful occlusion detection. Occlusion detection is an active area of study in virtual reality in general. It is a technique used to decide whether some objects overlap each other or not and to optimize rendering accordingly by not drawing what is not visible. With little or no prior knowledge about the surrounding real world, occlusion detection becomes a very tricky art in augmented reality, and inability to use painters algorithm and such in rendering makes occlusion detection a practical necessity. Occlusion errors easily ruin the feeling of presence the user might otherwise experience.
Occlusion detection can commonly utilize depth information about the scene. Depth sensing has been studied with various different approaches. One way is to calculate depth field from input using stereo cameras (Kanbera et al, 2000). The same stereo camera setting makes it also possible to extract camera parameters in the same run. Depth sensing can be accomplished also with monocular vision using one camera. The trick is in using multi-focused images from the same viewpoint. With just three unfocused images it is possible to extract the depth information from the blurriness of individual pixels (Takemura and Matsuyamal, 1998). A necessity of AR systems is the latency. Because some processing has to be done in order to dynamically supplement reality with virtuality. If decrease in details is not plausible answer to decrease latency, it has to be tackled with some other way. Prediction, pipelining and other tricks used in microprocessors for instance can be used to reduce latency. Jacobs (1997) has studied methods for measuring relative latency and a variety of techniques for managing latency to reduce the misregistration it causes. Display technologies used in HMDs are a constant area of research. Currently HMDs are too clumsy to be ignored by the user and they still lack in resolution, contrast and field of vision. Current trend in HMD research is study of retinal scanning displays. Microvision has developed their own solution which has some have some obvious advantages compared to traditional flat panel displays (Microvision, 2001). For instance retinal scanning is mechanically simple solution, it has practically unlimited resolution, it does not weight much and it is very configurable 5 FUTURE GUIDELINES Outdoor augmented reality has lots of potential, but it simply is not practical yet. It is a question of mere years before mobile and wireless computing is fast enough to produce satisfactory synthetic images. A more difficult aspect of an outdoor AR is the tracking of users location and orientation. Global Positioning System (GPS) has granularity of less than a meter. This will actually do for some applications such as aural hints, because the human ear is not that sensitive to direction, and perhaps for visual meta information about the surroundings. But GPS will not suffice for visual applications where computer generated virtual objects should blend in the view seamlessly on their correct places in respect to the environment. Algorithms and software will go through several iterations to evolve into more sophisticated solutions. Recoveration of environment, lighting and reflectance from real images are common image processing challenges, which could be used to make environments more immerse and natural. Coping with real-time requirement by predictive tracking algorithms and coming up with even more imaginative ways to register users location and orientation in the real world will also probably go through some advances in near future. Human computer interaction devices that are being developed for more traditional virtual reality will also be adopted to augmented reality applications. Immersiveness of environments will grow ever deeper with sensations of touch, smell and maybe even taste.
User sensing and modeling comes in question with future applications. This could be achieved through accurate sensory data and some predefined behavioral model of the user. With enough information, the wearable computer could track the state of the user and adjust its behavior accordingly (Starner et al, 1997). This could be used in making decisions on whether or not to bother the user with some piece of information, or to predict the user's next action, or state and to conduct some actions pre-emptively. I believe, however, that the greatest change for the augmented reality technology is not necessarily in some specific high-end innovation, but rather in penetrating the mass markets in various forms of pervasive computing. 6 CONCLUSIONS Augmented reality is an old idea that is right now on the verge of success. This is due to the fact that until lately there have not been advanced enough technologies to make feasible AR applications. They have been lacking either in computing power, user tracking accuracy or ease of use and comfortableness, which all are necessary to produce a satisfying AR experience. Augmented reality is still in a reasonably early stage of research and development at various universities and high-tech companies, if measured against the vast application areas and potential it has. Blair MacIntyre has prophesied in 1996 that the wearable seethrough display has the potential to be the Sony Walkman of the early 21st century (MacIntyre, Feiner, 1996). I believe that the success of the augmented reality in general might be somewhat similar. 7 REFERENCES Ahlers, K., Kramer, A. et al, (1995), Distributed Augmented Reality for Collaborative Design Applications, ECRC-95-03 Azuma, R., (1993), Tracking Requirements for Augmented reality, Communications of the ACM, 36, 7, 50-51. Begault, D. R., (1994), 3-D Sound for Virtual Reality and Multimedia, Academic Press Professional, Cambridge, MA, USA. Bonsor Kevin, How will augmented reality work?, How stuff works, http://www.howstuffworks.com/augmented-reality.htm, Referenced 11/25/2001 Ethrigde E., Head Mounted Displays (HMD), The Defense Advanced Research Projects Agency, http://www.darpa.mil/mto/displays/hmd/, Referenced 10/29/2001 Feiner, S., MacIntyre, B., and Seligmann, D., (1993a) Knowledge-based augmented reality. Communications of the ACM, 36(7)
10
Feiner, S., MacIntyre, B., et al., (1993b), Windows on the World: 2D Windows for 3D Augmented Reality. Proceedings of ACM Symposium on User Interface Software and Technology. Atlanta, GA, Association for Computing Machinery Jacobs, M., Livingston, M., State, A., (1997), Managing Latency in Complex Augmented reality Systems, Proceedings of 1997 Symposium on Interactive 3D Graphics, Annual Conference Series, ACM SIGGRAPH Jalkanen J., (2000), Building a spatially immersive display HUTCAVE, Licenciate thesis, Helsinki University of Technology Kanbara, M., Okuma, T., Takemura, H., Yokoya, N., (2000), A Stereoscopic Video Seethrough Augmented Reality System Based on Real-time Vision-based Registration, IEEE Virtual Reality 2000 International Conference, New Brunswick MacIntyre, B., Feiner, S., (1996), Future Multimedia User Interfaces, Multimedia Systems, Vol. 4 Milgram, P. and Kishino, F., (1994), A Taxonomy of Mixed Reality Visual Displays, IEICE Transactions on Information Systems, Vol E77-D, No.12 Okuma, T., et al., (1998), An Augmented Reality System Using a Real-time Vision Based Registration, Proc. ICPR'98, Vol. 2, 1998. Starner T., Rhodes J., (1996), Remembrance Agent, Proc. The Practical Application Of Intelligent Agents and Multi Agent Technology Starner, T., el al., (1997), Augmented reality Through Wearable Computing, Presence, Special Issue on Augmented reality, vol 6(4) State, A., et al., (1996), Superior Augmented-Reality Registration by Integrating Landmark Tracking and Magnetic Tracking., Proceedings of SIGGRAPH 96, New Orleans, LA Thomas B., et al., (2000), ARQuake: An Outdoor/Indoor Augmented Reality First Person Application, Proc. ISWC'00 Welch, G., et al., (2001), High-Performance Wide-Area Optical Tracking -The HiBall Tracking System, Presence: Teleoperators and Virtual Environments 10(1)
11

Augmented Reality: An Overview of Technologies and Applications

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Augmented Reality: An Overview of Technologies and Applications

Enviado por

Direitos autorais:

Formatos disponíveis

Augmented Reality

Mikko Sairio Helsinki University of Technology

Você também pode gostar