New theory about stereo vision, 3D, stereopsis, binocular vision and depth perception

It might be time to expand the way we think of human visual perception.  What we “see” is a construct of our brain and how it processes the stream of data that is input from our senses. The vast amount of raw data that our brains receive from our eyes, set aside the data from our other senses for now, is not something that we typically think about. We open our eyes and see stuff.  We’ve spent a lot of time learning about the parts of the eye and the mechanics, but I’m not sure that teaches us very much about “seeing”.

Understanding computers gives us a new way to think about this, specifically the converting of data (the signals our eyes send to the brain) into conscious perception. We aren’t born with all of the “software” needed to perceive the signals coming from our eyes. “Software” is created over time as the brain interprets and learns cause and effect through experience. I believe the brain never stops tweaking that processing and makes all sorts of modifications in the same way that computer software has upgrades that provide desirable new features and ease of use functions and performance enhancements and so on.

What we see and how we perceive what we see is a function of the snapshot in time of the current version of our vision “software”. Maybe that’s a radical idea, but there is anecdotal evidence that this might be true. I became aware of it when I noticed that each time I looked at a 3D image of an African tribal mask that it looked different from what I remembered. It was the same picture, it had not changed but how I perceived the image did change.

The weird thing about the image of the mask was that I did not have the same reaction to a 2D image of it. The 2D image always looked the same. The 3D image always looked slightly different. In my experience, my brain seems to be much more aggressive at tweaking how I perceive images with depth than it is when I look at flat images.

Having said that, it isn’t noticeable for all 3D images. Images that are life size or larger than life size and ones that I have some level of interest in seem to change in a more noticeable way. I’m curious if other 3D enthusiasts have experienced this.

I think it might be more pronounced with a 3D image because it is an illusion with perception conflicts that the brain must reconcile in some way.

Seeing With The Brain.

Common sense tells us that we see with our eyes. Afterall, when we close our eyes we stop seeing. Right?

Well, when you think for a minute you realize that’s not true. There is this thing called the mind’s eye and dreaming and envisioning, etc. Truth is, the eyes are little more than data acquisition devices that feed the brain with information. Actually, to be more precise; the eyes stream flawed data to the brain with tons of errors and giant missing pieces of data.

The amount of processing the brain performs to make vision possible is staggering. Scientists have written that up to 1/4 of the entire brain is involved in vision processing and interpretation. How we see and what we see is influenced by everything we have seen before. It is also influenced by what we hear, what we smell, what we taste and what we touch. Don’t believe it? Well, science proves it. One example, off the top of my head, was demonstrated at an audio engineering society convention in New York City many years ago. There were rooms with different resolution video monitors and different speaker systems. As it turned out, the room deemed to have the highest video quality was not the one with the best video monitor, but the one with the best sound system.

Much of the time what we think we see really doesn’t match with reality. Much of what we see doesn’t even make it out of our subconscious. So, when 3D cinematographers obsess over camera spacing (inter-axial distance) and convergence and depth of field as it relates to eye geometry, they are misguided in my humble opinion. The brain is not limited to the geometry of the eye, or it’s limitations. If it was, we would have two big black circles where the eye has no receptors (where the optic nerve is connected).

Indeed, how we see and what we see varies greatly from person to person. Then, there are people with eye problems and vision impariment. People that can’t fuse and have double vision.  Who’s to say that in a room filled with 99 people who have strabismus and one person who can see with stereopsis that the people with strabismus wouldn’t be “normal” given that they represented the majority?

How the majority of people see is the result of evolution and natural selection. Human vision is not the best of what nature can create. There are examples of eyes that are superior to human eyes in terms of clarity, detail, color, focus, etc. In the near future, there will be machine to biological connections that might enhance or even replace our eyes with superior devices.

My point to this rambling is that it is a mistake to limit the way multi perspective imagery is created to analytics based solely on eye geometry and how the eyes work. As I begin my research into analyzing the brain and how it responds to multi perspective imagery, I hope that there are discoveries that enlighten and enrich our perception of the space between things and the importance of textures and reflective properties to the interpretation of the world around us.

There is more to it than this:


I am presenting a paper at SPIE January 25, 2011 at 5:30 PM Paper 7863-49

SPIE (the International Society for Optical Engineering)  See: is holding a conference on 3D imaging from Jan. 23 – 27 in San Francisco, CA. My paper and presentation: “Human perception considerations for 3D content creation” is about the problem of perception conflicts as they relate to 3D imagery and what to do about them.

I first started thinking about this when I saw an old lenticular photograph of Queen Elizabeth. The photograph could be viewed with stereopsis but the Queen looked like she was dead. Watching the movie Beowulf, while not in 3D, also gave me the creeps as the characters had a dead aspect to them. I noticed some 3D lenticular photographs of people presented with a doll-like character. I then started to notice things in 3D movies that didn’t seem right. When details disappeared into blackness or got blown out to white I noticed an uneasy feeling while looking at that part of the 3D presentation.

Indeed, every time something was presented in 3D that was atypical or not possible to see in the real world, I could detect a feeling of conflict present at some level in my subconsious and I started to manifest a sensitivity to it with regards to recognizing when it was happening.

All of these observations got me thinking about the various mechanisims that we use to see and interpret depth, space and texture. Certainly vergence is the primary mechanism, but as I became more aware of supporting clues like accommodation, motion, luminance dynamic range, binocular rivalry, field of view and so on, I came to a realization.  I realized that when non-vergence depth clues weren’t complementary that those elements or perceptions in conflict required suppression to continue viewing without some sort of physical effect occurring (typically unpleasant such as headache, nausea, etc.).

My paper is a start to the investigation of the importance of supporting perception cues as it relates to stereovision.

*Vergence is the simultaneous movement of both eyes in opposite directions to obtain fixation and the ability to see depth.

*Accommodation is the automatic adjustment in the focal length of the lens of the eye to permit retinal focus of images of objects at varying distances. It is achieved through the action of the ciliary muscles that change the shape of the lens of the eye.

Human Perception Conflicts ARE Natural – How About Facts Over Drama?

Every day there seems to be a new article in a magazine or newspaper forecasting health problems and looming dangers with regards to all things 3D.  Most cite the example of focus decoupled from convergence. They say things like “this 3D stuff is unnatural” and do their best to sensationalize and dramatize for the purpose of selling more magazines.  They do it because that strategy works. You can’t blame them for wanting to sell more magazines. But when trade magazines like Broadcast Engineering jump on the bandwagon this “National Enquirer” style journalism becomes troubling. See their article:

This article was brought to my attention by a fellow member of a professional group that authors Blu-ray and DVD content.  Much of the basis for articles like the above stem from the premise that perception conflicts that occur with 3D illusions are unnatural and therefore are harmful with potential health hazards looming large. Then they point to anecdotal evidence that people experience headaches and nausea and one article exclaimed that someone died as a direct result of watching Avatar.

I’d like to bring some common sense observations into this dialog. First, human perception conflicts happen all of the time in nature. There is nothing unnatural about it. Perception conflicts also happen in outer space and inner space (ocean). If the brain wasn’t able to suppress perception conflicts we would not be able to travel on a boat or airplane or submarine. We certainly would not be able to travel into space where the number of perception conflicts go off the scale. Think about it, even with rigorous training and preparation, many astronauts experience nausea and headaches when they go into space. It is a natural human response to conflicting perceptions as processed in the brain. Some people have more difficulty with conflict suppression and therefore present more significant side effects. While some might argue that Buzz Aldrin must have experienced permanent brain damage as evidenced by his willingness to participate in the Dancing With The Stars TV program, the evidence is that after a time back on earth astronauts reintegrate perceptions that no longer conflict.  (Buzz perhaps requires additional study ;^)

In a previous blog post I talked about decoupling focus from convergence. That is one of the leading issues that people use to justify an opinion that 3D is inherently unnatural and dangerous. They ignore that those who produced the movie Avatar took great pains to mitigate that disparity by using toe-in cameras and keystone correction. Indeed, many found Avatar one of the most pleasing and easiest to view 3D movies they had ever seen. But Avatar did have elements of focus/convergence conflict as well as camera shake (I wasn’t shaking in my seat but the visual was shaking) and super fast scene jump cutting (last I checked there weren’t instant Star Trek like transporters that could beam me around in real life) and many other “unnatural” perception conflicts.

My thought is that it is common sense for filmmakers to spend time understanding the ramifications of perception conflicts and to perhaps reduce or minimize them especially in the beginning parts of the movie. This should provide a greater comfort level for the audience and especially for those with lower tolerance to perception disparities. Just as there are some people who simply can’t see 3D for various reasons, there are people with greater sensitivity to perception conflicts. This should be studied in a scientific way and proper consideration given to educate everyone involved in a factual and scientific manner.  3D filmmaking is both art and science and it certainly can improve dramatically from where the bar currently sits.

Finally, those who have problems watching 3D movies should be encouraged to seek more information about their condition. There are many ways that their experience can be improved, whether through vision therapy or other measures. The notion that 3D tech is somehow bad is very misguided. We should open our minds to the possibilities, not close our minds with unjustified prejudice and misinformation. If we weren’t willing to work through the ability to suppress perception conflicts we never would have got on a boat and humans might very well have become extinct due to their inability or unwillingness to adapt to a dynamic environment.

I hope that common sense will prevail and that all of us will seek to expand our horizons and enjoy all that viewing with stereopsis can provide in both our real world experience and within the realm of entertainment and presentation.

