What is the shortest perceivable application response delay?

A delay will always occur between a user action and an application response.

It is well known that the lower the response delay, the greater the feeling of the application responding instantaneously. It is also commonly known that a delay of up to 100ms is generally not perceivable. But what about a delay of 110ms?

What is the shortest application response delay that can be perceived?

I'm interested in any solid evidence, general thoughts and opinions.

28905 次浏览

No solid evidence but for our own application, we allow a maximum of one second between a user action and feedback. If it does take longer, a "waiting box" should be shown.

A user should see "something" happening within a second of causing an action.

For web applications 200ms is considered as unnoticable delay, while 500ms is acceptable.

Persistence of vision is around 100ms so it should be a reasonable visual feedback delay. 110ms should make no difference, as it is an approximate value. In practice you won't notice a delay below 200ms.

Out of my memory, studies have shown that users lose patience and retry an operation after around 2s of inactivity (in the absence of feedback), e.g. clicking on a confirm or action button. So plan on using some kind of animation if the action takes longer than 1s.

I worked on an application that had a explicit business goal of being blindingly fast, and we had a max allowed server time of 150ms for processing a full web page.

What I remember learning was that any latency of more than 1/10th of a second (100ms) for the appearance of letters after typing them begins to negatively impact productivity (you instinctively slow down, less sure you have typed correctly, for example), but that below that level of latency productivity is essentially flat.

Given that description, it's possible that a latency of less than 100ms might be perceivable as not being instantaneous (for example, trained baseball umpires can probably resolve the order of two events even closer together than 100ms), but it is fast enough to be considered an immediate response for feedback, as far as effects on productivity. A latency of 100ms and greater is definitely perceivable, even if it's still reasonably fast.

That's for visual feedback that a specific input has been received. Then there'd be a standard of responsiveness in a requested operation. If you click on a form button, getting visual feedback of that click (eg. the button displays a "depressed" look) within 100ms is still ideal, but after that you expect something else to happen. If nothing happens within a second or two, as others have said, you really wonder if it took the click or ignored it, thus the standard of displaying some sort of "working..." indicator when an operation might take more than a second before showing a clear effect (eg. waiting for a new window to pop up).

The 100 ms threshold was established over 30 yrs ago. See:

Card, S. K., Robertson, G. G., and Mackinlay, J. D. (1991). The information visualizer: An information workspace. Proc. ACM CHI'91 Conf. (New Orleans, LA, 28 April-2 May), 181-188.

Miller, R. B. (1968). Response time in man-computer conversational transactions. Proc. AFIPS Fall Joint Computer Conference Vol. 33, 267-277.

Myers, B. A. (1985). The importance of percent-done progress indicators for computer-human interfaces. Proc. ACM CHI'85 Conf. (San Francisco, CA, 14-18 April), 11-17.

I don't think anecdotes or opinions are really valid for answers here. This question touches on the psychology of user experience and the sub-conscious mind. The human brain is powerful and fast and mere milliseconds do count and are registered. I am no expert but I know there is much science behind e.g. what Matt Jacobsen mentioned. Check out Google's study here http://services.google.com/fh/files/blogs/google_delayexp.pdf for an idea of how much it can affect site traffic.

Here's another study by Akami - 2 second response time http://www.akamai.com/html/about/press/releases/2009/press_091409.html (From https://ux.stackexchange.com/questions/5529/once-apon-a-time-there-was-a-10-seconds-to-load-a-page-rule-what-is-it-nowa )

Does anyone have any other studies to share?

New research as of January, 2014:

http://newsoffice.mit.edu/2014/in-the-blink-of-an-eye-0116

...a team of neuroscientists from MIT has found that the human brain can process entire images that the eye sees for as little as 13 milliseconds...That speed is far faster than the 100 milliseconds suggested by previous studies...

At the San Francisco Opera house, we routinely setup precise delay setting for each of our speakers. We can detect 5 millisecond changes in delay times to our speakers. When you make such subtle changes, you change where the sound sources from. Often times we want sound to sound as if it's coming from someplace other than were the speakers are. Precise delay adjustments make this possible. Sound delays of 15 milliseconds are very obvious even to untrained ears because it radically shifts where the sound sources from. A simple test is to prove this is to play sound through multiple speakers, and have the subject close their eyes and point to where the sound is coming from. Now make a slight change in the delay time to one of the speakers of just a few milliseconds, and have the person point again to where the sound is coming from. Making changes in delay times is acoustically very similar to moving the actual speakers.

I am a cognitive neuroscientist who studies visual perception and cognition.

The paper by Mary Potter mentioned above regards the minimum time required to categorize a visual stimulus. However, understand that this is under laboratory conditions in the absence of any other visual stimuli, which certainly would not be the case in the real world user experience.

The typical benchmark for a stimulus-response / input-stimulus interaction, that is, the average amount of time for an individuals minimum reaction speed or input-response detection is ~200ms. to be certain there is no detectable difference, this threshold could be lowered to around 100ms. Below this threshold, the temporal dynamics of your cognitive processes take longer to compute the event than the event itself, so there is nearly no chance of any ability to detect or differentiate it. You could go lower to say 50 ms, but it really wouldn't be necessary. 10 ms and you've gone into the territory of overkill.

100ms is totally wrong. You can prove this yourself using your fingers, a desk, and a watch with visible seconds. Synchronising to the watch's seconds, drum out beats on the desk continuously such that 16 beats are drummed out every second. I chose 16 because it is natural to drum out multiples of two, so it's like four strong beats with three weak beats in between. Adjacent beats are clearly discernible by their sound. The beats are separated by about 60ms, so even 60 ms is actually still too high. Therefore the threshold is way below 100ms, especially if sound is involved.

For instance, a drum app or a keyboard app needs a delay of more like 30ms, or else it gets really annoying, because you hear the sound coming from the physical button / pad / key well before the sound comes out of the speakers. Software like ASIO and jack were made specifically to deal with this issue, so no excuses. If your drum app has a 100ms delay, I will hate you.

The situation for VoIP and high powered gaming is actually worse, because you need to react to events in real time, and in music, at least you get to plan ahead at least a little. For an average human reaction time of 200ms, a further 100ms delay is an enormous penalty. It noticeably changes the conversational flow of VoIP. In gaming, 200ms reaction time is generous, especially if the players have a lot of practice.

For a reasonably current scholarly article, try out How Much Faster is Fast Enough? User Perception of Latency & Latency Improvements in Direct and Indirect Touch (PDF). While the main focus was on JND (Just Noticeable Difference) of delay, there is some good background on on absolute delay perception and they also acknowledge and account for 60Hz monitors (16.7 ms repaint times) in their second experiment.

Use the dual of test for visual spatial resolution ( two parallel black bars, with an equal width, and an equal gap between them. Reduce angular subtense until they appear to be one line, ie scale down or simply move away. The point at which it seems to merge into one line shows the threshold).

Use function gen to blink an LED on for an interval, then off, then on, then off --- same time delay each interval, but repeat the pattern while gradually decreasing that delay, thus same as above, but time in place of space. Imagine an oscilloscope image like so:

_________/^d^\_d_/^d^\_________

I note that at 41 ms interval, I perceive one longer blink only, but at 42 ms, I just perceive it as extremely rapid double blink. Thus, threshold is ~42ms. Probably varies depending on person, age, condition etc.

This is close to 24 fps, which is probably why cinema works at that presentation rate.

Reaction time to see something, then decide to react, say by clicking mouse etc, is longer much longer again. Thus, it's not surprising that experiments requiring a reaction response to measure yield a longer time, but that longer delay wasn't what you were asking for, and the above experiment is easy and illuminating!

But note also -- smoothly moving animations require the visual cortex to work harder, delaying visual comprehension. This delay is 'hidden' from perception, so longer delays (several hundred ms) can be 'hidden' by just providing something thats difficult to see because moving.

The effect that hides it is called Chronostasis. Basically, glancing somewhere 'new' requires the visual cortex to work harder to 'de-render' / 'recognise' the scene. This takes a remarkably long time, during which your consciousness is essentially 'paused'.

Once looking at a mostly-constant scene, only changes need this processing, so smaller/faster changes are possible and your perceptual experience resumes, and faster/smaller movements are detectable.

The detection of changes visually is processed basically on your retina. Your eyes also have a natural 'bandpass' response -- stare unblinkingly at anything for sufficient time, and at sufficient distance for saccades to be unable to change the image much, and you will find your visual feed fading out to 'grey'. This is what gives us our 'white balance', and is somewhat similar to the automatic gain control on analogue radio/tv.

The point is, that your eyes themselves have a time constant to respond, but this is actually dependant on the strength of the stimulus. (brightness of the LED, for our case).

Too bright, and the ability of your retinal cells to 'relax' back from the brightness, ie, respond to the 'sudden dark', is compromised.

The effect which keeps you seeing bright things after the light has stopped is called 'persistence of vision', and old cathode-ray picture tubes more or less depend heavily on it for them to work at all.

This is the one that's usually 100 ms or so, but it's not a 'sharp' interval -- more like a exponential roll-off, and again -- changes duration depending on how bright the stimulus is relative to how dark-adjusted (ie, sensitive) the eye is at that moment.

For duller, faster changes, especially changes outside your fovea, you will perceive even higher rates easily. Eg, flickering lights. Those outer parts of your retina (most of the area, actually) are adapted to detecting movement, and bringing it to your attention. So it makes sense that although lacking spatial resolution, they have greater time resolution / shorter response rate.

But this also means animating things usually requires even finer time steps, otherwise 'jumpiness' is perceptible, mostly due to that faster response.

Note all the scaling/sliding full screen animations iOS uses -- these essentially exploit chronostasis to hide technically unavoidable loading delays, giving the perception that those products respond instantly and smoothly at all times.

So, show something different within 42 ms -> instant response. Keep animating otherwise useless hard-to-see-properly visuals continuously at high frame rates, then stop suddenly when done -> hides the delay so long as enough is visually busy, and the delay isn't too long. (probably 250ms is pushing the friendship).

This also seems to tee up with other's perceptions of input lag, for example : http://danluu.com/input-lag/