Skip to main content

Signing in with a picture password

Picture password is a new way to sign in to Windows 8 that is currently in the Developer Preview. Let’s go behind the scenes and see how secure this is and how it was built. One of the neat things about the availability of a touch screen is that it provides an opportunity to look at a new way to sign in to your PC. While many of us might prefer to remove the friction of getting to a PC by running without a password, for most of us, and in most situations this is not the case or is at least unwise. Providing a fast and fluid mechanism to sign in with touch is super important, and we all know that using alpha passwords on touch-screen phones is cumbersome. This post is authored by Zach Pace, a program manager on our You Centered Experience team, and looks at the implementation and security of picture password in Windows 8. Just as a note, you can also use a mouse with picture password too, just by using some click and/or drag actions.

The experience of signing in to your PC with touch has traditionally been a cumbersome one. In a world with increasingly strict password requirements—with numbers, symbols, and capitalization—it can take upwards of 30 seconds to enter a long, complex password on a touch keyboard. We have a strong belief that your experience with Windows 8 should be both fast and fluid, and that starts when you sign in.
Other touch experiences in the marketplace have tried to tackle this problem, with the canonical example being a numeric PIN. A PIN is a great solution: Almost everyone has seen or used one before, and a keypad is simple to use with touch. We knew though, that there was room to improve.
A numeric combination often presents a problem for people because the sequences easiest to remember are typically the least secure. Common number sequences—like 1111, or 1234—are troublesome, but PINs that are composed of common well-known personal dates can also be deduced if an attacker has personal knowledge of the person (much of which is not hard to obtain). In such a case, the number being personal to a person can work against its security. We set out to change the paradigm here: we designed a fast and fluid touch sign-in experience that is also personal to you.

A personal sign-in experience

At its core, your picture password is comprised of two complimentary parts. There is a picture from your picture collection and a set of gestures that you draw upon it. Instead of having you pick from a canned set of Microsoft images, you provide the picture, because it increases both the security and the memorability of the password. You get to decide the content of the picture and the portions that are important to you. Plus, you get to see a picture that is important to you just like many people do on their phone lock screen.
Image of four people / Switch to password button
At its core, the picture password feature is designed to highlight the parts of an image that are important to you, and it requires a set of gestures that allow you to accomplish this quickly and confidently. In order to determine the best set of gestures to use, we distributed a set of pictures to a set of study participants and asked them to highlight the parts of the image that were important to them. That’s it, no additional instructions. What we found were people doing three basic things: indicating location, connecting areas or highlighting paths, and enclosing areas. We mapped these ideas to tap, line, and circle, respectively. It’s the minimal set of gestures we found that allowed people to signify the parts of the image most important to them.
There’s also an attribute inherent to circle and line gestures that adds an additional layer of personalization and security: directionality. When you draw either a circle or a line on your selected picture, Windows remembers how you drew it. So, someone trying to reproduce your picture password needs to not only know the parts of the image you highlighted and the order you did it in, but also the direction and start and end points of the circles and lines that you drew.
Set up your getures / A circle is shown around the man's head, a dot on the nose of figure on left, a line between the nose of the two figures on right.
We also researched using freeform gestures. When we explored the concept, both with design iterations and research, we found the major pitfall of such a system: the time it takes to sign in. As I mentioned above, we wanted a solution that was faster than a touch keyboard. Throughout the evolutionary process of this feature we used the time taken to sign in using a touch keyboard as a benchmark to judge the success of our methods. We found that when people were allowed to use freeform gestures, it took them consistently longer to sign in. They were slowed down by the concept, feeling that they needed to be unnecessarily precise and trace fine details in an image.
Because people were highlighting areas instead of fine detail, we found that using a limited set of gestures was on average more than three times as fast as the freeform method. We also found that with repeated use, people using the gesture set were consistently able to complete the task in under four seconds, compared to an average of 17 seconds for the freeform model. After continued use of the freeform method, we found many participants asked to change their freeform gestures, picking simple lines and locations instead.

How it works

Once you have selected an image, we divide the image into a grid. The longest dimension of the image is divided into 100 segments. The shorter dimension is then divided on that scale to create the grid upon which you draw gestures.
To set up your picture password, you then place your gestures on the field we create. Individual points are defined by their coordinate (x,y) position on the grid. For the line, we record the starting and ending coordinates, as well as the order in which they occur. We use the ordering information to determine the direction the line was drawn in. For the circle, we record a center point coordinate, the radius of the circle, and its directionality. For the tap, we record the coordinate of the touch point.
Line between the noses of two people in picture shown with grid superimposed. Endpoints of line identified as (X1, Y1) and (X2, Y2)
When you attempt to sign in with Picture Password we evaluate the gestures you provide, and compare the set to the gestures you used when you set up your picture password. We take a look at the difference between each gesture and decide whether to authenticate you based on the amount of error in the set. If a gesture type is wrong—it should be a circle, but instead it’s a line—authentication will always fail. When the types, ordering, and directionality are all correct, we take a look at how far off each gesture was from the ones we’ve seen before, and decide if it’s close enough to authenticate you.
As an example, let’s take a look at the tap gesture. The tap is the least complex of the three gestures both in number of unique permutations and in the subsequent analysis. When considering whether the spot that you’ve tapped matches a reference spot, our scoring function compares the distance between the gesture you recorded as part of your picture password and the one that you just performed. The score decreases from 100% for a perfect match to 0% when sufficiently far away. Points match when the score is >= 90%. Here is a visual representation of the scoring function for a point in the immediate vicinity of a 100% match:
Scoring the tap gesture
The area that is scored a match is a circle of radius 3. For any specific tap, a total of 37 (X,Y) locations will return a match. We perform similar calculations for the variables associated with lines and circles.

Security and gesture count

When we took a look at the number of gestures that would be required to use picture password we considered security, memorability, and speed. We sought to balance these often competing attributes to achieve an optimal user experience that would also be secure to use. In order to determine the appropriate gesture count that would meet our security goals, we compared picture password with different authentication methods, namely PIN and plain text password.
The analysis of the number of unique PINs is trivial. A 4-digit PIN (4 digits with 10 independent possibilities each) means there are 104 = 10,000 unique combinations.
When looking at alphanumeric passwords, the analysis can be simplified by assuming passwords are a sequence of characters comprised of lower case letters (26), upper case letters (26), digits (10), and symbols (10). In the most basic case, when a password is comprised strictly of n lower case letters, there are 26n permutations. When the password can be any length from 1 to n letters, then there arethis many permutations:
For instance, an 8-character password has 208 billion possible combinations, which to most people would seem amazingly secure.
Unfortunately, the way most users pick passwords is far from random. Left to their own devices, people use common words and phrases, names of family members, and so on.
In this scenario, let's assume the user composes their password from all but two lower case letters, one upper case letter, and one digit or symbol; however, the upper case letter and digit/symbol can appear in any position of the password. The number of unique passwords is then:
The following table illustrates how the size of the solution space varies with password length and various character set assumptions.
Password lengthUnique passwords
When considering picture password, we can conduct a similar analysis for each of the gesture types. The information in the tables below accounts for both unique gesture positions and the leniency of our recognition algorithm.
For the simplest gesture, the tap, the number of unique gesture sets as a function of number of taps is as follows:
# of tapsUnique gestures
The circle gesture has more complexity than a tap, but less than a line. In an attempt to quantify the relative security of a circle, we can assume that an attacker knows the radii is guaranteed to be between 6 and 25 (reducing the work to guess a circle gesture), we will further assume that both X and Y coordinates are known to be between 5 and 95. This makes the potential solution space for a hacker to explore to be as follows:
As a function of number of circles, the number of unique gesture sets is as follows:
# of circlesUnique gestures
The most complex gesture of the three is the line. A line is comprised of two points on a normalized 100 x 100 grid, and an ordering of those points. This nominally results in 100 million possible lines; however, lines must be at least 5 units long, so the number of unique lines is actually 99,336,960. Unlike attempts to guess circles where hackers can make simplifying assumptions that significantly reduce the solution space, there are not any similarly obvious reductions for lines. Lines could just as easily go from edge to edge of the screen as they could be very short segments. The number of matches in the case of the line is as follows:
# of linesUnique gestures
Now that we understand the security of individual gestures, this data can be combined to assess sets containing multiple gestures. This can be done by summing up the unique gestures for all three gesture types for the specific gesture length n and raise it to the nth power. This results in the table below, which compares picture password to both PIN and alphanumeric password methods.
Length10-digit PINSimple a-z character set passwordMore complex character set passwordMulti-gesture picture password
As you can see, the use of three gestures provides a significant number of unique gesture combinations and a similar security promise to a password of 5 or 6 randomly chosen characters. Additionally, using three gestures ensures a Picture Password that is easy to remember and quick to use.
In addition to the number of unique combinations, we’ve increased security of the feature by introducing two safeguards against repeated trial attacks. Similar to the lock out feature on phones using PIN, when you enter your picture password incorrectly 5 times, you are prevented from using the feature again until you sign in with your plain text password. Also, picture password is disabled in remote and network scenarios, preventing network attacks against the feature.
To be clear, picture password is provided as a login mechanism in addition to your text password, not as a replacement for it. You should be sure to have a good hint and use safeguarding mechanisms for your text password, which you can still always use to sign in (the sign-in screen provides a one-click mechanism to switch between all available password entry methods).

Securing against smudges

We’ve also taken some practical considerations to protect you if you use Picture Password. People are often concerned with the smudges left behind on a touch screen and how easy or hard it would be to divine your password based on those markings. Because the order of gestures, their direction and location all matter, it makes the prospect of guessing the correct gesture set based on smudging very difficult even in the completely clean screen case, let alone on a screen that sees regular touch use.
The potential threat here is that smudges left from signing in may yield clues as to the authentication sequence. We can compare three way of logging in—touch keyboard, a four-digit PIN, and picture password—to compare the ease of guessing the sign-in sequence. Let's assume the worst case scenario:
  1. The user cleans their screen to an absolutely mirror shine.
  2. The user touches exactly the minimum places necessary to authenticate.
  3. The user then walks away from their machine without touching it further.
  4. The attacker steals the tablet and can with 100% accuracy see every gesture used for authentication.
Obviously, this is rather unlikely, but this scenario allows us to compare and contrast the three forms of authentication and their relative vulnerability to this sort of attack.
A PIN will leave a smudge in a known location for each digit used in the code. If there are n digits in the PIN, and all digits are unique (the hardest to deduce case), there will be n! possible ways of ordering the PIN. For a typical 4-digit PIN, this is 24 different combinations.
For an on-screen keyboard, there are also n! ways of ordering an n-character password. For compliant passwords, a person will typically use the Shift key (or another button) to select alternate character sets. This key press will, of course also be visible to the attacker, but it does not indicate when in the sequence the Shift key was utilized. If we make the simplifying assumption that there is only one shifted key in the password, then there are n!⋅possible passwords to consider.
Gestures also have n! orderings. For every circle and line used in the gesture set, the number of permutations increases by a factor of two. If all gestures are circles or lines, then the possible set of permutations is the same as a password that uses the Shift key,  n!⋅2^n.
The following table summarizes the number of permutations for each of these methods for various sequence lengths:
LengthPINPasswordPassword with ShiftTap-only gesturesLine and circle gestures
Again, this is assuming a completely clean screen with only the gestures visible via smudging. If we consider a scenario where the attacker cannot gain any useful information from smudging—either because the machine is very heavily used (and smudged) or because it is mouse and keyboard only—the chances of guessing the correct sequence becomes even more remote. With our three gestures types, directionality, and the requirement that the sequence be at least three gestures long, the possible number of gesture combinations sits at 1,155,509,083, as discussed above.
The final attack that we considered involves points of interest on an image, or areas that people may commonly choose when presented with an image. Even though the research we did showed this kind of attack to be extremely unreliable—the areas people chose and the kind of gestures they drew upon them correlated very poorly in the lab—we can analyze such an attack by assuming a given picture has clip_image016 points of interest. If the user is free to use any combination of taps, circles, and lines, then the total number of permutations is  (m⋅(1+2⋅5+(m-1)))^n, where n  is the length of the picture password. This yields the following number of possible combinations:
Points of interest
Assuming the average image has 10 points of interest, and a gesture sequence length of 3, there are 8 million possible combinations, making the prospect of guessing the correct sequence within 5 tries fairly remote.
Although we’re very happy with the robustness of a picture password, we know that there are a variety of businesses for which security is paramount, and anything less than a full password is unacceptable. As such, we’ve implemented group policy that gives a domain administrator the freedom to choose whether picture password can be used. And of course, on your home PC, picture password is optional as well.
When we started the process of designing picture password, we knew that we wanted a sign-in method that was fast, fluid, and personal to each and every user of Windows 8, but still had a robust security promise. Through our research and refinement of both the experience and the concept, we believe we’ve hit on a method of signing in that’s secure but also a lot of fun to use. We love picture password and the additional personal flavor it brings to Windows 8, and we hope you do too!