Thursday, October 24, 2013

Useful Complexity



...And even then it will be cracked, because GPU based cracking and cracking method optimization, have reduced the time required to crack the entire passwordspace for most passwords down to a matter of minutes, or at worst hours.

According to several recent articles in various industry publications and websites, approximately 85% of all Windows passwords can be recovered in less than 60 minutes, and more than 90% within 24 hours, using only a single multi-core cpu, multi-gpu computer (basically a high end gaming rig).

Using small clusters of multi-cpu many-gpu systems (basically, spend $20,000 on off the shelf hardware) the entirety of the 8 character Windows passwordspace (all possible 0-8 character Windows passwords) can be cracked in a few days, or less.

With the computing power available today, the only useful thing high password complexity does, is make your password harder for a human to guess.

...Unfortunately, the bad part is, that also makes it harder to remember, and harder to enter.

Here's the level of minimum password complexity that is actually useful: 

8 or more characters, not forming any dictionary word or combination of words (including letter substitutions), and including at least one special character.

Anything else is just making your users life more difficult, without actually making them any more secure in the real world.

Ok, so... why is this the "useful level of complexity" ?

Because in the real world, an 8 character password, without any dictionary words or variants on dictionary words, and including at least one special character, requires a cracker to use the entire characterspace to crack your password.

Wait... what? No, that's wrong isn't it? There's 128 ASCII characters, or 255 in the extended character set right? Upper and lower case alphabetic characters, numerals 0-9 and a whole bunch of "special characters"... All of those can be used in passwords right?

Well, yes, theoretically the possible characterspace is 255 characters (or 256 for ISO-8859/UTF-8 encodings).

...Theoretically...

In reality, it's not. First thing is that most password systems don't allow the entire 8 bit characterspace.

While it is theoretically possible to use the entire 8 bit U.S. character set (extended ASCII or UTF-8) in a password (or even to use a multibyte character set), it requires special keyboard codes, and these characters are difficult to enter. Further, most mobile devices do not allow you to enter characters other than those on the standard keyboard (or make it very difficult to do so).

There are 94 or 95 characters available on a standard US keyboard (depending on whether you count the nonbreaking space i.e. the space bar): 10 numerals, 32/33 special characters, and 52 letters (upper and lower case).

By the by, these are generally referred to as the "printable characters", with the remaining characters referred to as "non-printable".

Even if you wanted to use them, accepting that they are difficult to enter and mobile devices may not be able to enter them at all... most password systems exclude unprintable characters, leaving a maximum of 95 possible characters.

For those password systems which allow the non-printable character set, they generally limit passwords to the 7 bit basic ASCII character set (or sometimes ANSI-1 or UTF-7, which are technically different, but include the same characters), which is 128 characters.

Oh and yes, there are non-us character sets, even multibyte character sets that include many thousands of characters, and it's certainly possible to code a system that accepts all of these characters.

... but no-one does.

Even computing systems that accept large character sets for text input (those supporting the Chinese GB18030 standard for example, or a full implementation of UTF-32, with over 1.1 million possible characters), generally only accept a limited subset of characters (usually UTF-8) for passwords (because you can't guarantee compatibility with large character sets across all hardware and software combinations).

So yes, the theoretically possible characterspace is actually many more than 255 characters, but the 95 keyboard characters comprises the entirety of the passwordspace most people might actually use.

Oh and many password systems exclude some or all of the characters !@&*$?/|\ and almost all password systems exclude the nonbreaking space (the space bar), because they can cause problems with parsing. Some actually exclude all special characters, but this is rare now.

What it comes down to, is that the "normal" characterspace is 94 characters.

That would seem to make it even MORE important to use case shift, and numerals; as they comprise 38% of the available characters.

In theory just using lowercase and special characters takes 36 of those 94 characters out, meaning that crackers only need to use 72% of the characterspace to crack your password.

...In theory, it would be better to make them need to use 100%...

...but in reality it doesn't work that way.

Okay... why doesn't more complexity increase security?

At this point, the computational power of multi-gpu cracking system, is enough so that in any serious cracking run, crackers can include the entire alphanumeric space without undue penalty; so including numbers and case changes can help a bit, but not much.

The first cracking run on a password will be optimized for high speed, and will include an optimized dictionary, and tables of common dictionary variations and substitutions (substituting 3 for E, @ or 0 for O etc...). Combined with a full lowercase alpha only run, that only takes a few minutes, to at most a few hours, for the entire 0-8 character passwordspace.

From there, crackers go to brute force, with or without optimizations. The first thing they're going to do is add in the full alphanumeric space, before they add in special characters; and any run that includes special characters will therefore almost certainly include mixed case and numerals.

That means that in a bruteforce attack, whether you included mixed case and numerals in your password or not, the cracker will still try all of them as if you did, and therefore it will take just as long to crack your password as if you did include them.

So, any password with a special character is likely to be slightly more secure than those including numbers and case changes, and unlikely to be less secure (presuming equal length). To put it another way, using a special character (or preferably more than one), has a higher expected security value, than using mixed case and numeric characters.

Yeah, it MIGHT take longer to bruteforce your password if you've got all 3... but your password is going to be one in a hundred, or a thousand, or a million, the cracker is trying to crack all at once; and they're going to run the entire mixed case alphanumeric space, before they even start adding special characters.... and these days "longer" is a few hours, or at most a couple days, not "more than 30 days".

So, unless your password policy is that users change their passwords every week (and that would be a huge support nightmare, causing more lost productivity than any value doing so might provide)... adding any more complexity doesn't significantly increase the security of a password; but does significantly increase the trouble to your users.

Include more complexity if you want to... but don't make it a requirement.

My personal recommendation for how to create good passwords?

Using the first or last letter of each word (or better, both the first AND last letter) in a phrase, poem stanza, song lyric, or other memorable passage, combined with special characters; is generally a good way of producing a pseudorandom non-dictionary string that is of sufficient length to provide reasonable security, but which you can still actually remember.

Include more than one special character, and don't make the specials ONLY the first, last, or middle/joining characters in the password. Also, don't make the only special characters you use, common letter substitutions like $ for S, ! for I etc...

All of these are common optimizations which crackers use to reduce the time it takes to bruteforce a password by the way. Not doing them forces the cracker to bruteforce the entire passwordspace, not just the MUCH reduced optimized space.

Going to more than 8 characters is actually useful, if the password system doesn't drop or ignore the extra characters (many do).

More than 16 characters generally isn't useful for a pseudorandom password, because 16 characters using the 94 character passwordspace, is essentially uncrackable at this time (it's computationally infeasible within a reasonable time horizon). Really any complex pseudorandom password with 12 characters or more is likely to remain computationally infeasible for at least 10 years.

Telephone company studies to determine the ideal length of phone numbers, figured out that human beings are pretty good at remembering strings of 1, 2, 3, and 4 characters, and combinations of those strings (2+3=5, 3+4=7, 3+3+4=10 etc...); with 3 and 4 character strings being the easiest to remember due to something they called "memory chunking" (the human memory seems to run 4/4 time).

Those same studies showed that humans are generally bad at remembering strings of other lengths, more than 4 strings total, and more than 13 characters total (with optimal recall at 3 or fewer strings, and 10 or fewer total characters).

Given that, I say make your passwords 9-12 characters long, with at least two special characters. You can improve your password strength dramatically with every additional character up to 16, but you trade off on memorability.

The standard recommendation is to use a different password for every account; but given the huge number of accounts people often have these days, it seems unrealistic to expect them to remember that many different passwords.

One solution is to use a password manager, which will create a unique strong password for every account, and store them, requiring you only to enter the strong password you created for the password manager itself.

Another solution is to create unique strong passwords for your high security impact accounts (those with banking, credit, legal, and healthcare impact for example), and then to have several other passwords that you use for other security levels, having just one for each level, but changing them at least every 90 days.

Whatever you do, it's always a tradeoff between length and complexity (increased entropy), and memorability and easy of entry.

Speaking of length and memorability... passphrases?

If the password system in question doesn't drop or ignore characters beyond 8, 12, 16 etc... you can also use longer passphrases instead of pseudorandom passwords.

At first glance, this would seem to be an easy way to have a memorable password that is still very long; which is true, but there are some major issues with passphrases that make each character in added length of much less value than in a pseudorandom password.

Multi word phrases using common dictionary words are less secure for an equivalent length, than pseudorandom passwords with special characters, simply because the possible solutionspace for each is very different.

With an 8 character password, in a 94 character passwordspace, there are 6,095,689,385,410,816 possible combinations of characters. There are only about 30,000 8 letter words.

There are between 250,000 and 400,000 words in the english language (depending on what words you count and whose estimates you believe). The average English speaker however only knows 20-40,000 words, and only uses about 2000-4000 words regularly.

Further, English words exhibit very strong letter frequency patterns, which are well understood in statistical analysis (in fact that understanding is critical for cryptanalysis). For example, the average english word is 5 letters long, and more than 80% of english words contain at least two of 6 most common consonants, and at least one of the vowels e, i, or a.

Reducing the dictionary set to common words of 8 letters or fewer, brings your wordspace down from 400,000 to something like 100,000.

These characteristics dramatically reduce the total entropy of passphrases; and dictionary optimized bruteforcing, based on common words, and english letter frequency, can be many orders of magnitude faster than a straight bruteforce.

Essentially, each word in a passphrase provides less than the entropy of a pair of pseudorandom characters.  

In fact, given the reduction in entropy inherent in using dictionary words; if you are going to use a passphrase without increasing the complexity, I would recommend at least 8 words and at least 32 characters (not including the non-breaking space. Longer words are better).

... which really means you should be increasing the complexity. 

The first and most basic thing, is to use at least one word 8 characters or longer, preferably an uncommon one (say... antidisestablishmentarianism for example). This makes the wordspace required to crack your passphrase DRAMATICALLY larger (the average English word is 4.5 characters long. Going from 4-5 character words to an 8 character word increases the cracking space from around 40,000 to over 150,000 words).

Passphrases should include as much of the full 94 character passwordspace as possible; using mixed case, multiple special characters (punctuation is good for that, but because spacing between words is common, it has a lower expected value than other special characters), and if it is easy to remember, and makes sense, numerals. Also, using a special character substitution in more than one word here provides a dramatic increase in entropy that is very worthwhile, particularly if it's not a common substitution.

I would also recommend replacing (os letter substitution with) one or more dictionary words in the phrase with a pseudorandom string. For example, use the first two and/or last two words of the phrase to create a pseudorandom string with the first and last letters.

Increasing word complexity and adding pseudorandom strings to a passphrase of any length more than 5 words or more, and at least 20 characters should make it functionally impossible to bruteforce.

Common words of 3 characters or fewer are actually easier to bruteforce than single additional pseudorandom characters. So you want to average at least 4 characters per word... preferably 5 or more (more than the average word length).

Oh and as spacing is predictable in standard English phrases, make it unpredictable. This results in combination words that together are harder to brute force than the multiple individual words with spaces would be.

Basically... if your pass phrase includes "the" and "end" you should make sure that you've got two 6 letter words in there and make it something like "Always-beTTer intheenD!" (which would be functionally impossible to bruteforce any time in the forseeable future).

At that point you have the same entropy as a pseudorandom string of the same length... it's just easier to remember.