Editing Talk:936: Password Strength
Please sign your posts with ~~~~ |
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 83: | Line 83: | ||
* (Secondly: The "punctuation" should have 5, not 4 bits of entropy. There are 32 (2^5) ASCII punctuation characters (POSIX class [:punct:]). But I assume this is a lapse.) | * (Secondly: The "punctuation" should have 5, not 4 bits of entropy. There are 32 (2^5) ASCII punctuation characters (POSIX class [:punct:]). But I assume this is a lapse.) | ||
Can someone enlighten me? --[[Special:Contributions/162.158.91.236|162.158.91.236]] 17:31, 19 September 2015 (UTC) | Can someone enlighten me? --[[Special:Contributions/162.158.91.236|162.158.91.236]] 17:31, 19 September 2015 (UTC) | ||
− | :I have missed the sentence "Randall assumes only the 16 most common characters are used in practice (4 bits)". Hm. There is a huge list with real world passwords out there, leaking from RockYou in 2009. After some processing to remove passwords | + | :I have missed the sentence "Randall assumes only the 16 most common characters are used in practice (4 bits)". Hm. There is a huge list with real world passwords out there, leaking from RockYou in 2009. After some processing to remove UTF-8 passwords, the list contained about 14329849 unique passwords from about 32585010 accounts. The following are the number of accounts using a password containing some (ASCII) punctuation or space characters: |
<nowiki> | <nowiki> | ||
226673 . | 226673 . | ||
Line 91: | Line 91: | ||
104224 @ | 104224 @ | ||
95237 * | 95237 * | ||
− | 92802 | + | 92802 (space) |
60002 # | 60002 # | ||
36522 / | 36522 / | ||
Line 118: | Line 118: | ||
939 } | 939 } | ||
502 | | 502 | | ||
− | |||
− | |||
</nowiki> | </nowiki> | ||
:Sorry, I have no "citation". But you can play with the leaked RockYou password list yourself. Here is a way to reach that playground: | :Sorry, I have no "citation". But you can play with the leaked RockYou password list yourself. Here is a way to reach that playground: | ||
Line 125: | Line 123: | ||
$ # Download the compressed list (57 MiB; I have no idea what "skullsecurity" | $ # Download the compressed list (57 MiB; I have no idea what "skullsecurity" | ||
$ # is, it was simply the first find and I assume it's the said list): | $ # is, it was simply the first find and I assume it's the said list): | ||
− | $ wget http://downloads.skullsecurity.org/passwords/rockyou-withcount.txt.bz2 | + | $ wget 'http://downloads.skullsecurity.org/passwords/rockyou-withcount.txt.bz2' |
− | $ # Decompress the list (243 MiB), or, | + | $ # Decompress the list (243 MiB), or, more exact spoken, it's a table: |
$ bzip2 -dk rockyou-withcount.txt.bz2 | $ bzip2 -dk rockyou-withcount.txt.bz2 | ||
Line 139: | Line 137: | ||
49952 iloveyou | 49952 iloveyou | ||
− | $ # The following command processes the table to remove lines | + | $ # The following command processes the table to remove lines having non-ASCII |
− | $ # | + | $ # characters or non-printable ASCII characters in the password, and lines |
− | $ # | + | $ # insisting that there were some accounts with no password. Moreover, the |
− | + | $ # command removes every space character not belonging to a password, makes | |
− | $ # removes every space character not belonging to a password, makes | + | $ # the rows tab-delimited and writes the result in a file called "ry" |
− | $ # tab-delimited and writes the result in a file called "ry" (161 MiB | + | $ # (161 MiB). |
− | + | $ LC_ALL=C sed -nr 's/^ *([1-9][0-9]*) ([[:print:]]+)$/\1\t\2/p' rockyou-withcount.txt > ry | |
− | $ LC_ALL=C sed - | ||
$ # The following are shell functions to build commands. They will be explained | $ # The following are shell functions to build commands. They will be explained | ||
− | + | ä # below using examples (I can not express myself well in this language). | |
$ counta() { LC_ALL=C awk 'BEGIN { FS = "\t"; p = 0; a = 0 } { if ($2 ~ /'"$(printf %s "$1" | sed 'sI/I\\/Ig')"'/) { p++; a += $1 } } END { print a " (" p ")" }' "$2" ;} | $ counta() { LC_ALL=C awk 'BEGIN { FS = "\t"; p = 0; a = 0 } { if ($2 ~ /'"$(printf %s "$1" | sed 'sI/I\\/Ig')"'/) { p++; a += $1 } } END { print a " (" p ")" }' "$2" ;} | ||
$ countap() { LC_ALL=C awk 'BEGIN { FS = "\t"; p = 0; a = 0 } { if ($2 ~ /'"$(printf %s "$1" | sed 'sI/I\\/Ig')"'/) { p++; a += $1; print $0 } } END { print a " (" p ")" }' "$2" ;} | $ countap() { LC_ALL=C awk 'BEGIN { FS = "\t"; p = 0; a = 0 } { if ($2 ~ /'"$(printf %s "$1" | sed 'sI/I\\/Ig')"'/) { p++; a += $1; print $0 } } END { print a " (" p ")" }' "$2" ;} | ||
Line 160: | Line 157: | ||
671599 (188855) | 671599 (188855) | ||
− | $ # The first operand of | + | $ # The first operand of this command is a extended regular expression (ERE), |
− | $ # | + | $ # namely "love". The second operand of this command is a file, namely the |
− | $ # called "ry", that is the (processed) table. The first number of the output | + | $ # obove generated file called "ry", that is the (processed) table. The first |
− | + | $ # number of the output means: "That many accounts were using a password | |
− | + | $ # matching the ERE." The second number in parentheses means: "That many unique | |
− | $ # the ERE." If the first number is greater than the second number, some | + | $ # passwords matching the ERE." If the first number is greater than the second |
− | + | $ # number, some accounts sharing the same password. We will see this clearly in | |
− | $ # examples below | + | $ # some examples below. |
$ # Count how many accounts were using a password containing at least one | $ # Count how many accounts were using a password containing at least one | ||
Line 179: | Line 176: | ||
144 (45) | 144 (45) | ||
− | $ # Count how many accounts were using a password containing exactly one | + | $ # Count how many accounts were using a password containing exactly one |
− | $ # character: | + | $ # numeric character: |
$ counta '^[0-9]$' ry | $ counta '^[0-9]$' ry | ||
55 (10) | 55 (10) | ||
Line 198: | Line 195: | ||
55 (10) | 55 (10) | ||
− | + | # Here we see the second command in action. You see what it does and what it | |
− | + | # does different. And here we see clearly the meaning of the first and the | |
− | + | # second number in parentheses. | |
$ # Count how many accounts were using a password containing at least one | $ # Count how many accounts were using a password containing at least one | ||
Line 216: | Line 213: | ||
$ counta '^[0-9]' ry | $ counta '^[0-9]' ry | ||
6409397 (3283946) | 6409397 (3283946) | ||
− | |||
− | |||
− | |||
− | |||
− | |||
$ # And, last but not least, count how many accounts were using a password | $ # And, last but not least, count how many accounts were using a password | ||
Line 229: | Line 221: | ||
3 (3) | 3 (3) | ||
− | $ # Yes, there are some. 14 million | + | $ # Yes, there are some. 14 million passwords are a lot. Let's see what exactly |
− | $ # | + | $ # was used: |
$ countap '[tT]r[o0]ub[a4]d[o0]r' ry | $ countap '[tT]r[o0]ub[a4]d[o0]r' ry | ||
1 troubador1 | 1 troubador1 |