Thursday, June 16, 2005

A close encounter with "dc"

"dc" DOES NOT stand for:
David Coulthard (F1 fans can stop reading)
Direct Current (electrical engineers, look no further)
District Council (sorry officer!!!)
da capo (dictionary please????)
Before Christ (You're blind dude!!! Get your eyes checked)

Computer engineers, step forward.
dc is a desktop calculator utility on Linux that uses the reverse-polish notation for its calculations. Honestly, I had not come across it until like a coupla days back, and I would've brushed it aside, were it not for the fact that what I saw critiqued my thinking abilities. Well, I bet it would ruffle up your thinking feathers too if you came across the following line:
echo "16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D68736142snlbxq" | dc.
The above line actually prints the string "Bash".Well,
not to be undeterred by this seemingly haphazard arrangement of characters, I started out on the noble mission of deciphering it. So, let's start from the left and work our way through it. Since we've piped the entire command to dc, it's as if dc read this from the stdin. First the basics of dc(assuming you know the reverse-polish notation):
1. Whatever number dc encounters, it pushes it onto its stack. The radix of the input is set by the command "i". When dc encounters this command, it pops the variable on top of the stack and sets it as the input radix.
2. Whenever it encounters an operator(+,-,*,/,%....) it pops the top 2 elements,performs the operation and pushes the result.
3. A macro is nothing but a collection of commands/numbers/operators within square braces"[]".
4. The command "q" quits from the current macro. If we are at the topmost level, it simply quits dc.
5. dc also has around 256 internal registers. The command "sx" pops the variable on top of the stack and stores it in the register x. Macros can also be stored onto registers or pushed onto the stack.
6. The command "lx" pushes the variable in register x onto the top of the stack. x is unchanged.
7. Command "x" pops the macro at the top of the stack and executes it. Hence, the command "lbx" executes the macro in register b. How? Well, it pushes the contents of b onto the stack and the "x" command pops it off executes it. Cool,nah?
8. "P" pops the variable from the top of stack and prints its ASCII equivalent.
9. The command "=z" pops the top 2 elements from the stack, compares them and if found equal, executes the macro in register "z".
Fine, now comes the interesting part of tearing apart this ubiquitous line:
echo "16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D68736142snlbxq" | dc.
First the input radix is set to hexa by the command 16i. We then store the command q in register a by "[q]sa". The next macro is a HUGE one. We'll tackle it later. Whatever was in that macro, we store it onto register b. After that, we push the number "A0D68736142" onto the stack, and then store it into register n through the following sequence : "A0D68736142sn". Finally, we execute the contents of register b(lbx) and then quit from the program.
Now, the magic is in the macro that we skipped earlier, which is actually recursive. Lemme print that here just for convenience:
First, the register contents of n are pushed onto the stack. If you remember, this is that huge hex number. It is then compared with 0 and if found equal, executes the macro in register a, which contains "q". Hence, this is the terminating condition of the recursion. After this, we again push the contents of reg n(since it was popped during the = operation) and the value 0x100 onto the stack. Now,the stack looks like this:
0x100 <- top
We then find the remainder of "0xA0D68736142/0x100" and print its ASCII value. Finally, we find out the value of 0xA0D68736142/0x100 and store this into the register n. At last, we make the recursive call (tail-recursion) to the same macro through "lbx". But now, the register n contains 0xA0D68736142/0x100 instead of 0xA0D68736142. Thus, we repeat the same procedure until n contains 0, at which point we execute contents of reg a,which quits this macro.
So, how do we get "Bash"? Simple:
1. REG N=0xA0D68736142

2. REG N=0xA0D68736142/0x100=0xA0D687361

3. REG N=0xA0D687361/0x100=0xA0D6873

4. REG N=0xA0D6873/0x100=0xA0D68

5. REG N=0xA0D68/0x100=0xA0D
0xA0D%0x100=0xD=13=ASCII(Carriage Return)

6. REG N=0xA0D/0x100=0x10
0x10%0x100=0x10=16=ASCII(DLE) ( I don't know what this is. A flaw in the logic perhaps???)

7. REG N=0x10/0x100=0x0

No comments: