PersonalCorpus 版 (精华区)

发信人: Kernel (调戏了别人的妹妹!), 信区: Hacker
标  题: 如何编写自己的缓冲区溢出利用程序?
发信站: 哈工大紫丁香 (Thu Sep 11 19:51:46 2003)


                              by 黑猫 (virtualcat@hotmail.com)


内容: 本文主要讲解有关Buffer Overflow的原理, 以及结合实战范例介绍Linux和Solari
s下的漏洞利用.
      本文并不介绍如何编写shell code.

要求: 读者要有一点C和汇编语言基础.

目标: 希望本文能够尽量做到通熟易懂,使得稍有计算机基础知识的朋友看后能够亲自动手
写自己的Exploit
      如果你觉得自己对这些都懂了, 就请不要再往下看了.


                          第一部份    概述篇

1. Buffer overflow是如何产生的?
    所谓Buffer overflow, 中文译为缓冲区溢出. 顾名思意, 就是说所用的缓冲区太小了
, 以至装不下
    那么多的东西, 多出来的东西跑出来了. 就好象是水缸装不了那么多的水, 硬倒太多
会溢出来一样;)
    那么, 在编程过程中为什么要用到buffer(缓冲区)呢? 简单的回答就是做为数据处理
的中转站.

2. UNIX下C语言函数调用的机制及缓冲区溢出的利用.
   1) 进程在内存中的影像.
      我们假设现在有一个程序, 它的函数调用顺序如下.
      main(...) -> func_1(...) -> func_2(...) -> func_3(...)
      即: 主函数main调用函数func_1; 函数func_1调用函数func_2; 函数func_2调用函
数func_3

      当程序被操作系统调入内存运行, 其相对应的进程在内存中的影像如下图所示.

        (内存高址)
        +--------------------------------------+
        |             ......                   |  ... 省略了一些我们不需要关心
的区
        +--------------------------------------+
        |  env strings (环境变量字串)          | \
        +--------------------------------------+  \
        |  argv strings (命令行字串)           |   \
        +--------------------------------------+    \
        |  env pointers (环境变量指针)         |    SHELL的环境变量和命令行参数
保存区
        +--------------------------------------+    /
        |  argv pointers (命令行参数指针)      |   /
        +--------------------------------------+  /
        |  argc (命令行参数个数)               | /
        +--------------------------------------+
        |            main 函数的栈帧           | \
        +--------------------------------------+  \
        |            func_1 函数的栈帧         |   \
        +--------------------------------------+    \
        |            func_2 函数的栈帧         |     \
        +--------------------------------------+      \
        |            func_3 函数的栈帧         |      Stack (栈)
        +......................................+      /
        |                                      |     /
                      ......                        /
        |                                      |   /
        +......................................+  /
        |            Heap (堆)                 | /
        +--------------------------------------+
        |        Uninitialised (BSS) data      |  非初始化数据(BSS)区
        +--------------------------------------+
        |        Initialised data              |  初始化数据区
        +--------------------------------------+
        |        Text                          |  文本区
        +--------------------------------------+
        (内存低址)

        这里需要说明的是:
        i)   随着函数调用层数的增加, 函数栈帧是一块块地向内存低地址方向延伸的.

             随着进程中函数调用层数的减少, 即各函数调用的返回, 栈帧会一块块地

             被遗弃而向内存的高址方向回缩.
             各函数的栈帧大小随着函数的性质的不同而不等, 由函数的局部变量的数目
决定.
        ii)  进程对内存的动态申请是发生在Heap(堆)里的. 也就是说, 随着系统动态分

             配给进程的内存数量的增加, Heap(堆)有可能向高址或低址延伸, 依赖于不

             同CPU的实现. 但一般来说是向内存的高地址方向增长的.
        iii) 在BSS数据或者Stack(栈)的增长耗尽了系统分配给进程的自由内存的情况下
,
             进程将会被阻塞, 重新被操作系统用更大的内存模块来调度运行.
             (虽然和exploit没有关系, 但是知道一下还是有好处的)
        iv)  函数的栈帧里包含了函数的参数(至于被调用函数的参数是放在调用函数的

             帧还是被调用函数栈帧, 则依赖于不同系统的实现),
             它的局部变量以及恢复调用该函数的函数的栈帧(也就是前一个栈帧)所需要

             数据, 其中包含了调用函数的下一条执行指令的地址.
        v)   非初始化数据(BSS)区用于存放程序的静态变量, 这部分内存都是被初始化
为零的.
             初始化数据区用于存放可执行文件里的初始化数据.
             这两个区统称为数据区.
        vi)  Text(文本区)是个只读区, 任何尝试对该区的写操作会导致段违法出错. 文
本区
             是被多个运行该可执行文件的进程所共享的. 文本区存放了程序的代码.


    2) 函数的栈帧.
       函数调用时所建立的栈帧包含了下面的信息:
       i)   函数的返回地址. 返回地址是存放在调用函数的栈帧还是被调用函数的栈帧
里,
            取决于不同系统的实现.
       ii)  调用函数的栈帧信息, 即栈顶和栈底.
       iii) 为函数的局部变量分配的空间
       iv)  为被调用函数的参数分配的空间--取决于不同系统的实现.

    3) 缓冲区溢出的利用.
       从函数的栈帧结构可以看出:
       由于函数的局部变量的内存分配是发生在栈帧里的, 所以如果我们在某一个函数里
定义
       了缓冲区变量, 则这个缓冲区变量所占用的内存空间是在该函数被调用时所建立的
栈帧里.

       由于对缓冲区的潜在操作(比如字串的复制)都是从内存低址到高址的, 而内存中所
保存
       的函数调用返回地址往往就在该缓冲区的上方(高地址)--这是由于栈的特性决定的
, 这
       就为复盖函数的返回地址提供了条件. 当我们有机会用大于目标缓冲区大小的内容
来向
       缓冲区进行填充时, 就有可以改写函数保存在函数栈帧中的返回地址, 从而使程序
的执
       行流程随着我们的意图而转移. 换句话来说, 进程接受了我们的控制. 我们可以让
进程
       改变原来的执行流程, 去执行我们准备好的代码.

       这是冯.诺曼计算机体系结构的缺陷.

       下面是缓冲区溢出利用的示意图:
       i) 函数对字串缓冲区的操作, 方向一般都是从内存低址向高址的.
          如: strcpy(s, "AAA.....");

                    s  s+1 s+2 s+3 ...
                    +---+---+---+--------+---+...+
         (内存低址) | A | A | A | ...... | A |...|     (内存高址)
                    +---+---+---+--------+---+...+

          ii) 函数返回地址的复盖

                                / |      ......        |  (内存高址)
                               /  +--------------------+
                  调用函数栈帧    |    0x41414141      |
                               \  +--------------------+
                                \ |    0x41414141      |  调用函数的返回地址

                                 \+--------------------+
                                 /|      ......        |
                                / +--------------------+  s+8
                               /  |    0x41414141      |
                              /   +--------------------+  s+4
                被调用函数栈帧    |    0x41414141      |
                              \   +--------------------+  s
                               \  |    0x41414141      |
                                \ +--------------------+
                                 \|      ......        |
                                  +....................+
                                  |      ......        | (内存低址)

               注: 字符A的十六进制ASCII码值为0x41.

          iii) 从上图可以看出: 如果我们用的是进程可以访问的某个地址而不是0x414
14141
               来改写调用函数的返回地址, 而这个地址正好是我们准备好的代码的入口
, 那么
               进程将会执行我们的代码. 否则, 如果用的是进程无法访问的段的地址,
 将会导
               致进程崩馈--Segment Fault Core dumped (段出错内核转储); 如果该地
址处有
               无效的机器指令数据, 将会导致非法指令(Illigal Instruction)错误, 
等等.

   4) 缓冲区在Heap(堆)区或BBS区的情况
      i)  如果缓冲区的内存空间是在函数里通过动态申请得到的(如: 用malloc()函数申
请), 那
          么在函数的栈帧中只是分配了存放指向Heap(堆)中相应申请到的内存空间的指
针. 这种
          情况下, 溢出是发生在(Heap)堆中的, 想要复盖相应的函数返回地址, 看来几
乎是不可
          能的. 这种情况的利用可能性要看具体情形, 但不是不可能的.
      ii) 如果缓冲区在函数中定义为静态(static), 则缓冲区内存空间的位置在非初始
化(BBS)区,
          和在Heap(堆)中的情况差不多, 利用是可能的. 但还有一种特姝情况, 就是可
以利用它来
          复盖函数指针, 让进程后来调用相应的函数变成调用我们所指定的代码.


3. 从缓冲区溢出的利用可以得到什么?
   从上文我们看到, 缓冲区溢出的利用可以使我们能够改写相关内存的内容及函数的返回
地址, 从而
   改变代码的执行流程, 让进程去执行我们准备好的代码.

   但是, 进程是以我们当前登录的用户身份来运行的. 能够执行我们准备好的代码又怎样
呢? 我们还
   是无法突破系统对当前用户的权限设置, 无法干超越权限的事.

   换句话来说, 要想利用缓冲区溢出得到更高的权限, 我们还得利用系统的一些特性.


   对于UNIX来讲, 有两个特性可以利用.
   i)  SUID及SGID程序
       UNIX是允许其他用户可以以某个可执行文件的文件拥有者的用户ID或用户组ID的身
份来执行该
       文件的,这是通过设置该可执行文件的文件属性为SUID或SGID来实现的.
       也就是说如果某个可执行文件被设了SUID或SGID, 那么当系统中其他用户执行该文
件时就相当
       于以该文件属主的用户或用户组身份来执行该文件.
       如果某个可执行文件的属主是root, 而这个文件被设了SUID, 那么如果该可执行文
件存在可利
       用的缓冲区溢出漏洞, 我们就可以利用它来以root的身份执行我们准备好的代码.
 没有比让它
       为我们产生一个具有超级用户root身份的SHELL更吸引人了, 是不是?

   ii) 各种端口守护(服务)进程
       UNIX中有不少守护(服务)进程是以root的身份运行的, 如果这些程序存在可利用的
缓冲区溢出,
       那么我们就可以让它们以当前运行的用户身份--root去执行我们准备被好的代码.

       由于守护进程已经以root的身份在运行, 我们并不需要相对应的可执行文件为SUI
D或SGID属性.
       又由于此类利用通常是从远程机器上向目标机器上的端口发送有恶意的数据造成的
, 所以叫做
       "远程溢出"利用.

4. 一个有问题的程序
    以下例程纯属虚构, 如有雷同, 纯属巧合.

/*
*    文件名  : p.c
*    编译    : gcc -o p p.c
*/

#include <stdio.h>

void vulFunc(char* s)
{
        char buf[10];
        strcpy(buf, s);
        printf("String=%s\n", buf);
}

main(int argc, char* argv[])
{
        if(argc == 2)
        {
                vulFunc(argv[1]);
        }
        else
        {
                printf("Usage: %s <A string>\n", argv[0]);
        }

}

    这个例程接受用户在命令行的字串输入, 然后在标准输出(屏幕)上打印出来. 我们可
以看出在
    vulFunc()这个函数里, 定义了一个最多可以装十个字符的缓冲区buf. 如果我们在命
令行输入
    小于等于十个字符的字串, 则一切都很正常. 但是, 如果我们输入的字串长度大于十
呢? 情况
    会怎样? 缓冲区太小装不下了, 所以溢出了? 答案有待于具体分析一下才知道.

    对于这个程序在不同操作系统下的分析和模拟攻击. 请看第二部份基楚篇



                      第二部份    基楚篇
5. Linux x86 平台
   本文使用了如下Linux平台:
   Red Hat Linux release 6.2 (Zoot)
   Kernel 2.2.14-12 on an i586

   所使用的编译器及版本:
   bash$ gcc -v
   Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/specs
   gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)

   注意: 不同版本的编译器编译相同代码所生成的机器指令可能不同.

   1) 例程p.c在Linux x86平台下的剖析.
      i)   首先我们编译p.c并用gdb对相关函数进行反汇编
           结果见如下清单:

           bash$ gcc -o p p.c
           bash$ gdb p
           GNU gdb 19991004
           Copyright 1998 Free Software Foundation, Inc.
           GDB is free software, covered by the GNU General Public License, an
d you are
           welcome to change it and/or distribute copies of it under certain c
onditions.
           Type "show copying" to see the conditions.
           There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
           This GDB was configured as "i386-redhat-linux"...
           (gdb) disas main
           Dump of assembler code for function main:
           0x804842c <main>:       push   %ebp
           0x804842d <main+1>:     mov    %esp,%ebp
           0x804842f <main+3>:     cmpl   $0x2,0x8(%ebp)
           0x8048433 <main+7>:     jne    0x8048448 <main+28>
           0x8048435 <main+9>:     mov    0xc(%ebp),%eax
           0x8048438 <main+12>:    add    $0x4,%eax
           0x804843b <main+15>:    mov    (%eax),%edx
           0x804843d <main+17>:    push   %edx
           0x804843e <main+18>:    call   0x8048400 <vulFunc>
           0x8048443 <main+23>:    add    $0x4,%esp
           0x8048446 <main+26>:    jmp    0x804845b <main+47>
           0x8048448 <main+28>:    mov    0xc(%ebp),%eax
           0x804844b <main+31>:    mov    (%eax),%edx
           0x804844d <main+33>:    push   %edx
           0x804844e <main+34>:    push   $0x80484bb
           0x8048453 <main+39>:    call   0x8048330 <printf>
           0x8048458 <main+44>:    add    $0x8,%esp
           0x804845b <main+47>:    leave
           0x804845c <main+48>:    ret
           0x804845d <main+49>:    nop
           0x804845e <main+50>:    nop
           0x804845f <main+51>:    nop
           End of assembler dump.
           (gdb) disas vulFunc
           Dump of assembler code for function vulFunc:
           0x8048400 <vulFunc>:    push   %ebp
           0x8048401 <vulFunc+1>:  mov    %esp,%ebp
           0x8048403 <vulFunc+3>:  sub    $0xc,%esp
           0x8048406 <vulFunc+6>:  mov    0x8(%ebp),%eax
           0x8048409 <vulFunc+9>:  push   %eax
           0x804840a <vulFunc+10>: lea    0xfffffff4(%ebp),%eax
           0x804840d <vulFunc+13>: push   %eax
           0x804840e <vulFunc+14>: call   0x8048340 <strcpy>
           0x8048413 <vulFunc+19>: add    $0x8,%esp
           0x8048416 <vulFunc+22>: lea    0xfffffff4(%ebp),%eax
           0x8048419 <vulFunc+25>: push   %eax
           0x804841a <vulFunc+26>: push   $0x80484b0
           0x804841f <vulFunc+31>: call   0x8048330 <printf>
           0x8048424 <vulFunc+36>: add    $0x8,%esp
           0x8048427 <vulFunc+39>: leave
           0x8048428 <vulFunc+40>: ret
           0x8048429 <vulFunc+41>: lea    0x0(%esi),%esi
           End of assembler dump.


           这里我们只对所关心的main和vulFunc两个函数进行反汇编分析.

      ii)  进程的运行及其在内存中的情况分析
           我们用gdb来跟踪看看进程是如何在内存中运行的.

           首先把程序调入.
           bash$ gdb p
           GNU gdb 19991004
           Copyright 1998 Free Software Foundation, Inc.
           GDB is free software, covered by the GNU General Public License, an
d you are
           welcome to change it and/or distribute copies of it under certain c
onditions.
           Type "show copying" to see the conditions.
           There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
           This GDB was configured as "i386-redhat-linux"...
           (gdb)

           把断点设到main的第一条可执行汇编指令上
           (gdb) b *0x804842c
           Breakpoint 1 at 0x804842c

           运行程序
           (gdb) r AAAAAAAA
           Starting program: /home/vcat/p AAAAAAAA

           Breakpoint 1, 0x804842c in main ()

           在断点处停下来了.
           看一下这时各寄存器的值
           (gdb) i reg
           eax            0x4010b3f8       1074836472
           ecx            0x804842c        134513708
           edx            0x4010d098       1074843800
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6bc       -1073744196
           ebp            0xbffff6d8       -1073744168
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x804842c        134513708
           eflags         0x246    582
           cs             0x23     35
           ss             0x2b     43
           ds             0x2b     43
           es             0x2b     43
           fs             0x0      0
           gs             0x0      0
           cwd            0xffff037f       -64641
           swd            0xffff0000       -65536
           twd            0xffffffff       -1
           fip            0x40034d70       1073958256
           fcs            0x35d0023        56426531
           fopo           0xbfffe400       -1073748992
           fos            0xffff002b       -65493

           我们这里关心的是栈底(ebp), 栈顶(esp)及指令寄存器(eip).
           此时, ebp的值为0xbffff6d8, esp的值为0xbffff6bc, 相差28个字节.
           eip的值为0x804842c, 正好是我们所设的断点.
           (注: 这里的值可能会随着程序运行在不同的系统环境而不同)

           我们再看看当前栈帧里有什么内容?
           (gdb) x/8x $esp
           0xbffff6bc:     0x400349cb      0x00000002      0xbffff704      0xb
ffff710
           0xbffff6cc:     0x40013868      0x00000002      0x08048350      0x0
0000000

           也就是说, main函数刚被调用时进程在内存中的相关部份的影像是这样的:

                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- ebp (调用main函数前的ebp)
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704|
                              +--------+
                              |00000002|
                              +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- esp (调用main函数前的esp)
                              | ...... |
                   (内存低址)


           我们看看接下来的指令做了些什么?
           0x804842c <main>:       push   %ebp         ; esp的值等于esp-4(因为
ebp是32位);
                                                       ; 把ebp的值放入esp所指的
32位内存单
                                                       ; 元(注: 这里保存栈底).

           0x804842d <main+1>:     mov    %esp,%ebp    ; ebp的值等于esp的值(注
: 这里把原来
                                                       ; 的栈顶做为新的栈底).


           运行这两条指令, 然后看一下寄存器内容和栈的情况.
           (gdb) si
           0x804842d in main ()
           (gdb) si
           0x804842f in main ()
           (gdb) i reg
           eax            0x4010b3f8       1074836472
           ecx            0x804842c        134513708
           edx            0x4010d098       1074843800
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6b8       -1073744200
           ebp            0xbffff6b8       -1073744200
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x804842f        134513711
           eflags         0x346    838
           cs             0x23     35
           ss             0x2b     43
           ds             0x2b     43
           es             0x2b     43
           fs             0x0      0
           gs             0x0      0
           cwd            0xffff037f       -64641
           swd            0xffff0000       -65536
           twd            0xffffffff       -1
           fip            0x40034d70       1073958256
           fcs            0x35d0023        56426531
           fopo           0xbfffe400       -1073748992
           fos            0xffff002b       -65493
           (gdb) x/9x $esp
           0xbffff6b8:     0xbffff6d8      0x400349cb      0x00000002      0xb
ffff704
           0xbffff6c8:     0xbffff710      0x40013868      0x00000002      0x0
8048350
           0xbffff6d8:     0x00000000

           此时进程的相关影像为:
                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704|
                              +--------+
                              |00000002|
                              +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8|
                   0xbffff6b8 +--------+ <-- ebp, esp
                              | ...... |
                   (内存低址)

           接下来的两条指令:
           0x804842f <main+3>:     cmpl   $0x2,0x8(%ebp)       ; 2和ebp+8所指向
的内存(32位--4
                                                               ; 个字节)里面所
放的内容比较.
           0x8048433 <main+7>:     jne    0x8048448 <main+28>  ; 如果不等则跳到
0x08048448地址
                                                               ; 处继续执行, 否
则执行下条指令.
           这里我们可以看到这是C语言语句
           if(argc == 2)
           {
              ...
           }
           else
           {
              ...
           }
           的等价汇编语句. 内存地址ebp+8处存放的是argc的值.
           (gdb) x/x $ebp+8
           0xbffff6c0:     0x00000002

           我们来看看在调用vulFunc函数前的指令:
           0x8048435 <main+9>:     mov    0xc(%ebp),%eax  ; 把内存地址ebp+12处
的四个字节的
                                                          ; 内容放到eax里.
           0x8048438 <main+12>:    add    $0x4,%eax       ; eax等于eax+4.
           0x804843b <main+15>:    mov    (%eax),%edx     ; 把eax指向的四个内存
字节单元里
                                                          ; 的内容赋给edx
           0x804843d <main+17>:    push   %edx            ; esp等于esp-4, 把ed
x的值放到esp
                                                          ; 所指的内存地址的四
个字节单元里.

           看看ebp+12处放的是什么?
           (gdb) x/x $ebp+12
           0xbffff6c4:     0xbffff704
           怀疑这里放的是指向argv[0]字串的地址的地址, 看看是不是
           (gdb) x/x 0xbffff704
           0xbffff704:     0xbffff83e
           (gdb) x/1s 0xbffff83e
           0xbffff83e:      "/home/vcat/p"
           果然是. 那么$ebp+12的所指的四个字节的内容(argv[0]字串的地址)加上四,
 应该就是指向
           argv[1]字串的地址了.
           (gdb) x/x 0xbffff704+4
           0xbffff708:     0xbffff856
           (gdb) x/1s 0xbffff856
           0xbffff856:      "AAAAAAAA"

           可以看出, 这四条指令是用来计算argv[1](即所输入的字串"AAAAAAAA"在内存
中的起始地址),
           然后把该地址压入栈中做为参数传给即将被调用的函数vulFunc的.

           设个断点在0x804843e, 让程序继续执行到调用vulFunc函数之前.
           (gdb) b *0x804843e
           Breakpoint 2 at 0x804843e
           (gdb) c
           Continuing.

           (gdb) i reg
           eax            0xbffff708       -1073744120
           ecx            0x804842c        134513708
           edx            0xbffff856       -1073743786
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6b4       -1073744204
           ebp            0xbffff6b8       -1073744200
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x804843e        134513726
           eflags         0x282    642
           (以下省略)
           ...

           (gdb) x/10x $esp
           0xbffff6b4:     0xbffff856      0xbffff6d8      0x400349cb      0x0
0000002
           0xbffff6c4:     0xbffff704      0xbffff710      0x40013868      0x0
0000002
           0xbffff6d4:     0x08048350      0x00000000

           此时的进程在内存中的相关影像为:
                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8| 调用main函数前的ebp
                   0xbffff6b8 +--------+ <-- main函数的ebp
                              |bffff856| 字符串"AAAAAAAA"在内存中的开始地址
                   0xbffff6b4 +--------+ <-- main函数的esp
                              | ...... |
                   (内存低址)

           单步执行
           (gdb) si
           0x8048400 in vulFunc ()

           好, 现在进入vulFunc函数了.
           (gdb) i reg
           eax            0xbffff708       -1073744120
           ecx            0x804842c        134513708
           edx            0xbffff856       -1073743786
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6b0       -1073744208
           ebp            0xbffff6b8       -1073744200
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x8048400        134513664
           eflags         0x382    898
           (以下省略)
           ...

           这时esp已经变为0xbffff6b0, 和以前的值0xbffff6b4比较相差四个字节.
           我们来看看到底压了什么东西入栈.
           (gdb) x/11x $esp
           0xbffff6b0:     0x08048443      0xbffff856      0xbffff6d8      0x4
00349cb
           0xbffff6c0:     0x00000002      0xbffff704      0xbffff710      0x4
0013868
           0xbffff6d0:     0x00000002      0x08048350      0x00000000

           原来是main函数里调用vulFunc函数的指令的后续指令的地址--即vulFunc函数
的返回地址.
           这是我们的第一个焦点.

           ...
           0x804843e <main+18>:    call   0x8048400 <vulFunc>
           0x8048443 <main+23>:    add    $0x4,%esp
           ...

           我们接着分析vulFunc函数.
           0x8048400 <vulFunc>:    push   %ebp
           0x8048401 <vulFunc+1>:  mov    %esp,%ebp
           0x8048403 <vulFunc+3>:  sub    $0xc,%esp    ; esp等于esp-12, 栈帧大
小增加12个字节.

           前面两条指令的功能和main函数的一样, 用来保存调用函数栈帧的栈底ebp和
设置被调用函
           数栈帧栈底.
           即: 保存调用函数的栈帧栈底, 调用函数栈帧的栈顶变为被调用函数的栈底.
 可以看出当前
               (被调用函数)的栈帧为空时, ebp和esp的值相等.


           第三条指令在栈帧中分配了0xc(十二)个字节的内存空间, 注意到里面的内容
是垃圾.
           (gdb) si
           0x8048401 in vulFunc ()
           (gdb) si
           0x8048403 in vulFunc ()
           (gdb) si
           0x8048406 in vulFunc ()

           (gdb) x/15x $esp
           0xbffff6a0:     0x4000ae60      0xbffff704      0xbffff6b8      0xb
ffff6b8
           0xbffff6b0:     0x08048443      0xbffff856      0xbffff6d8      0x4
00349cb
           0xbffff6c0:     0x00000002      0xbffff704      0xbffff710      0x4
0013868
           0xbffff6d0:     0x00000002      0x08048350      0x00000000

           此时进程在内存中相关的影像为:
                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8| 调用main函数前的ebp
                   0xbffff6b8 +--------+ <-- main函数的ebp
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff6b4 +--------+
                              |08048443| vulFunc函数的返回地址
                   0xbffff6b0 +--------+ <-- 调用vulFunc函数前的esp
                              |bffff6b8| main函数的ebp
                   0xbffff6ac +--------+ <-- vulFunc函数的ebp
                              |bffff6b8| (垃圾)
                   0xbffff6a8 +--------+
                              |bffff704| (垃圾)
                   0xbffff6a4 +--------+
                              |4000ae60| (垃圾)
                   0xbffff6a0 +--------+ <-- vulFunc的当前esp
                              | ...... |
                   (内存低址)

           再看看下面的四条指令.
           0x8048406 <vulFunc+6>:  mov    0x8(%ebp),%eax    ; 把ebp+8指向的内存
单元(4字节)里
                                                            ; 的内容赋给eax.

           从上图看出vulFunc函数栈帧的ebp+8四字节内存单元里放的是指向"AAAAAAAA
"字符串的起始地址.

           0x8048409 <vulFunc+9>:  push   %eax              ; eax的值入栈.
           把指向"AAAAAAAA"字符串的起始地址入栈.

           0x804840a <vulFunc+10>: lea    0xfffffff4(%ebp),%eax
           哇! 好吓人呀! 这条指令是干什么的? 让我们慢慢来分析一下.
           这条指令是把ebp+0xfffffff4做为地址值赋给eax.
           但是ebp的值加上0xfffffff4指向那里呀, 这是我们要弄清楚的.
           这里如果我们按正数来加, 那是不行的.
           实际上这个十六进制的0xfffffff4所表示的是负数, 要知道它的值, 让我们来
算一下.

             F    F    F    F    F    F    F    4
           +----+----+----+----+----+----+----+----+
           |1111|1111|1111|1111|1111|1111|1111|0100|
           +----+----+----+----+----+----+----+----+

           取反

             0    0    0    0    0    0    0    B
           +----+----+----+----+----+----+----+----+
           |0000|0000|0000|0000|0000|0000|0000|1011|
           +----+----+----+----+----+----+----+----+

           加一

             0    0    0    0    0    0    0    C
           +----+----+----+----+----+----+----+----+
           |0000|0000|0000|0000|0000|0000|0000|1100|
           +----+----+----+----+----+----+----+----+

           也就是负的0xc. ebp+0xfffffff4, 即ebp-0xc.

           所以ebp+0xfffffff4, 就是现在栈顶指向的那十二个字节的起始地址.

           0x804840d <vulFunc+13>: push   %eax
           接着把得到的地址入栈.

           让程序运行到调用strcpy函数之前看看
           (gdb) b *0x804840e
           Breakpoint 3 at 0x804840e
           (gdb) c
           Continuing.

           Breakpoint 3, 0x804840e in vulFunc ()
           (gdb) x/17x $esp
           0xbffff698:     0xbffff6a0      0xbffff856      0x4000ae60      0xb
ffff704
           0xbffff6a8:     0xbffff6b8      0xbffff6b8      0x08048443      0xb
ffff856
           0xbffff6b8:     0xbffff6d8      0x400349cb      0x00000002      0xb
ffff704
           0xbffff6c8:     0xbffff710      0x40013868      0x00000002      0x0
8048350
           0xbffff6d8:     0x00000000

           这时进程在内存的相关影像为:
                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8| 调用main函数前的ebp
                   0xbffff6b8 +--------+ <-- main函数的ebp
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff6b4 +--------+
                              |08048443| vulFunc函数的返回地址
                   0xbffff6b0 +--------+ <-- 调用vulFunc函数前的esp
                              |bffff6b8| main函数的ebp
                   0xbffff6ac +--------+ <-- vulFunc函数的ebp
                              |bffff6b8| (垃圾)
                   0xbffff6a8 +--------+
                              |bffff704| (垃圾)
                   0xbffff6a4 +--------+
                              |4000ae60| (垃圾)
                   0xbffff6a0 +--------+
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff69c +--------+
                              |bffff6a0| vulFunc函数栈帧中分配的十二个字节起始
地址
                   0xbffff698 +--------+ <-- vulFunc的当前esp
                              | ...... |
                   (内存低址)

           我们这里不关心strcpy函数具体运行, 把断点设到调用它的后续指令.
           (gdb) b *0x8048413
           Breakpoint 4 at 0x8048413
           (gdb) c
           Continuing.

           Breakpoint 4, 0x8048413 in vulFunc ()
           (gdb) x/17x $esp
           0xbffff698:     0xbffff6a0      0xbffff856      0x41414141      0x4
1414141
           0xbffff6a8:     0xbffff600      0xbffff6b8      0x08048443      0xb
ffff856
           0xbffff6b8:     0xbffff6d8      0x400349cb      0x00000002      0xb
ffff704
           0xbffff6c8:     0xbffff710      0x40013868      0x00000002      0x0
8048350
           0xbffff6d8:     0x00000000

           这时进程在内存中的相关影像为:
                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8| 调用main函数前的ebp
                   0xbffff6b8 +--------+ <-- main函数的ebp
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff6b4 +--------+
                              |08048443| vulFunc函数的返回地址
                   0xbffff6b0 +--------+ <-- 调用vulFunc函数前的esp
                              |bffff6b8| main函数的ebp
                   0xbffff6ac +--------+ <-- vulFunc函数的ebp
                              |bffff600|
                   0xbffff6a8 +--------+
                              |41414141|
                   0xbffff6a4 +--------+
                              |41414141|
                   0xbffff6a0 +--------+
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff69c +--------+
                              |bffff6a0| vulFunc函数栈帧中分配的十二个字节起始
地址
                   0xbffff698 +--------+ <-- vulFunc的当前esp
                              | ...... |
                   (内存低址)

           我们注意到在vulFunc函数栈帧中所分配的那十二个字节, 从传递给strcpy函
数的起始
           地址处被我们所输入的八个'A'(十六进制0x41)填充了.

           这是我们的第二个焦点.

           同时也注意到, 内存地址0xbffff6a8所指向的四个字节的内容由原来的垃圾数
据0xbffff6b8
           变成了bffff600.

           低字节的00应该就是字符串"AAAAAAAA"的零结尾字节.

           所以得出结论: vulFunc函数栈帧中分配的那十二个字节是给局部变量buf(缓
冲区)的.
           这里会奇怪: 程序中buf缓冲区只定义了十个字节的大小, 为什么为它分配了
十二个字
           节? 原因是: 内存的分配是以四字节为单位的.所以十个字节(4+4+2)要用三个
内存分
           配单元, 3*4=12.

           如果我们在命令行提供的字串长度为十(多两个字符, 刚好是程序中定义的缓
冲区的大
           小), 那么内存地址0xbffff6a8所指向的四个字节的内容将是bf004141; 如果
增加到十
           一个, 内存地址0xbffff6a8所指向的四个字节的内容为00414141, 刚好填满栈
帧中分配
           给buf的内存空间. 可以看出, 在命令行中提供的字串长度小于12, 程序是不
会出错的.

           现在让我们看看字串长度等于十二的情况, 这时0xbffff6a8所指向的四个字节
的内存单
           元已被41414141填满.0xbffff6ac所指向的四个字节的内存单元的低字节被00
所填, 其内
           容变为bffff600, 从上面的影像图可知: 这个内存单元里保存的是调用函数的
ebp. 也就
           是说, 当字串长度大于或等于十二时, 调用函数的ebp被复盖.

           从进程的影像图可以看出, 要想全面复盖vulFunc函数的返回地址, 则字节串
的长度至少
           要二十(12+8)个字节.

           我们继续分析后面的指令:
           0x8048413 <vulFunc+19>: add    $0x8,%esp  ; 栈帧缩小8个字节--放弃了
两个内存存储单元.

           可以看到, 在调用strcpy前, 依次压了s和buf的地址入栈, 现在这条指令是把
这两个地址抛弃.

           所以可以得出, Linux x86系统在调用函数时(其实是编译器所生成的机器指令
), 所传给
           被调用函数的参数是由调用函数从右到左依次入栈的.
           如现在的strcpy(buf, s), 首先是s先入栈, 然后是buf. 参数的出栈也由调用
函数负责.

           0x8048416 <vulFunc+22>: lea    0xfffffff4(%ebp),%eax
           0x8048419 <vulFunc+25>: push   %eax
           这两条指令和前面的一样, 把argv[1](即"AAAAAAAA"字串)的起始地址入栈.


           0x804841a <vulFunc+26>: push   $0x80484b0
           先看一下0x80484b0里面放的是什么, 虽然很明显是即将调用的printf函数的
第一个参数的地址.
           (gdb) x/1s 0x80484b0
           0x80484b0 <_IO_stdin_used+4>:    "String=%s\n"
           果然是.

           下面的两条指令就是调用printf函数和抛弃在栈中的两个参数了.
           0x804841f <vulFunc+31>: call   0x8048330 <printf>
           0x8048424 <vulFunc+36>: add    $0x8,%esp

           我们在0x08048427 leave 指令的前面设个断点并继续运行.
           (gdb) b *0x8048427
           Breakpoint 5 at 0x8048427
           (gdb) c
           Continuing.
           String=AAAAAAAA

           Breakpoint 5, 0x8048427 in vulFunc ()
           屏幕输出了"String=AAAAAAAA".

           这时栈帧的内容为:
           (gdb) x/15x $esp
           0xbffff6a0:     0x41414141      0x41414141      0xbffff600      0xb
ffff6b8
           0xbffff6b0:     0x08048443      0xbffff856      0xbffff6d8      0x4
00349cb
           0xbffff6c0:     0x00000002      0xbffff704      0xbffff710      0x4
0013868
           0xbffff6d0:     0x00000002      0x08048350      0x00000000

           进程在内存中的相关影像为:
                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8| 调用main函数前的ebp
                   0xbffff6b8 +--------+ <-- main函数的ebp
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff6b4 +--------+
                              |08048443| vulFunc函数的返回地址
                   0xbffff6b0 +--------+ <-- 调用vulFunc函数前的esp
                              |bffff6b8| main函数的ebp
                   0xbffff6ac +--------+ <-- vulFunc函数的ebp
                              |bffff600|
                   0xbffff6a8 +--------+
                              |41414141|
                   0xbffff6a4 +--------+
                              |41414141|
                   0xbffff6a0 +--------+ <-- vulFunc的当前esp
                              |bffff856| (垃圾) 字符串"AAAAAAAA"在内存中的起始
地址
                   0xbffff69c +--------+
                              |bffff6a0| (垃圾) vulFunc函数栈帧中分配的十二个字
节起始地址
                   0xbffff698 +--------+
                              | ...... |
                   (内存低址)

           各寄存器的状况:
           (gdb) i reg
           eax            0x10     16
           ecx            0x400    1024
           edx            0x4010a980       1074833792
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6a0       -1073744224
           ebp            0xbffff6ac       -1073744212
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x8048427        134513703
           eflags         0x296    662
           (以下省略)
           ...

           请注意: 此时esp的内容为0xbffff6a0, ebp的内容为0xbffff6ac
           单步运行leave指令, 然后看一下寄存器的情况.
           (gdb) si
           0x8048428 in vulFunc ()
           (gdb) i reg
           eax            0x10     16
           ecx            0x400    1024
           edx            0x4010a980       1074833792
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6b0       -1073744208
           ebp            0xbffff6b8       -1073744200
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x8048428        134513704
           eflags         0x396    918
           (以下省略)
           ...

           此时的esp的内容为0xbffff6b0, 即执行leave指令前的ebp内容0xbffff6ac+4
;
           ebp的内容为0xbffff6b8, 这个值从那来的呢? 看一下此时进程在内存中的影
像, 正好是
           vulFunc函数的ebp指向的内存的内容, 而随着这个值的出栈, esp的值正好为
0xbffff6b0.

           由此可见, leave指令其实等价于
           mov        %ebp,%esp
           pop        %ebp
           这两条指令, 正好和刚进入被调用函数时
           push        %ebp
           mov        %esp,%ebp
           这两条指令的功能相反.
           也就是说leave指令抛弃了被调用函数的栈帧, 恢复了调用函数的栈帧.

           此时栈中相关的内容:
           (gdb) x/11x $esp
           0xbffff6b0:     0x08048443      0xbffff856      0xbffff6d8      0x4
00349cb
           0xbffff6c0:     0x00000002      0xbffff704      0xbffff710      0x4
0013868
           0xbffff6d0:     0x00000002      0x08048350      0x00000000

           进程在内存中的相关影像:

                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8| 调用main函数前的ebp
                   0xbffff6b8 +--------+ <-- main函数的ebp
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff6b4 +--------+
                              |08048443| vulFunc函数的返回地址
                   0xbffff6b0 +--------+ <-- 当前esp
                              |bffff6b8| (垃圾) main函数的ebp
                   0xbffff6ac +--------+
                              |bffff600| (垃圾)
                   0xbffff6a8 +--------+
                              |41414141| (垃圾)
                   0xbffff6a4 +--------+
                              |41414141| (垃圾)
                   0xbffff6a0 +--------+
                              |bffff856| (垃圾) 字符串"AAAAAAAA"在内存中的起始
地址
                   0xbffff69c +--------+
                              |bffff6a0| (垃圾) vulFunc函数栈帧中分配的十二个字
节起始地址
                   0xbffff698 +--------+
                              | ...... |
                   (内存低址)

           继续执行下条指令: ret
           (gdb) si
           0x8048443 in main ()
           (gdb) i reg
           eax            0x10     16
           ecx            0x400    1024
           edx            0x4010a980       1074833792
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6b4       -1073744204
           ebp            0xbffff6b8       -1073744200
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x8048443        134513731
           eflags         0x396    918
           (以下省略)
           ...

           可以看出, 从栈中弹出0x8048443(vulFunc函数调用的返回地址)给了eip.
           至此vulFunc函数调用完毕, 返回到main函数继续执行.

           值得注意的是: 如果象上面所说的, 我们输入的字串长度为二十个'A'--刚好
复盖完0xbffff6b0
           所指的单元, 那么此时从栈中弹出给eip的内容将是0x41414141, 而不是0x80
48443, 程序
           将跳到0x41414141去执行那里的指令, 由于0x41414141对于当前进程来说是不
可访问的,
           所以导致段出错(Segmentation fault), 进程停止执行.

           这是我们的第三个焦点.

           如果我们能计算好位移(offset), 用我们准备好的代码的入口地址来覆盖0xb
ffff6b0所
           指的单元, 那么从栈中弹出给eip的内容就是我们的代码的入口地址, 程序将
跳到我们的
           代码去继续执行.

           分析到这里, 我们已经清楚了C语言函数调用的机制了. main函数的后续指令
对于我们的
           分析已无关紧要. 但是为了保持文章的完整, 我们继续再往下看看.

           此时栈的情况:
           (gdb) x/10x $esp
           0xbffff6b4:     0xbffff856      0xbffff6d8      0x400349cb      0x0
0000002
           0xbffff6c4:     0xbffff704      0xbffff710      0x40013868      0x0
0000002
           0xbffff6d4:     0x08048350      0x00000000

           进程在内存中的相关影像:

                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb|
                   0xbffff6bc +--------+ <-- 调用main函数前的esp
                              |bffff6d8| 调用main函数前的ebp
                   0xbffff6b8 +--------+ <-- main函数的ebp
                              |bffff856| 字符串"AAAAAAAA"在内存中的起始地址
                   0xbffff6b4 +--------+ <-- 当前esp
                              |08048443| (垃圾) vulFunc函数的返回地址
                   0xbffff6b0 +--------+
                              |bffff6b8| (垃圾) main函数的ebp
                   0xbffff6ac +--------+
                              |bffff600| (垃圾)
                   0xbffff6a8 +--------+
                              |41414141| (垃圾)
                   0xbffff6a4 +--------+
                              |41414141| (垃圾)
                   0xbffff6a0 +--------+
                              |bffff856| (垃圾) 字符串"AAAAAAAA"在内存中的起始
地址
                   0xbffff69c +--------+
                              |bffff6a0| (垃圾) vulFunc函数栈帧中分配的十二个字
节起始地址
                   0xbffff698 +--------+
                              | ...... |
                   (内存低址)


           再看看后续的指令做了些什么?
           0x8048443 <main+23>:    add    $0x4,%esp            ; 抛弃栈中为被调
用函数准备的参数.
           0x8048446 <main+26>:    jmp    0x804845b <main+47>  ; 跳转到0x80484
5b继续执行
           0x8048448 <main+28>:    mov    0xc(%ebp),%eax       ; 0x8048433 jne
的条件判断跳转
                                                               ; 入口(即argc!=
2的情况)
                                                               ; 把ebp+0xc所指
向的内存单元的
                                                               ; 内容赋给eax, 
从上面的分析我
                                                               ; 们知道里面放的
是argv的地址
           0x804844b <main+31>:    mov    (%eax),%edx          ; 把eax指向的地
址的内存单元里
                                                               ; 的内容赋给edx
, 我们知道argv
                                                               ; 是个数组, arg
v的值就是argv[0]
           0x804844d <main+33>:    push   %edx                 ; 把argv[0]入栈
. 注意这里的
                                                               ; argv[0]其实是
个地址值.
           0x804844e <main+34>:    push   $0x80484bb           ; 把常数0x80484
bb入栈
                                                               ; 以上为调用pri
ntf函数准备参数.
           0x8048453 <main+39>:    call   0x8048330 <printf>   ; 调用printf函数

           0x8048458 <main+44>:    add    $0x8,%esp            ; 抛弃为调用pri
ntf函数准备的参数
           0x804845b <main+47>:    leave                       ; 恢复调用main函
数的函数的栈帧
           0x804845c <main+48>:    ret                         ; 返回到调用mai
n函数的函数

           估计0x80484bb指向的是printf函数的format字串, 看看是不是?
           (gdb) x/1s 0x80484bb
           0x80484bb <_IO_stdin_used+15>:   "Usage: %s <A string>\n"
           果然是. 那从0x8048448到0x8048458这段指令就是C语言
           printf("Usage: %s <A string>\n", argv[0]);
           的等价汇编语句了.

           我们把断点设到0x804845b, 再继续执行.
           (gdb) b *0x804845b
           Breakpoint 6 at 0x804845b
           (gdb) c
           Continuing.

           Breakpoint 6, 0x804845b in main ()

           下一条指令是leave, 应该是恢复调用函数的函数的栈帧.
           单步执行一下, 看看寄存器及栈的情况.
           (gdb) si
           0x804845c in main ()
           (gdb) i reg
           eax            0x10     16
           ecx            0x400    1024
           edx            0x4010a980       1074833792
           ebx            0x4010c1ec       1074840044
           esp            0xbffff6bc       -1073744196
           ebp            0xbffff6d8       -1073744168
           esi            0x4000ae60       1073786464
           edi            0xbffff704       -1073744124
           eip            0x804845c        134513756
           eflags         0x386    902
           (以下省略)
           ...

           (gdb) x/8x $esp
           0xbffff6bc:     0x400349cb      0x00000002      0xbffff704      0xb
ffff710
           0xbffff6cc:     0x40013868      0x00000002      0x08048350      0x0
0000000

           下一条指令是ret, 我们知道栈顶放的是main函数的返回地址(0x400349cb).


           此时进程在内存中的相关影像:

                   (内存高址)
                              | ...... |
                              +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350|
                              +--------+
                              |00000002|
                              +--------+
                              |40013868|
                              +--------+
                              |bffff710|
                              +--------+
                              |bffff704| argv的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb| main函数的返回地址
                   0xbffff6bc +--------+ <-- 当前esp
                              |bffff6d8| (垃圾) 调用main函数前的ebp
                   0xbffff6b8 +--------+
                              |bffff856| (垃圾) 字符串"AAAAAAAA"在内存中的起始
地址
                   0xbffff6b4 +--------+
                              |08048443| (垃圾) vulFunc函数的返回地址
                   0xbffff6b0 +--------+
                              |bffff6b8| (垃圾) main函数的ebp
                   0xbffff6ac +--------+
                              |bffff600| (垃圾)
                   0xbffff6a8 +--------+
                              |41414141| (垃圾)
                   0xbffff6a4 +--------+
                              |41414141| (垃圾)
                   0xbffff6a0 +--------+
                              |bffff856| (垃圾) 字符串"AAAAAAAA"在内存中的起始
地址
                   0xbffff69c +--------+
                              |bffff6a0| (垃圾) vulFunc函数栈帧中分配的十二个字
节起始地址
                   0xbffff698 +--------+
                              | ...... |
                   (内存低址)

           再单步执行, 返回到调用main函数的函数
           (gdb) si
           0x400349cb in __libc_start_main (main=0x804842c <main>, argc=2, arg
v=0xbffff704, init=0x80482c0 <_init>,
               fini=0x804848c <_fini>, rtld_fini=0x4000ae60 <_dl_fini>, stack_
end=0xbffff6fc)
               at ../sysdeps/generic/libc-start.c:92
           92      ../sysdeps/generic/libc-start.c: No such file or directory.


           原来是 __libc_start_main 函数调用了我们的main函数, 看来和概述里说的
有些出入,
           但这对于我们来讲不是很重要. 如果想看源码, 请到../sysdeps/generic/li
bc-start.c
           文件中找.
           (gdb) x/16x $esp
           0xbffff6c0:     0x00000002      0xbffff704      0xbffff710      0x4
0013868
           0xbffff6d0:     0x00000002      0x08048350      0x00000000      0x0
8048371
           0xbffff6e0:     0x0804842c      0x00000002      0xbffff704      0x0
80482c0
           0xbffff6f0:     0x0804848c      0x4000ae60      0xbffff6fc      0x4
0013e90

           从上面可以看到, stack_end=0xbffff6fc, 也就是说我们的进程的栈底地址为
0xbffff6fc,
           在调用__libc_start_main函数前依次推了如下七个参数入栈:
           0xbffff6fc  -> 进程的栈底
           0x4000ae60  -> _dl_fini函数的入口地址.
           0x0804848c  -> _fini函数的入口地址
           0x080482c0  -> _init函数的入口地址
           0xbffff704  -> argv命令行参数地址的地址
           0x00000002  -> argc命令行参数个数值
           0x0804842c  -> 我们的main函数入口

           从上面的分析可推出, 在内存地址0xbffff6dc的内容0x08048371就是__libc_
start_main函数
           的返回地址了.
           我们来看看是什么函数调用了__libc_start_main.
           (gdb) disas 0x08048371
           Dump of assembler code for function _start:
           0x8048350 <_start>:     xor    %ebp,%ebp
           0x8048352 <_start+2>:   pop    %esi
           0x8048353 <_start+3>:   mov    %esp,%ecx
           0x8048355 <_start+5>:   and    $0xfffffff8,%esp
           0x8048358 <_start+8>:   push   %eax
           0x8048359 <_start+9>:   push   %esp
           0x804835a <_start+10>:  push   %edx
           0x804835b <_start+11>:  push   $0x804848c
           0x8048360 <_start+16>:  push   $0x80482c0
           0x8048365 <_start+21>:  push   %ecx
           0x8048366 <_start+22>:  push   %esi
           0x8048367 <_start+23>:  push   $0x804842c
           0x804836c <_start+28>:  call   0x8048320 <__libc_start_main>
           0x8048371 <_start+33>:  hlt
           0x8048372 <_start+34>:  nop
           0x8048373 <_start+35>:  nop
           (省略以下的nop)
           End of assembler dump.

           原来是_start函数调用了__libc_start_main函数.
           至于_start函数调用__libc_start_main函数后, 接是如何调用_init函数和_
dl_runtime_resove
           函数来调用共享库函数和我们的main函数然后退出的, 已经远远脱离了本文的
主题, 这里不再继
           续介绍.

           (gdb) x/1024x 0xbffff6f0
           0xbffff6f0:     0x0804848c      0x4000ae60      0xbffff6fc      0x4
0013e90
           0xbffff700:     0x00000002      0xbffff83e      0xbffff856      0x0
0000000
           0xbffff710:     0xbffff85f      0xbffff881      0xbffff88f      0xb
ffff89e
           0xbffff720:     0xbffff8c4      0xbffff8d2      0xbffff900      0xb
ffff91a
           0xbffff730:     0xbffff932      0xbffff94d      0xbffff9a8      0xb
ffff9df
           0xbffff740:     0xbffffaf3      0xbffffb06      0xbffffb11      0xb
ffffb31
           0xbffff750:     0xbffffb5a      0xbffffb68      0xbffffc72      0xb
ffffc7e
           0xbffff760:     0xbffffc8f      0xbffffca4      0xbffffcb4      0xb
ffffcbf
           0xbffff770:     0xbffffcd7      0xbffffcf5      0xbffffd0e      0xb
ffffd19
           0xbffff780:     0xbffffd23      0xbffffd6c      0xbffffd79      0xb
ffffda0
           0xbffff790:     0xbffffdb2      0xbffffdc1      0xbffffde6      0xb
ffffe08
           0xbffff7a0:     0xbffffe10      0xbfffffd3      0x00000000      0x0
0000003
           0xbffff7b0:     0x08048034      0x00000004      0x00000020      0x0
0000005
           0xbffff7c0:     0x00000006      0x00000006      0x00001000      0x0
0000007
           0xbffff7d0:     0x40000000      0x00000008      0x00000000      0x0
0000009
           0xbffff7e0:     0x08048350      0x0000000b      0x000001f5      0x0
000000c
           0xbffff7f0:     0x000001f5      0x0000000d      0x00000004      0x0
000000e
           0xbffff800:     0x00000004      0x00000010      0x008001bf      0x0
000000f
           0xbffff810:     0xbffff839      0x00000000      0x00000000      0x0
0000000
           0xbffff820:     0x00000000      0x00000000      0x00000000      0x0
0000000
           0xbffff830:     0x00000000      0x00000000      0x38356900      0x6
82f0036
           0xbffff840:     0x2f656d6f      0x65776f74      0x74742f72      0x2
f737775
           0xbffff850:     0x2f6c6469      0x41410070      0x41414141      0x4
c004141
           0xbffff860:     0x4f535345      0x3d4e4550      0x73752f7c      0x6
9622f72
           ...
           (省略)
           ...
           0xbfffffd0:     0x54003a35      0x75413d5a      0x61727473      0x2
f61696c
           0xbfffffe0:     0x0057534e      0x6d6f682f      0x6f742f65      0x2
f726577
           0xbffffff0:     0x77757474      0x64692f73      0x00702f6c      0x0
0000000
           0xc0000000:     Cannot access memory at address 0xc0000000

           我们知道内存单元0xbffff704放的是指argv[0]的地址, 那么0xbffff708放的
就是argv[1]
           的地址了. 0xbffff700里放的是argc的值.

           那么0xbffff710里放的是什么呢? 看样子象是指向字符串的地址, 让我们来看
看.
           (gdb) x/1s 0xbffff85f
           0xbffff85f:      "LESSOPEN=|/usr/bin/lesspipe.sh %s"
           (gdb)
           0xbffff881:      "HISTSIZE=1000"
           ...

           再看看最后一个.
           (gdb) x/1s 0xbfffffd3
           0xbfffffd3:      "TZ=Australia/NSW"
           0xc0000000以后的地址空间已不是进程能合法访问的了.

           原来都是些SHELL的环境变量字符串.

           这一片东西是从内存地址0xbffff839开始的, 让我们再看看.
           (gdb) x/1s 0xbffff839
           0xbffff839:      "i586"
           (gdb)
           0xbffff83e:      "/home/vcat/p"    ===> 细心的朋友会发现这里已被俺改
掉了,
                                                   让俺保留一点私隐吧 ;)
           (gdb)
           0xbffff856:      "AAAAAAAA"
           (gdb)
           0xbffff85f:      "LESSOPEN=|/usr/bin/lesspipe.sh %s"
           ...

           我们得出结论: 0xbffff700放的是argc的值; 0xbffff704放的是argv[0]的地
址,
           0xbffff708放的是argv[1]的地址; 0xbffff710--0xbffffa4放的是指向各个环
境变量
           字符串起始地址的指针; 从内存地址0xbffff839开始依次存放的是: 系统平台
信息字
           串; 命令行字串; 环境变量字串.

           至于0xbffff7a8--0xbffff838里放的是什么, 还有待研究. 由于对本文不是至
关重要,
           暂时放一下.

           分析到这, 我们来组合一下进程在内存的影像:

                   (内存高址)

                              | ...... | ...省略了一些我们不需要关心的区
                              +--------+
                              |00000000|\
                   0xbffffffc +--------+ \
                              | ...... |  \
                                           \
                              | ...... |    \
                   0xbffff844 +--------+   系统平台信息串(如:"i586")和命令行参
数及环境变量字串
                              |2f656d6f|    /
                   0xbffff840 +--------+   /
                              |682f0036|  /
                   0xbffff83c +--------+ /
                              |38356900|/ --> 从内存地址0xbffff839开始, 0x6935
3836="i586"
                   0xbffff838 +--------+
                              | ...... |\
                                         里面放的是什么? 还有待研究
                              | ...... |/
                   0xbffff7a8 +--------+
                              |bfffffd3|\
                   0xbffff7a4 +--------+ \
                              | ...... |  \
                                ......     环境变量指针
                              | ...... |  /
                   0xbffff714 +--------+ /
                              |bffff85f|/
                   0xbffff710 +--------+
                              |00000000|
                   0xbffff70c +--------+
                              |bffff856| argv[1]的地址
                   0xbffff708 +--------+
                              |bffff83e| argv[0]的地址
                   0xbffff704 +--------+
                              |00000002| argc的值
                   0xbffff700 +--------+
                              |40013e90| ???? (和_dl_starting_up函数有关)
                   0xbffff6fc +--------+ <-- 进程的栈底
                              |bffff6fc| stack_end(进程的栈底)
                   0xbffff6f8 +--------+
                              |4000ae60| _dl_fini函数入口地址
                   0xbffff6f4 +--------+
                              |0804848c| _fini函数入口地址
                   0xbffff6f0 +--------+
                              |080482c0| _init函数入口地址
                   0xbffff6ec +--------+
                              |bffff704| argv地址的地址
                   0xbffff6e8 +--------+
                              |00000002| argc的值
                   0xbffff6e4 +--------+
                              |0804842c| main函数的入口地址
                   0xbffff6e0 +--------+
                              |08048371| __libc_start_main的返回地址(指令hlt),
 正常情况不会返回到这.
                   0xbffff6dc +--------+
                              |00000000|
                   0xbffff6d8 +--------+ <-- 调用main函数前的ebp
                              |08048350| _start函数的入口地址
                              +--------+
                              |00000002| argc的值
                              +--------+
                              |40013868| ????
                              +--------+
                              |bffff710| 环境变量指针的地址
                              +--------+
                              |bffff704| argv地址的地址(即argv[0]的地址)
                   0xbffff6c4 +--------+
                              |00000002| argc的值
                   0xbffff6c0 +--------+
                              |400349cb| main函数的返回地址
                   0xbffff6bc +--------+ <-- 当前esp
                  
--

※ 来源:.哈工大紫丁香 bbs.hit.edu.cn [FROM: 218.108.200.218]
[百宝箱] [返回首页] [上级目录] [根目录] [返回顶部] [刷新] [返回]
Powered by KBS BBS 2.0 (http://dev.kcn.cn)
页面执行时间:1,051.509毫秒