Linux | c&cpp | Email | github | QQ群:425043908 关注本站

itarticle.cc

您现在的位置是:网站首页 -> 代码相关 文章内容

内存错误,int_malloc错误-itarticl.cc-IT技术类文章记录&分享

发布时间: 8年前代码相关 139人已围观返回

1. 在修改较多文件,make和执行,发现总是提示SegmentFault,调试也找不到原因(gdb调试时,往往进入某个普通的函数就SegmentFault)

解决方法:全部重新编译可能会解决问题,原因不明。make clean; make。

环境:g++/gcc (GCC) 4.4.4 20100726 (Red Hat 4.4.4-13) centos6

2. 程序突然就退出了,调试时却不退出。

解决方法:可能是SIGPIPE信号导致的,Linux下默认该信号是终止程序,而在gdb下,这个信号被忽略。

3. 调用堆栈突然就消失了:

解决方法:大多是内存越界操作,而且是临时变量(栈变量),譬如声明char [256],操作时写入超过256字符,就可能改写函数的调用堆栈。

4. Centos4编译时的问题:

4.1 Centos4的gcc(3.4.6)不支持-std=C++0x参数

4.2 Centos4的多线程编程mutext的初始化,只能使用“pthread_mutex_init(&mutex, NULL);”,宏定义“PTHREAD_MUTEX_INITIALIZER;”不能使用。

5. new时段错误:

这个是最难解决的问题,因为是其他地方越界导致内存错误,在这个地方new导致问题。

Program received signal SIGSEGV, Segmentation fault.

0x00000037234787ee in _int_malloc () from /lib64/libc.so.6

(gdb) bt

#0 0x00000037234787ee in _int_malloc () from /lib64/libc.so.6

#1 0x0000003723479aed in malloc () from /lib64/libc.so.6

#2 0x00000037300bd0ed in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6

#3 0x000000000040db32 in ConnectionFarm::AddClient (this=0xae34e0, client_fd=6) at connection.cpp:312

在connection.cpp:312是一个new操作:

Connection* client = new Connection(this);

碰到这个问题,就比较恐怖了。有可能是memcpy,memset,strcpy,strcat等等越界。

这里还有一个glibc的BUG,会导致死锁,程序不退出_https://sites.google.com/site/embeddedmonologue/home/posix-programming/self-deadlock-on-pthread_once.

堆栈是这样的:

#0 0x00007f6711224d2b in pthread_once () from /lib64/libpthread.so.0

#1 0x00007f6711536b84 in backtrace () from /lib64/libc.so.6

#2 0x00007f67114a884b in __libc_message () from /lib64/libc.so.6

#3 0x00007f67114ae1c3 in malloc_printerr () from /lib64/libc.so.6

#4 0x00007f67114b214f in _int_malloc () from /lib64/libc.so.6

#5 0x00007f67114b2ce1 in malloc () from /lib64/libc.so.6

#6 0x00007f6713274c92 in local_strdup () from /lib64/ld-linux-x86-64.so.2

#7 0x00007f6713278654 in _dl_map_object () from /lib64/ld-linux-x86-64.so.2

#8 0x00007f6713282a44 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2

#9 0x00007f671327e1b6 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2

#10 0x00007f67132824fa in _dl_open () from /lib64/ld-linux-x86-64.so.2

#11 0x00007f671155ed30 in do_dlopen () from /lib64/libc.so.6

#12 0x00007f671327e1b6 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2

#13 0x00007f671155ee87 in __libc_dlopen_mode () from /lib64/libc.so.6

#14 0x00007f6711536a55 in init () from /lib64/libc.so.6

#15 0x00007f6711224d33 in pthread_once () from /lib64/libpthread.so.0

#16 0x00007f6711536b84 in backtrace () from /lib64/libc.so.6

#17 0x00007f67114a884b in __libc_message () from /lib64/libc.so.6

#18 0x00007f67114ae1c3 in malloc_printerr () from /lib64/libc.so.6

#19 0x00007f67114b214f in _int_malloc () from /lib64/libc.so.6

#20 0x00007f67114b2ce1 in malloc () from /lib64/libc.so.6


The reason why a lock is involved is that pthread_once always needs a lock:

nptl/pthread_once.c

28 __pthread_once (once_control, init_routine)

29 pthread_once_t *once_control;

30 void (*init_routine) (void);

31 {

32 /* XXX Depending on whether the LOCK_IN_ONCE_T is defined use a

33 global lock variable or one which is part of the pthread_once_t

34 object. */

35 if (*once_control == PTHREAD_ONCE_INIT)

36 {

37 lll_lock (once_lock, LLL_PRIVATE); ====> pthread_once() needs this lock, interesting this is a lock for all pthread_once caller. this is arguable.

38

39 /* XXX This implementation is not complete. It doesn't take

40 cancelation and fork into account. */

41 if (*once_control == PTHREAD_ONCE_INIT)

42 {

43 init_routine ();

44

45 *once_control = !PTHREAD_ONCE_INIT;

46 }

47

48 lll_unlock (once_lock, LLL_PRIVATE);

49 }

50

51 return 0;

52 }

It is very interesting to notice that the above recursive lockup is arguably a bug inside glibc code, and it was reported as early as 2005, and was fixed inside malloc() ( part of glibc ) in 2015:


有个游戏界大牛(游晶 )说tcmalloc能解决这个问题,当memcpy越界时会报错。

游晶 12:47:11

http://www.google.com/#hl=zh-CN&q=tcmalloc+TCMALLOC_PAGE_FENCE&oq=tcmalloc+TCMALLOC_PAGE_FENCE&gs_l=serp.3...139786.139786.4.140233.1.1.0.0.0.0.134.134.0j1.1.0...0.0...1c.9XWRVXVVt3c&bav=on.2,or.r_gc.r_pw.&fp=3b4abcf478f33cbd&biw=1440&bih=763

http://www.cppblog.com/feixuwu/archive/2011/05/14/146395.html

不是阻止,而是在有内存访问越界的时候捕获到,例如使用guard

试了一下果然在越界的地方停下来了:

下载地址:

http://code.google.com/p/gperftools/downloads/detail?name=gperftools-2.0.tar.gz

http://download.csdn.net/download/winlinvip/4475430


wget http://gperftools.googlecode.com/files/gperftools-2.0.tar.gz


#!/bin/bash


test -z "gperftools-2.0" || tar xf gperftools-2.0.tar.gz


echo "please modify the src/debugallocation.cc"

echo " DEFINE_bool(malloc_page_fence,"

echo " EnvToBool(\"TCMALLOC_PAGE_FENCE\", false),"

echo " \"Enables putting of memory allocations at page boundaries \""

echo " \"with a guard page following the allocation (to catch buffer \""

echo " \"overruns right when they happen).\");"

echo "to EnvToBool(\"TCMALLOC_PAGE_FENCE\", true) and link with -ltcmalloc_debug"


echo ""

echo "build and install:"

echo "cd gperftools-2.0"

echo "./configure --enable-frame-pointers"

echo "make"

echo "sudo make install"

静态库链接:

sudo ln -sf /usr/local/lib/libtcmalloc_debug.so.4 /lib64/libtcmalloc_debug.so.4

编译选项加上:

-fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free

链接选项加上:

-ltcmalloc_debug

使用gdb调试,在越界的地方就会停下来。

举一个真实项目的例子(原型):

/**

# to build:

g++ -g -O0 -c memcorrupt.cpp -o memcorrupt.o -fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free; g++ memcorrupt.o -o memcorrupt -ltcmalloc_debug ; ./memcorrupt

*/

#include <stdio.h>


class Connection;

class State

{

private:

Connection* conn;

public:

State(Connection* c) : conn(c){

}

virtual ~State(){

}

void action();

};


class Manager;

class Connection

{

private:

State* state;

Manager* manager;

public:

Connection(){

state = NULL;

}

virtual ~Connection(){

if(state != NULL){

delete state;

state = NULL;

}

}

public:

void SetManager(Manager* m){

manager = m;

}

Manager* GetManager(){

return manager;

}

void SetState(State* s){

state = s;

}

};


class Manager

{

private:

Connection* conn;

public:

Manager(){

conn = NULL;

}

virtual ~Manager(){

}

public:

void Destroy(){

if(conn != NULL){

delete conn;

conn = NULL;

}

}

Connection* GetConnection(){

return conn;

}

void SetConnection(Connection* c){

conn = c;

conn->SetManager(this);

}

};


void State::action(){

if(conn == NULL){

return;

}


conn->GetManager()->Destroy();

this->conn = NULL;

}


int main(int /*argc*/, char** /*argv*/){

Manager manager;

Connection* connection = new Connection();

State* state = new State(connection);


connection->SetState(state);

manager.SetConnection(connection);


state->action();


return 0;

}

这段代码怎么看都没有问题,State为终止状态,将调用Manager销毁Connection对象。

有一个地方越界了:Manager.Destroy()将销毁Connection,而Connection将销毁State,在Destroy之后State的this已经不可用了,再调用this->conn=NULL就会越界。这个越界有时会有问题,有时候没有,所以很危险。

使用tcmalloc,用gdb调试时会停在越界的地方:

[python] view plain copy

0x000000000040079a in State::action (this=0x2aaaaaab0ff0) at memcorrupt.cpp:80

80 this->conn = NULL;

(gdb) bt

#0 0x000000000040079a in State::action (this=0x2aaaaaab0ff0) at memcorrupt.cpp:80

#1 0x0000000000400816 in main () at memcorrupt.cpp:91

发布时间: 8年前代码相关139人已围观返回回到顶端

很赞哦! (1)

文章评论

  • 请先说点什么
    热门评论
    138人参与,0条评论

站点信息

  • 建站时间:2016-04-01
  • 文章统计:728条
  • 文章评论:82条
  • QQ群二维码:扫描二维码,互相交流