本篇总结一下信号的必要知识,以及实际场景下的处理,主要参考UNIX环境高级编程(第三版)
要先从子进程fork开始总结,因为信号和子进程息息相关
wait系统调用
根据书中8.3节 fork以及8.6节 wait
fork出子进程以后,假如子进程退出,需要用wait或者waitpid检查子进程的退出状态,否则子进程会进入僵死状态,直到父进程退出,被系统的1号进程接管进行wait才会释放。
父进程可以直接wait等待,也可以啥也不做,在SIGCHLD信号处理函数中,再进行wait调用
waitpid比wait多一些功能,最主要的就是可以在第二个参数传入WNOHANG进行非阻塞调用
根据书中10.5节,wait系统调用是默认自带重启动的,但是Linux的man 2 wait
显示EINTR的含义是
EINTR:WNOHANG was not set and an unblocked signal or a SIGCHLD was caught; see signal(7).
在下面重启动系统调用 中会提到Linux什么情况下才会重启动wait
1 2 3 4 5 6 7 while (wait (&status) < 0 ) { if (errno != EINTR) { status = -1 ; break ; } }
WEXITSTATUS
wait和waitpid的status可以用来判断子进程的退出状态
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #include <sstream> #include <stdlib.h> std::string waitResult2Str (int status) { std::stringstream ss; if (WIFEXITED (status)) { ss << "Child exited normally with status " << WEXITSTATUS (status) << std::endl; }else if (WIFSIGNALED (status)) { ss << "Child terminated by signal " << WTERMSIG (status); if (WCOREDUMP (status)) { ss << " and Core dump was produced" ; } ss << std::endl; } else { ss << "Child ended in an unexpected manner" << std::endl; } return ss.str (); }
WEXITSTATUS,实现上是
1 #define WEXITSTATUS(status) (((status) & 0xff00) >> 8)
实际上就是对status除以256
signal vs sigaction
引用https://stackoverflow.com/questions/231912/what-is-the-difference-between-sigaction-and-signal
Use sigaction()
unless you've got very compelling reasons not to do so.
尽量使用 sigaction()
,除非你有非常令人信服的理由不这样做。
The signal()
interface has antiquity (and hence availability) in its favour, and it is defined in the C standard. Nevertheless, it has a number of undesirable characteristics that sigaction()
avoids - unless you use the flags explicitly added to sigaction()
to allow it to faithfully simulate the old signal()
behaviour.
signal()
接口因为历史悠久(因此在可用性上有优势),并且在 C 标准中有所定义。然而,它有许多不可取的特性,而 sigaction()
避免了这些问题,除非你使用明确添加到 sigaction()
中的标志来让它忠实地模拟旧的 signal()
行为。
The signal()
function does not (necessarily) block other signals from arriving while the current handler is executing; sigaction()
can block other signals until the current handler returns.
signal()
函数在当前处理程序执行期间不(必然)阻塞其他信号到达;而 sigaction()
可以阻塞其他信号,直到当前处理程序返回。
The signal()
function (usually) resets the signal action back to SIG_DFL
(default) for almost all signals. This means that the signal()
handler must reinstall itself as its first action. It also opens up a window of vulnerability between the time when the signal is detected and the handler is reinstalled during which if a second instance of the signal arrives, the default behaviour (usually terminate, sometimes with prejudice - aka core dump) occurs.
signal()
函数(通常)会将信号处理动作重置为 SIG_DFL
(默认值),这适用于几乎所有信号。这意味着 signal()
处理程序必须作为其第一个动作重新安装自身。这也导致了一个脆弱的时机窗口,即在信号被检测到并且处理程序被重新安装期间,如果有第二个该信号实例到达,默认行为(通常是终止,有时是以核心转储的方式)会发生。
The exact behaviour of signal()
varies between systems — and the standards permit those variations.
signal()
的确切行为在不同的系统之间有所不同,而且标准允许这些变化。
阻塞当前信号处理程序
sigaction最大的好处是可以阻塞当前的信号处理程序,以这个例子来说:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 #include <stdio.h> #include <unistd.h> #include <signal.h> #include <string.h> void sigaction_handler (int signo) { static int count = 0 ; printf ("Handled signal %d with sigaction (%d times)\n" , signo, ++count); while (1 ) sleep (1 ); } int main () { struct sigaction sa; memset (&sa, 0 , sizeof (sa)); sa.sa_handler = sigaction_handler; sigemptyset (&sa.sa_mask); sigaddset (&sa.sa_mask, SIGUSR1); sigaddset (&sa.sa_mask, SIGUSR2); sigaction (SIGUSR1, &sa, NULL ); sigaction (SIGUSR2, &sa, NULL ); while (1 ) sleep (1 ); return 0 ; }
sigaction可以在sa_mask中指定屏蔽哪些信号,当使用如下顺序kill信号
1 2 3 kill -s SIGUSR1 xxx kill -s SIGUSR1 xxx kill -s SIGUSR2 xxx
会打印
1 Handled signal 10 with sigaction (1 times)
而使用注释掉的signal来设置信号处理程序,会打印
1 2 Handled signal 10 with sigaction (1 times) Handled signal 12 with sigaction (2 times)
也就是signal只会阻塞相同信号,但是不会阻塞不同信号进入信号处理程序
重启动系统调用
另外,signal会默认重启动系统调用,而sigaction则不会,例如wait
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #include <sys/wait.h> #include <unistd.h> #include <stdio.h> #include <string.h> void sigaction_handler (int ) {}int main () { pid_t pid; if ((pid = fork()) == 0 ) { sleep (100 ); } else { struct sigaction sa; memset (&sa, 0 , sizeof (sa)); sa.sa_handler = sigaction_handler; sigaction (SIGUSR1, &sa, NULL ); printf ("waiting\n" ); int status = wait (0 ); printf ("wait end %d\n" , status); perror ("" ); } }
运行后kill -s SIGUSR1 xxx
,会打印
1 2 3 waiting wait end -1 Interrupted system call
而使用注释掉的signal,则只会打印
说明wait系统调用被自动重启了
标准化问题
signal最大的问题就是没有标准化,引用man signal
的DESCRIPTION对传入的函数信号处理程序是这么描述的
If the disposition is set to a function, then first either the disposition is reset to SIG_DFL, or the signal is blocked (see Portability below), and then handler is called with argument signum. If invocation of the handler caused the signal to be blocked, then the signal is unblocked upon return from the han‐ dler.
如果设置为一个函数,那么首先要么将信号处理设置为 SIG_DFL
,要么该信号被阻塞(具体行为请参考可移植性部分),接着调用处理程序,并传递信号编号 signum
作为参数。如果处理程序调用导致信号被阻塞,则在处理程序返回后信号将被解除阻塞。
接着详细描述了可移植性的部分,以这一段话开头
The only portable use of signal() is to set a signal's disposition to SIG_DFL or SIG_IGN. The semantics when using signal() to establish a signal handler vary across systems (and POSIX.1 explicitly permits this variation); do not use it for this purpose.
signal()
的唯一可移植用法是将信号的处理设置为 SIG_DFL
(默认)或 SIG_IGN
(忽略),因为在为信号建立自定义处理程序时,signal()
的行为因系统不同而有差异,POSIX.1 明确允许这种变化。因此,不建议用 signal()
来建立信号处理程序。
下面详细描述了原始UNIX上,触发信号处理程序时,会将该信号的信号处理程序设置回SIG_DFL的默认行为,并且不阻塞该信号,相当于sigaction的
1 sa.sa_flags = SA_RESETHAND | SA_NODEFER;
当希望为这个信号多次触发这个信号处理程序时,需要自己重新signal一下,但是由于该信号未被阻塞,在还未signal的时候有可能信号就进入了,大部分信号在这个时候会导致程序退出
在BSD系统上优化了这一点,触发信号处理程序时,不会将该信号的信号处理程序设置回SIG_DFL的默认行为,并且阻塞该信号,但是会自动重启某些系统调用,相当于sigaction的
1 sa.sa_flags = SA_RESTART;
在Linux上,上述两种都有可能,得看是哪个版本,或者有没有_BSD_SOURCE
宏
根据上面阻塞当前信号处理程序 和重启动系统调用 的测试可以发现,默认情况下Linux的signal应该是符合BSD系统语义的
也就是相当于使用sigaction的sa.sa_flags = SA_RESTART;
实践1:将程序的coredump改成gstack打印堆栈
这里有一个前提,因为调用的某些函数会影响到进程权限,必须fork子进程来做吐coredump的操作
一开始没有用参考资料的时候,写了一个特别丑陋的版本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 #include <unistd.h> #include <signal.h> #include <stdlib.h> #include <sys/types.h> #include <sys/wait.h> #include <string.h> #include <stdio.h> #include <syscall.h> #include <sstream> #include <iostream> void signalBatch (__sighandler_t handle) { signal (SIGSEGV, handle); signal (SIGFPE, handle); signal (SIGABRT, handle); signal (SIGILL, handle); signal (SIGQUIT, handle); } void executeStackTostr () { char command[256 ]; snprintf (command, sizeof (command), "gdb --batch --quiet -ex 'bt full' -p %ld" , syscall (SYS_gettid)); FILE *pipe = popen (command, "r" ); if (!pipe) { fprintf (stderr, "popen() failed!\n" ); return ; } char buffer[128 ]; while (fgets (buffer, sizeof (buffer), pipe) != NULL ) { printf ("%s" , buffer); } pclose (pipe); } bool g_disableCoreDump = false ;void test (int signalval) { pid_t pid = fork(); if (pid == 0 ){ executeStackTostr (); if (g_disableCoreDump) { kill (getppid (), SIGKILL); }else { kill (getppid (), signalval); } quick_exit (0 ); }else if (pid > 0 ) { signal (signalval, SIG_DFL); int status; while (waitpid (pid, &status, 0 ) < 0 ) { if (errno != EINTR) { status = -1 ; break ; } } }else { std::cout << "Fork failed" << std::endl; } } int main () { g_disableCoreDump = true ; signalBatch (test); sleep (1 ); int *p = nullptr ; *p = 0 ; }
这里有几个问题:
为了避免进程间同步问题,需要把子进程的kill放到父进程去做
为了避免信号重入问题,初版方案是使用原子变量来加锁解决(这里不能用互斥锁,会导致死锁),最终方案使用sigaction
修改test和main代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 void signalBatch (__sighandler_t handle) { vector<int > sigs{SIGSEGV, SIGFPE, SIGABRT, SIGILL, SIGQUIT}; struct sigaction sa; memset (&sa, 0 , sizeof (sa)); sa.sa_handler = gdbMyself; sigemptyset (&sa.sa_mask); for (auto sig : sigs) { sigaddset (&sa.sa_mask, sig); sigaction (sig, &sa, NULL ); } } void test (int signalval) { pid_t pid = fork(); if (pid == 0 ){ executeStackTostr (); quick_exit (0 ); }else if (pid > 0 ) { int status; while (waitpid (pid, &status, 0 ) < 0 ) { if (errno != EINTR) { status = -1 ; break ; } } std::cout << waitResult2Str (status) << std::endl; if (g_disableCoreDump) { kill (getpid (), SIGKILL); }else { signal (signalval, SIG_DFL); kill (getpid (), signalval); } }else { std::cout << "Fork failed" << std::endl; } } int main () { g_disableCoreDump = true ; signalBatch (test); sleep (1 ); int *p = nullptr ; *p = 0 ; }
main函数第三行的while(1) sleep(1)
和test函数的sleep(100)
取消注释,就可以手动kill测试多个信号,是否阻塞了当前信号处理程序
实践2:用文件锁执行命令并且重定向到标准输出
用popen执行flock -xn /tmp/lock -c 'xxx'
的问题,是/tmp/lock
没有写入一个pid文件来协助debug,只能通过lslocks或者fuser来查看谁占用了文件锁
所以需要自己轮一个popen_with_cb,用于在子进程中传入文件锁设置,虽然这个造轮子理由有点勉强,但是是一个很好的理解原理的机会
正确实现system
popen就是加上输出重定向的system,所以先实现system,就可以实现popen
在书中的8.13和10.18中都介绍了system函数的实现
在8.13的简单版本中,只是fork了进程,然后在子进程中execl("/bin/sh", "sh", "-c", cmdstring, (char *)0)
由于执行的是/bin/sh
不需要考虑coredump问题,因此wait以后可以直接将status重置为-1
execl执行的新程序替换当前进程映像,并且不会返回,因此只有执行失败才会到_exit(127)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 #include <sys/wait.h> #include <errno.h> #include <unistd.h> int system (const char *cmdstring) { pid_t pid; int status; if (cmdstring == NULL ) return (1 ); if ((pid = fork()) < 0 ) { status = -1 ; } else if (pid == 0 ) { execl ("/bin/sh" , "sh" , "-c" , cmdstring, (char *)0 ); _exit(127 ); } else { while (waitpid (pid, &status, 0 ) < 0 ) { if (errno != EINTR) { status = -1 ; break ; } } } return status; }
在10.18中,做了一些更新,主要是阻塞了SIGCHLD信号,以及屏蔽了SIGINT和SIGQUIT信号
当使用编译好的a.out
二进制调用system("/bin/ed")
时,进程关系如下,/bin/sh
本身忽略了大部分信号
flowchart LR
subgraph 后台进程组
A[登录 shell]
end
subgraph 前台进程组
B[a.out] --fork/exec--> C["/bin/sh"]
C --fork/exec--> D["/bin/ed"]
end
A -- fork/exec --> B
SIGINT和SIGQUIT
由于信号是父子进程都会进行响应,所以为了能让/bin/ed
接收到SIGINT和SIGQUIT,而不是a.out
响应,system的时候必须屏蔽这两个
SIGCHLD
SIGCHLD是只会发给自己的直接父进程的,那么阻塞SIGCHLD信号是为了什么呢?
参考https://stackoverflow.com/questions/59212770/treating-signals-correctly-inside-system
主要是为了防止调用system的进程,自己的SIGCHLD信号处理程序触发了以后,wait吃掉了本应该在waitpid中接收的子进程,示例如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 #include <unistd.h> #include <sys/wait.h> #include <errno.h> #include <stdio.h> #include <string.h> int system (char *command) { int status; pid_t childPid = fork(); if (childPid == 0 ) { execl ("/bin/sh" , "sh" , "-c" , command, (char *) NULL ); _exit(127 ); } else if (childPid > 0 ) { sleep (1 ); if (waitpid (childPid, &status, 0 ) == -1 ) return -1 ; return status; } return -1 ; } void sigchld (int Sig) { wait (0 );}int main () { struct sigaction sa; memset (&sa, 0 , sizeof (sa)); sa.sa_handler = sigchld; sigaction (SIGCHLD, &sa, NULL ); int status = system ("true" ); printf ("%d %s\n" , status, strerror (errno)); }
测试输出-1 No child processes
也就是说,当SIGCHLD发生以后,打断了父进程的sleep,sigchld函数的wait调用接收了子进程
从信号处理函数出来以后,执行waitpid,会发现已经没有子进程了,导致报错No child processes
也就是ECHILD
PS:这里和SIGINT和SIGQUIT不一样,只能阻塞掉ECHILD信号,不能屏蔽,否则会造成system的调用者漏掉子进程信号处理
书中实现源码如下:
主要是屏蔽并阻塞信号以后,在fork出的子进程立刻恢复原信号处理,并且在子进程成功执行返回以后,恢复父进程的信号处理
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 #include <sys/wait.h> #include <errno.h> #include <signal.h> #include <unistd.h> int system (const char *cmdstring) { pid_t pid; int status; struct sigaction ignore, saveintr, savequit; sigset_t chldmask, savemask; if (cmdstring == NULL ) return (1 ); ignore.sa_handler = SIG_IGN; sigemptyset (&ignore.sa_mask); ignore.sa_flags = 0 ; if (sigaction (SIGINT, &ignore, &saveintr) < 0 ) return (-1 ); if (sigaction (SIGQUIT, &ignore, &savequit) < 0 ) return (-1 ); sigemptyset (&chldmask); sigaddset (&chldmask, SIGCHLD); if (sigprocmask (SIG_BLOCK, &chldmask, &savemask) < 0 ) return (-1 ); if ((pid = fork()) < 0 ) { status = -1 ; } else if (pid == 0 ) { sigaction (SIGINT, &saveintr, NULL ); sigaction (SIGQUIT, &savequit, NULL ); sigprocmask (SIG_SETMASK, &savemask, NULL ); execl ("/bin/sh" , "sh" , "-c" , cmdstring, (char *)0 ); _exit(127 ); } else { while (waitpid (pid, &status, 0 ) < 0 ) { if (errno != EINTR) { status = -1 ; break ; } } } if (sigaction (SIGINT, &saveintr, NULL ) < 0 ) return (-1 ); if (sigaction (SIGQUIT, &savequit, NULL ) < 0 ) return (-1 ); if (sigprocmask (SIG_SETMASK, &savemask, NULL ) < 0 ) return (-1 ); return status; }
这里使用了sigaction
,sigprocmask
的api,相比signal,可以保存当前的信号选项,用于进行信号恢复
实现popen_with_cb
首先新增管道用于重定向输出
1 2 3 4 5 int pipefd[2 ];if (pipe (pipefd) == -1 ) { perror ("pipe" ); return -1 ; }
然后主要修改在fork的子进程和父进程代码,在子进程中使用dup2重定向STDOUT_FILENO和STDERR_FILENO,并且在父进程中读取
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 if (pid == 0 ) { sigaction (SIGINT, &saveintr, NULL ); sigaction (SIGQUIT, &savequit, NULL ); sigprocmask (SIG_SETMASK, &savemask, NULL ); close (pipefd[0 ]); dup2 (pipefd[1 ], STDOUT_FILENO); dup2 (pipefd[1 ], STDERR_FILENO); if (!prestartSubProcessFunc ()) { exit (EXIT_FAILURE); } execl ("/bin/sh" , "sh" , "-c" , cmdstring, (char *)0 ); exit (EXIT_FAILURE); } else { close (pipefd[1 ]); char buffer[128 ]; ssize_t count; while ((count = read (pipefd[0 ], buffer, sizeof (buffer) - 1 )) > 0 ) { buffer[count] = '\0' ; output += buffer; } close (pipefd[0 ]); while (waitpid (pid, &status, 0 ) < 0 ) { if (errno != EINTR) { status = -1 ; break ; } } }
完整代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 #include <sys/wait.h> #include <errno.h> #include <signal.h> #include <unistd.h> #include <string> #include <functional> #include <iostream> #include <sstream> #include <stdlib.h> std::string waitResult2Str (int status) { std::stringstream ss; if (WIFEXITED (status)) { ss << "Child exited normally with status " << WEXITSTATUS (status) << std::endl; }else if (WIFSIGNALED (status)) { ss << "Child terminated by signal " << WTERMSIG (status); if (WCOREDUMP (status)) { ss << " and Core dump was produced" ; } ss << std::endl; } else { ss << "Child ended in an unexpected manner" << std::endl; } return ss.str (); } int popen_with_cb (const char *cmdstring, std::string &output, const std::function<bool ()> &prestartSubProcessFunc) { int pipefd[2 ]; if (pipe (pipefd) == -1 ) { perror ("pipe" ); return -1 ; } pid_t pid; int status; struct sigaction ignore, saveintr, savequit; sigset_t chldmask, savemask; if (cmdstring == NULL ) return (1 ); ignore.sa_handler = SIG_IGN; sigemptyset (&ignore.sa_mask); ignore.sa_flags = 0 ; if (sigaction (SIGINT, &ignore, &saveintr) < 0 ) return (-1 ); if (sigaction (SIGQUIT, &ignore, &savequit) < 0 ) return (-1 ); sigemptyset (&chldmask); sigaddset (&chldmask, SIGCHLD); if (sigprocmask (SIG_BLOCK, &chldmask, &savemask) < 0 ) return (-1 ); if ((pid = fork()) < 0 ) { status = -1 ; } else if (pid == 0 ) { sigaction (SIGINT, &saveintr, NULL ); sigaction (SIGQUIT, &savequit, NULL ); sigprocmask (SIG_SETMASK, &savemask, NULL ); close (pipefd[0 ]); dup2 (pipefd[1 ], STDOUT_FILENO); dup2 (pipefd[1 ], STDERR_FILENO); if (!prestartSubProcessFunc ()) { exit (EXIT_FAILURE); } execl ("/bin/sh" , "sh" , "-c" , cmdstring, (char *)0 ); exit (EXIT_FAILURE); } else { close (pipefd[1 ]); char buffer[128 ]; ssize_t count; while ((count = read (pipefd[0 ], buffer, sizeof (buffer) - 1 )) > 0 ) { buffer[count] = '\0' ; output += buffer; } close (pipefd[0 ]); while (waitpid (pid, &status, 0 ) < 0 ) { if (errno != EINTR) { status = -1 ; break ; } } } if (sigaction (SIGINT, &saveintr, NULL ) < 0 ) return (-1 ); if (sigaction (SIGQUIT, &savequit, NULL ) < 0 ) return (-1 ); if (sigprocmask (SIG_SETMASK, &savemask, NULL ) < 0 ) return (-1 ); return status; } int main () { std::string output; int status = popen_with_cb ("./coredump.out" , output, []() { std::cout << "prestart" << std::endl; return true ; }); std::cout << "output: " << output << std::endl; std::cout << "output size: " << output.size () << std::endl; std::cout << "status: " << status << std::endl; std::cout << "status: " << (WEXITSTATUS (status) == EXIT_FAILURE ? "exec failed" : waitResult2Str (WEXITSTATUS (status))) << std::endl; }
当执行的./coredump.out
会coredump时,输出
1 2 3 4 5 6 output: prestart Segmentation fault (core dumped) output size: 42 status: 35584 status: Child terminated by signal 11 and Core dump was produced
/bin/sh的exit code
根据测试,sh的exit code会给fork出的子进程exit code再套一层,所以需要两次WEXITSTATUS,才可以获取到真的coredump情况:
flowchart LR
subgraph 后台进程组
A[登录 shell]
end
subgraph 前台进程组调用过程
B[a.out] --fork/exec--> C["/bin/sh"]
C --fork/exec--> D["./coredump"]
end
A -- fork/exec --> B
flowchart TB
subgraph 前台进程组exit过程
B["./coredump"] --exit 139--> C["/bin/sh"]
C --exit 35584--> D["./a.out"]
end
加上文件锁
文件锁的功能在书中14.3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 #include <fcntl.h> bool flockWritePid () { int fd = open ("./lock" , O_RDWR | O_CREAT, 0644 ); if (fd == -1 ) { std::cout << "open file failed" << std::endl; return false ; } struct flock lock; lock.l_type = F_WRLCK; lock.l_whence = SEEK_SET; lock.l_start = 0 ; lock.l_len = 0 ; lock.l_pid = getpid (); if (fcntl (fd, F_SETLK, &lock) == -1 ) { char buffer[16 ]; memset (buffer, 0 , 16 ); read (fd, buffer, 16 ); std::cout << "lock failed, cur hold lock pid:" << buffer << std::endl; return false ; } std::cout << "lock success, update hold lock pid:" << getpid () << std::endl; std::string pidStr = std::to_string (getpid ()); write (fd, pidStr.c_str (), pidStr.size () + 1 ); sleep (10 ); return true ; } int main () { std::string output; int status = popen_with_cb ("./coredump.out" , output, flockWritePid); std::cout << "output: " << output << std::endl; std::cout << "output size: " << output.size () << std::endl; std::cout << "status: " << status << std::endl; std::cout << "status: " << (WEXITSTATUS (status) == EXIT_FAILURE ? "exec failed" : waitResult2Str (WEXITSTATUS (status))) << std::endl; }
执行输出:
1 2 3 4 5 output: lock success, update hold lock pid:503595 Segmentation fault (core dumped) output size: 75 status: Child terminated by signal 11 and Core dump was produced
此时另外一个抢锁失败进程会输出
1 2 3 4 output: lock failed, cur hold lock pid:503595 output size: 38 status: exec failed
25/8/12补充
SIGCHLD
上文所述的SIGCHLD都是主动去监听,都是让子进程进入僵尸模式,然后由父进程wait(waitpid)来处理
还有一种懒人玩法,父进程直接signal(SIGCHLD, SIG_IGN)
,或使用 SA_NOCLDWAIT,子进程退出时会被内核自动回收,不会变僵尸
由于信号处理的父子关系继承原因,可能在代码里面搜不到对SIGCHLD的SIG_IGN
查看进程的信号处理
那么可以通过打印/proc下状态直观的看到有没有对某个信号SIG_IGN,例如
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 #include <signal.h> #include <unistd.h> #include <string.h> #include <stdio.h> #include <errno.h> int main (void ) { struct sigaction sa ; memset (&sa, 0 , sizeof (sa)); sa.sa_handler = SIG_IGN; sigemptyset(&sa.sa_mask); sa.sa_flags = 0 ; if (sigaction(SIGCHLD, &sa, NULL ) == -1 ) { perror("sigaction" ); return 1 ; } sleep(1000 ); return 0 ; }
然后查看进程id并打印
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 ~ ps -ef|grep a.out root 375088 371970 0 23:24 pts/1 00:00:00 ./a.out ~ cat /proc/375088/status Name: a.out Umask: 0002 State: S (sleeping) Tgid: 375088 Ngid: 0 Pid: 375088 PPid: 371970 TracerPid: 0 Uid: 1002 1002 1002 1002 Gid: 1003 1003 1003 1003 FDSize: 256 Groups: 1002 1003 NStgid: 375088 NSpid: 375088 NSpgid: 375088 NSsid: 371970 VmPeak: 4280 kB VmSize: 4216 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 640 kB VmRSS: 640 kB RssAnon: 68 kB RssFile: 572 kB RssShmem: 0 kB VmData: 48 kB VmStk: 132 kB VmExe: 4 kB VmLib: 1948 kB VmPTE: 52 kB VmSwap: 0 kB HugetlbPages: 0 kB CoreDumping: 0 THP_enabled: 1 Threads: 1 SigQ: 0/508796 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000010000 SigCgt: 0000000000000000 CapInh: 00000000a82425fb CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: 00000000a82425fb CapAmb: 0000000000000000 NoNewPrivs: 0 Seccomp: 0 Speculation_Store_Bypass: thread vulnerable Cpus_allowed: ffffffff Cpus_allowed_list: 0-31 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 Mems_allowed_list: 0 voluntary_ctxt_switches: 1 nonvoluntary_ctxt_switches: 0
其中的SigIgn为0000000000010000
,实际上是16进制表达,2进制为00010000000000000000
,第17位为1,查表可知正是SIGCHLD
1 2 3 4 5 6 7 8 9 10 11 12 13 14 ~ kill -l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP 21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8 43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2 63) SIGRTMAX-1 64) SIGRTMAX
参考资料
可重入函数
accept
fchmod
lseek
sendto
stat
access
fchown
lstat
setgid
symlink
aio_error
fcntl
mkdir
setpgid
sysconf
aio_return
fdatasync
mkfifo
setsid
tcdrain
aio_suspend
fork
open
setsockopt
tcflow
alarm
fpathconf
pathconf
setuid
tcflush
bind
fstat
pause
shutdown
tcgetattr
cfgetispeed
fsync
pipe
sigaction
tcgetpgrp
cfgetospeed
ftruncate
poll
sigaddset
tcsendbreak
cfsetispeed
getegid
posix_trace_event
sigdelset
tcsetattr
cfsetospeed
geteuid
pselect
sigemptyset
tcsetpgrp
chdir
getgid
raise
sigfillset
time
chmod
getgroups
read
sigismenber
timer_getoverrun
chown
getpeername
readlink
signal
timer_gettime
clock_gettime
getpgrp
recv
sigpause
timer_settime
close
getpid
recvfrom
sigpending
times
connect
getppid
recvmsg
sigprocmask
umask
creat
getsockname
rename
sigqueue
uname
dup
getsockopt
rmdir
sigset
unlink
dup2
getuid
select
sigsuspend
utime
execle
kill
sem_post
sleep
wait
execve
link
send
socket
waitpid
_Exit & _exit
listen
sendmsg
socketpair
write