linux - Why do we need to send SIGHUP to a newly orphaned process group containing a stopped process?

The Advanced Programming in the UNIX Environment book ("APUE") says

Consider a process that forks a child and then terminates. Although this is nothing abnormal (it happens all the time), what happens if the child is stopped (using job control) when the parent terminates? How will the child ever be continued, and does the child know that it has been orphaned?

...

If the process group is not orphaned, there is a chance that one of those parents in a different process group but in the same session will restart a stopped process in the process group that is not orphaned.

...

Since the process group is orphaned when the parent terminates, and the process group contains a stopped process, POSIX.1 requires that every process in the newly orphaned process group be sent the hang-up signal (SIGHUP) followed by the continue signal (SIGCONT).

If the concern is only that a stopped process won't have a chance to be waken up after its process group becomes orphaned, why doesn't the kernel just send SIGCONT when its process group becomes orphaned, and why need to send SIGHUP too?

1 Answer

  1. Jack- Reply

    2019-11-14

    In fact the concern is not just that a stopped process wouldn’t have a chance to be woken up, it is also of being notified that it has been orphaned.

    Your extract of APUE says:

    and does the child know that it has been orphaned?

    It wouldn’t know that it has been orphaned if it only received a SIGCONT.

    In practical substance, sending the child also SIGHUP is precisely the chosen way to notify it of that event.

    Besides, this is consistent with what some shells (Bash for instance) do with their children when the shell itself receives a SIGHUP due to a disconnection event: they propagate such event also to their stopped children through a HUP+CONT, if you are familiar with this particular case. Else, as a reference for this just see Bash’s man-page, precisely at the SIGNALS section.

    This behavior by kernels in practice extends such shells’s particular behavior to any case of orphaned process groups.


    To elaborate my answer further, I would start by telling you that historically it used to be just a SIGKILL, instead of SIGHUP+SIGCONT.

    At some point this has been considered too harsh an approach. By doing that way, in fact, processes would not even have any chance to at least “die gracefully”, i.e. finish what they were doing, release resources, etc.

    Therefore that SIGKILL has been changed to the current and more graceful approach of the HUP+CONT, so that not only processes have their chance to clean up after themselves, but they may also choose to continue running if they so wish.

    I’m not a POSIX specifications expert, but note for instance the following rationale excerpt from the specification of the _exit() system-call:

    [...], if the termination of a process causes a process group to become orphaned, [...]. Stopped processes within the group would languish forever. In order to avoid this problem, newly orphaned process groups that contain stopped processes are sent a SIGHUP signal and a SIGCONT signal to indicate that they have been disconnected from their session. The SIGHUP signal causes the process group members to terminate unless they are catching or ignoring SIGHUP.

    Maybe not an explicit statement, but I think quite close enough to one.

    Then, note also that the use of process groups is not limited to job-control put in place by shell applications only. You can have processes having their own session, such as the typical “daemon” process, with no controlling terminal, no controlling shell, and yet spawning several children itself, grouping them in process groups, that can be handled (i.e. signaled) using their process-group-IDs.

    In fact, although the “shell’s job-control” specific use case of process groups is the most widely known one, very often referred to as the excellent example even by specifications themselves, the actual POSIX’s definition for “process group” is:

    3.296 Process Group

    A collection of processes that permits the signaling of related processes. Each process in the system is a member of a process group that is identified by a process group ID. A newly created process joins the process group of its creator.

    A generic collection of processes.

    While the definition for “job control” is:

    3.203 Job Control

    A facility that allows users selectively to stop (suspend) the execution of processes and continue (resume) their execution at a later point. The user typically employs this facility via the interactive interface jointly supplied by the terminal I/O driver and a command interpreter.

    Job control is a concept related to interactive shells.

    Process grouping is a wider concept.

    HTH

Leave a Reply

Your email address will not be published. Required fields are marked *

You can use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>