On 13 May 2016 at 20:30, Terry Burton <
[hidden email]> wrote:
> On 13 May 2016 at 20:06, Terry Burton <
[hidden email]> wrote:
>> On 13 May 2016 at 19:25, dave c <
[hidden email]> wrote:
>>> Are folks forgetting that the default action of the kill command is to send
>>> the TERM signal? That signal should tell the daemon to do an orderly
>>> shutdown, close the leases file cleanly, send whatever signals to the
>>> partner that are required and then exit when everything is ready.
>>>
>>> All the concern I am seeing below would be true if folks were issuing a kill
>>> -9 to stop the service. At which point the leases file would get potentially
>>> corrupted.
>> <...snip...>
>>> So it sounds like a lot of angst over nothing... a TERM signal is defined as
>>> closing all processes and threads cleanly, writing out the last bits of data
>>> and stopping things in an orderly fashion. So seems that issuing kill {dhcpd
>>> pid} would be perfectly acceptable to close things down even in a partner
>>> scenario.
>>
>> Where do you get the definition of a SIGTERM causing a graceful
>> shutdown (other than by convention) and if this were the case for ISC
>> DHCP then why the warning about truncated leases given in AA-01043?
>>
>> The effect of receiving a handleable signal is to immediately jump
>> into the trap handler if one is configured for that signal, otherwise
>> to die.
>>
>> Unless a handler takes care to ensure that everything is consistent
>> and then exit then SIGTERM, SIGINT, etc. are potentially dangerous.
>>
>> The release notes indicate that a "gentle shutdown" feature was added
>> in the past and then subsequently removed because the semantics chosen
>> caused operational issues - but what these were isn't known because
>> the associated bug report isn't publicly available.
>>
>> I need to find time to understand the current codebase, but what I'd
>> like to know the intended semantics and what issues are encountered
>> with implementing these in the way that Simon Hobson suggests.
>
> So currently there are no trap handlers for SIGTERM or SIGINT and
> therefore no cleanup whatsoever at exit.
>
> There is a compiled-out option ENABLE_GENTLE_SHUTDOWN which installs
> handlers for these signals but when this was activated it implemented
> the harmful semantics of putting the server through a
> recovery+partner-down transition which isn't useful for a quick
> configuration reload:
>
> /* Enable the gentle shutdown signal handling. Currently this
> means that on SIGINT or SIGTERM a client will release its
> address and a server in a failover pair will go through
> partner down. Both of which can be undesireable in some
> situations. We plan to revisit this feature and may
> make non-backwards compatible changes including the
> removal of this define. Use at your own risk. */
> /* #define ENABLE_GENTLE_SHUTDOWN */
>
> #if defined(ENABLE_GENTLE_SHUTDOWN)
> /* no signal handlers until we deal with the side effects */
> /* install signal handlers */
> signal(SIGINT, dhcp_signal_handler); /* control-c */
> signal(SIGTERM, dhcp_signal_handler); /* kill */
> #endif
>
> Having a more basic signal handler that defers the exit in order to
> continue to write out an outstanding lease seems better. Perhaps once
> could even differentiate these exit semantics based on SIGINT vs
> SIGTERM.
>
> If someone who can speak for ISC is able to indicate whether this
> would be a sensible approach then I am happy to work up a patch.
As a rough proof of concept this simple amendment appears to provide
rapid shutdown without partner down interference for a basic reload.
The part of dhcpd.c that drives the current -> shutdown -> recover
state transition can be guarded by a variable that determines whether
or not to place the server into long-term shutdown (possibly based on
the received signal or a runtime configurable.)
...
#define RECOVER_ON_STARTUP 1!=1 /* To be a variable derived at runtime */
...
#if defined (FAILOVER_PROTOCOL)
if (RECOVER_ON_STARTUP) {
/* Set all failover peers into the shutdown state. */
if (shutdown_state == shutdown_dhcp) {
for (state = failover_states; state; state = state -> next) {
if (state -> me.state == normal) {
dhcp_failover_set_state (state, shut_down);
failover_connection_count++;
}
if (state -> me.state == shut_down &&
state -> partner.state != partner_down)
failover_connection_count++;
}
}
if (shutdown_state == shutdown_done) {
for (state = failover_states; state; state = state -> next) {
if (state -> me.state == shut_down) {
if (state -> link_to_peer)
dhcp_failover_link_dereference (&state ->
link_to_peer,
MDL);
dhcp_failover_set_state (state, recover);
}
}
}
}
#endif
if (shutdown_state == shutdown_done) {
#if defined (DEBUG_MEMORY_LEAKAGE) && \
defined (DEBUG_MEMORY_LEAKAGE_ON_EXIT)
free_everything ();
omapi_print_dmalloc_usage_by_caller ();
#endif
if (no_pid_file == ISC_FALSE)
(void) unlink(path_dhcpd_pid);
exit (0);
}
...
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users