ISC DHCP Users

eth0: not responding (recovering)

Classic

List

Threaded

8 messages Options

Teja

eth0: not responding (recovering)

Hi I have an issue in the lease flow with isc dhcp service. In the logs it is printing eth0: not responding (recovering)

My local is set up with active-active mode(splt value as 50-50%) and because of some reason one of the appliance went down for some duration. I observed this and i bring it up, and duration of down is nearly 15hr.

After i bring it up. I am seeing the logs saying not responding (recovering). Its been more than two hours still I am getting the same logs

Does any one have any idea about this scenario and how to get the environment stable

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

Bill Shirley-2

Re: eth0: not responding (recovering)

Assuming you're referring to DHCP failover, is there any traffic flow on the
port and peer port in the failover stanza?

What is your value for mclt?

Which server, primary or secondary, is giving the recovering message?

Bill

On 10/5/2019 9:33 AM, Surya Teja wrote:

Hi I have an issue in the lease flow with isc dhcp service. In the logs it is printing eth0: not responding (recovering)
My local is set up with active-active mode(splt value as 50-50%) and because of some reason one of the appliance went down for some duration. I observed this and i bring it up, and duration of down is nearly 15hr.

After i bring it up. I am seeing the logs saying not responding (recovering). Its been more than two hours still I am getting the same logs

Does any one have any idea about this scenario and how to get the environment stable
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

Teja

Re: eth0: not responding (recovering)

Hi Bill Thanks for your reply,

Yes I see traffic on the peer ports which i mentioned in the fail over section of my configuration file.

My mclt value is 1800(30 min).

I am seeing these issues on the failover server and some times I see the logs saying peer hold all free leases, but that scope is not completely full with active entries in the dhcpd.lease file of that specified server

And one more strange thing I observerd in the lease file. In the file I have statements like my status and peer status. In that peer status is saying unknown

When will this happen? In general scenario it should be normal that is what i got from internet, but the state is not getting updated in the lease file.

On Sat, 5 Oct 2019, 21:16 Bill Shirley, <[hidden email]> wrote:

Assuming you're referring to DHCP failover, is there any traffic flow on the
port and peer port in the failover stanza?

What is your value for mclt?

Which server, primary or secondary, is giving the recovering message?

Bill

On 10/5/2019 9:33 AM, Surya Teja wrote:
Hi I have an issue in the lease flow with isc dhcp service. In the logs it is printing eth0: not responding (recovering)
My local is set up with active-active mode(splt value as 50-50%) and because of some reason one of the appliance went down for some duration. I observed this and i bring it up, and duration of down is nearly 15hr.

After i bring it up. I am seeing the logs saying not responding (recovering). Its been more than two hours still I am getting the same logs

Does any one have any idea about this scenario and how to get the environment stable
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

Teja

Re: eth0: not responding (recovering)

I am facing weird situation with fail over setup on my lab environment. I am facing issue when the failover dhcp appliance is added to my existing server.

For the first time when i add failover to primary appliance, On primary appliance lease file i see the partner state as unknown and in the failover the messages are printing not responding recovery, so i shutdown the failover appliance and removed the failover config section from primary and restarted primary then it was working fine.

As a trial of second attempt i increased mclt value to 3600 and added the failover section back to primary config and bring up the failover server now.

The environment became most problematic none of the servers are granting leases to devices On the primary lease file it says

my state partner-down

peer state recovery

Why do we get these recovery,partner down, unknown status when i add the failover to my environment?

Or do we have any best practice steps how to add failover to existing server without causing any outages?

Any help would be appreciated

Thanks

On Sun, 6 Oct 2019, 21:04 Surya Teja, <[hidden email]> wrote:

Hi Bill Thanks for your reply,
Yes I see traffic on the peer ports which i mentioned in the fail over section of my configuration file.
My mclt value is 1800(30 min).
I am seeing these issues on the failover server and some times I see the logs saying peer hold all free leases, but that scope is not completely full with active entries in the dhcpd.lease file of that specified server

And one more strange thing I observerd in the lease file. In the file I have statements like my status and peer status. In that peer status is saying unknown
When will this happen? In general scenario it should be normal that is what i got from internet, but the state is not getting updated in the lease file.
On Sat, 5 Oct 2019, 21:16 Bill Shirley, <[hidden email]> wrote:
Assuming you're referring to DHCP failover, is there any traffic flow on the
port and peer port in the failover stanza?

What is your value for mclt?

Which server, primary or secondary, is giving the recovering message?

Bill

On 10/5/2019 9:33 AM, Surya Teja wrote:
Hi I have an issue in the lease flow with isc dhcp service. In the logs it is printing eth0: not responding (recovering)
My local is set up with active-active mode(splt value as 50-50%) and because of some reason one of the appliance went down for some duration. I observed this and i bring it up, and duration of down is nearly 15hr.

After i bring it up. I am seeing the logs saying not responding (recovering). Its been more than two hours still I am getting the same logs

Does any one have any idea about this scenario and how to get the environment stable
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

Bill Shirley-2

Re: eth0: not responding (recovering)

I non-failover DHCP server doesn't have any "failover peer" stanza in the /var/log/dhcpd/dhcpd.leases
file. A working failover lease file has at the top:
failover peer "dhcp-failover" state { my state normal at 4 2019/10/10 05:05:04; partner state normal at 5 2011/09/02 23:51:25;}
Try shutting down both the primary and secondary servers, remove the "failover peer" stanza from
both of their lease files, and then bring up the primary with the failover configuration. Ensure it is
handing out leases, then bring up the secondary with the failover configuration. Then check that
all is working correctly.

Bill

On 10/9/2019 1:32 PM, Surya Teja wrote:

I am facing weird situation with fail over setup on my lab environment. I am facing issue when the failover dhcp appliance is added to my existing server.
For the first time when i add failover to primary appliance, On primary appliance lease file i see the partner state as unknown and in the failover the messages are printing not responding recovery, so i shutdown the failover appliance and removed the failover config section from primary and restarted primary then it was working fine.
As a trial of second attempt i increased mclt value to 3600 and added the failover section back to primary config and bring up the failover server now.

The environment became most problematic none of the servers are granting leases to devices On the primary lease file it says

my state partner-down

peer state recovery

Why do we get these recovery,partner down, unknown status when i add the failover to my environment?

Or do we have any best practice steps how to add failover to existing server without causing any outages?

Any help would be appreciated

Thanks
On Sun, 6 Oct 2019, 21:04 Surya Teja, <[hidden email]> wrote:
Hi Bill Thanks for your reply,
Yes I see traffic on the peer ports which i mentioned in the fail over section of my configuration file.

My mclt value is 1800(30 min).

I am seeing these issues on the failover server and some times I see the logs saying peer hold all free leases, but that scope is not completely full with active entries in the dhcpd.lease file of that specified server

And one more strange thing I observerd in the lease file. In the file I have statements like my status and peer status. In that peer status is saying unknown

When will this happen? In general scenario it should be normal that is what i got from internet, but the state is not getting updated in the lease file.
On Sat, 5 Oct 2019, 21:16 Bill Shirley, <[hidden email]> wrote:
Assuming you're referring to DHCP failover, is there any traffic flow on the
port and peer port in the failover stanza?

What is your value for mclt?

Which server, primary or secondary, is giving the recovering message?

Bill

On 10/5/2019 9:33 AM, Surya Teja wrote:
Hi I have an issue in the lease flow with isc dhcp service. In the logs it is printing eth0: not responding (recovering)
My local is set up with active-active mode(splt value as 50-50%) and because of some reason one of the appliance went down for some duration. I observed this and i bring it up, and duration of down is nearly 15hr.

After i bring it up. I am seeing the logs saying not responding (recovering). Its been more than two hours still I am getting the same logs

Does any one have any idea about this scenario and how to get the environment stable
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

Teja

Re: eth0: not responding (recovering)

Hi Bill

I non-failover DHCP server doesn't have any "failover peer" stanza in the /var/log/dhcpd/dhcpd.leases file ---->

Sorry for typo in my previous email

The environment became most problematic none of the servers are granting leases to devices On the primary lease file it says

my state partner-down

peer state recovery

----> Its not peer it is partner state

A working failover lease file has at the top ----->
I can see at multiple places in the dhcpd.lease file specifying about these states appended with time stamp saying like
failover peer "dhcp-peer-workspace1" state {
my state recover at 3 2019/10/09 14:23:41;
partner state unknown-state at 3 2019/10/09 14:23:41;
}

failover peer "dhcp-peer- workspace1 " state {
my state recover at 3 2019/10/09 14:23:41;
partner state unknown-state at 3 2019/10/09 14:23:41;
}
server-duid "\000\001\000\001%0\251\355\000PV\207D\342";
failover peer "dhcp-peer- workspace1 " state {
my state recover at 3 2019/10/09 14:23:41;
partner state unknown-state at 3 2019/10/09 14:23:41;
}
Try shutting down both the primary and secondary servers, remove the "failover peer" stanza---->

Yes I tried this,I shut down the failover and on primary, I removed the failover config part totally from the config file

Stopped the dhcpd and deleted the lease file and again touch the lease file then restarted the DHCP it worked as expected,

The moment I bring the failover appliances up and add the failover section to the primary config file and restart the dhcpd on the primary the issue starts.

First the failover logs says it is in recovery mode ok, So i thought as it has to sync the primary it is in recovery mode. After some span of

time on the primary appliance lease file I see

my state partner-down

partner state recovery

Thus comes the issues, and these are running in for ever condition the status are not getting updated in any of the appliance lease file

Is it ok if I edit the lease file manually and make it normal ?

On Thu, Oct 10, 2019 at 9:35 PM Bill Shirley <[hidden email]> wrote:

I non-failover DHCP server doesn't have any "failover peer" stanza in the /var/log/dhcpd/dhcpd.leases
file. A working failover lease file has at the top:
failover peer "dhcp-failover" state { my state normal at 4 2019/10/10 05:05:04; partner state normal at 5 2011/09/02 23:51:25;}
Try shutting down both the primary and secondary servers, remove the "failover peer" stanza from
both of their lease files, and then bring up the primary with the failover configuration. Ensure it is
handing out leases, then bring up the secondary with the failover configuration. Then check that
all is working correctly.

Bill

On 10/9/2019 1:32 PM, Surya Teja wrote:
I am facing weird situation with fail over setup on my lab environment. I am facing issue when the failover dhcp appliance is added to my existing server.
For the first time when i add failover to primary appliance, On primary appliance lease file i see the partner state as unknown and in the failover the messages are printing not responding recovery, so i shutdown the failover appliance and removed the failover config section from primary and restarted primary then it was working fine.
As a trial of second attempt i increased mclt value to 3600 and added the failover section back to primary config and bring up the failover server now.

The environment became most problematic none of the servers are granting leases to devices On the primary lease file it says

my state partner-down

peer state recovery

Why do we get these recovery,partner down, unknown status when i add the failover to my environment?

Or do we have any best practice steps how to add failover to existing server without causing any outages?

Any help would be appreciated

Thanks
On Sun, 6 Oct 2019, 21:04 Surya Teja, <[hidden email]> wrote:
Hi Bill Thanks for your reply,
Yes I see traffic on the peer ports which i mentioned in the fail over section of my configuration file.

My mclt value is 1800(30 min).

I am seeing these issues on the failover server and some times I see the logs saying peer hold all free leases, but that scope is not completely full with active entries in the dhcpd.lease file of that specified server

And one more strange thing I observerd in the lease file. In the file I have statements like my status and peer status. In that peer status is saying unknown

When will this happen? In general scenario it should be normal that is what i got from internet, but the state is not getting updated in the lease file.
On Sat, 5 Oct 2019, 21:16 Bill Shirley, <[hidden email]> wrote:
Assuming you're referring to DHCP failover, is there any traffic flow on the
port and peer port in the failover stanza?

What is your value for mclt?

Which server, primary or secondary, is giving the recovering message?

Bill

On 10/5/2019 9:33 AM, Surya Teja wrote:
Hi I have an issue in the lease flow with isc dhcp service. In the logs it is printing eth0: not responding (recovering)
My local is set up with active-active mode(splt value as 50-50%) and because of some reason one of the appliance went down for some duration. I observed this and i bring it up, and duration of down is nearly 15hr.

After i bring it up. I am seeing the logs saying not responding (recovering). Its been more than two hours still I am getting the same logs

Does any one have any idea about this scenario and how to get the environment stable
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

Bill Shirley-2

Re: eth0: not responding (recovering)

I wouldn't change the state manually.

For me, after changing the primary configuration to failover, I initally start it with "mclt 60;" to speed
recovery:
mclt 3600; # not for secondary#mclt 60; # use this when deploying a replacement server
If the primary is working, then start the secondary. When they're both "normal", change
the configuration for the primary to the desired mclt time and restart the primary, then the
secondary.

Bill

On 10/10/2019 1:23 PM, Surya Teja wrote:

Hi Bill
I non-failover DHCP server doesn't have any "failover peer" stanza in the /var/log/dhcpd/dhcpd.leases file ---->

Sorry for typo in my previous email

The environment became most problematic none of the servers are granting leases to devices On the primary lease file it says

my state partner-down

peer state recovery
----> Its not peer it is partner state

A working failover lease file has at the top ----->
I can see at multiple places in the dhcpd.lease file specifying about these states appended with time stamp saying like
failover peer "dhcp-peer-workspace1" state {
my state recover at 3 2019/10/09 14:23:41;
partner state unknown-state at 3 2019/10/09 14:23:41;
}

failover peer "dhcp-peer- workspace1 " state {
my state recover at 3 2019/10/09 14:23:41;
partner state unknown-state at 3 2019/10/09 14:23:41;
}
server-duid "\000\001\000\001%0\251\355\000PV\207D\342";
failover peer "dhcp-peer- workspace1 " state {
my state recover at 3 2019/10/09 14:23:41;
partner state unknown-state at 3 2019/10/09 14:23:41;
}
Try shutting down both the primary and secondary servers, remove the "failover peer" stanza---->

Yes I tried this,I shut down the failover and on primary, I removed the failover config part totally from the config file

Stopped the dhcpd and deleted the lease file and again touch the lease file then restarted the DHCP it worked as expected,

The moment I bring the failover appliances up and add the failover section to the primary config file and restart the dhcpd on the primary the issue starts.

First the failover logs says it is in recovery mode ok, So i thought as it has to sync the primary it is in recovery mode. After some span of

time on the primary appliance lease file I see

my state partner-down

partner state recovery

Thus comes the issues, and these are running in for ever condition the status are not getting updated in any of the appliance lease file

Is it ok if I edit the lease file manually and make it normal ?
On Thu, Oct 10, 2019 at 9:35 PM Bill Shirley <[hidden email]> wrote:
I non-failover DHCP server doesn't have any "failover peer" stanza in the /var/log/dhcpd/dhcpd.leases
file. A working failover lease file has at the top:
failover peer "dhcp-failover" state { my state normal at 4 2019/10/10 05:05:04; partner state normal at 5 2011/09/02 23:51:25;}
Try shutting down both the primary and secondary servers, remove the "failover peer" stanza from
both of their lease files, and then bring up the primary with the failover configuration. Ensure it is
handing out leases, then bring up the secondary with the failover configuration. Then check that
all is working correctly.

Bill

On 10/9/2019 1:32 PM, Surya Teja wrote:
I am facing weird situation with fail over setup on my lab environment. I am facing issue when the failover dhcp appliance is added to my existing server.
For the first time when i add failover to primary appliance, On primary appliance lease file i see the partner state as unknown and in the failover the messages are printing not responding recovery, so i shutdown the failover appliance and removed the failover config section from primary and restarted primary then it was working fine.
As a trial of second attempt i increased mclt value to 3600 and added the failover section back to primary config and bring up the failover server now.

The environment became most problematic none of the servers are granting leases to devices On the primary lease file it says

my state partner-down

peer state recovery

Why do we get these recovery,partner down, unknown status when i add the failover to my environment?

Or do we have any best practice steps how to add failover to existing server without causing any outages?

Any help would be appreciated

Thanks
On Sun, 6 Oct 2019, 21:04 Surya Teja, <[hidden email]> wrote:
Hi Bill Thanks for your reply,
Yes I see traffic on the peer ports which i mentioned in the fail over section of my configuration file.

My mclt value is 1800(30 min).

I am seeing these issues on the failover server and some times I see the logs saying peer hold all free leases, but that scope is not completely full with active entries in the dhcpd.lease file of that specified server

And one more strange thing I observerd in the lease file. In the file I have statements like my status and peer status. In that peer status is saying unknown

When will this happen? In general scenario it should be normal that is what i got from internet, but the state is not getting updated in the lease file.
On Sat, 5 Oct 2019, 21:16 Bill Shirley, <[hidden email]> wrote:
Assuming you're referring to DHCP failover, is there any traffic flow on the
port and peer port in the failover stanza?

What is your value for mclt?

Which server, primary or secondary, is giving the recovering message?

Bill

On 10/5/2019 9:33 AM, Surya Teja wrote:
Hi I have an issue in the lease flow with isc dhcp service. In the logs it is printing eth0: not responding (recovering)
My local is set up with active-active mode(splt value as 50-50%) and because of some reason one of the appliance went down for some duration. I observed this and i bring it up, and duration of down is nearly 15hr.

After i bring it up. I am seeing the logs saying not responding (recovering). Its been more than two hours still I am getting the same logs

Does any one have any idea about this scenario and how to get the environment stable
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users
_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users

Simon Hobson

Re: eth0: not responding (recovering)

In reply to this post by Teja

Surya Teja <[hidden email]> wrote:

> Is it ok if I edit the lease file manually and make it normal ?

I would suggest you try leaving the partner down and explicitly set the master to "partner down" state. AIUI, the master should then enter normal operations as if there were no failover configured. Wait until it is running normally, then start the peer with a clean (empty) leases file.

What should then happen is the peer will transfer the lease info from the master. After this they should both go to normal state - not sure if there's any built in delay forr this.

_______________________________________________
dhcp-users mailing list
[hidden email]
https://lists.isc.org/mailman/listinfo/dhcp-users