[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Carp and /30 network
- To: misc_(_at_)_openbsd_(_dot_)_org
- Subject: Re: Carp and /30 network
- From: Stephen Marley <stephen_(_dot_)_marley_(_at_)_catwoman_(_dot_)_cl-is_(_dot_)_com>
- Date: Sat, 19 Feb 2005 16:32:02 +0000
- Mail-followup-to: misc_(_at_)_openbsd_(_dot_)_org
On Sat, Feb 19, 2005 at 03:02:02PM +1300, Nigel Wohlers wrote:
> Stephen Marley wrote:
>
> >I'm also using the carpdev feature with addressless physical interfaces.
> >It's working pretty well since the recent commits by Ryan McBride and
> >his carp team (problem with the default route is fixed), however, in my
> >limited testing, there are still some issues where arpresolve errors can
> >occur on failover which makes carp stop working. (I assume the devs know
> >of these issues because I can get them to show up merely by repeatedly
> >rebooting or pulling/inserting cables on the master device?).
>
> Hi Stephen,
>
> I have been testing carp with the carpdev support for a few weeks now,
> and I too have noticed the issues you state in your previous posts
> regarding the default route of the BACKUP firewall when using carpdev on
> an addressless interface.
>
> I am running a fresh install of 17th Feb snapshot on two identical test
> machines without preempt (advskew).
>
> You mention "problem with the default route is fixed" in this post; I am
> interested to know how this is fixed, or rather how you came to know
> this. Do you have an /etc/mygate, use bgpd, ifstated, routed?
My latest tests were with a basic 2 firewall setup with a default route
specified in /etc/mygate, no ifstated or dynamic routing protocols
involved and no addresses on the outside physical i/fs. The backup
firewall has advskew=100 and preempt=1 is set in sysctl.conf on both.
The last commit to src/sys/net/if.[ch] fixes the issue where the default
route was squashed when failing over from master to backup when
preempt=1. Previously carp on a /30 was unusable with preempt=1 unless
ifstated was used to re-instate the default route; even then it wasn't
particularly stable.
With the lastest commits, failover on this setup seems to work as
expected (the routing is correct), except that occassionally an
"arpresolve: cannot allocate llinfo" error occurs which breaks
subsequent failover. (Just repeatedly unplug and re-insert cable from
master's outside i/f to trigger. It also happens when rebooting the
master, but it's not as frequently occurring).
There are other kernel warnings like "arplookup: unable to enter address
for x.x.x.x" and "arp_rtrequest: bad gateway value" but these don't seem
to affect failover operation. The "allocate llinfo" one seems to be a
showstopper however.
Try your /30 again with preempt=1 as described in my 1st paragraph, and
you should find things are vastly improved from pre-Feb7 snapshots. You
also need to ensure that your firewalls don't attempt to use the default
route when they're in backup mode as the dreaded arpresolve error will
occur and breakage will ensue. Hopefully this (and the "route -n flush;
sh /etc/netstart") fragility can be fixed.
Let me know if there's any other information I can provide.
Kind regards,
Stephen Marley
--
Network Systems Engineer, Datacentre Operations
Campbell Lee Internet Solutions <http://cl-is.com>
Visit your host, monkey.org