How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

David Young
Hi all,

I'm running Openstack Ocata (Deployed with openstack-ansible), with the following configuration:

* Compute nodes running nova and neutron agent
* 2 x Controllers running neutron server/agents in LXC containers (as deployed by openstack-ansible playbooks)
* Underlying hosts have a single NIC (MTU 9000) with multiple VLAN subinterafces, which in turn are connected to bridges br-vxlan, br-vlan, br-management


I've encountered the following problem:

1. When I create in instance in a vxlan tenant network, without changing any configuration files, the instance (linux default) assumes an MTU of 1500, but in reality only has an MTU of 1450 (because of the VXLAN overhead). Instances cannot ping each other or their gateway (a neutron router) with > 1450 MTU.

2. While I _could_ push an MTU of 1450 to my instances via DHCP, this is (a) not always reliable depending on the guest OS, and (b) breaks docker on instances, which defaults to an MTU of 1500 for docker0

3. So, I attempted the configuration changes described at http://serverascode.com/2017/06/06/neutron-vxlan-tenant-mtu-1500.html, increasing my global MTU to 1550 in neutron.conf / ml2_conf.ini, on the compute nodes, and the neutron client & server LXC containers on the controller, so that a default MTU of 1500 in my instances would always work.

4. The effect of step #3 above is that now my instances can communicate with _each other_ at up to 1500 MTU, _but_ they still can't ping their gateway (the neutron router) at anything over 1450 MTU.

5. When I examine my compute nodes (underlying host OS), I note that the bridge "br-vxlan" contains the vlan subinterface (MTU 9000) plus a veth interface for connectivity to the neutron-agents LXC container (e.g. "04063403_eth10"). The veth interface has an MTU of 1500. The corresponding interface within the neutron-agents LXC container (eth10) also has an MTU of 1500.

6. Assuming that #5 is the cause of my MTU fault (i.e., a 1500-byte packet from the instance over the tentant network = 1500+50=1550, can't pass through the veth interface), I manually changed the veth interface (and the corresponding interface within the LXC container) to MTU 1550.

7. Now I can pass packets from my instances to the neutron router as large as 1468 bytes (previous limit was 1448), but still not the 1500 bytes I expected.

8. Increasing the MTU again (per #6 above) to 1600 makes no difference to the result in #7 above.


So, I'm thinking I've missed something, and the most likely issue is the definition of the LXC container (and veth interfaces) for neutron-agents on the controller. I thought it was a simple fix (manually change MTU per #6), but I'm baffled re why increasing MTU on the veth interfaces by 50 bytes only got me 20 bytes more overhead (1468), and even if this _was_ the fix, it's obviously only temporary, so I wonder what is the correct way to address the MTU issue under openstack-ansible?

Can anybody shed some light on this?

Thanks!
David



_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

David Young

Hello,

Thanks for the reply, responses inline below:

Hello,

I haven't touched this for a while, but could you give us your user_*
variable overrides?

OK, here we go. Let me know if there’s a preferred way to send large data blocks - I considered a gist or a pastebin, but figured that having the content archived with the mailing list message would be the best result.

I think the overrides is what you’re asking for? The only MTU-related override I have is “containermtu” for the vxlan network below. I expect it doesn’t actually _do anything though, because I can’t find the string “container_mtu” within any of the related ansible roles (see grep for container_mtu vs container_bridge below for illustration). I found https://bugs.launchpad.net/openstack-ansible/+bug/1678165 which looked related

root@nbs-dh-09:~# grep container_mtu /etc/ansible/ -ri
root@nbs-dh-09:~# grep container_bridge /etc/ansible/ -ri
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-mgmt"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-vxlan"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-vlan"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-vlan"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-storage"
/etc/ansible/roles/plugins/library/provider_networks:                            bind_device = net['network']['container_bridge']
/etc/ansible/roles/os_neutron/doc/source/configure-network-services.rst:          container_bridge: "br-vlan"
root@nbs-dh-09:~#
global_overrides:
  internal_lb_vip_address: 10.76.76.11
  #
  # The below domain name must resolve to an IP address
  # in the CIDR specified in haproxy_keepalived_external_vip_cidr.
  # If using different protocols (https/http) for the public/internal
  # endpoints the two addresses must be different.
  #
  external_lb_vip_address: openstack.dev.safenz.net
  tunnel_bridge: "br-vxlan"
  management_bridge: "br-mgmt"
  provider_networks:
    - network:
        container_bridge: "br-mgmt"
        container_type: "veth"
        container_interface: "eth1"
        ip_from_q: "container"
        type: "raw"
        group_binds:
          - all_containers
          - hosts
        is_container_address: true
        is_ssh_address: true
    - network:
        container_bridge: "br-vxlan"
        container_type: "veth"
        container_interface: "eth10"
        container_mtu: "9000"
        ip_from_q: "tunnel"
        type: "vxlan"
        range: "1:1000"
        net_name: "vxlan"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-vlan"
        container_type: "veth"
        container_interface: "eth12"
        host_bind_override: "eth12"
        type: "flat"
        net_name: "flat"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-vlan"
        container_type: "veth"
        container_interface: "eth11"
        type: "vlan"
        range: "1:4094"
        net_name: "vlan"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-storage"
        container_type: "veth"
        container_interface: "eth2"
        ip_from_q: "storage"
        type: "raw"
        group_binds:
          - glance_api
          - cinder_api
          - cinder_volume
          - nova_compute
          - swift_proxy

Here are a few things I watch for mtu related discussions:
1) ``lxc_net_mtu``: It is used in lxc_hosts to define the lxc bridge.

Aha. I didn’t know about this, it sounds like what I need. I’ll add this and report back.

2) Your compute nodes and your controller nodes need to have
consistent mtus on their bridges.

They are both configured for an MTU of 9000, but the controller nodes bridges’ drop their MTU to 1500 when the veth interface paired with the neutron-agent LXC container is joined to the bridge (bridges downgrade their MTU to the MTU of the lowest participating interface)

3) Neutron needs a configuration override.

I’ve set this in neutron.conf on all neutron LXC containers, and on the compute nodes too:
global_physnet_mtu = 1550

And likewise in /etc/neutron/plugins/ml2/ml2_conf.ini:

# Set a global MTU of 1550 (to allow VXLAN at 1500)
path_mtu = 1550

# Drop VLAN and FLAT providers back to 1500, to align with outside FWs
physical_network_mtus = vlan:1500,flat:1500

4) the lxc containers need to be properly defined: each network should
have a mtu defined, or alternatively, you can define a default mtu for
all the networks defined in openstack_user_config with
``lxc_container_default_mtu``. (This one is the one that spawns up the
veth pair to the lxc container)

I didn’t know about this one either, it didn’t exist in any of the default ansible-provided sample configs, but now that I’ve grepped in the ansible roles for “mtu”, it’s obvious. I’ll try this too.

root@nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/openstack_deploy/*
root@nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/ansible/
/etc/ansible/roles/lxc_container_create/defaults/main.yml:lxc_container_default_mtu: "1500"
/etc/ansible/roles/lxc_container_create/templates/container-interface.ini.j2:lxc.network.mtu = {{ item.value.mtu|default(lxc_container_default_mtu) }}
/etc/ansible/roles/lxc_container_create/templates/debian-interface.cfg.j2:    mtu {{ item.value.mtu|default(lxc_container_default_mtu) }}
/etc/ansible/roles/lxc_container_create/templates/rhel-interface.j2:MTU={{ item.value.mtu|default(lxc_container_default_mtu) }}
root@nbs-dh-09:~#

5) The container interfaces need to have this proper mtu. This is
taking the same configuration as 4) above, so it should work out of
the box.

Agreed, that seems to be the case currently with 1500, I’d expect it to be true with the updated value

6) If your instance is reaching its router with no mtu issue, you may
still have issues for the Northbound trafic. Check how you configured
this northbound and if the interfaces have proper mtu. If there are
veth pairs to create pseudo links, check their mtus too.

I think it's a good start for the conversation...

Thank you, this is very helpful. I’ll give it a try and respond.

Re #1 and #4, do I need to destroy / recreate my existing LXC containers, or will rerunning the playbooks be enough to update the MTUs?

Many thanks,
David


_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

David Young

An update to my reply below..

I’ve realized that I need a per-network MTU defined in /etc/openstack_deploy/openstack_user_config.yml, so I’ve done the following:

global_overrides:
<snip>
  provider_networks:
    - network:
        container_bridge: "br-mgmt"
        <snip>
        container_mtu: "1500"
        <snip>
    - network:
        container_bridge: "br-vxlan"
        container_mtu: "1550"
        type: "vxlan"
        <snip>
    - network:
        container_bridge: "br-vlan"
        type: "flat"
        net_name: "flat"
        container_mtu: "1500"
        <snip>
    - network:
        container_bridge: "br-vlan"
        type: "vlan"
        container_mtu: "1500"
        <snip>
    - network:
        container_bridge: "br-storage"
        type: "raw"
        container_mtu: "9000"
        group_binds:
          - glance_api
          - cinder_api
          - cinder_volume
          - nova_compute
          - swift_proxy

I think that gets me:

  • VXLAN LXC interfaces will have an MTU of 1550 (necessary for “raw” 1500 from the instances)
  • flat/vlan interfaces will have an MTU of 1500 (let’s be consistent)
  • storage interfaces can have an MTU of 9000

Then, I set the following in /etc/openstack_deploy/user_variables.yml:

lxc_net_mtu: 1550
lxc_container_default_mtu: 1550

I don’t know whether this is redundant or not based on the above, but it seemed sensible.

I’m rerunning the setup-everything.yml playbook, but still not sure whether the changes apply if there’s an existing LXC container defined. We’ll find out soon enough…

Cheers,
D

On 06/12/2017 21:51, David Young wrote:

Hello,

Thanks for the reply, responses inline below:

Hello,

I haven't touched this for a while, but could you give us your user_*
variable overrides?

OK, here we go. Let me know if there’s a preferred way to send large data blocks - I considered a gist or a pastebin, but figured that having the content archived with the mailing list message would be the best result.

I think the overrides is what you’re asking for? The only MTU-related override I have is “containermtu” for the vxlan network below. I expect it doesn’t actually _do anything though, because I can’t find the string “container_mtu” within any of the related ansible roles (see grep for container_mtu vs container_bridge below for illustration). I found https://bugs.launchpad.net/openstack-ansible/+bug/1678165 which looked related

root@nbs-dh-09:~# grep container_mtu /etc/ansible/ -ri
root@nbs-dh-09:~# grep container_bridge /etc/ansible/ -ri
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-mgmt"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-vxlan"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-vlan"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-vlan"
/etc/ansible/roles/plugins/library/provider_networks:#     container_bridge: "br-storage"
/etc/ansible/roles/plugins/library/provider_networks:                            bind_device = net['network']['container_bridge']
/etc/ansible/roles/os_neutron/doc/source/configure-network-services.rst:          container_bridge: "br-vlan"
root@nbs-dh-09:~#
global_overrides:
  internal_lb_vip_address: 10.76.76.11
  #
  # The below domain name must resolve to an IP address
  # in the CIDR specified in haproxy_keepalived_external_vip_cidr.
  # If using different protocols (https/http) for the public/internal
  # endpoints the two addresses must be different.
  #
  external_lb_vip_address: openstack.dev.safenz.net
  tunnel_bridge: "br-vxlan"
  management_bridge: "br-mgmt"
  provider_networks:
    - network:
        container_bridge: "br-mgmt"
        container_type: "veth"
        container_interface: "eth1"
        ip_from_q: "container"
        type: "raw"
        group_binds:
          - all_containers
          - hosts
        is_container_address: true
        is_ssh_address: true
    - network:
        container_bridge: "br-vxlan"
        container_type: "veth"
        container_interface: "eth10"
        container_mtu: "9000"
        ip_from_q: "tunnel"
        type: "vxlan"
        range: "1:1000"
        net_name: "vxlan"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-vlan"
        container_type: "veth"
        container_interface: "eth12"
        host_bind_override: "eth12"
        type: "flat"
        net_name: "flat"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-vlan"
        container_type: "veth"
        container_interface: "eth11"
        type: "vlan"
        range: "1:4094"
        net_name: "vlan"
        group_binds:
          - neutron_linuxbridge_agent
    - network:
        container_bridge: "br-storage"
        container_type: "veth"
        container_interface: "eth2"
        ip_from_q: "storage"
        type: "raw"
        group_binds:
          - glance_api
          - cinder_api
          - cinder_volume
          - nova_compute
          - swift_proxy
Here are a few things I watch for mtu related discussions:
1) ``lxc_net_mtu``: It is used in lxc_hosts to define the lxc bridge.

Aha. I didn’t know about this, it sounds like what I need. I’ll add this and report back.

2) Your compute nodes and your controller nodes need to have
consistent mtus on their bridges.

They are both configured for an MTU of 9000, but the controller nodes bridges’ drop their MTU to 1500 when the veth interface paired with the neutron-agent LXC container is joined to the bridge (bridges downgrade their MTU to the MTU of the lowest participating interface)

3) Neutron needs a configuration override.

I’ve set this in neutron.conf on all neutron LXC containers, and on the compute nodes too:
global_physnet_mtu = 1550

And likewise in /etc/neutron/plugins/ml2/ml2_conf.ini:

# Set a global MTU of 1550 (to allow VXLAN at 1500)
path_mtu = 1550

# Drop VLAN and FLAT providers back to 1500, to align with outside FWs
physical_network_mtus = vlan:1500,flat:1500
4) the lxc containers need to be properly defined: each network should
have a mtu defined, or alternatively, you can define a default mtu for
all the networks defined in openstack_user_config with
``lxc_container_default_mtu``. (This one is the one that spawns up the
veth pair to the lxc container)

I didn’t know about this one either, it didn’t exist in any of the default ansible-provided sample configs, but now that I’ve grepped in the ansible roles for “mtu”, it’s obvious. I’ll try this too.

root@nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/openstack_deploy/*
root@nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/ansible/
/etc/ansible/roles/lxc_container_create/defaults/main.yml:lxc_container_default_mtu: "1500"
/etc/ansible/roles/lxc_container_create/templates/container-interface.ini.j2:lxc.network.mtu = {{ item.value.mtu|default(lxc_container_default_mtu) }}
/etc/ansible/roles/lxc_container_create/templates/debian-interface.cfg.j2:    mtu {{ item.value.mtu|default(lxc_container_default_mtu) }}
/etc/ansible/roles/lxc_container_create/templates/rhel-interface.j2:MTU={{ item.value.mtu|default(lxc_container_default_mtu) }}
root@nbs-dh-09:~#
5) The container interfaces need to have this proper mtu. This is
taking the same configuration as 4) above, so it should work out of
the box.

Agreed, that seems to be the case currently with 1500, I’d expect it to be true with the updated value

6) If your instance is reaching its router with no mtu issue, you may
still have issues for the Northbound trafic. Check how you configured
this northbound and if the interfaces have proper mtu. If there are
veth pairs to create pseudo links, check their mtus too.

I think it's a good start for the conversation...

Thank you, this is very helpful. I’ll give it a try and respond.

Re #1 and #4, do I need to destroy / recreate my existing LXC containers, or will rerunning the playbooks be enough to update the MTUs?

Many thanks,
David


_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

Jean-philippe Evrard
On 6 December 2017 at 09:09, David Young <[hidden email]> wrote:

> An update to my reply below..
>
> I’ve realized that I need a per-network MTU defined in
> /etc/openstack_deploy/openstack_user_config.yml, so I’ve done the following:
>
> global_overrides:
> <snip>
>   provider_networks:
>     - network:
>         container_bridge: "br-mgmt"
>         <snip>
>         container_mtu: "1500"
>         <snip>
>     - network:
>         container_bridge: "br-vxlan"
>         container_mtu: "1550"
>         type: "vxlan"
>         <snip>
>     - network:
>         container_bridge: "br-vlan"
>         type: "flat"
>         net_name: "flat"
>         container_mtu: "1500"
>         <snip>
>     - network:
>         container_bridge: "br-vlan"
>         type: "vlan"
>         container_mtu: "1500"
>         <snip>
>     - network:
>         container_bridge: "br-storage"
>         type: "raw"
>         container_mtu: "9000"
>         group_binds:
>           - glance_api
>           - cinder_api
>           - cinder_volume
>           - nova_compute
>           - swift_proxy
>
> I think that gets me:
>
> VXLAN LXC interfaces will have an MTU of 1550 (necessary for “raw” 1500 from
> the instances)
> flat/vlan interfaces will have an MTU of 1500 (let’s be consistent)
> storage interfaces can have an MTU of 9000
>
> Then, I set the following in /etc/openstack_deploy/user_variables.yml:
>
> lxc_net_mtu: 1550
> lxc_container_default_mtu: 1550
>
> I don’t know whether this is redundant or not based on the above, but it
> seemed sensible.
>
> I’m rerunning the setup-everything.yml playbook, but still not sure whether
> the changes apply if there’s an existing LXC container defined. We’ll find
> out soon enough…
>
> Cheers,
> D
>
> On 06/12/2017 21:51, David Young wrote:
>
> Hello,
>
> Thanks for the reply, responses inline below:
>
> Hello,
>
> I haven't touched this for a while, but could you give us your user_*
> variable overrides?
>
> OK, here we go. Let me know if there’s a preferred way to send large data
> blocks - I considered a gist or a pastebin, but figured that having the
> content archived with the mailing list message would be the best result.
>
> I think the overrides is what you’re asking for? The only MTU-related
> override I have is “containermtu” for the vxlan network below. I expect it
> doesn’t actually _do anything though, because I can’t find the string
> “container_mtu” within any of the related ansible roles (see grep for
> container_mtu vs container_bridge below for illustration). I found
> https://bugs.launchpad.net/openstack-ansible/+bug/1678165 which looked
> related
>
> root@nbs-dh-09:~# grep container_mtu /etc/ansible/ -ri
> root@nbs-dh-09:~# grep container_bridge /etc/ansible/ -ri
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-mgmt"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-vxlan"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-vlan"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-vlan"
> /etc/ansible/roles/plugins/library/provider_networks:#     container_bridge:
> "br-storage"
> /etc/ansible/roles/plugins/library/provider_networks:
> bind_device = net['network']['container_bridge']
> /etc/ansible/roles/os_neutron/doc/source/configure-network-services.rst:
> container_bridge: "br-vlan"
> root@nbs-dh-09:~#
>
> global_overrides:
>   internal_lb_vip_address: 10.76.76.11
>   #
>   # The below domain name must resolve to an IP address
>   # in the CIDR specified in haproxy_keepalived_external_vip_cidr.
>   # If using different protocols (https/http) for the public/internal
>   # endpoints the two addresses must be different.
>   #
>   external_lb_vip_address: openstack.dev.safenz.net
>   tunnel_bridge: "br-vxlan"
>   management_bridge: "br-mgmt"
>   provider_networks:
>     - network:
>         container_bridge: "br-mgmt"
>         container_type: "veth"
>         container_interface: "eth1"
>         ip_from_q: "container"
>         type: "raw"
>         group_binds:
>           - all_containers
>           - hosts
>         is_container_address: true
>         is_ssh_address: true
>     - network:
>         container_bridge: "br-vxlan"
>         container_type: "veth"
>         container_interface: "eth10"
>         container_mtu: "9000"
>         ip_from_q: "tunnel"
>         type: "vxlan"
>         range: "1:1000"
>         net_name: "vxlan"
>         group_binds:
>           - neutron_linuxbridge_agent
>     - network:
>         container_bridge: "br-vlan"
>         container_type: "veth"
>         container_interface: "eth12"
>         host_bind_override: "eth12"
>         type: "flat"
>         net_name: "flat"
>         group_binds:
>           - neutron_linuxbridge_agent
>     - network:
>         container_bridge: "br-vlan"
>         container_type: "veth"
>         container_interface: "eth11"
>         type: "vlan"
>         range: "1:4094"
>         net_name: "vlan"
>         group_binds:
>           - neutron_linuxbridge_agent
>     - network:
>         container_bridge: "br-storage"
>         container_type: "veth"
>         container_interface: "eth2"
>         ip_from_q: "storage"
>         type: "raw"
>         group_binds:
>           - glance_api
>           - cinder_api
>           - cinder_volume
>           - nova_compute
>           - swift_proxy
>
> Here are a few things I watch for mtu related discussions:
> 1) ``lxc_net_mtu``: It is used in lxc_hosts to define the lxc bridge.
>
> Aha. I didn’t know about this, it sounds like what I need. I’ll add this and
> report back.
>
> 2) Your compute nodes and your controller nodes need to have
> consistent mtus on their bridges.
>
> They are both configured for an MTU of 9000, but the controller nodes
> bridges’ drop their MTU to 1500 when the veth interface paired with the
> neutron-agent LXC container is joined to the bridge (bridges downgrade their
> MTU to the MTU of the lowest participating interface)
>
> 3) Neutron needs a configuration override.
>
> I’ve set this in neutron.conf on all neutron LXC containers, and on the
> compute nodes too:
> global_physnet_mtu = 1550
>
> And likewise in /etc/neutron/plugins/ml2/ml2_conf.ini:
>
> # Set a global MTU of 1550 (to allow VXLAN at 1500)
> path_mtu = 1550
>
> # Drop VLAN and FLAT providers back to 1500, to align with outside FWs
> physical_network_mtus = vlan:1500,flat:1500
>
> 4) the lxc containers need to be properly defined: each network should
> have a mtu defined, or alternatively, you can define a default mtu for
> all the networks defined in openstack_user_config with
> ``lxc_container_default_mtu``. (This one is the one that spawns up the
> veth pair to the lxc container)
>
> I didn’t know about this one either, it didn’t exist in any of the default
> ansible-provided sample configs, but now that I’ve grepped in the ansible
> roles for “mtu”, it’s obvious. I’ll try this too.
>
> root@nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/openstack_deploy/*
> root@nbs-dh-09:~# grep -ri lxc_container_default_mtu /etc/ansible/
> /etc/ansible/roles/lxc_container_create/defaults/main.yml:lxc_container_default_mtu:
> "1500"
> /etc/ansible/roles/lxc_container_create/templates/container-interface.ini.j2:lxc.network.mtu
> = {{ item.value.mtu|default(lxc_container_default_mtu) }}
> /etc/ansible/roles/lxc_container_create/templates/debian-interface.cfg.j2:
> mtu {{ item.value.mtu|default(lxc_container_default_mtu) }}
> /etc/ansible/roles/lxc_container_create/templates/rhel-interface.j2:MTU={{
> item.value.mtu|default(lxc_container_default_mtu) }}
> root@nbs-dh-09:~#
>
> 5) The container interfaces need to have this proper mtu. This is
> taking the same configuration as 4) above, so it should work out of
> the box.
>
> Agreed, that seems to be the case currently with 1500, I’d expect it to be
> true with the updated value
>
> 6) If your instance is reaching its router with no mtu issue, you may
> still have issues for the Northbound trafic. Check how you configured
> this northbound and if the interfaces have proper mtu. If there are
> veth pairs to create pseudo links, check their mtus too.
>
> I think it's a good start for the conversation...
>
> Thank you, this is very helpful. I’ll give it a try and respond.
>
> Re #1 and #4, do I need to destroy / recreate my existing LXC containers, or
> will rerunning the playbooks be enough to update the MTUs?
>
> Many thanks,
> David


Hello,

For the mtu, it would be impactful to do it on a live environment. I
expect that if you change the container configuration, it would
restart.

Could you please tell me if this configuration was good enough for
your use case?
Or if the docs need adapting?

If this still doesn't work, maybe you should file a bug with your new
openstack_user_config
and the appropriate user_*.yml file. That would follow our bug triage
process where more ppl can have a look at the issue.

As usual, don't hesitate to come on our irc channel #openstack-ansible
if you have further questions!

Thank you!

Best regards,
Jean-Philippe Evrard
@evrardjp

_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

David Young

So..

On 07/12/2017 03:12, Jean-Philippe Evrard wrote:

For the mtu, it would be impactful to do it on a live environment. I
expect that if you change the container configuration, it would
restart.

It’s a busy lab environment, but given that it’s fully HA (2 controllers), I didn’t anticipate a significant problem with changing container configuration one-at-a-time.

However, the change has had an unexpected side effect - one of the controllers (I haven’t rebooted the other one yet) seems to have lost the ability to bring up lxcbr0, and so while it can start all its containers, none of them have any management connectivity on eth0, which of course breaks all sorts of things.

I.e.

root@nbs-dh-10:~# systemctl status networking.service
● networking.service - Raise network interfaces
   Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
  Drop-In: /run/systemd/generator/networking.service.d
           └─50-insserv.conf-$network.conf
   Active: failed (Result: exit-code) since Thu 2017-12-07 06:37:00 NZDT; 14min ago
     Docs: man:interfaces(5)
  Process: 2717 ExecStart=/sbin/ifup -a --read-environment (code=exited, status=1/FAILURE)
  Process: 2656 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ] && [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm settle (code=e
 Main PID: 2717 (code=exited, status=1/FAILURE)

Dec 07 06:36:58 nbs-dh-10 systemd[1]: Starting Raise network interfaces...
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: RTNETLINK answers: Invalid argument
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on /run/network/ifstate.enp4s0
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on /run/network/ifstate.br-mgmt
Dec 07 06:37:00 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on /run/network/ifstate.br-vlan
Dec 07 06:37:00 nbs-dh-10 ifup[2717]: Failed to bring up lxcbr0.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE
Dec 07 06:37:00 nbs-dh-10 systemd[1]: Failed to start Raise network interfaces.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Unit entered failed state.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Failed with result 'exit-code'.
root@nbs-dh-10:~#

I’ve manually reversed the “lxc.network.mtu = 1550” entry in /etc/lxc/lxc-openstack.conf, but this doesn’t seem to have made a difference.

What’s also odd is that lxcbr0 appears to be perfectly normal:

root@nbs-dh-10:~# brctl show lxcbr0
bridge name    bridge id        STP enabled    interfaces
lxcbr0        8000.fe0a7fa28303    no        04063403_eth0
                            075266dc_eth0
                            160c9b30_eth0
                            38ac19ae_eth0
                            4f57300f_eth0
                            59b2b5a5_eth0
                            5b7bbeb4_eth0
                            64a1fcdd_eth0
                            6c99f5fe_eth0
                            6f93ebb2_eth0
                            70ce61e5_eth0
                            745ba80d_eth0
                            85df2fa5_eth0
                            99e6adf8_eth0
                            cbdfa2f3_eth0
                            e15dc279_eth0
                            ea67ce7e_eth0
                            ed5c7af9_eth0
root@nbs-dh-10:~#

… But, no matter the value of lxc.network.mtu, it doesn’t change from 1500 (I suppose this could actually have reduced itself based on the lower MTUs of the member interfaces though):

root@nbs-dh-10:~# ifconfig lxcbr0
lxcbr0    Link encap:Ethernet  HWaddr fe:0c:5d:1c:36:da
          inet addr:10.0.3.1  Bcast:10.0.3.255  Mask:255.255.255.0
          inet6 addr: fe80::f4b0:bff:fec3:63b0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:499 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:128882 (128.8 KB)  TX bytes:828 (828.0 B)

root@nbs-dh-10:~#

Any debugging suggestions?

Thanks,
D


_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

Jean-philippe Evrard
Hello David,

Did you solve your issue?
Did you check that it depends on the default container interface's mtu itself?

Best regards,
JP


On 6 December 2017 at 18:45, David Young <[hidden email]> wrote:

> So..
>
> On 07/12/2017 03:12, Jean-Philippe Evrard wrote:
>
> For the mtu, it would be impactful to do it on a live environment. I
> expect that if you change the container configuration, it would
> restart.
>
> It’s a busy lab environment, but given that it’s fully HA (2 controllers), I
> didn’t anticipate a significant problem with changing container
> configuration one-at-a-time.
>
> However, the change has had an unexpected side effect - one of the
> controllers (I haven’t rebooted the other one yet) seems to have lost the
> ability to bring up lxcbr0, and so while it can start all its containers,
> none of them have any management connectivity on eth0, which of course
> breaks all sorts of things.
>
> I.e.
>
> root@nbs-dh-10:~# systemctl status networking.service
> ● networking.service - Raise network interfaces
>    Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor
> preset: enabled)
>   Drop-In: /run/systemd/generator/networking.service.d
>            └─50-insserv.conf-$network.conf
>    Active: failed (Result: exit-code) since Thu 2017-12-07 06:37:00 NZDT;
> 14min ago
>      Docs: man:interfaces(5)
>   Process: 2717 ExecStart=/sbin/ifup -a --read-environment (code=exited,
> status=1/FAILURE)
>   Process: 2656 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ]
> && [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm
> settle (code=e
>  Main PID: 2717 (code=exited, status=1/FAILURE)
>
> Dec 07 06:36:58 nbs-dh-10 systemd[1]: Starting Raise network interfaces...
> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: RTNETLINK answers: Invalid argument
> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.enp4s0
> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.br-mgmt
> Dec 07 06:37:00 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.br-vlan
> Dec 07 06:37:00 nbs-dh-10 ifup[2717]: Failed to bring up lxcbr0.
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Main process
> exited, code=exited, status=1/FAILURE
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: Failed to start Raise network
> interfaces.
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Unit entered
> failed state.
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Failed with result
> 'exit-code'.
> root@nbs-dh-10:~#
>
> I’ve manually reversed the “lxc.network.mtu = 1550” entry in
> /etc/lxc/lxc-openstack.conf, but this doesn’t seem to have made a
> difference.
>
> What’s also odd is that lxcbr0 appears to be perfectly normal:
>
> root@nbs-dh-10:~# brctl show lxcbr0
> bridge name    bridge id        STP enabled    interfaces
> lxcbr0        8000.fe0a7fa28303    no        04063403_eth0
>                             075266dc_eth0
>                             160c9b30_eth0
>                             38ac19ae_eth0
>                             4f57300f_eth0
>                             59b2b5a5_eth0
>                             5b7bbeb4_eth0
>                             64a1fcdd_eth0
>                             6c99f5fe_eth0
>                             6f93ebb2_eth0
>                             70ce61e5_eth0
>                             745ba80d_eth0
>                             85df2fa5_eth0
>                             99e6adf8_eth0
>                             cbdfa2f3_eth0
>                             e15dc279_eth0
>                             ea67ce7e_eth0
>                             ed5c7af9_eth0
> root@nbs-dh-10:~#
>
> … But, no matter the value of lxc.network.mtu, it doesn’t change from 1500
> (I suppose this could actually have reduced itself based on the lower MTUs
> of the member interfaces though):
>
> root@nbs-dh-10:~# ifconfig lxcbr0
> lxcbr0    Link encap:Ethernet  HWaddr fe:0c:5d:1c:36:da
>           inet addr:10.0.3.1  Bcast:10.0.3.255  Mask:255.255.255.0
>           inet6 addr: fe80::f4b0:bff:fec3:63b0/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:499 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:128882 (128.8 KB)  TX bytes:828 (828.0 B)
>
> root@nbs-dh-10:~#
>
> Any debugging suggestions?
>
> Thanks,
> D

_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

David Young
Hey Jean-Philippe,

No, after I disasterously split-brained/partitioned my rabbitmq and galera clusters by allowing LXC to start the containers up without the dnsmasq process to address their eth0 interfaces (due to what _may_ be a template/Xenial bug), I've spent the last few days cleaning up the mess :)

I have two unused hosts set aside as a test environment for pre-testing, and I'll be leveraging these in the next few days to test the issue on a fresh Xenial install.

I'll update you (and the list) once I've positively confirmed the issue.

Cheers!
D




On 12/12/2017 21:52, Jean-Philippe Evrard wrote:
Hello David,

Did you solve your issue?
Did you check that it depends on the default container interface's mtu itself?

Best regards,
JP


On 6 December 2017 at 18:45, David Young [hidden email] wrote:
So..

On 07/12/2017 03:12, Jean-Philippe Evrard wrote:

For the mtu, it would be impactful to do it on a live environment. I
expect that if you change the container configuration, it would
restart.

It’s a busy lab environment, but given that it’s fully HA (2 controllers), I
didn’t anticipate a significant problem with changing container
configuration one-at-a-time.

However, the change has had an unexpected side effect - one of the
controllers (I haven’t rebooted the other one yet) seems to have lost the
ability to bring up lxcbr0, and so while it can start all its containers,
none of them have any management connectivity on eth0, which of course
breaks all sorts of things.

I.e.

root@nbs-dh-10:~# systemctl status networking.service
● networking.service - Raise network interfaces
   Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor
preset: enabled)
  Drop-In: /run/systemd/generator/networking.service.d
           └─50-insserv.conf-$network.conf
   Active: failed (Result: exit-code) since Thu 2017-12-07 06:37:00 NZDT;
14min ago
     Docs: man:interfaces(5)
  Process: 2717 ExecStart=/sbin/ifup -a --read-environment (code=exited,
status=1/FAILURE)
  Process: 2656 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ]
&& [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm
settle (code=e
 Main PID: 2717 (code=exited, status=1/FAILURE)

Dec 07 06:36:58 nbs-dh-10 systemd[1]: Starting Raise network interfaces...
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: RTNETLINK answers: Invalid argument
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
/run/network/ifstate.enp4s0
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
/run/network/ifstate.br-mgmt
Dec 07 06:37:00 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
/run/network/ifstate.br-vlan
Dec 07 06:37:00 nbs-dh-10 ifup[2717]: Failed to bring up lxcbr0.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Main process
exited, code=exited, status=1/FAILURE
Dec 07 06:37:00 nbs-dh-10 systemd[1]: Failed to start Raise network
interfaces.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Unit entered
failed state.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Failed with result
'exit-code'.
root@nbs-dh-10:~#

I’ve manually reversed the “lxc.network.mtu = 1550” entry in
/etc/lxc/lxc-openstack.conf, but this doesn’t seem to have made a
difference.

What’s also odd is that lxcbr0 appears to be perfectly normal:

root@nbs-dh-10:~# brctl show lxcbr0
bridge name    bridge id        STP enabled    interfaces
lxcbr0        8000.fe0a7fa28303    no        04063403_eth0
                            075266dc_eth0
                            160c9b30_eth0
                            38ac19ae_eth0
                            4f57300f_eth0
                            59b2b5a5_eth0
                            5b7bbeb4_eth0
                            64a1fcdd_eth0
                            6c99f5fe_eth0
                            6f93ebb2_eth0
                            70ce61e5_eth0
                            745ba80d_eth0
                            85df2fa5_eth0
                            99e6adf8_eth0
                            cbdfa2f3_eth0
                            e15dc279_eth0
                            ea67ce7e_eth0
                            ed5c7af9_eth0
root@nbs-dh-10:~#

… But, no matter the value of lxc.network.mtu, it doesn’t change from 1500
(I suppose this could actually have reduced itself based on the lower MTUs
of the member interfaces though):

root@nbs-dh-10:~# ifconfig lxcbr0
lxcbr0    Link encap:Ethernet  HWaddr fe:0c:5d:1c:36:da
          inet addr:10.0.3.1  Bcast:10.0.3.255  Mask:255.255.255.0
          inet6 addr: fe80::f4b0:bff:fec3:63b0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:499 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:128882 (128.8 KB)  TX bytes:828 (828.0 B)

root@nbs-dh-10:~#

Any debugging suggestions?

Thanks,
D


_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Reply | Threaded
Open this post in threaded view
|

Re: How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

Jean-philippe Evrard
Hello,

Ok thanks! Don't hesitate to ask on our channel.

FYI: In case of split brains for rabbitmq, most likely recreating
rabbit is the fastest. We are dealing with non persistent data anyway
:p

Best regards,
JP

On 12 December 2017 at 09:20, David Young <[hidden email]> wrote:

> Hey Jean-Philippe,
>
> No, after I disasterously split-brained/partitioned my rabbitmq and galera
> clusters by allowing LXC to start the containers up without the dnsmasq
> process to address their eth0 interfaces (due to what _may_ be a
> template/Xenial bug), I've spent the last few days cleaning up the mess :)
>
> I have two unused hosts set aside as a test environment for pre-testing, and
> I'll be leveraging these in the next few days to test the issue on a fresh
> Xenial install.
>
> I'll update you (and the list) once I've positively confirmed the issue.
>
> Cheers!
> D
>
>
>
>
> On 12/12/2017 21:52, Jean-Philippe Evrard wrote:
>
> Hello David,
>
> Did you solve your issue?
> Did you check that it depends on the default container interface's mtu
> itself?
>
> Best regards,
> JP
>
>
> On 6 December 2017 at 18:45, David Young <[hidden email]> wrote:
>
> So..
>
> On 07/12/2017 03:12, Jean-Philippe Evrard wrote:
>
> For the mtu, it would be impactful to do it on a live environment. I
> expect that if you change the container configuration, it would
> restart.
>
> It’s a busy lab environment, but given that it’s fully HA (2 controllers), I
> didn’t anticipate a significant problem with changing container
> configuration one-at-a-time.
>
> However, the change has had an unexpected side effect - one of the
> controllers (I haven’t rebooted the other one yet) seems to have lost the
> ability to bring up lxcbr0, and so while it can start all its containers,
> none of them have any management connectivity on eth0, which of course
> breaks all sorts of things.
>
> I.e.
>
> root@nbs-dh-10:~# systemctl status networking.service
> ● networking.service - Raise network interfaces
>    Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor
> preset: enabled)
>   Drop-In: /run/systemd/generator/networking.service.d
>            └─50-insserv.conf-$network.conf
>    Active: failed (Result: exit-code) since Thu 2017-12-07 06:37:00 NZDT;
> 14min ago
>      Docs: man:interfaces(5)
>   Process: 2717 ExecStart=/sbin/ifup -a --read-environment (code=exited,
> status=1/FAILURE)
>   Process: 2656 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ]
> && [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm
> settle (code=e
>  Main PID: 2717 (code=exited, status=1/FAILURE)
>
> Dec 07 06:36:58 nbs-dh-10 systemd[1]: Starting Raise network interfaces...
> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: RTNETLINK answers: Invalid argument
> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.enp4s0
> Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.br-mgmt
> Dec 07 06:37:00 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.br-vlan
> Dec 07 06:37:00 nbs-dh-10 ifup[2717]: Failed to bring up lxcbr0.
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Main process
> exited, code=exited, status=1/FAILURE
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: Failed to start Raise network
> interfaces.
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Unit entered
> failed state.
> Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Failed with result
> 'exit-code'.
> root@nbs-dh-10:~#
>
> I’ve manually reversed the “lxc.network.mtu = 1550” entry in
> /etc/lxc/lxc-openstack.conf, but this doesn’t seem to have made a
> difference.
>
> What’s also odd is that lxcbr0 appears to be perfectly normal:
>
> root@nbs-dh-10:~# brctl show lxcbr0
> bridge name    bridge id        STP enabled    interfaces
> lxcbr0        8000.fe0a7fa28303    no        04063403_eth0
>                             075266dc_eth0
>                             160c9b30_eth0
>                             38ac19ae_eth0
>                             4f57300f_eth0
>                             59b2b5a5_eth0
>                             5b7bbeb4_eth0
>                             64a1fcdd_eth0
>                             6c99f5fe_eth0
>                             6f93ebb2_eth0
>                             70ce61e5_eth0
>                             745ba80d_eth0
>                             85df2fa5_eth0
>                             99e6adf8_eth0
>                             cbdfa2f3_eth0
>                             e15dc279_eth0
>                             ea67ce7e_eth0
>                             ed5c7af9_eth0
> root@nbs-dh-10:~#
>
> … But, no matter the value of lxc.network.mtu, it doesn’t change from 1500
> (I suppose this could actually have reduced itself based on the lower MTUs
> of the member interfaces though):
>
> root@nbs-dh-10:~# ifconfig lxcbr0
> lxcbr0    Link encap:Ethernet  HWaddr fe:0c:5d:1c:36:da
>           inet addr:10.0.3.1  Bcast:10.0.3.255  Mask:255.255.255.0
>           inet6 addr: fe80::f4b0:bff:fec3:63b0/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:499 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:128882 (128.8 KB)  TX bytes:828 (828.0 B)
>
> root@nbs-dh-10:~#
>
> Any debugging suggestions?
>
> Thanks,
> D
>
>

_______________________________________________
OpenStack-operators mailing list
[hidden email]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators