About ramonacedo

My name is Ramon Acedo, I’m a Cloud Architect who started in the Open Source world in the age of the 33.6K modems. I have seen the evolution of the FOSS in all sorts of organisations: as a sysadmin, as an entrepreneur, as a Red Hat employee, as a Yahoo! employee and for a couple of years I worked mostly with enterprise proprietary software for VMware, where I could help large enterprises who use both, proprietary and FOSS, to solve their business requirements using VMware-based private clouds. Back into the FOSS industry, I have also helped Canonical/Ubuntu to develop the OpenStack business around the world. I work now for Red Hat helping businesses in their journey to an enterprise-class OpenStack experience.

Deploying Ironic in OpenStack Newton with TripleO

Image

OVS Libvirt VLANs

Introduction

This post describes the process to enable Ironic in the Overcloud in a multi-controller deployment with director or TripleO, a new feature introduced in Red Hat OpenStack Platform 10 (Newton).

The process should work with any working OpenStack Newton (and above) platform deployed with TripleO, even with an already deployed environment updated with the configuration templates described here should work.

The workflow is based in the upstream documentation.

Architecture Setup

With this setup we can have virtual instances and instances on baremetal nodes in the same environment. In this architecture I’m using floating IPs with VMs and a provisioning network with the baremetal nodes.

To be able to test this setup in a lab with virtual machines, we use Libvirt+KVM using VMs for all the nodes in a all-in-one lab. The network topology is described in the diagram below.

Ideally, we would have more networks, for example a dedicated network for cleaning the disks and another one for provisioning the baremetal nodes from the Overcloud, and even an extra one as the tenant network for the baremetal nodes in the Overcloud. For simplicity reasons though, in this lab I reused the Undercloud’s provisioning network for this four network roles:

  • Provisioning from the Undercloud
  • Provisioning from the Overcloud
  • Cleaning the baremetal nodes’ disks
  • Baremetal tenant network for the Overcloud nodes

OVS Libvirt VLANs

Virtual environment configuration

In order to be able to test with root_device hints in the nodes (Libvirt VMs) that we want to test as baremetal nodes we define the first disk in Libvirt with a iSCSI bus and a wwn ID:

<disk type='file' device='disk'>
  <driver name='qemu' type='qcow2'/>
  <source file='/var/lib/virtual-machines/overcloud-2-node4-disk1.qcow2'/>
  <target dev='sda' bus='scsi'/>
  <wwn>0x0000000000000001</wwn>
</disk>

To verify the hints we can optionally introspect the node in the Undercloud (as currently there’s no introspection in the Overcloud). This is what we can see after introspection of the node in the Undercloud:

$ openstack baremetal introspection data save 7740e442-96a6-496c-9bb2-7cac89b6a8e7|jq '.inventory.disks'
[
  {
    "size": 64424509440,
    "rotational": true,
    "vendor": "QEMU",
    "name": "/dev/sda",
    "wwn_vendor_extension": null,
    "wwn_with_extension": "0x0000000000000001",
    "model": "QEMU HARDDISK",
    "wwn": "0x0000000000000001",
    "serial": "0000000000000001"
  },
  {
    "size": 64424509440,
    "rotational": true,
    "vendor": "0x1af4",
    "name": "/dev/vda",
    "wwn_vendor_extension": null,
    "wwn_with_extension": null,
    "model": "",
    "wwn": null,
    "serial": null
  },
  {
    "size": 64424509440,
    "rotational": true,
    "vendor": "0x1af4",
    "name": "/dev/vdb",
    "wwn_vendor_extension": null,
    "wwn_with_extension": null,
    "model": "",
    "wwn": null,
    "serial": null
  },
  {
    "size": 64424509440,
    "rotational": true,
    "vendor": "0x1af4",
    "name": "/dev/vdc",
    "wwn_vendor_extension": null,
    "wwn_with_extension": null,
    "model": "",
    "wwn": null,
    "serial": null
  }
]

Undercloud templates

The following templates contain all the changes needed to configure Ironic and to adapt the NIC config to have a dedicated OVS bridge for Ironic as required.

Ironic configuration

~/templates/ironic.yaml

parameter_defaults:
    IronicEnabledDrivers:
        - pxe_ssh
    NovaSchedulerDefaultFilters:
        - RetryFilter
        - AggregateInstanceExtraSpecsFilter
        - AvailabilityZoneFilter
        - RamFilter
        - DiskFilter
        - ComputeFilter
        - ComputeCapabilitiesFilter
        - ImagePropertiesFilter
    IronicCleaningDiskErase: metadata
    IronicIPXEEnabled: true
    ControllerExtraConfig:
        ironic::drivers::ssh::libvirt_uri: 'qemu:///system'

Network configuration

First we map an extra bridge called br-baremetal which will be used by Ironic:

~/templates/network-environment.yaml:

[...]
parameter_defaults:
[...]
  NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal
  NeutronFlatNetworks: datacentre,baremetal

This bridge will be configured in the provisioning network (control plane) of the controllers as we will reuse this network as the Ironic network later. If we wanted to add a dedicated network we would do the same config.

It is important to mention that this Ironic network used for provisioning can’t be VLAN tagged, which is yet another reason to justify using the Undercloud’s provisioning network for this lab:

~/templates/nic-configs/controller.yaml:

[...]
          network_config:
            -
              type: ovs_bridge
              name: br-baremetal
              use_dhcp: false
              members:
                 -
                   type: interface
                   name: eth0
              addresses:
                -
                  ip_netmask:
                    list_join:
                      - '/'
                      - - {get_param: ControlPlaneIp}
                        - {get_param: ControlPlaneSubnetCidr}
              routes:
                -
                  ip_netmask: 169.254.169.254/32
                  next_hop: {get_param: EC2MetadataIp}
[...]

Deployment

This is the deployment script I’ve used. Note there’s a roles_data.yaml template to add a composable role (a new feature in OSP 10) that I used for the deployment of an Operational Tools server (Sensu and Fluentd). The deployment also includes 3 Ceph nodes. These are irrelevant for the purpose of this setup but I wanted to test it all together in an advanced and more realistic architecture.

Red Hat’s documentation contains the details for configuring these advanced options and the base configuration with director.

~/deployment-scripts/ironic-ha-net-isol-deployment-dupa.sh:

openstack overcloud deploy \
--templates \
-r ~/templates/roles_data.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \
-e ~/templates/network-environment.yaml \
-e ~/templates/ceph-storage.yaml \
-e ~/templates/parameters.yaml \
-e ~/templates/firstboot/firstboot.yaml \
-e ~/templates/ips-from-pool-all.yaml \
-e ~/templates/fluentd-client.yaml \
-e ~/templates/sensu-client.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \
-e ~/templates/ironic.yaml \
--control-scale 3 \
--compute-scale 1 \
--ceph-storage-scale 3 \
--compute-flavor compute \
--control-flavor control \
--ceph-storage-flavor ceph-storage \
--timeout 60 \
--libvirt-type kvm

Post-deployment configuration

Verifications

After the deployment completes successfully we should see how the controllers have the compute service enabled:

$ . overcloudrc
$ openstack compute service list -c Binary -c Host -c State
+------------------+------------------------------------+-------+
| Binary           | Host                               | State |
+------------------+------------------------------------+-------+
| nova-consoleauth | overcloud-controller-1.localdomain | up    |
| nova-scheduler   | overcloud-controller-1.localdomain | up    |
| nova-conductor   | overcloud-controller-1.localdomain | up    |
| nova-compute     | overcloud-controller-1.localdomain | up    |
| nova-consoleauth | overcloud-controller-0.localdomain | up    |
| nova-consoleauth | overcloud-controller-2.localdomain | up    |
| nova-scheduler   | overcloud-controller-0.localdomain | up    |
| nova-scheduler   | overcloud-controller-2.localdomain | up    |
| nova-conductor   | overcloud-controller-0.localdomain | up    |
| nova-conductor   | overcloud-controller-2.localdomain | up    |
| nova-compute     | overcloud-controller-0.localdomain | up    |
| nova-compute     | overcloud-controller-2.localdomain | up    |
| nova-compute     | overcloud-compute-0.localdomain    | up    |
+------------------+------------------------------------+-------+

And the driver we passed with IronicEnabledDrivers is also enabled:

$ openstack baremetal driver list
+---------------------+------------------------------------------------------------------------------------------------------------+
| Supported driver(s) | Active host(s)                                                                                             |
+---------------------+------------------------------------------------------------------------------------------------------------+
| pxe_ssh             | overcloud-controller-0.localdomain, overcloud-controller-1.localdomain, overcloud-controller-2.localdomain |
+---------------------+------------------------------------------------------------------------------------------------------------+

Baremetal network

This network will be:

  • The provisioning network for the Overcloud’s Ironic.
  • The cleaning network for wiping the baremetal node’s disks.
  • The tenant network for the Overcloud’s Ironic instances.

Create the baremetal network in the Overcloud with the same subnet and gateway than the Undercloud’s ctlplane but using a different range:

$ . overcloudrc
$ openstack network create \
--share \
--provider-network-type flat \
--provider-physical-network baremetal \
--external \
baremetal
$ openstack subnet create \
--network baremetal \
--subnet-range 192.168.3.0/24 \
--gateway 192.168.3.1 \
--allocation-pool start=192.168.3.150,end=192.168.3.170 \
baremetal-subnet

Then, we need to configure each controller’s /etc/ironic.conf to use this network to clean the nodes’ disks at registration time and also before tenants use them as baremetal instances:

$ openstack network show baremetal -f value -c id
f7af39df-2576-4042-87c0-14c395ca19b4
$ ssh heat-admin@$CONTROLLER_IP
$ sudo vi /etc/ironic/ironic.conf
$ cleaning_network_uuid=f7af39df-2576-4042-87c0-14c395ca19b4
$ sudo systemctl restart openstack-ironic-conductor

We should also leave it ready to be included in our next update by adding it to the ControllerExtraConfig section in the ironic.yaml template:

parameter_defaults:
  ControllerExtraConfig:
    ironic::conductor::cleaning_network_uuid: f7af39df-2576-4042-87c0-14c395ca19b4

Baremetal deployment images

We can use the same deployment images we use in the Undercloud:

$ openstack image create --public --container-format aki --disk-format aki --file ~/images/ironic-python-agent.kernel deploy-kernel
$ openstack image create --public --container-format ari --disk-format ari --file ~/images/ironic-python-agent.initramfs deploy-ramdisk

We could also create them using the CoreOS images. For example, if we wanted to troubleshoot the deployment, we could use the CoreOS images and enable debug output in the Ironic Python Agent or adding our ssh-key to access during the deployment of the image.

Baremetal instance images

Again, for simplicity, we can use the overcloud-full image we use in the Undercloud:

$ KERNEL_ID=$(openstack image create --file ~/images/overcloud-full.vmlinuz --public --container-format aki --disk-format aki -f value -c id overcloud-full.vmlinuz)
$ RAMDISK_ID=$(openstack image create --file ~/images/overcloud-full.initrd --public --container-format ari --disk-format ari -f value -c id overcloud-full.initrd)
$ openstack image create --file ~/images/overcloud-full.qcow2 --public --container-format bare --disk-format qcow2 --property kernel_id=$KERNEL_ID --property ramdisk_id=$RAMDISK_ID overcloud-full

Note that it uses kernel and ramdisk images, as the Overcloud default image is a partition image.

Create flavors

We create two flavors to start with, one for the baremetal instances and another one for the virtual instances.

$ openstack flavor create --ram 1024 --disk 20 --vcpus 1 baremetal
$ openstack flavor create --disk 20 m1.small

Baremetal instances flavor

Then, we set a boolean property in the newly created flavor called baremetal, which will also be set in the host aggregates (see below) to differentiate nodes for baremetal instances from nodes virtual instances.

And, as by default the boot_option is netboot, we set it to local (and later we will do the same when we create the baremetal node):

$ openstack flavor set baremetal --property baremetal=true
$ openstack flavor set baremetal --property capabilities:boot_option="local"

Virtual instances flavor

Lastly, we set the flavor for virtual instances with the boolean property set to false:

$ openstack flavor set m1.small --property baremetal=false

Create host aggregates

To have OpenStack differentiating between baremetal and virtual instances we can create host aggregates to have the nova-compute service running on the controllers just for Ironic and the the one on compute nodes for virtual instances:

$ openstack aggregate create --property baremetal=true baremetal-hosts
$ openstack aggregate create --property baremetal=false virtual-hosts
$ for compute in $(openstack hypervisor list -f value -c "Hypervisor Hostname" | grep compute); do openstack aggregate add host virtual-hosts $compute; done
$ openstack aggregate add host baremetal-hosts overcloud-controller-0.localdomain
$ openstack aggregate add host baremetal-hosts overcloud-controller-1.localdomain
$ openstack aggregate add host baremetal-hosts overcloud-controller-2.localdomain

Register the nodes in Ironic

The nodes can be registered with the command openstack baremetal create and a YAML template where the node is defined. In this example I register only one node named overcloud-2-node4, which I had previously registered in the Undercloud for introspection (and later deleted from it or set to maintenance mode to avoid conflicts between the two Ironic services).

The section root_device contains commented examples of the hints we could use. Remember that when configuring the Libvirt XML file for the node above, we added a wwn ID section, which is the one we’ll use in this example.

This template is like the instackenv.json one in the Undercloud but in YAML.

$ cat overcloud-2-node4.yaml
nodes:
    - name: overcloud-2-node5
      driver: pxe_ssh
      driver_info:
        ssh_username: stack
        ssh_key_contents:  |
          -----BEGIN RSA PRIVATE KEY-----
          MIIEogIBAAKCAQEAxc0a2u18EgTy5y9JvaExDXP2pWuE8Ebyo24AOo1iQoWR7D5n
          fNjkgCeKZRbABhsdoMBmbDMtn0PO3lzI2HnZQBB4BdBZprAiQ1NwKKotUv9puTeY
          [..]
          7DsSKAL4EDqjufY3h+4fRwOcD+EFqlUTDG1sjsSDKjdiHyYMzjcrg8nbaj/M9kAs
          xXnSm9686KxUiCDXO5FWKun204B18mPH1UP20aYw098t6aAQwm4=
          -----END RSA PRIVATE KEY-----
        ssh_virt_type: virsh
        ssh_address: 10.0.0.1
      properties:
        cpus: 4
        memory_mb: 12288
        local_gb: 60
        #boot_option: local (it doesn't set 'capabilities')
        root_device:
          # vendor: "0x1af4"
          # model: "QEMU HARDDISK"
          # size: 64424509440
          wwn: "0x0000000000000001"
          # serial: "0000000000000001"
          # vendor: QEMU
          # name: /dev/sda
      ports:
        - address: 52:54:00:a0:af:da

We create the node using the above template:

$ openstack baremetal create overcloud-2-node4.yaml

Then we have to specify which are the deployment kernel and ramdisk for the node:

$ DEPLOY_KERNEL=$(openstack image show deploy-kernel -f value -c id)
$ DEPLOY_RAMDISK=$(openstack image show deploy-ramdisk -f value -c id)
$ openstack baremetal node set $(openstack baremetal node show overcloud-2-node4 -f value -c uuid) \
--driver-info deploy_kernel=$DEPLOY_KERNEL \
--driver-info deploy_ramdisk=$DEPLOY_RAMDISK

And lastly, just like we do in the Undercloud, we set the node to available:

$ openstack baremetal node manage $(openstack baremetal node show overcloud-2-node4 -f value -c uuid)
$ openstack baremetal node provide $(openstack baremetal node show overcloud-2-node4 -f value -c uuid)

You can have all of this in a script and run it together every time you register a node.

If everything has gone well, the node will be registered and Ironic will clean its disk metadata (as per above configuration):

$ openstack baremetal node list -c Name -c "Power State" -c "Provisioning State"
+-------------------+-------------+--------------------+
| Name              | Power State | Provisioning State |
+-------------------+-------------+--------------------+
| overcloud-2-node4 | power off   | cleaning           |
+-------------------+-------------+--------------------+

Wait until the cleaning process has finished and then set the boot_option to local:

$ openstack baremetal node set $(openstack baremetal node show overcloud-2-node4 -f value -c uuid) --property 'capabilities=boot_option:local'

Start a baremetal instance

Just as in the virtual instances we’ll use a ssh key and then we’ll start the instance with Ironic:

$ openstack keypair create --public-key ~/.ssh/id_rsa.pub stack-key

Then we make sure that the cleaning process has finished (Provisioning State is available):

$ openstack baremetal node list -c Name -c "Power State" -c "Provisioning State"
+-------------------+-------------+--------------------+
| Name              | Power State | Provisioning State |
+-------------------+-------------+--------------------+
| overcloud-2-node4 | power off   | available          |
+-------------------+-------------+--------------------+

and we start the baremetal instance:

$ openstack server create \
--image overcloud-full \
--flavor baremetal \
--nic net-id=$(openstack network show baremetal -f value -c id) \

Now check its IP and access the newly created machine:

$ openstack server list -c Name -c Status -c Networks
+---------------+--------+-------------------------+
| Name          | Status | Networks                |
+---------------+--------+-------------------------+
| bm-instance-0 | ACTIVE | baremetal=192.168.3.157 |
+---------------+--------+-------------------------+
$ ssh cloud-user@192.168.3.157
Warning: Permanently added '192.168.3.157' (ECDSA) to the list of known hosts.
Last login: Sun Jan 15 07:49:37 2017 from gateway
[cloud-user@bm-instance-0 ~]$

Start a virtual instance

Optionally, we start a virtual instance to test that virtual and baremetal instances can reach each other.

As I need to create public and private networks, an image, a router, a security group, a floating IP, etc. I’ll use a heat template that does it all for me and, including creating the virtual instance, so I will use it skip the details of doing this:

$ openstack stack create -e overcloud-env.yaml -t overcloud-template.yaml overcloud-stack

Check that the networks and the instance have been created:

$ openstack network list -c Name
+----------------------------------------------------+
| Name                                               |
+----------------------------------------------------+
| public                                             |
| baremetal                                          |
| HA network tenant 1e6a7de837ad488d8beed626c86a6dfe |
| private-net                                        |
+----------------------------------------------------+
$ openstack server list -c Name -c Networks
+----------------------------------------+------------------------------------+
| Name                                   | Networks                           |
+----------------------------------------+------------------------------------+
| overcloud-stack-instance0-2thafsncdgli | private-net=172.16.2.6, 10.0.0.168 |
| bm-instance-0                          | baremetal=192.168.3.157            |
+----------------------------------------+------------------------------------+

We now have both instances and they can communicate over the network:

$ ssh cirros@10.0.0.168
Warning: Permanently added '10.0.0.168' (RSA) to the list of known hosts.
$ ping 192.168.3.157
PING 192.168.3.157 (192.168.3.157): 56 data bytes
64 bytes from 192.168.3.157: seq=0 ttl=62 time=1.573 ms
64 bytes from 192.168.3.157: seq=1 ttl=62 time=0.914 ms
64 bytes from 192.168.3.157: seq=2 ttl=62 time=1.064 ms
^C
--- 192.168.3.157 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.914/1.183/1.573 ms
Advertisements

OpenStack for NFV Applications: SR-IOV and PCI Passthrough

4632198914_057aede16c_o

NFV

Network Function Virtualisation (NFV) initiatives in the telecommunication industry require specific OpenStack functionalities enabled.

Without entering into the details of the NFV specifications, the goal in OpenStack is to optimise network, memory and CPU performance on the running instances.

In this article we’ll see Single Root I/O Virtualisation (SR-IOV) and PCI-Passthrough, which are commonly required by some Virtual Network Functions (VNF) running as instances on top of OpenStack.

In addition to SR-IOV and PCI-Passthrough there are other techniques such as DPDK, CPU pinning and the use of NUMA nodes which also are usually required by VNFs. A future post will cover some of them.

SR-IOV

SR-IOV allows a PCIe network interface, offering Physical Functions (PF) to expose multiple network interfaces, appearing as Virtual Functions (VF). For example, the network interface p5p1 configured with 5 VFs looks like this from the operating system:

# ip link show p5p1
8: p5p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
 link/ether a0:36:9f:8f:3f:b8 brd ff:ff:ff:ff:ff:ff
 vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
 vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
 vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
 vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
 vf 4 MAC 00:00:00:00:00:00, spoof checking on, link-state auto

The VFs can be used by the OS or exposed to VMs. They look exactly as regular NIC:

# ip link show p5p1_1
18: p5p1_1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
 link/ether 72:1c:ef:b0:a8:d0 brd ff:ff:ff:ff:ff:ff

Only certain NICs support SR-IOV. In this example I’m using Intel’s X540-AT2 NICs which uses the driver ixgbe.

Linux configuration for SR-IOV

To use SR-IOV in OpenStack, firstly we need to make sure the operating system is configured to support it. There are 2 kernel parameters to be set:

intel_iommu=on 
ixgbe.max_vfs=5

Note that ixgbe is specific for the Intel X540-AT2 NIC and you might be using another one. You can also use a different number of VFs.

To enable the parameters in RHEL based systems it works as follows:

  1. Add the parameters to /etc/default/grub in GRUB_CMDLINE_LINUX
  2. Regenerate the config file with: grub2-mkconfig -o /boot/grub2/grub.cfg
  3. Rebuild the initramfs file with: dracut -f -v

We also need to make sure that the admin state of the interface is UP:

# ip link show p5p1
# ip link set p5p1 up

And by setting the appropriate network interface configuration file in /etc/sysconfig/network-scripts/ifcfg-p5p1 as:

BOOTPROTO=none
DEVICE=p5p1
ONBOOT=yes

OpenStack configuration for SR-IOV

1. Neutron

SR-IOV works with the VLAN type driver in Neutron. We enable it in /etc/neutron/plugin.ini:

[ml2]
type_drivers=vxlan,vlan
tenant_network_types=vxlan,vlan

The mechanism driver is sriovnicswitch, which is configured in the same [ml2] section as follows:

mechanism_drivers=openvswitch,sriovnicswitch

Every time we create a new SR-IOV network in Neutron, it will configure it on a VLAN from a range that we need specify. It needs a name too. In this example the range is 1010 to 1020 and the physical network for Neutron will be called physnet_sriov :

[ml2_type_vlan]
network_vlan_ranges=physnet_sriov:1010:1020

Now, we configure SR-IOV settings in /etc/neutron/plugins/ml2/ml2_conf_sriov.ini. In the section [ml2_sriov] we need to tell the driver which NIC we will use:

[ml2_sriov]
supported_pci_vendor_devs=8086:1515

The numbers represent the vendor ID (8086) and the product ID (1515).  To get them we can use lspci -nn:

# lspci -nn|grep X540-AT2
06:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
06:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)

By default the neutron-server service is not loading the configuration in the file ml2_conf_sriov.ini so we need to add it to its systemd service in /usr/lib/systemd/system/neutron-server.service:

[Service]
Type=notify
User=neutron
ExecStart=/usr/bin/neutron-server --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugin.ini --config-file /etc/neutron/plugins/ml2/ml2_conf_sriov.ini  --log-file /var/log/neutron/server.log 

And after that restart the service:

# systemctl restart neutron-server

2. Nova scheduler

We need to tell the Nova scheduler about the SR-IOV so that it can schedule instances to compute nodes with SR-IOV support.

In the [DEFAULT] section of /etc/nova/nova.conf adding the PciPassthroughFilter. Also ensure scheduler_available_filters is set as follows:

[DEFAULT]
scheduler_available_filters=nova.scheduler.filters.all_filters
scheduler_default_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,CoreFilter,PciPassthroughFilter

And restart Nova scheduler:

# systemctl restart openstack-nova-scheduler

3. Nova compute

Nova compute needs to know which PFs can be used for SR-IOV so that VFs are exposed – actually via PCI-passthrough – to the instances. Also, it needs to know that when we create a network with Neutron specifying the physical network physnet_sriov  – configured before in Neutron with network_vlan_ranges – it will use the SR-IOV NIC.

That’s done by the config flag pci_passthrough_whitelist in /etc/nova/nova.conf:

pci_passthrough_whitelist = {"devname": "p5p1", "physical_network": "physnet_sriov"}

And simply restart Nova compute:

# systemctl restart openstack-nova-compute

4. SR-IOV NIC agent

We can optionally configure the SR-IOV NIC agent to manage the admin state of the NICs. When a VF NIC is used by an instance and then released, sometimes the NIC goes into DOWN state and the admin manually has to bring it back to UP state. There’s an article that describes how to do this in the official Red Hat documentation:

Enable the OpenStack Networking SR-IOV agent

Not all the drivers work with the agent and that was the case for the Intel X540-AT2 NIC.

Creating OpenStack instances with a SR-IOV port

1. Create the network

We configured the physnet_sriov network in Neutron to use the SR-IOV interface p5p1. Let’s create the network and its subnet in Neutron now:

$ neutron net-create nfv_sriov --shared --provider:network_type vlan --provider:physical_network physnet_sriov
$ neutron subnet-create --name nfv_subnet_sriov --disable-dhcp --allocation-pool start=10.0.0.2,end=10.0.0.100 nfv_sriov 10.0.0.0/24

Remember we configured a VLAN range, so Neutron will choose a VLAN from it, but if we wanted to specify one we can by using –provider:segmentation_id=1010 when creating the network.

2. Create the port

We’ll pass a port to the instance instead of the nfv_sriov network. To create it we do this:

$ neutron port-create nfv_sriov --name sriov-port --binding:vnic-type direct

Save the ID of the port as we’ll need it for creating the instance.

3. Create the instance

We will now create an instance that uses two NICs, one created the standard way – in a private network which already existed in Neutron – and a another one with the port created before. Assuming  SRIOV_PORT_ID is the ID of the port and PRIVATE_NETWORK_ID is the ID of the pre-existing private network, this is how we create it:

$ openstack server create --flavor m1.small --nic port-id=$SRIOV_PORT_ID --nic net-id=$PRIVATE_NETWORK_ID --image centos7 sr-iov-instance1

If you have key-pairs or other options  you use, pass them too in the openstack server create command.

Log in the instance as usual and you’ll notice two interfaces, eth0 and probably ens5, which is the SR-IOV NIC ready to be used.

Note as well that one of the VFs has now the same MAC address than the Neutron port we created above:

$ ip link show p5p1
8: p5p1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
    link/ether a0:36:9f:8b:cd:80 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
    vf 1 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
    vf 2 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
    vf 3 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
    vf 4 MAC fa:16:3e:e0:3f:be, spoof checking on, link-state auto

PCI-Passthrough

If our VNF (or any virtualised application for that matter) required direct access to a PCI interface in the hypervisor, the PCI-Passthrough functionality in Libvirt/KVM and OpenStack allows us doing it. This is also common in High Performance Computing (HPC), not only with NIC interfaces but, for example, sharing GPUs with the instances.

In this example we’ll use another NIC interface to pass it to the instance: p5p2 in the hypervisor.

Linux configuration for PCI-Passthrough

First, just like before, make sure the admin state if the interface is UP so let’s do the same:

# ip link show p5p2
# ip link set p5p2 up

And in /etc/sysconfig/network-scripts/ifcfg-p5p2:

BOOTPROTO=none
DEVICE=p5p2
ONBOOT=yes

The kernel options are the same ones we used above so nothing else is required at this point.

OpenStack configuration for PCI-Passthrough

Nova scheduler is already configured for PCI-Passthrough so only Nova compute needs to be made aware of the device we want to pass through.

1. Nova compute

We need a second entry in /etc/nova/nova.conf with pci_passthrough_whitelist. This will tell Nova compute that the interface p5p2 can be taken from the Linux OS and passed into an instance:

pci_passthrough_whitelist={ "devname": "p5p2" }

Now, we need to tag this interface with a name that will be used by Nova during the creation of the instance. For example we can call it my_PF. This is also done in the /etc/nova/nova.conf file:

pci_alias={ "vendor_id": "8086", "product_id": "1528", "name": "my_PF"}

Note that the vendor and product IDs are the same ones than before as both NICs are the same. Again, you can get your PCI device IDs with lspci -nn.

2. Nova flavor

The way OpenStack has been designed to allow passing PCI devices to instances is via flavors. The tag we used before (my_PF) needs to be associated with a new flavor in this way:

$ openstack flavor create --ram 4096 --disk 100 --vcpus 2 m1.medium.pci_passthrough
$ openstack flavor set --property "pci_passthrough:alias"="my_PF:1" m1.medium.pci_passthrough

3. Create the instance

Now all we need to do is launching an instance using this new flavor and it will automatically be configured by Nova compute – and then by Libvirt – with the PCI device in it.

$ openstack server create --flavor m1.medium.pci_passthrough --nic net-id=$PRIVATE_NETWORK_ID --image centos7 pci-passthrough-instance1

Again, if you have more options you need such as key-pairs or adding later a floating IP to access the instance you can do it too.

After that, the instance will show again an interface ens5 which is the p5p2 interface. In addition the interface p5p2 will disappear from the operating system while the instance exists.

OpenStack lab on your laptop with TripleO and director

cloud

This setup allows us to experiment with OSP director in our own laptop, play with the existing Heat templates, create new ones and understand how TripleO is used to install OpenStack, from the confort of your laptop.

VMware Fusion Professional version is used, but this will also work in VMware Workstation with virtually no changes and in vSphere or VirtualBox with an equivalent setup.

This guide uses the official Red Hat documentation, in particular the Director Installation and Usage.

Architecture


Architecture diagram

Standard RHEL OSP 7 architecture with multiple networks, VLANs, bonding and provisioning from the Undercloud / director node via PXE.

RHEL OSP 7 in laptop - [racedo]

Networks and VLANs

No especial setup is needed for enabling VLAN support in VMware Fusion, we just set the VLANs and their networks in RHEL as usual.

DHCP and PXE

DHCP and PXE are provided by the Undercloud VM.

NAT

VMware Fusion NAT will be used to provide external access to the Controller and Compute VMs via the provisioning and external networks. The VMware Fusion NAT below, configures 10.0.0.2 in your Mac OS X as the default gateway for the VMs, which will be used in the TripleO templates as the default gateway IP.

VMware Fusion Networks

The networks are configured in the VMware Fusion menu in Preferences, then Network.

vmnet9

vmnet10

The provisioning (PXE) network is set up in vmnet9, the rest of the networks in vmnet10.

The above describes the architecture of our laptop lab in VMware Fusion. Now, let’s implement it.

Step 1. Create 3 VMs in VMware Fusion


VM specifications

VM vCPUs Memory Disk NICs Boot device
Undercloud 1 3000 MB 20 GB 2 Disk
Controller 2 3000 MB 20 GB 3 1st NIC
Compute 2 3000 MB 20 GB 3 1st NIC

Disk size

You may want to increase the disk size for the controller to be able to test more or larger images and to the compute node to be able to run more or larger instances. 3GB of memory is enough if you include a swap partition for the compute and controller.

VMware network driver in .vmx file

Make sure the network driver in the three VMs is vmxnet3 and not e1000 so that RHEL shows all of them:

$ grep ethernet[0-9].virtualDev Undercloud.vmwarevm/Undercloud.vmx
ethernet0.virtualDev = "vmxnet3"
ethernet1.virtualDev = "vmxnet3"

ethX vs enoX NIC names

By default, the OSP director images have the kernel boot option net.ifnames=0. This will name the network interfaces as ethX as opposed to enoX. This is why in the Undercloud the interface names are eno16777984 and eno33557248 (default net.ifnames=1) and the Controller and Compute VMs have eth0, eth1 and eth2. This may change in RHEL OSP 7.2.

Undercloud VM Networks

This is the mapping of VMware networks to OS NICs. A OVS bridge br-ctlplane will be created automatically by the installation of the Undercloud.

Networks VMware Network RHEL NIC
External vmnet10 eno33557248
Provisioning vmnet9 eno16777984 / br-ctlplane

Copy the MAC addresses of the controller and compute VMs

Make a note of the MAC addresses of the first vNIC in the Controller and Compute VMs.

Screen Shot 2015-10-29 at 16.12.41

Screen Shot 2015-10-29 at 16.19.03

Step 2. Install the Undercloud


Install RHEL 7.1 in your preferred way in the Undercloud VM and then configure it as follows.

Network interfaces

First, set up the network. 192.168.100.10 will be the external IP in eno33557248 and 10.0.0.10 the provisioning IP in eno16777984.

In /etc/sysconfig/network-scripts/ifcfg-eno33557248

TYPE=Ethernet
BOOTPROTO=none
DEFROUTE=yes
NAME=eno33557248
DEVICE=eno33557248
ONBOOT=yes
IPADDR=192.168.100.10
PREFIX=24
GATEWAY=192.168.100.2
DNS1=192.168.100.2

And in /etc/sysconfig/network-scripts/ifcfg-eno16777984

TYPE=Ethernet
BOOTPROTO=none
DEFROUTE=yes
NAME=eno16777984
DEVICE=eno16777984
ONBOOT=yes
IPADDR=10.0.0.10
PREFIX=24

Once the network is set up ssh from your Mac OS X to 192.168.100.10 and not to 10.0.0.10 because the latter will be automatically reconfigured during the Undercloud installation to become the IP of the bridge called br-ctrlplane and you would lose access during the reconfiguration.

Undercloud hostname

The Undercloud needs a fully qualified domain name and it also needs to be present in the /etc/hosts file. For example:

# sudo hostnamectl set-hostname undercloud.osp.poc

And in /etc/hosts:

192.168.100.10 undercloud.osp.poc undercloud

Subscribe RHEL and Install the Undercloud Package

Now, subscribe the RHEL OS to Red Hat’s CDN and enable the required repos.

Then, install the OpenStack client plug-in that will allow us to install the Undercloud

# yum install -y python-rdomanager-oscplugin

Create the user stack

After that, create the stack user, which we will use to do the installation of the Undercloud and later the deployment and management of the Overcloud.

Configure the director

The following undercloud.conf file is a working configuration for this guide, which is mostly self-explanatory.

For a reference of the configuration flags, there’s a documented sample in /usr/share/instack-undercloud/undercloud.conf.sample

Become the stack user and create the file in its home directory.

# su - stack
$ vi ~/undercloud.conf
[DEFAULT]
image_path = /home/stack/images
local_ip = 10.0.0.10/24
undercloud_public_vip = 10.0.0.11
undercloud_admin_vip = 10.0.0.12
local_interface = eno16777984
masquerade_network = 10.0.0.0/24
dhcp_start = 10.0.0.50
dhcp_end = 10.0.0.100
network_cidr = 10.0.0.0/24
network_gateway = 10.0.0.10
discovery_iprange = 10.0.0.100,10.0.0.120
undercloud_debug = true
[auth]

The masquerade_network config flag is optional as in VMware Fusion we already have NAT as explained above, but it might be needed if you use VirtualBox.

Finally, get the Undercloud installed

We will run the installation as the stack user we created

$ openstack undercloud install

Step 3. Set up the Overcloud deployment


Verify the undercloud is working

Load the environment first, then run the service list command:

$ . stackrc
$ openstack service list
+----------------------------------+------------+---------------+
| ID                               | Name       | Type          |
+----------------------------------+------------+---------------+
| 0208564b05b148ed9115f8ab0b04f960 | glance     | image         |
| 0df260095fde40c5ab838affcdbce524 | swift      | object-store  |
| 3b499d3319094de5a409d2c19a725ea8 | heat       | orchestration |
| 44d8d0095adf4f27ac814e1d4a1ef9cd | nova       | compute       |
| 84a1fe11ed464894b7efee7543ecd6d6 | neutron    | network       |
| c092025afc8d43388f67cb9773b1fb27 | keystone   | identity      |
| d1a85475321e4c3fa8796a235fd51773 | nova       | computev3     |
| d5e1ad8cca1549759ad1e936755f703b | ironic     | baremetal     |
| d90cb61c7583494fb1a2cffd590af8e8 | ceilometer | metering      |
| e71d47d820c8476291e60847af89f52f | tuskar     | management    |
+----------------------------------+------------+---------------+

Configure the fake_pxe Ironic driver

Ironic doesn’t have a driver for powering on and off VMware Fusion VMs so we will do it manually. We need to configure the fake_pxe driver for this.

Edit /etc/ironic/ironic.conf and add it:

enabled_drivers = pxe_ipmitool,pxe_ssh,pxe_drac,fake_pxe

Then restart ironic-conductor and verify the driver is loaded:

$ sudo systemctl restart openstack-ironic-conductor
$ ironic driver-list
+---------------------+--------------------+
| Supported driver(s) | Active host(s)     |
+---------------------+--------------------+
| fake_pxe            | undercloud.osp.poc |
| pxe_drac            | undercloud.osp.poc |
| pxe_ipmitool        | undercloud.osp.poc |
| pxe_ssh             | undercloud.osp.poc |
+---------------------+--------------------+

Upload the images into the Undercloud’s Glance

Download the images that will be used to deploy the OpenStack nodes to the directory specified in the image_path in the undercloud.conf file, in our example /home/stack/images. Get the images and untar them as described here. Then upload them into Glance in the Undercloud:

$ openstack overcloud image upload --image-path /home/stack/images/

Define the VMs into the Undercloud’s Ironic

TripleO needs to know about the nodes, in our case the VMware Fusion VMs. We describe them in the file instackenv.json which we’ll create in the home directory of the stack user.

Notice that here is where we use the MAC addresses we took from the two VMs.

{
 "nodes": [
 {
   "arch": "x86_64",
   "cpu": "2",
   "disk": "20",
   "mac": [
   "00:0c:29:8f:1e:7b"
   ],
   "memory": "3000",
   "pm_type": "fake_pxe"
 },
 {
   "arch": "x86_64",
   "cpu": "2",
   "disk": "20",
   "mac": [
   "00:0C:29:41:0F:4E"
   ],
   "memory": "3000",
   "pm_type": "fake_pxe"
 }
 ]
}

Import them to the undercloud:

$ openstack baremetal import --json instackenv.json

The command above adds the nodes to Ironic:

$ ironic node-list
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+
| UUID                                 | Name | Instance UUID                        | Power State | Provision State | Maintenance |
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+
| 111cf49a-eb9e-421d-af05-35ab0d74c5d6 | None | 941bbdf9-43c0-442e-8b65-0bd531322509 | power off   | available       | False       |
| e579df9f-528f-4d14-94bc-07b2af4b252f | None | f1bd425b-a4d9-4eca-8bc4-ee31b300e381 | power off   | available       | False       |
+--------------------------------------+------+--------------------------------------+-------------+-----------------+-------------+

To finish the registration of the nodes we run this command:

$ openstack baremetal configure boot

Discover the nodes

At this point we are ready to start discovering the nodes, i.e. having Ironic powering them on, booting with the discovery image that was uploaded before and then shutting them down after the relevant hardware information has been saved in the node metadata in Ironic. This process is called introspection.

Note that as we use the fake_pxe driver, Ironic won’t power on the VMs, so we do it manually in VMware Fusion. We wait until the output of ironic node-list tells us that the power state is on and then we run this command:

$ openstack baremetal introspection bulk start

Assign the roles to the nodes in Ironic

There are two roles in this example, compute and control. We will assign them manually with Ironic.

$ ironic node-update 111cf49a-eb9e-421d-af05-35ab0d74c5d6 add properties/capabilities='profile:compute,boot_option:local'
$ ironic node-update e579df9f-528f-4d14-94bc-07b2af4b252f add properties/capabilities='profile:control,boot_option:local'

Create the flavors in Glance and associate them with the roles in ironic

This consists in creating the flavors matching the specs of the VMs and then adding the property control and compute to the corresponding flavors to match Ironic’s as done in the previous step. Then, it also requires a flavor called baremetal.

$ openstack flavor create --id auto --ram 3000 --disk 17 --vcpus 2 --swap 2000 compute
$ openstack flavor create --id auto --ram 3000 --disk 19 --vcpus 2 --swap 1500 control

TripleO also needs a flavor called baremetal (which we won’t use):

$ openstack flavor create --id auto --ram 3000 --disk 19 --vcpus 2 baremetal

Notice the disk size is 1 GB smaller than the VM’s disk. This is a precaution to avoid No valid host found when deploying with Ironic, which sometimes is a bit too sensitive.

Also, notice that I added swap because 3 GB of memory is not enough and the out of memory killer could be triggered otherwise.

Now we make the flavors match with the capabilities we set in the Ironic nodes in the previous step:

$ openstack flavor set --property "cpu_arch"="x86_64" --property "capabilities:boot_option"="local" --property "capabilities:profile"="control" control
$ openstack flavor set --property "cpu_arch"="x86_64" --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute" compute

 

Step 4. Create the TripleO templates


Get the TripleO templates

Copy the TripleO heat templates to the home directory of the stack user.

$ mkdir ~/templates
$ cp -r /usr/share/openstack-tripleo-heat-templates/ ~/templates/

Create the network definitions

These are our network definitions:

Network Subnet VLAN
Provisioning 10.0.0.0/24 VMware native
Internal API 172.16.0.0/24 201
Tenant 172.17.0.0/24 204
Storage 172.18.0.0/24 202
Storage Management 172.19.0.0/24 203
External 192.168.100.0/24 VMware native

To allow creating dedicated networks for specific services we describe them in a Heat template that we can call network-environment.yaml.

$ vi ~/templates/network-environment.yaml
resource_registry:
 OS::TripleO::Compute::Net::SoftwareConfig: /home/stack/templates/nic-configs/compute.yaml
 OS::TripleO::Controller::Net::SoftwareConfig: /home/stack/templates/nic-configs/controller.yaml

parameter_defaults:

 # The IP address of the EC2 metadata server. Generally the IP of the Undercloud
 EC2MetadataIp: 10.0.0.10
 # Gateway router for the provisioning network (or Undercloud IP)
 ControlPlaneDefaultRoute: 10.0.0.2
 DnsServers: ["10.0.0.2"]

 InternalApiNetCidr: 172.16.0.0/24
 TenantNetCidr: 172.17.0.0/24
 StorageNetCidr: 172.18.0.0/24
 StorageMgmtNetCidr: 172.19.0.0/24
 ExternalNetCidr: 192.168.100.0/24

 # Leave room for floating IPs in the External allocation pool
 ExternalAllocationPools: [{'start': '192.168.100.100', 'end': '192.168.100.200'}]
 InternalApiAllocationPools: [{'start': '172.16.0.10', 'end': '172.16.0.200'}]
 TenantAllocationPools: [{'start': '172.17.0.10', 'end': '172.17.0.200'}]
 StorageAllocationPools: [{'start': '172.18.0.10', 'end': '172.18.0.200'}]
 StorageMgmtAllocationPools: [{'start': '172.19.0.10', 'end': '172.19.0.200'}]

 InternalApiNetworkVlanID: 201
 StorageNetworkVlanID: 202
 StorageMgmtNetworkVlanID: 203
 TenantNetworkVlanID: 204

 # ExternalNetworkVlanID: 100
 # Set to the router gateway on the external network
 ExternalInterfaceDefaultRoute: 192.168.100.2
 # Set to "br-ex" if using floating IPs on native VLAN on bridge br-ex
 NeutronExternalNetworkBridge: "br-ex"

 # Customize bonding options if required
 BondInterfaceOvsOptions:
 "bond_mode=active-backup"

More information about this template can be found here.

Configure the NICs of the VMs

We have examples of NIC configurations for multiple networks and bonding in /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/

We will use them as a template to define the Controller and Compute NIC setup.

$ mkdir ~/templates/nic-configs/
$ cp /usr/share/openstack-tripleo-heat-templates/network/config/bond-with-vlans/* ~/templates/nic-configs/

Notice that they are called from the previous template network-environment.yaml.

Controller NICs

We want this setup in the controller:

Bonded Interface  Bond Slaves Bond Mode
bond1 eth1, eth2 active-backup
Networks VMware Network RHEL NIC
Provisioning vmnet9 eth0
External vmnet10 bond1 / br-ex
Internal vmnet10 bond1 / vlan201
Tenant vmnet10 bond1 / vlan204
Storage vmnet10 bond1 / vlan202
Storage Management vmnet10 bond1 / vlan203

We only need to modify the resources section of the ~/templates/nic-configs/controller.yaml to match the configuration in the table above:

$ vi ~/templates/nic-configs/controller.yaml
[...]
resources:
  OsNetConfigImpl:
    type: OS::Heat::StructuredConfig
    properties:
      group: os-apply-config
      config:
        os_net_config:
          network_config:
            -
              type: interface
              name: nic1
              use_dhcp: false
              addresses:
                -
                  ip_netmask:
                    list_join:
                      - '/'
                      - - {get_param: ControlPlaneIp}
                        - {get_param: ControlPlaneSubnetCidr}
              routes:
                -
                  ip_netmask: 169.254.169.254/32
                  next_hop: {get_param: EC2MetadataIp}
            -
              type: ovs_bridge
              name: {get_input: bridge_name}
              addresses:
                - ip_netmask: {get_param: ExternalIpSubnet}
              routes:
                - ip_netmask: 0.0.0.0/0
                  next_hop: {get_param: ExternalInterfaceDefaultRoute}
              dns_servers: {get_param: DnsServers}
              members:
                -
                  type: ovs_bond
                  name: bond1
                  ovs_options: {get_param: BondInterfaceOvsOptions}
                  members:
                    -
                      type: interface
                      name: nic2
                      primary: true
                    -
                      type: interface
                      name: nic3
                -
                  type: vlan
                  device: bond1
                  vlan_id: {get_param: InternalApiNetworkVlanID}
                  addresses:
                  -
                    ip_netmask: {get_param: InternalApiIpSubnet}
                -
                  type: vlan
                  device: bond1
                  vlan_id: {get_param: StorageNetworkVlanID}
                  addresses:
                  -
                    ip_netmask: {get_param: StorageIpSubnet}
                -
                  type: vlan
                  device: bond1
                  vlan_id: {get_param: StorageMgmtNetworkVlanID}
                  addresses:
                  -
                    ip_netmask: {get_param: StorageMgmtIpSubnet}
                -
                  type: vlan
                  device: bond1
                  vlan_id: {get_param: TenantNetworkVlanID}
                  addresses:
                  -
                    ip_netmask: {get_param: TenantIpSubnet}

outputs:
  OS::stack_id:
    description: The OsNetConfigImpl resource.
    value: {get_resource: OsNetConfigImpl}

Compute NICs

In the compute node we want this setup:

Bonded Interface  Bond Slaves Bond Mode
bond1 eth1, eth2 active-backup
Networks VMware Network RHEL NIC
Provisioning vmnet9 eth0
Internal vmnet10 bond1 / vlan201
Tenant vmnet10 bond1 / vlan204
Storage vmnet10 bond1 / vlan202
$ vi ~/templates/nic-configs/compute.yaml
[...]
resources:
  OsNetConfigImpl:
    type: OS::Heat::StructuredConfig
    properties:
      group: os-apply-config
      config:
        os_net_config:
          network_config:
            -
              type: interface
              name: nic1
              use_dhcp: false
              dns_servers: {get_param: DnsServers}
              addresses:
                -
                  ip_netmask:
                    list_join:
                      - '/'
                      - - {get_param: ControlPlaneIp}
                        - {get_param: ControlPlaneSubnetCidr}
              routes:
                -
                  ip_netmask: 169.254.169.254/32
                  next_hop: {get_param: EC2MetadataIp}
                -
                  default: true
                  next_hop: {get_param: ControlPlaneDefaultRoute}
            -
              type: ovs_bridge
              name: {get_input: bridge_name}
              members:
                -
                  type: ovs_bond
                  name: bond1
                  ovs_options: {get_param: BondInterfaceOvsOptions}
                  members:
                    -
                      type: interface
                      name: nic2
                      primary: true
                    -
                      type: interface
                      name: nic3
                -
                  type: vlan
                  device: bond1
                  vlan_id: {get_param: InternalApiNetworkVlanID}
                  addresses:
                  -
                    ip_netmask: {get_param: InternalApiIpSubnet}
                -
                  type: vlan
                  device: bond1
                  vlan_id: {get_param: StorageNetworkVlanID}
                  addresses:
                  -
                    ip_netmask: {get_param: StorageIpSubnet}
                -
                  type: vlan
                  device: bond1
                  vlan_id: {get_param: TenantNetworkVlanID}
                  addresses:
                  -
                    ip_netmask: {get_param: TenantIpSubnet}
outputs:
  OS::stack_id:
    description: The OsNetConfigImpl resource.
    value: {get_resource: OsNetConfigImpl}

Enable Swap

Enabling the swap partition is done from within the OS. Ironic only creates the partition as instructed in the flavor. This can be done with the templates that allow running first boot scripts via cloud-init.

First, the template for running at cloud-init userdata /home/stack/templates/firstboot/firstboot.yaml

resource_registry:
 OS::TripleO::NodeUserData: /home/stack/templates/firstboot/userdata.yaml

Then, the actual script for enabling swap /home/stack/templates/firstboot/userdata.yaml

heat_template_version: 2014-10-16

resources:
 userdata:
   type: OS::Heat::MultipartMime
   properties:
   parts:
   - config: {get_resource: swapon_config}

 swapon_config:
   type: OS::Heat::SoftwareConfig
   properties:
   config: |
     #!/bin/bash
     swap_device=$(sudo fdisk -l | grep swap | awk '{print $1}')
     if [[ $swap_device && ${swap_device} ]]; then
       rc_local="/etc/rc.d/rc.local"
       echo "swapon $swap_device " >> $rc_local
       chmod 755 $rc_local
       swapon $swap_device
     fi
outputs:
 OS::stack_id:
 value: {get_resource: userdata}

 

Step 5. Deploy the Overcloud


Summary

We have everything we need to deploy now:

  • The Undercloud configured.
  • Flavors for the compute and controller nodes.
  •  Images for the discovery and deployment of the nodes.
  • Templates defining the networks in OpenStack.
  • Templates defining the nodes’ NICs configuration.
  • A first boot script used to enable swap.

We will use all this information when running the deploy command:

$ openstack overcloud deploy \
--templates templates/openstack-tripleo-heat-templates/ \
-e templates/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e templates/network-environment.yaml \
-e templates/firstboot/firstboot.yaml \
--control-flavor control \
--compute-flavor compute \
--neutron-tunnel-types vxlan --neutron-network-type vxlan \
--ntp-server clock.redhat.com

After a successful deployment you’ll see this:

Deploying templates in the directory /home/stack/templates/openstack-tripleo-heat-templates
[...]
Overcloud Endpoint: http://192.168.100.100:5000/v2.0/
Overcloud Deployed

An overcloudrc file with the environment is created for you to start using the new OpenStack environment deployed in your laptop.

Step 6. Start using the Overcloud


Now we are ready to start testing our newly deployed platform.

$ . overcloudrc
$ openstack service list
+----------------------------------+------------+---------------+
| ID | Name | Type |
+----------------------------------+------------+---------------+
| 043524ae126b4f23bd3fb7826a557566 | glance     | image         |
| 3d5c8d48d30b41e9853659ce840ae4fe | neutron    | network       |
| 418d4f34abe449aa8f07dac77c078e9c | nova       | computev3     |
| 43480fab74fd4fd480fdefc56eecfe83 | cinderv2   | volumev2      |
| 4e01d978a648474db6d5b160cd0a71e1 | nova       | compute       |
| 6357f4122d6d41b986dab40d6fb471e3 | cinder     | volume        |
| a49119e0fd9f43c0895142e3b3f3394a | keystone   | identity      |
| b808ae83589646e6b7033f2b150e7623 | horizon    | dashboard     |
| d4c9383fa9e94daf8c74419b0b18fd6e | heat       | orchestration |
| db556409857d4d24872cdc1b718eee8f | swift      | object-store  |
| ddc3c82097d24f478edfc89b46310522 | ceilometer | metering      |
+----------------------------------+------------+---------------+

Understanding OpenStack Heat Auto Scaling

Heat Autoscaling

OpenStack Heat can deploy and configure multiple instances in one command using resources we have in OpenStack. That’s called a Heat Stack.

Heat will create instances from images using existing flavors and networks. It can configure LBaaS and provide VIPs for our load-balanced instances. It can also use the metadata service to inject files, scripts or variables after instance deployment. It can even use Ceilometer to create alarms based on instance CPU usage and associate actions like spinning up or terminating instances based on CPU load.

All the above is done by Heat to provide autoscaling capabilities to our applications. In this post I explain how to do this in RHEL 7 instances. If you want to reproduce this in another OS it’s as simple as replacing how the example webapp packages are installed.

Steps to have Heat autoscaling

1. Create a WordPress repo in a RHEL 7 box. Make sure it’s a basic installation so that all the dependencies are downloaded along with WordPress:

# Install EPEL and Remi repos first, then create a repo
yum -y install http://dl.fedoraproject.org/pub/epel/beta/7/x86_64/epel-release-7-0.2.noarch.rpm
yum -y install http://rpms.famillecollet.com/enterprise/remi-release-7.rpm
yum -y --enablerepo=remi install wordpress --downloadonly --downloaddir=/var/www/html/repos/wordpress
createrepo /var/www/html/repos/wordpress

2. Create a repo for rhel-7-server-rpms with something like:

# First register to Red Hat's CDN with subscription-manager register
# Then subscribe to the channels to be synchronised
reposync -p /var/www/html/repos/ rhel-7-server-rpms
createrepo /var/www/html/repos/rhel-7-server-rpms

3. Download the Heat template which consists on two files: autoscaling.yaml and lb_server.yaml 

Note: The template autoscaling.yaml uses lb_server.yaml (nested stacks?) and it can’t be deployed from Horizon right now due to a bug. It works fine from the command line as described below.

[Update] Note: I made it work in horizon by:

  • Publishing the two templates on a web server.
  • Modifying the autoscaling.yaml template published in the web server to call the nested template like this:
type: http://172.16.0.129:81/repos/heat-templates/lb_server.yaml

4. Modify the Heat template so the first thing it does when cloud-init executes the script passed via user_data by Heat is installing the WordPress repos.

a. Right before yum -y install httpd wordpress add the repos making it look like this:

 
[...]     
      user_data_format: RAW
      user_data:
        str_replace:
          template: |
            #!/bin/bash -v
            #Add local repos for wordpress and rhel7
            cat << EOF >> /etc/yum.repos.d/rhel.repo
            [rhel-7-server-rpms]
            name=rhel-7-server-rpms
            baseurl=http://172.16.0.129:81/repos/rhel-7-server-rpms
            gpgcheck=0
            enabled=1

            [wordpress]
            name=wordpress
            baseurl=http://172.16.0.129:81/repos/wordpress
            gpgcheck=0
            enabled=1
            EOF

            yum -y install httpd wordpress
[...]

b. And right before yum -y install mariadb mariadb-server do exactly the same.

Note: I’m assuming that your two repos are accessible via http from the instances.

Note: All of these steps are optional. If your instances pull packages directly from the Internet and/or another repository you can skip or adapt this to your environment.

5. Take note of:

  • The glance image you will use: nova image-list
    • Note: I’m using the RHEL 7 image available in the Red Hat Customer Portal  rhel-guest-image-7.0-20140618.1.x86_64.qcow2
  • The ssh key pair you want to use: nova keypair-list
  • The flavor you want to use with them: nova flavor-list
  • The subnet where the instances of the Heat stack will be launched on.

6. Create the Heat stack:

heat stack-create AutoscalingWordpress -f autoscaling.yaml \
-P image=rhel7 \
-P key=ramon \
-P flavor=m1.small \
-P database_flavor=m1.small \
-P subnet_id=44908b41-ce16-4f8c-ba6c-9bb4303e6d3f \
-P database_name=wordpress \
-P database_user=wordpress

Note: Here we use all the parameters from the template downloaded before. The are found in the parameters: section of the YAML file. We could add default: value within the template alternatively.

Now, what I do right after is a tail -f /var/log/heat/*log in the controller node, where I have Heat installed, just to make sure everything is fine with the creation of the heat stack.

7. Verify Heat created a LBaaS pool and VIP:

[root@racedo-rhel7-1 heat(keystone_demo)]# neutron lb-pool-list  
+--------------------------------------+----------------------------------------+----------+-------------+----------+----------------+--------+  
| id                                  | name                                  | provider | lb_method  | protocol | admin_state_up | status |  
+--------------------------------------+----------------------------------------+----------+-------------+----------+----------------+--------+  
| 78f02e89-aa07-40fd-917b-1481175b43e8 | AutoscalingWordpress-pool-46zb7elgzamo | haproxy  | ROUND_ROBIN | HTTP    | True          | ACTIVE |  
+--------------------------------------+----------------------------------------+----------+-------------+----------+----------------+--------+
[root@racedo-rhel7-1 heat(keystone_demo)]# neutron lb-vip-list  
+--------------------------------------+----------+-----------+----------+----------------+--------+  
| id                                  | name    | address  | protocol | admin_state_up | status |  
+--------------------------------------+----------+-----------+----------+----------------+--------+  
| 8da663cb-43d7-49af-9343-360431e02655 | pool.vip | 10.1.1.14 | HTTP    | True          | ACTIVE |  
+--------------------------------------+----------+-----------+----------+----------------+--------+  

8. Associate a floating IP to the VIP: neutron floatingip-associate FLOATING_IP_ID VIP_NEUTRON_PORT_ID. In my case I need a floating IP:

[root@racedo-rhel7-1 heat(keystone_demo)]# neutron lb-vip-show pool.vip | grep port_id
| port_id             | 13c01599-23f1-4e1e-96d9-72f2775e6183 |
[root@racedo-rhel7-1 heat(keystone_demo)]# neutron floatingip-list
+--------------------------------------+------------------+---------------------+--------------------------------------+
| id                                   | fixed_ip_address | floating_ip_address | port_id                              |
+--------------------------------------+------------------+---------------------+--------------------------------------+
| 0525f959-5213-4291-a1f0-a2ea2b40e11c |                  | 172.16.0.53         |                                      |
| 09f1bdc9-228b-4057-a5d1-3327ccc0bfc8 |                  | 172.16.0.54         |                                      |
| 5538961a-3423-46a3-9744-aba699e722c5 |                  | 172.16.0.52         |                                      |
+--------------------------------------+------------------+---------------------+-------------------------------------
root@racedo-rhel7-1 heat(keystone_demo)]# neutron floatingip-associate 0525f959-5213-4291-a1f0-a2ea2b40e11c 13c01599-23f1-4e1e-96d9-72f2775e6183

Note: This is optional if your instances are connected to a provider network where you can access directly instead of to a tenant network like in this example.

9. Verify that Heat created the two Ceilometer alarms; one to scale out on high CPU usage and another one to scale down on low CPU:

root@racedo-rhel7-1 heat(keystone_demo)]# ceilometer alarm-list
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+
| Alarm ID                            | Name                                            | State | Enabled | Continuous | Alarm condition                | Time constraints |
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+
| 1610f404-8df7-46ed-b131-6d3797fc9e4e | AutoscalingWordpress-cpu_alarm_low-vinrbn2rdjpx  | alarm | True    | False      | cpu_util < 15.0 during 1 x 600s | None            |
| 53c124bd-db57-4909-af55-009f5a635937 | AutoscalingWordpress-cpu_alarm_high-42dc5funjeds | ok    | True    | False      | cpu_util > 50.0 during 1 x 60s  | None            |
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+

10. Verify you can access the WordPress using the VIP:

Wordpress

11. Now ssh into the WordPress web instance (not the DB one) and put some CPU load, just a couple of dd commands will suffice. Add a floating IP to the instance first if necessary.

[cloud-user@au-g6hl-ye4uglqb5t7r-ylpghgnzyck3-server-nobvg6ftaoe7 ~]$ dd if=/dev/zero of=/dev/null  &  
[1] 908  
[cloud-user@au-g6hl-ye4uglqb5t7r-ylpghgnzyck3-server-nobvg6ftaoe7 ~]$ dd if=/dev/zero of=/dev/null  &  
[2] 909  
[cloud-user@au-g6hl-ye4uglqb5t7r-ylpghgnzyck3-server-nobvg6ftaoe7 ~]$ dd if=/dev/zero of=/dev/null  &  
[3] 910  
[cloud-user@au-g6hl-ye4uglqb5t7r-ylpghgnzyck3-server-nobvg6ftaoe7 ~]$ dd if=/dev/zero of=/dev/null  &  
[4] 911  
[cloud-user@au-g6hl-ye4uglqb5t7r-ylpghgnzyck3-server-nobvg6ftaoe7 ~]$ dd if=/dev/zero of=/dev/null  &  
[5] 912  
[cloud-user@au-g6hl-ye4uglqb5t7r-ylpghgnzyck3-server-nobvg6ftaoe7 ~]$ top  
top - 11:01:05 up 10 min,  1 user,  load average: 6.81, 1.12, 0.71  
Tasks:  90 total,  8 running,  82 sleeping,  0 stopped,  0 zombie  
%Cpu(s): 24.3 us, 75.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st  
KiB Mem:  1018312 total,  235068 used,  783244 free,      688 buffers  
KiB Swap:        0 total,        0 used,        0 free.    95480 cached Mem  
  
  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM    TIME+ COMMAND  
  908 cloud-u+  20  0  107920    620    528 R 15.8  0.1  0:09.80 dd  
  909 cloud-u+  20  0  107920    616    528 R 15.5  0.1  0:08.49 dd  
  911 cloud-u+  20  0  107920    620    528 R 15.5  0.1  0:07.88 dd  
  912 cloud-u+  20  0  107920    616    528 R 15.5  0.1  0:07.71 dd  
  910 cloud-u+  20  0  107920    620    528 R 15.2  0.1  0:08.11 dd  

12. Observe how Ceilometer triggers an alarm (State goes to alarm) and how a new instance is launched:

[root@racedo-rhel7-1 heat(keystone_demo)]# ceilometer alarm-list  
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+  
| Alarm ID                            | Name                                            | State | Enabled | Continuous | Alarm condition                | Time constraints |  
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+  
| 1610f404-8df7-46ed-b131-6d3797fc9e4e | AutoscalingWordpress-cpu_alarm_low-vinrbn2rdjpx  | ok    | True    | False      | cpu_util < 15.0 during 1 x 600s | None            |  
| 53c124bd-db57-4909-af55-009f5a635937 | AutoscalingWordpress-cpu_alarm_high-42dc5funjeds | alarm | True    | False      | cpu_util > 50.0 during 1 x 60s  | None            |  
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+

Wordpress Autoscaling

13. Kill the dd processes (directly from top press k and kill all of them)

14. Wait about 10 minutes, which is the duration of the scale down alarm by default in our template. The state of the alarm in Ceilometer will go to alarm just like before but now due to lack of CPU load. You’ll see how one of the two instances is deleted:

[root@racedo-rhel7-1 heat(keystone_demo)]# ceilometer alarm-list  
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+  
| Alarm ID                             | Name                                             | State | Enabled | Continuous | Alarm condition                 | Time constraints |  
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+  
| 1610f404-8df7-46ed-b131-6d3797fc9e4e | AutoscalingWordpress-cpu_alarm_low-vinrbn2rdjpx  | alarm | True    | False      | cpu_util < 15.0 during 1 x 600s | None             |  
| 53c124bd-db57-4909-af55-009f5a635937 | AutoscalingWordpress-cpu_alarm_high-42dc5funjeds | ok    | True    | False      | cpu_util > 50.0 during 1 x 60s  | None             |  
+--------------------------------------+--------------------------------------------------+-------+---------+------------+---------------------------------+------------------+  

That’s all.

Multiple Private Networks with Open vSwitch GRE Tunnels and Libvirt

 

Libvirt and GRE Tunnels

GRE tunnels are extremely useful for many reasons. One use case is to be able to design and test an infrastructure requiring multiple networks on a typical home lab with limited hardware, such as laptops and desktops with only 1 ethernet card.

As an example, to design an OpenStack infrastructure for a production environment with RDO or Red Hat Enterprise Linux OpenStack Platform (RHEL OSP) three separate networks are recommended.

These networks will have services such as DHCP (even multiple DHCP servers if eventually needed) as they will be completely isolated from each other. Testing multiple VLANs or trunking is also possible with this setup.

The diagram above should be almost self-explanatory and describes this setup with Open vSwitch, GRE tunnels and Libvirt.

Step by Step on CentOS 6.5

1. Install CentOS 6.5 choosing the Basic Server option

2. Install the EPEL and RDO repos which provide Open vSwitch and iproute Namespace support:

# yum install http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
# yum install http://repos.fedorapeople.org/repos/openstack/openstack-icehouse/rdo-release-icehouse-3.noarch.rpm

3. Install Libvirt, Open vSwitch and virt-install:

# yum install libvirt openvswitch python-virtinst

4. Create the bridge that will be associated to eth0:

# ovs-vsctl add-br br-eth0

5. Set up your network on the br-eth0 bridge with the configuration you had on eth0 and change the eth0 network settings as follows (with your own network settings):

# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=none
MTU=1546
# cat /etc/sysconfig/network-scripts/ifcfg-br-eth0
DEVICE=br-eth0
TYPE=OVSBridge
ONBOOT=yes
BOOTPROTO=none
IPADDR0=192.168.2.1
PREFIX0=24
DNS1=192.168.2.254

Notice the MTU setting above. This is very important as GRE adds encapsulation bytes. There are two options, increasing the MTU in the hosts like in this example or decreasing the MTU in the guests if your NIC doesn’t support MTUs larger than 1500 bytes.

6. Add eth0 to br-eth0 and restart the network to pick up the changes made in the previous step:

# ovs-vsctl add-port br-eth0 eth0 && service network restart

7. Make sure your network still works as it did before the changes above

8. Assuming this host has the IP 192.168.2.1 and you have two other hosts where you will do this same (or compatible) setup with the IPs 192.168.2.2 and 192.168.2.3,  create the internal ovs bridge br-int0 and set the GRE tunnel endpoints gre0 and gre1 (note that the diagram above has only two hosts but a you can add more hosts with identical setup):

# ovs-vsctl add-br br-int0
# ovs-vsctl add-port br-int0 gre0 -- set interface gre0 type=gre options:remote_ip=192.168.2.2
# ovs-vsctl add-port br-int0 gre1 -- set interface gre1 type=gre options:remote_ip=192.168.2.3

Notice there is another way to set up GRE tunnels using /etc/sysconfig/network-scripts/ in CentOS/RHEL but the method explained here works in any Linux distro and is equally persistent. Choose whichever you find appropriate.

9. Enable STP (needed for more than 2 hosts):

# ovs-vsctl set bridge br-int0 stp_enable=true

10. Create a file called libvirt-vlabs.xml with the definition of the Libvirt network that will use the Open vSwitch bridge br-int0 (and the GRE tunnels) we just created. Check the diagram above for reference:

<network>
  <name>ovs-network</name>
  <forward mode='bridge'/>
  <bridge name='br-int0'/>
  <virtualport type='openvswitch'/>
  <portgroup name='no-vlan' default='yes'>
  </portgroup>
  <portgroup name='vlan-100'>
    <vlan>
      <tag id='100'/>
    </vlan>
  </portgroup>
  <portgroup name='vlan-200'>
    <vlan>
      <tag id='200'/>
    </vlan>
  </portgroup>
</network>

11. Remove (optionally) the default network that Libvirt creates and add (mandatory) the network defined in the previous step:

# virsh net-destroy default
# virsh net-autostart --disable default
# virsh net-undefine default
# virsh net-define libvirt-vlans.xml
# virsh net-autostart ovs-network
# virsh net-start ovs-network

12. Create a Libvirt storage pool where your VMs will be created (needed to use qcow2 disk format). I chose /home/VMs/pool but it can be anywhere you find appropriate:

# virsh pool-define-as --name VMs-pool --type dir --target /home/VMs/pool/

13. Asuming you are installing a CentOS VM and that the location of the ISO is /home/VMs/ISOs/CentOS-6.5-x86_64-bin-DVD1.iso, create a VM named foreman (or any name you like) with virt-install:

# virt-install \
--name foreman \
--ram 1024 \
--vcpus=1 \
--disk size=20,format=qcow2,pool=VMs-pool \
--nonetworks \
--cdrom /home/VMs/ISOs/CentOS-6.5-x86_64-bin-DVD1.iso \
--graphics vnc,listen=0.0.0.0,keymap=en_gb --noautoconsole --hvm \
--os-variant rhel6

14. Use a VNC client to access the screen of the VM during the installation. Finish the installation and shut down the VM.

15. Edit the VM with virsh edit foreman (following name used in the example above) to add the 3 networks created before. At the bottom of the VM definition, just before </devices> add the following:


<interface type='network'>
  <source network='ovs-network' portgroup='no-vlan'/>
  <model type='virtio'/>
</interface>
<interface type='network'>
  <source network='ovs-network' portgroup='vlan-100'/>
  <model type='virtio'/>
</interface>
<interface type='network'>
  <source network='ovs-network' portgroup='vlan-200'/>
  <model type='virtio'/>
</interface>

Now you can start your VM with virsh start foreman, set up the network (in any or all of the three interfaces). Repeat the same process in another host and VM and you are good to go and install something like Foreman and OpenStack without having to have more than one network interface per host.

Resizing OpenStack Volumes

Hard Drive

Resizing a Volume with Cinder in Havana

Cinder has the extend functionality in Havana which allows an easy resizing procedure of volumes. It works as expected on the volumes. On the OS I have found it less reliable when using resize2fs on the extended volume. Maybe I haven’t done enough tests yet but in any case the method below works in Havana and in Grizzly.

Resizing a Volume in Grizzly and Havana

The following method works in both Grizzly and Havana. It can be entirely done by the tenant with the nova command (the cinder client is not needed).

1. Identify the volume to be resized

$ nova volume-list
+--------------------------------------+-----------+--------------+------+-------------+--------------------------------------+
| ID | Status | Display Name | Size | Volume Type | Attached to |
+--------------------------------------+-----------+--------------+------+-------------+--------------------------------------+
| 44bcd404-8a6e-41d8-9d56-2ac4e0c1e97c | in-use | None | 10 | None | 438cdb78-5573-4ab0-9f89-79cad806286c |
| 010ea497-98d5-4ace-a6aa-bdc847628cee | available | None | 1 | None | |
+--------------------------------------+-----------+--------------+------+-------------+--------------------------------------+

2. Detach the volume from its instance. It is recommended to ssh into the instance and to unmount it first.

$ nova volume-detach VM1 44bcd404-8a6e-41d8-9d56-2ac4e0c1e97c

3. Create a snapshot of the volume:

$ nova volume-snapshot-create 44bcd404-8a6e-41d8-9d56-2ac4e0c1e97c
+---------------------+--------------------------------------+
| Property            | Value                                |
+---------------------+--------------------------------------+
| status              | creating                             |
| display_name        | None                                 |
| created_at          | 2014-01-16T16:20:17.739982           |
| display_description | None                                 |
| volume_id           | 44bcd404-8a6e-41d8-9d56-2ac4e0c1e97c |
| size                | 10                                   |
| id                  | ea8a1c24-982e-4d63-809f-38f0ad974604 |
| metadata            | {}                                   |
+---------------------+--------------------------------------+

4. Create a new volume from the the snapshot of the volume we are resizing specifying the new desired size:

$ nova volume-create --snapshot-id ea8a1c24-982e-4d63-809f-38f0ad974604 15
+---------------------+--------------------------------------+
| Property            | Value                                |
+---------------------+--------------------------------------+
| status              | creating                             |
| display_name        | None                                 |
| attachments         | []                                   |
| availability_zone   | nova                                 |
| bootable            | false                                |
| created_at          | 2014-01-16T16:22:10.634404           |
| display_description | None                                 |
| volume_type         | None                                 |
| snapshot_id         | ea8a1c24-982e-4d63-809f-38f0ad974604 |
| source_volid        | None                                 |
| size                | 15                                   |
| id                  | 408a9d90-6498-4e87-a26a-43fe506b1b1d |
| metadata            | {}                                   |
+---------------------+--------------------------------------+

5. Wait until the status of the newly created volume is available and attach it to the instance using another device name (if it originally was /dev/vdc then use /dev/vdd for example):

$ nova volume-attach VM1 408a9d90-6498-4e87-a26a-43fe506b1b1d  /dev/vdd
+----------+--------------------------------------+
| Property | Value                                |
+----------+--------------------------------------+
| device   | /dev/vdd                             |
| serverId | 438cdb78-5573-4ab0-9f89-79cad806286c |
| id       | 408a9d90-6498-4e87-a26a-43fe506b1b1d |
| volumeId | 408a9d90-6498-4e87-a26a-43fe506b1b1d |
+----------+--------------------------------------+

Note that the snapshot created in step 3 can now be deleted as well as the original volume.

From within the instance OS, assuming it’s a Linux VM, we need to make the OS aware of the new size.

ubuntu@vm1:/$ sudo e2fsck -f /dev/vdc
e2fsck 1.42 (29-Nov-2011)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/vdc: 13/65536 files (0.0% non-contiguous), 12637/262144 blocks
ubuntu@vm1:/$ sudo resize2fs /dev/vdc
resize2fs 1.42 (29-Nov-2011)
Resizing the filesystem on /dev/vdc to 1310720 (4k) blocks.
The filesystem on /dev/vdc is now 1310720 blocks long.

ubuntu@vm1:/$ sudo mount /dev/vdc /mnt2
ubuntu@vm1:/$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       9.9G  828M  8.6G   9% /
udev            494M  8.0K  494M   1% /dev
tmpfs           200M  220K  199M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            498M     0  498M   0% /run/shm
/dev/vdb         20G  173M   19G   1% /mnt
/dev/vdc       15.0G   34M 14.7G   1% /mnt2

Notes

  • If resize2fs does not work, try rebooting the instance first. The kernel will come back up fresh and may pick it up fine after reboot.
  • Make sure you changed the block device name (i.e. from vdc to vdd for instance)
  • Do not use partitions (i.e. /dev/vdc1). Check resize2fs for details. It can be done anyway but rebuilding the partition table is needed.

Set up iSCSI Storage for ESXi Hosts From The Command Line

VMware Command Line Interface

The esxcli command line tool can be extremely useful to set up an ESXi host, including iSCSI storage.

1. Enable iSCSI:

~ # esxcli iscsi software set -e true
Software iSCSI Enabled

2. Check the adapter name, usually vmhba32, vmhba33, vmhba34 and so on.

~ # esxcli iscsi adapter list
Adapter Driver State UID Description
------- --------- ------ ------------- ----------------------
vmhba32 iscsi_vmk online iscsi.vmhba32 iSCSI Software Adapter

3. Connect your ESXi iSCSI adapter to your iSCSI target

~ # esxcli iscsi adapter discovery sendtarget add -A vmhba32 -a 10.230.5.60:3260

~ # esxcli iscsi adapter get -A vmhba32
vmhba32
Name: iqn.1998-01.com.vmware:ch02b03-65834587
Alias:
Vendor: VMware
Model: iSCSI Software Adapter
Description: iSCSI Software Adapter
Serial Number:
Hardware Version:
Asic Version:
Firmware Version:
Option Rom Version:
Driver Name: iscsi_vmk
Driver Version:
TCP Protocol Supported: false
Bidirectional Transfers Supported: false
Maximum Cdb Length: 64
Can Be NIC: false
Is NIC: false
Is Initiator: true
Is Target: false
Using TCP Offload Engine: false
Using ISCSI Offload Engine: false

4. Now, on your iSCSI server, assign a volume of the SAN to the IQN of your ESXi host, for example, for a HP StorageWorks:

CLIQ>assignVolume volumeName=racedo-vSphereVolume initiator=iqn.1998-01.com.vmware:ch02b01-01e26a74;iqn.1998-01.com.vmware:ch02b02-20d3e33b;iqn.1998-01.com.vmware:ch02b03-65834587

The above command assigned three IQNs to the volume, two that we already had and the new one we are setting up. This is just for HP StorageWorks CLI, other storage arrays work differently.

5. Back on the ESXi host, discover the targets:

~ # esxcli iscsi adapter discovery rediscover -A vmhba32

Finally, check with the df command that the datastore has been added. If not, try rediscovering again.

This is the simplest configuration possible from the command line. NIC teaming or other more complex setups can also be done from the command line of the ESXi hosts.