I will disconnect the host this morning and test but before I do that I ran
this command when all hosts are up -
select * from cloud.host;
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
| id | name | uuid | status | type
| private_ip_address | private_netmask | private_mac_address |
storage_ip_address | storage_netmask | storage_mac_address |
storage_ip_address_2 | storage_mac_address_2 | storage_netmask_2 | cluster_id |
public_ip_address | public_netmask | public_mac_address | proxy_port |
data_center_id | pod_id | cpu_sockets | cpus | speed | url
| fs_type | hypervisor_type | hypervisor_version | ram |
resource | version | parent | total_size | capabilities | guid
| available | setup | dom0_memory |
last_ping | mgmt_server_id | disconnected | created |
removed | update_count | resource_state | owner | lastUpdated | engine_state |
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
| 1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up |
Routing | 172.30.3.3 | 255.255.255.192 | 00:22:19:92:4e:34
| 172.30.3.3 | 255.255.255.192 | 00:22:19:92:4e:34 | NULL
| NULL | NULL | 1 | 172.30.4.3
| 255.255.255.128 | 00:22:19:92:4e:35 | NULL | 1 | 1
| 1 | 2 | 2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL |
KVM | NULL | 7510159360 | NULL | 4.11.0.0 | NULL
| NULL | hvm,snapshot |
9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource | 1 |
0 | 0 | 1492390408 | 146457912294 | 2018-06-05 14:09:22 |
2018-06-05 13:44:33 | NULL | 4 | Enabled | NULL | NULL
| Disabled |
| 2 | v-2-VM | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up |
ConsoleProxy | 172.30.3.49 | 255.255.255.192 | 1e:00:80:00:00:14
| 172.30.3.49 | 255.255.255.192 | 1e:00:80:00:00:14 | NULL
| NULL | NULL | NULL | 172.30.4.98
| 255.255.255.128 | 1e:00:c9:00:00:5f | NULL | 1 | 1
| NULL | NULL | NULL | NoIqn | NULL |
NULL | NULL | 0 | NULL | 4.11.0.0 | NULL
| NULL | NULL | Proxy.2-ConsoleProxyResource
| 1 | 0 | 0 | 1492390409 | 146457912294 |
2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL | 7 | Enabled
| NULL | NULL | Disabled |
| 3 | s-1-VM | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up |
SecondaryStorageVM | 172.30.3.34 | 255.255.255.192 | 1e:00:3b:00:00:05
| 172.30.3.34 | 255.255.255.192 | 1e:00:3b:00:00:05 | NULL
| NULL | NULL | NULL | 172.30.4.86
| 255.255.255.128 | 1e:00:d9:00:00:53 | NULL | 1 | 1
| NULL | NULL | NULL | NoIqn | NULL |
NULL | NULL | 0 | NULL | 4.11.0.0 | NULL
| NULL | NULL | s-1-VM-NfsSecondaryStorageResource
| 1 | 0 | 0 | 1492390407 | 146457912294 |
2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL | 7 | Enabled
| NULL | NULL | Disabled |
| 4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up |
Routing | 172.30.3.4 | 255.255.255.192 | 00:26:b9:4a:97:7d
| 172.30.3.4 | 255.255.255.192 | 00:26:b9:4a:97:7d | NULL
| NULL | NULL | 1 | 172.30.4.4
| 255.255.255.128 | 00:26:b9:4a:97:7e | NULL | 1 | 1
| 1 | 2 | 2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL |
KVM | NULL | 7510159360 | NULL | 4.11.0.0 | NULL
| NULL | hvm,snapshot |
40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource | 1 |
0 | 0 | 1492450882 | 146457912294 | 2018-06-05 14:09:22 |
2018-06-05 13:46:33 | NULL | 8 | Enabled | NULL | NULL
| Disabled |
| 5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up |
Routing | 172.30.3.5 | 255.255.255.192 | 00:24:e8:73:6a:b2
| 172.30.3.5 | 255.255.255.192 | 00:24:e8:73:6a:b2 | NULL
| NULL | NULL | 1 | 172.30.4.5
| 255.255.255.128 | 00:24:e8:73:6a:b3 | NULL | 1 | 1
| 1 | 2 | 3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL |
KVM | NULL | 7510159360 | NULL | 4.11.0.0 | NULL
| NULL | hvm,snapshot |
10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource | 1 |
0 | 0 | 1492390408 | 146457912294 | 2018-06-05 14:09:22 |
2018-06-05 13:47:04 | NULL | 6 | Enabled | NULL | NULL
| Disabled |
+----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
5 rows in set (0.00 sec)
and you can see that it says the storage IP address is the same as the private
IP address (the management network).
I also ran the command you provided using the Cluster ID number from the table
above -
mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is not
null;
Empty set (0.00 sec)
mysql>
So assuming I am reading this correctly that seems to be the issue.
I am at a loss as to why though.
I have a separate NIC for storage as described. When I add the zone and get to
the storage web page I exclude the IPs already used for the compute node NICs
and the NFS server itself. I do this because initially I didn't and the SSVM
started using the IP address of the NFS server.
So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 ->
172.30.5.14.
And I used the label "cloudbr2" for storage.
I must be doing this wrong somehow.
Any pointers would be much appreciated.
________________________________
From: Rafael Weingärtner <[email protected]>
Sent: 05 June 2018 16:13
To: users
Subject: Re: advanced networking with public IPs direct to VMs
That is interesting. Let's see the source of all truth...
This is the code that is generating that odd message.
> List<StoragePoolVO> clusterPools =
> _storagePoolDao.listPoolsByCluster(agent.getClusterId());
> boolean hasNfs = false;
> for (StoragePoolVO pool : clusterPools) {
> if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
> hasNfs = true;
> break;
> }
> }
> if (!hasNfs) {
> s_logger.warn(
> "Agent investigation was requested on host " + agent +
> ", but host does not support investigation because it has no NFS storage.
> Skipping investigation.");
> return Status.Disconnected;
> }
>
There are two possibilities here. You do not have any NFS storage? Is that
the case? Or maybe, for some reason, the call
"_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not returning
any NFS storage pools. Looking at the "listPoolsByCluster " we will see
that the following SQL is used:
Select * from storage_pool where cluster_id = <host'sClusterId> and removed
> is not null
>
Can you run that SQL to see the its return when your hosts are marked as
disconnected?
On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <[email protected]> wrote:
> I reran the tests with the 3 NIC setup. When I configured the zone through
> the UI I used the labels cloudbr0 for management, cloudbr1 for guest
> traffic and cloudbr2 for NFS as per my original response to you.
>
>
> When I pull the power to the node (dcp-cscn2.local) after about 5 mins
> the host status goes to "Alert" but never to "Down"
>
>
> I get this in the logs -
>
>
> 2018-06-05 15:17:14,382 WARN [c.c.h.KVMInvestigator]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
> requested on host Host[-4-Routing], but host does not support investigation
> because it has no NFS storage. Skipping investigation.
> 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to
> determine host 4 is in Disconnected
> 2018-06-05 15:17:14,382 INFO [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state
> determined is Disconnected
> 2018-06-05 15:17:14,382 WARN [c.c.a.m.AgentManagerImpl]
> (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
> the host is still up: 4-dcp-cscn2.local
>
> I don't understand why it thinks there is no NFS storage as each compute
> node has a dedicated storage NIC.
>
>
> I also don't understand why it thinks the host is still up ie. what test
> is it doing to determine that ?
>
>
> Am I just trying to get something working that is not supported ?
>
>
> ________________________________
> From: Rafael Weingärtner <[email protected]>
> Sent: 04 June 2018 15:31
> To: users
> Subject: Re: advanced networking with public IPs direct to VMs
>
> What type of failover are you talking about?
> What ACS version are you using?
> What hypervisor are you using?
> How are you configuring your NICs in the hypervisor?
> How are you configuring the traffic labels in ACS?
>
> On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <[email protected]>
> wrote:
>
> > Hi all
> >
> >
> > I am close to giving up on basic networking as I just cannot get failover
> > working with multiple NICs (I am not even sure it is supported).
> >
> >
> > What I would like is to use 3 NICs for management, storage and guest
> > traffic. I would like to assign public IPs direct to the VMs which is
> why I
> > originally chose basic.
> >
> >
> > If I switch to advanced networking do I just configure a guest VM with
> > public IPs on one NIC and not both with the public traffic -
> >
> >
> > would this work ?
> >
>
>
>
> --
> Rafael Weingärtner
>
--
Rafael Weingärtner