Bandwidth monitoring for Citrix Xen VMs

Bandwidth monitoring does not appear to work for Citrix Xen VMs. Is this a bug or a feature that has yet to be implemented?

Status: 
Active

Comments

It's a missing feature, because (as far as I know) there is no way to get bandwidth stats from a Citrix Xen VM using the xe command.

So I did some research and this is the procedure I came up with to get the data our of XenServer. It is a little convoluted and would require many ssh connections to gather the stats. Do you think this method would be suitable for Cloudmin or would it cause performance problems due to the number of connections required?

You are definitely correct that accounting information is simply not available via the API, which came as somewhat of a surprise. Also CPU time (in seconds) is not available, I just now realised that Cloudmin is displaying "percent hours" for CPU used over period. Am I correct that this would show "CPU hours" if using the open source Xen hypervisor?

Is the open source Xen hypervisor the most used/tested/supported back-end for Cloudmin?

  1. Find VM UUID in Cloudmin, eg a9bbdf59-fd8c-e5c5-e27d-03e5fd4ccad4

  2. Determine which network to collect stats for (we only want public)

xe network-list

… uuid ( RO) : ee752cf5-201a-f342-94f6-190b0d39c1d6 name-label ( RW): Public name-description ( RW): bridge ( RO): xenbr2

  1. Determine device ID for any public network interface on our VM
xe vif-list vm-uuid=a9bbdf59-fd8c-e5c5-e27d-03e5fd4ccad4 | grep -B 1 ee752cf5-201a-f342-94f6-190b0d39c1d6 device ( RO): 0 network-uuid ( RO): ee752cf5-201a-f342-94f6-190b0d39c1d6
  1. Determine domain ID of the VM
xe vm-list uuid=a9bbdf59-fd8c-e5c5-e27d-03e5fd4ccad4 params=dom-id

dom-id ( RO) : 67

  1. The interface name is vif{dom-id}.{device}, eg in this case vif67.0

  2. Find the resident host:

xe vm-param-list uuid=a9bbdf59-fd8c-e5c5-e27d-03e5fd4ccad4 |grep resident-on resident-on ( RO): 70637a4a-4120-4c75-840d-240e1fd38612
  1. Find the IP address or hostname of the host server:
xe host-param-list uuid=70637a4a-4120-4c75-840d-240e1fd38612 | grep address address ( RO): 10.1.101.206 xe host-param-list uuid=70637a4a-4120-4c75-840d-240e1fd38612 | grep hostname hostname ( RO): ams1-xen-6.anu.net
  1. Retrieve stats:
ssh 10.1.101.206 "ifconfig vif67.0" | grep bytes RX bytes:88068279 (83.9 MiB) TX bytes:321689222 (306.7 MiB)

Thanks, that could work ... I will look into implementing support for collecting network stats like this, and update this ticket when done.

This has been implemented for inclusion in Cloudmin 6.2.

That's great news, thanks!!

Automatically closed -- issue fixed for 2 weeks with no activity.

I installed the update you sent me and have been running it for a few hours now. Bandwidth collection is working great on 3 out of 5 physical servers, the VMs on the other 2 still do not have any bandwidth stats.

All of my Citrix hosts are added to Cloudmin as physical servers. They have Webmin installed and show up as status "Webmin" in the Physical Systems list. All of the Physical Systems have bandwidth stats shown (and always have)

Is there a debug log somewhere that I can check to see why the bandwidth collection is failing on these 2 servers? I have checked everything I can think of. /var/log/secure on the 2 Citrix boxes shows successful SSH logins from Cloudmin...

On the failing system, could you send me the output from the following commands :

brctl show

and

cat /proc/net/dev

I checked these commands yesterday and compared against the other servers, the output looked very much the same and exactly what I'd expect to see.

Today we upgraded our production XenServer pool from 6.0.0 to 6.0.2. Post-upgrade, all of the VMs are now reporting bandwidth stats!

I kind of wish I had figured this out before upgrading as I am curious what the issue was, but for now let's close this ticket, it appears to be working.

Ok, great!

The only other possible cause of this problem that I can think of would be if the domain ID as reported by :

xe vm-list uuid=a9bbdf59-fd8c-e5c5-e27d-03e5fd4ccad4 params=dom-id

wasn't matching the X part of the vifX.Y interface name.

Automatically closed -- issue fixed for 2 weeks with no activity.

This problem has cropped up again.

Could it be related to the counters growing too large or something like that?

Bandwidth collection is currently NOT working on this server:

[root@ams1-xen-0 ~]# cat /proc/net/dev
Inter-|   Receive                                                |  Transmit
face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:359270965 1158920    0    0    0     0          0         0 359270965 1158920    0    0    0     0       0          0
  eth0:2108597188 468470374    0    0    0     0          0         0 2776588476 362200311    0    0    0     0       0          0
  eth1:   31092     517    0    0    0     0          0         0        0       0    0    0    0     0       0          0
  eth2:2753429887 105173962    0 91972    0     0          0    174284 2938555590 77194821    0    0    0     0       0          0
  eth3:4170896719 613784402    0 2257    0     0          0     44065 3406639035 1577111488    0    0    0     0       0          0
xenbr3:3322902298 171892615    0    0    0     0          0         0 2576425695 256482787    0    0    0     0       0          0
xapi1:2105704350 468470149    0    0    0     0          0         0 2046823059 351773643    0    0    0     0       0          0
bond0:2108619508 468470851    0    0    0     0          0         0 2046533132 351773617    0    0    0     0       0          0
xenbr2:2541765542 34657929    0    0    0     0          0         0        0       0    0    0    0     0       0          0
vif1.0:99323155  483751    0    0    0     0          0         0 2598580257 35110415    0  241    0     0       0          0
vif2.0:1332273818 9283663    0    0    0     0          0         0 3507245760 41069670    0 8279    0     0       0          0
vif6.0:1051076272 44137754    0    0    0     0          0         0 3579148450 77554443    0  735    0     0       0          0
vif7.0:2132246898 5998458    0    0    0     0          0         0 3062915715 38548745    0  743    0     0       0          0
vif8.0:460919687  604762    0    0    0     0          0         0 2046729859 30135222    0  605    0     0       0          0
vif9.0:22039906  191451    0    0    0     0          0         0 1307668400 21043548    0    0    0     0       0          0
vif11.0:29153871  118082    0    0    0     0          0         0 915528101 14665269    0  223    0     0       0          0
vif12.0:366783000 3489146    0    0    0     0          0         0 1423955441 14981485    0  307    0     0       0          0
vif12.1:2471769446 105205520    0    0    0     0          0         0 2848192213 256976284    0  126    0     0       0          0
vif17.0:108099661   38501    0    0    0     0          0         0  3705126   56953    0    8    0     0       0          0
vif17.1:    3316      55    0    0    0     0          0         0   106274     479    0   27    0     0       0          0
[root@ams1-xen-0 ~]# brctl show
bridge name bridge id STP enabled interfaces
xapi1 0000.001d92f63cd6 no bond0
eth1
eth0
xenbr2 0000.001b21ce0633 no eth2
vif1.0
vif2.0
vif11.0
vif12.0
vif6.0
vif7.0
vif8.0
vif17.0
vif9.0
xenbr3 0000.001d92f63cd7 no eth3
vif12.1
vif17.1

But it DOES work on this one:

[root@ams1-xen-4 ~]# cat /proc/net/dev
Inter-|   Receive                                                |  Transmit
face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:90196168  199364    0    0    0     0          0         0 90196168  199364    0    0    0     0       0          0
  eth0:    4320      72    0    0    0     0          0         0        0       0    0    0    0     0       0          0
  eth1:362750026 48866647    0    0    0     0          0         0 440898010 37755746    0    0    0     0       0          0
  eth2:1917192116 23672777    0    0    0     0          0     99434 1434960426 3312750    0    0    0     0       0          0
  eth3:2496440422 35254104    0    0    0     0          0     25251 1314130376 85638799    0    0    0     0       0          0
xenbr3:2480647926 35017986    0    0    0     0          0         0 473998662 21601121    0    0    0     0       0          0
xenbr2:1282292982 20773649    0    0    0     0          0         0        0       0    0    0    0     0       0          0
xapi1:362735238 48866699    0    0    0     0          0         0 359853211 36597965    0    0    0     0       0          0
bond0:362754346 48866719    0    0    0     0          0         0 359853140 36597963    0    0    0     0       0          0
vif1.0:1361949483 3237065    0    0    0     0          0         0 1754248157 23490824    0  871    0     0       0          0
vif5.0: 3088568   56562    0    0    0     0          0         0 196411864  253482    0    8    0     0       0          0
vif5.1:      56       2    0    0    0     0          0         0   680449    2750    0   22    0     0       0          0
[root@ams1-xen-4 ~]# brctl show
bridge name bridge id STP enabled interfaces
xapi1 0000.001b21ce05e6 no bond0
eth1
eth0
xenbr2 0000.00e081d1e0aa no eth2
vif1.0
vif5.0
xenbr3 0000.00e081d1e0ab no eth3
vif5.1

I migrated one of the VMs from ams1-xen-0 to ams1-xen-4, and the bandwidth collection then started working again, so I don't think it's anything VM specific...

That output looks OK ... I'd need to login to the problem host system (and cloudmin master) to see why collection isn't working.