Friday, July 22, 2022

Adventures of a Small Time OpenStack Sysadmin Chapter 024 - OpenStack Ceilometer Telemetry Service

Adventures of a Small Time OpenStack Sysadmin relate the experience of converting a small VMware cluster into two small OpenStack clusters, and the adventures and friends I made along the way.

Adventures of a Small Time OpenStack Sysadmin Chapter 024 - OpenStack Ceilometer Telemetry Service

My reference for installing Ceilometer:

https://docs.openstack.org/ceilometer/yoga/install/

Install notes:

The usual problem where the Python2.7 CLI package name is python-gnocchiclient as displayed in the docs, but the project has moved into Python3 some time ago and the new Python3 CLI package name is python3-gnocchiclient.  No big deal.  I planned on after I get Ceilometer working I will file bugs that I found; never got Ceilometer working, so never filed any bugs.  I wonder if other people do a similar workflow; there are a lot of obvious doc bugs.

gnocchi-api Ubuntu package does not seem to install a gnocchi-api service to be restarted.  It seems there is some UWSGI workaround that is OK or maybe not?

This seems to be handled, possible successfully by /etc/apache2/sites-available/gnocchi-api.conf as per:

https://stackoverflow.com/questions/45374863/devstackceilometergnocchi-error-403

or maybe not as per:

https://bugs.launchpad.net/ceilometer/+bug/1949305

/var/log/apache2/gnocchi_error.log shows "unable to initialize coordination driver" when trying to run "gnocchi status"

coordination_url in gnocchi.conf is supposed to move to [DEFAULT] section in the config file.  Seems obvious.

Also if I try to telnet to port 6379 on my controller I don't have a redis database set up at this time so the docs should imply a redis db is necessary.

/etc/ceilometer/pipeline.yaml seems to be a new file, not an edited file.  See:

https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/ceilometer/templates/pipeline.yaml.j2

You can tell by this point in the process of setting up Plan 1.0 I was already fed up with doing a hand installation of OpenStack and was starting to use the docs for Plan 2.0 era Kolla-Ansible as my doc source for Plan 1.0, LOL.

On Ubuntu, you can determine the path to "cinder-volume-usage-audit" by running "which cinder-volume-usage-audit" and it should be /usr/bin/cinder-volume-usage-audit so edit the cron line as such

There is no cinder-api as part of the default install.

The proper way to handle the compute ipmi sudoers is to create a new file in sudoers.d not edit the sudoers file directly, as the other components do.

For swift metrics, python-ceilometermiddleware should be python3-ceilometermiddleware, the usual Python2.7 to Python3 transition package rename issue run into so many other times.

Upon install, my /etc/ceilometer/pipeline.yaml file on Ubuntu was missing/empty.

Obviously adding only the gnocchi publisher section will result in nothing being stored because there is no input section.

I found a sample pipeline.yaml online and adapted it to my needs.

To convert to storage of logs on swift:

https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/gnocchi/templates/gnocchi.conf.j2

So, um, yeah, a little more challenging to install than any other service in the entire project.  All that work for mere telemetry.

Everything seems to work although no data is being stored.  It seems actually accessing the data is even more complex than the extremely complex task of installing the telemetry service.

IT professionals used to the VMware or ELK stack or Zabbix or Nagios or Observium or LibreNMS or even smokeping experience may be a little disappointed when they meet OpenStack and there's two monitoring stacks, the legacy Ceilometer that is unmaintained and reportedly does not scale and is very difficult to make work, or its replacement the Monasca which supposedly scales well but is also unmaintained and it is inoperable in Kolla-Ansible because the project's required version of ElasicSearch is incompatible with the overall system version so it boot-loops the container, LOL.  Its just a different world if you have used something like VMware Ops Manager in the past, or even just bare "as-installed" vCenter.

My personal solution to OpenStack not providing a viable monitoring/telemetry service was to leverage my existing systems.  I continue to use Observium talking to the IMPI SNMP controller to report data like CPU temps and fan RPMs, and I continue to use Zabbix to report operating system level data like CPU use percentage on the bare metal OS and memory use and so forth.  And, honestly, it works great!

Additionally, for awhile, I was trying to use Portainer to monitor the Docker containers in Kolla-Ansible, but Portainer got agitated by some changes made by Zun/Kuryr so I stopped using that, although it was pretty cool for awhile.

It would be pretty awesome if something like Observium or Zabbix were integrated into OpenStack.  But even un-integrated, it works pretty well.

Tomorrow we experiment with Aodh.

Stay tuned for the next chapter!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.