sudoedit.com!

Putting it all together – Podman WordPress hosting

July 25, 2020

Time to hang the mission accomplished banners!

This site is 100% powered by Podman containers!

It's been a long, hard road but we made it! Around 2 months ago, way back at the end of May, I said that I was going to migrate this site, along with the others I host into containers using the Podman container engine.

As of now (2 weeks ago really) that work is done. Every part of this site has been shoehorned into containers. However, for better or for worse, and much like that fateful mission accomplished speech on the USS Abraham Lincoln. It turns out that I've still got a lot of work to do before this project is finished... Honestly, it will always be a work in progress but let's not crush my hopes so early on in this post.

For those that haven't been following along ( I know... it's also hard for me to believe that there are people who haven't been awaiting this post with bated breath), here are some links to help you get caught up if you are curious about the run-up to this post.

Some catchup reading on what I've been doing:

Containers are not easy

If you listen to the people who live and breathe containers you could understandably begin to think that anything you might ever want is just one Dockerfile away from coming into existence. Many of the container evangelists out there would have you believe that you could run the next Google or Facebook with just two DevOP's people and docker hub. I do not share that sentiment. – and even though that might not be an altogether fair representation of the attitude of the larger container world, I think it's an impression many of the more devoted Docker fans leave in their wake.

I came into this not knowing a whole lot about containers. I was a little put off by them – I've said many times that I don't understand what problem I'm supposed to be solving with containers. But I wanted to learn. So I figured no better place to do some resume building than my own blog.

I've come to have an appreciation for what you can do with containers, the portability, and the flexibility of being able to swap in a new container or roll back to an older one if something doesn't work quite right makes containers awesome. But they add a lot of management overhead, and you need a lot of knowledge, time, and experience to get all working. Especially if you want to get it working without turning off firewalls, and SELinux.

Without further ado here is the full configuration for how this site is running on Podman. I welcome any feedback you might have.

The WordPress Container(s)

General overview

For my purposes I ended up with three container images, all based on the Fedora 32 base image.

Mariadb – for data storage
Apache – webserver
Nginx – reverse proxy

One of my goals was to run the entire operation with non-root containers. I was able to accomplish that with MariaDB and Apache. However, with the Nginx reverse proxy, I needed to make a rootful container in order to get the remote IP address of users hitting the proxy and pass it on to the Apache webserver. If anyone has any suggestions on how to get around that I'm all ears. – I may end up building a micro tier ec2 instance and build out a dedicated reverse proxy. I'm old and having all these services on one just server feels icky...

Not much has changed with the database setup from my previous post about setting up MariaDB in Podman. So I'll spare you details of that in this post.

NOTE: An issue I ran into after getting everything up and running as a non-root user was hitting a too many open files limit – check your open files limits. I ended up adding the following to my /etc/security/limits.conf file:

username     soft    nofile          10240
username     hard    nofile          10240
username     soft    memlock         unlimited
username     hard    memlock         unlimited

The reverse proxy – with Let's Encrypt

If you want to read a little about some of the struggles I ran into along the way while building out the configuration for the reverse proxy you can read about it in my last post. Here I will just be sharing the finalized configuration.

As mentioned previously this is the only container that I'm running as root. The configuration will work – and you can reverse proxy and all that stuff as non-root, but for whatever reason I was not able to pass the real ip of clients back to the Apache web servers as non-root. Nor could I get the real ip in the Nginx logs. So if you don't care about either of those things feel free to run this as non-root.

Also as I mentioned earlier if you know a way that I can change this over to a non-root container and still get the real ip's please let me know.

Nginx Containerfile

FROM registry.fedoraproject.org/fedora
MAINTAINER luke@sudoedit.com
RUN dnf -y upgrade && dnf -y install nginx certbot python3-certbot-nginx python3-certbot-dns-route53 && dnf clean all
RUN systemctl enable nginx
RUN systemctl enable certbot-renew.timer
RUN systemctl disable systemd-update-utmp.service
ENTRYPOINT ["/sbin/init"]
CMD ["/sbin/init"]
EXPOSE 80 443

Nginx reverse proxy systemd unit file

[Unit]
Description=Podman container - reverse_proxy
After=sshd.service

[Service]
Type=simple
ExecStart=/usr/bin/podman run -i --rm -v /srv/proxy/config/nginx/:/etc/nginx/conf.d/:Z -v /srv/proxy/config/certbot/letsencrypt/:/etc/letsencrypt/:Z -v /srv/proxy/data/logs:/var/log/nginx:Z --name nginx --ip 10.88.0.10 -p 80:80 -p 443:443 --hostname ngnix registry.gitlab.com/lucas.rawlins/sudoedit-db/proxy:latest /sbin/init
ExecStop=/usr/bin/podman stop -t 3 nginx
ExecStopAfter=/usr/bin/podman rm -f nginx
Restart=always

[Install]
WantedBy=multi-user.target

NOTE: The repo mentioned is a private repo you will want to change the repo to one that you control.

When you run a Podman container with systemd you want to run it in “interactive” mode that is with the -i flag.

Nginx directory structure

/srv/proxy/
├── config
│   ├── certbot
│   │   └── letsencrypt
│   │       ├── accounts
│   │       ├── csr
│   │       ├── keys
│   │       ├── live
│   │       │   ├── sudoedit.com
│   │       ├── renewal
│   │       └── renewal-hooks
│   │           ├── deploy
│   │           ├── post
│   │           └── pre
│   └── nginx
|        └── sudodedit.conf
└── data
    ├── certs
    │   └── sudoedit
    └── logs

A few things of note to point out here.

I cheated, to get certbot up and running.

Instead of coming up with a fancy way to get a new cert on port 80 and then doing some back bending to get it into a persistent volume, I just did an rsync -av /etc/letsencrypt /srv/proxy/config/certbot and then exec'd into the container once it was running to execute a renewal... I could do this because I was already using letsencrypt and I was too lazy to set it up for a second time. – I know that this has violated some sacred dogma.... but it was a one-time thing, and I don't care.

To do this the way I did it you have to change a couple lines in the file /etc/letsencrypt/renewal/<site.com.conf>.

Open /etc/letsencrypt/renewal/<site.com.conf> in your favorite text editor and look for the two lines: authenticator and installer. If you were using Apache you want to change the value's to nginx.

Also notice I'm keeping my logs in a directory under /srv/proxy/data. If you do not specify a persistent volume for your logs they will be lost every time you restart the container.

Nginx Configuration

###
# place in /etc/nginx/conf.d/sudoedit.conf
###
upstream sudoeditServers {
    server server_ip_address:8080;
    }


# HTTP server
# Proxy with no SSL

    server {rewrite ^(/.well-known/acme-challenge/.*) $1 break; # managed by Certbot


        listen       0.0.0.0:80;
        server_name  sudoedit.com;
        return 301 https://$host$request_uri;
        location = /.well-known/acme-challenge/ # managed by Certbot

}

# HTTPS server
# Proxy with SSL

    server {
        listen       0.0.0.0:443;
        server_name  sudoedit.com;

         location / {
         proxy_set_header Host $host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
         proxy_set_header X-Forwarded-Host $host;
         proxy_set_header X-Forwarded-Proto https;
         proxy_pass http://sudoeditServers$request_uri;
         deny <ip_address_of_bad_people>;
        }

        ssl                  on;
        ssl_certificate      /etc/letsencrypt/live/sudoedit.com/fullchain.pem;
        ssl_certificate_key  /etc/letsencrypt/live/sudoedit.com/privkey.pem;
        include /etc/letsencrypt/options-ssl-nginx.conf;

      }

This Nginx configuration redirects all traffic to https. – https is free so I don't see any reason to not use it. If your web host charges extra you should find a new hosting provider.

Apache webserver backend

Apache Containerfile

Again a lot of this was covered in my last post. If you have questions about the choices I made here I would encourage you to read it, as it goes into some depth explaining the problems I ran into, and why I chose to solve them the way I did.

FROM registry.fedoraproject.org/fedora:32
MAINTAINER luke@sudoedit.com
RUN dnf -y upgrade && dnf -y install httpd php-intl php-opcache php-soap php-sodium php-json php-pecl-zip php-xml php-mbstring php php-xmlrpc php-gd php-pecl-imagick php-bcmath php-pecl-mcrypt php-pdo php-cli php-fpm php-mysqlnd php-process php-common mod_fcgid; dnf clean all
RUN systemctl enable httpd php-fpm
RUN systemctl disable systemd-update-utmp.service
ENTRYPOINT ["/sbin/init"]
CMD ["/sbin/init"]
expose 80

Systemd unit file for the Apache backend

This is a pretty standard systemd unit file. The major difference from the last one is that in the [Service] section I define the username of the user that I want the service to run as. For my purposes I wanted this to be a user that doesn't have sudo privileges on the server, and ideally it would be a user with /sbin/nologin defined in /etc/passwd. I've done some limited testing and that does seem to work, however, I still like to switch to the Podman user and do some tinkering. As I get more comfortable managing the containers and have more of this pulled into Ansible I imagine the need to do that sort of thing will fade out.

[Unit]
Description=Podman container - sudoedit.com
After=sshd.service

[Service]
Type=simple
User=<user_name>
ExecStart=/usr/bin/podman run -i --rm --ip 10.88.0.20 -p 8080:80 --name sudoedit.com --read-only --read-only-tmpfs=true -v /srv/sudoedit/config/httpd.conf:/etc/httpd/conf/httpd.conf -v /srv/sudoedit/config/php.conf:/etc/httpd/conf.d/php.conf:ro -v /srv/sudoedit/config/sudoedit.conf:/etc/httpd/conf.d/sudoedit.conf:ro -v /srv/sudoedit/config/wp-config.php:/var/www/sudoedit/wp-config.php:Z -v /srv/sudoedit/web/:/var/www/sudoedit/public_html/:Z -v /srv/sudoedit/logs/:/var/log/httpd:Z -v /srv/sudoedit/logs/:/var/log/php-fpm/:Z --hostname sudoedit.com registry.gitlab.com/lucas.rawlins/sudoedit-db/web:latest /sbin/init
ExecStop=/usr/bin/podman stop -t 3 sudoedit.com
ExecStopAfter=/usr/bin/podman rm -f sudoedit.com
Restart=always

[Install]
WantedBy=multi-user.target

Website container directory tree

For the persistent volumes, I chose a directory structure as outlined below. Each site is kept at the root of /srv. I then, break everything down into config, data, logs, and web directories.

/srv/sudoedit
├── config
├── data
│   └── db
├── logs
│   ├── journal
│   └── private
└── web
    ├── wp-admin
    ├── wp-content
    └── wp-includes

Config, contains the httpd files, virtual hosts files, any custom modules.
Data, in this case just holds the database
Logs, contains the web server logs.
Web holds the WordPress application files.

Some of this could probably be combined. For example logs could be a child directory of data as well as web. But I wanted to have a clear and quick way to know which directories would hold which data and this made the most sense to me. You might make other decisions about how you like your directory tree. The important thing is to separate each piece in a way that makes sense for you, so that in 6 months when you look at it again, you know where to find everything.

Changes to the httpd.conf file

In order to get the real IP address of the users visiting your site you will need to add these lines to your httpd configuration.

....
Include conf.modules.d/*.conf
#NGINX PROXY
RemoteIPHeader X-Forwarded-For
RemoteIPInternalProxy 127.0.0.1
#STOP PROXY
....

Virtual host file

This is a fairly standard vhost file. The only thing to note is that it's only listening on port 80. Since the Nginx proxy is handling SSL/TLS for us we don't need to have this site listening on tcp/443 in the container.

<VirtualHost *:80>
	ServerName sudoedit.com

	ServerAdmin webmaster@localhost
	DocumentRoot /var/www/sudoedit/public_html

	ErrorLog /var/log/httpd/sudoedit_error_log
	CustomLog /var/log/httpd/sudoedit_access_log combined

</VirtualHost>

WordPress wp-config.php

Add this bit of code to the top of your wp-config.php file if you are using Jetpack for backups. See this knowledge base article for more details.

<?php
if ( !empty( $_SERVER['HTTP_X_FORWARDED_FOR'] ) ) {
    $forwarded_ips = explode( ',', $_SERVER['HTTP_X_FORWARDED_FOR'] );
    $_SERVER['REMOTE_ADDR'] = $forwarded_ips[0];
    unset( $forwarded_ips );
}

If you end up haveing issues with css or javascript not loading you may need to add this code snippet to your wp-config.php file as well. Make sure you add it above the line that reads: /* That's all, stop editing! Happy publishing. */.

define('FORCE_SSL_ADMIN', true);
define('FORCE_SSL_LOGIN', true);
if ($_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https')
  $_SERVER['HTTPS']='on';
/* That's all, stop editing! Happy publishing. */

Pheww.... That was a lot! But after running through these steps you should have a mostly working WordPress site that is completely running on Podman! If you have a dev environment you should definitely take advantage of it so that you can work out any of the unique issues that might pertain to your site. I'm too cheap to have that extra infrastructure so I just took some extended (planned and unplanned) downtimes while I figured it out. If you run into something you can't quite figure out drop me a line and I'll be glad to offer any suggestions I might have.

Final thoughts, containers are simple – but not easy

I learned a lot in this process, and I'm moving into the next stage of this project knowing just a little bit more than the nothing I started with. Getting to this point was not easy, even though the concept by itself is simple enough.

What do I mean by simple but not easy?

What makes containers simple?

A container is just another process running on your server.
Containers can be defined with an easy to learn syntax in a Containerfile.
Containers separate processing from data and configuration, just like any other process you have:
- binaries – /bin (container)
- config – /etc (volume)
- data – /var or /home (volume)

What makes containers, not easy?

Basically all of the third point in the above list is why containers are simple but not easy. The idea is simple, the execution is hard. In order to put an application into a container, you have to understand how the application works. At least at a high level. You need to know things like:

What network ports should be listening?
Where does my app keep its configuration files?
1. Do these files change often?
What other systems does it interact with?
1. Are those “systems” other apps on the same server? i.e. a WordPress database
2. If so how do they communicate?
Where is its data stored?

All that really just scratches the surface. Under the covers, you need to understand how the OS deals with Container runtimes, how systemd will manage the services, how uid's and gid's are mapped, how SELinux works. Then, if you are working with an application that was not designed to be “cloud-native” or container ready or whatever you have to figure out how to separate each of those pieces yourself.

Moving an application into containers, also means that you have to plan your changes in advance ( you should be doing that anyway ) which is not always easy. If for instance, you determine that you do not need to mount a volume for your configuration file – you want to put it directly into the container image. That is fine until you need to make a change that file. While you could exec into the container and make your changes, you have to understand that those changes will be lost the next time the container stops and restarts. Instead, you have to either: A) build a new container image with the corrected configuration file, or B) move that configuration file into a volume that you can manipulate outside the container to ensure the changes are persistent.

While some people will scoff at those problems, they are real problems that people deal with on a daily basis. Changing your workflows among many different groups with different priorities to align with some of these ideals is much easier said than done.

Next Steps

The last piece of this puzzle is to get the webserver/database server containers joined into a “pod”. I was going to jump into that head first from this point on, but I've got a few other things I'm working on and might have to put that on the back burner for a bit. I've got some Ansible tidbits I want to share, and few other things.

Acknowledgements

Special thanks to Scott McCarty for his excellent write up covering his work migrating to Podman: http://crunchtools.com/moving-linux-services-to-containers/.

This project was deeply inspired by that post, and the adventures that lead to this point would not have been possible without being able to piggyback off of his work. Thank you.

Also to Dan Walsh for retweeting some of my posts! And for all the great work you and your team are doing on Podman. In particular these two blogs posts helped me work through several issues.

WordPress behind an Nginx reverse proxy

June 29, 2020

Because apparently I can’t leave well enough alone.

In this post, I’ll dive into how I went about setting up an Nginx reverse proxy for this WordPress site, and some of the challenges I ran into along the way.

This was a task that proved to be more challenging than I anticipated, and there were moments that I questioned my ability to get it working. – It's also got me wondering if my next project should be migrating to a static site generator.

This is part of my Podman project.

I’ve had an ongoing project to convert this site over to rootless containers with Podman. If you are curious about the progression of this project, you can catch up on what I’ve already covered by checking out the links below:

Not the tutorial you are looking for.

While I will provide some configuration files and some explanations, this post is more to outline some of the issues I've run into, how I’ve solved them, and how I plan to move forward. – maybe a complete tutorial will come a little later when I'm all finished and satisfied.

Among the many reasons I don't want to do a conventional tutorial now, is that there are already a lot of great tutorials out there that cover Nginx reverse proxies. Especially as they relate to containers, Seth Kenlon did a good write up for Enable Sysadmin on this topic last year. My goal with this post isn’t really to cover all the how-to’s, but more to share some of the details of my overall design, where I ran into trouble, and how I solved those problems. – This isn't to say that my solution is the best solution, but it's how I solved it, and if it helps you great. If it doesn't help, and you find a better way feel free to let me know!

Hopefully, this story still passes along some information that might come in handy if you are doing your own migration to containers and need a reverse proxy.

Remind me, why am I doing this again?

At this point, I'm mostly finishing this container project to save face... really. Getting WordPress to play nice behind a load balancer was a lot more difficult than I thought it would be.

I told all of you that I was migrating this site to containers and that I'd update periodically with progress and tips. Initially, I was flying through the migration. Learning to build simple images turned out to be much easier than I thought it would be.

After learning the basic ways that SELinux and UID mapping work to help isolate and enable rootless containers, getting the database containerized was really no sweat – should've done it a long time ago. But then, I always knew the database would be the easy part.

For one thing, I know MariaDB databases really well – not that I'm some kind of guru. But if there is any, one computer thing outside of operating systems that I really enjoy learning about and playing with, it's databases and MariaDB is very accessible and user friendly. – at least in my opinion.

Secondly, I'm only running one database, for now, so all I had to do was bind tcp 3306 on the host to tcp 3306 in the container and add the host IP address to the wp-config.php file in WordPress to get the webserver talking to the MariaDB instance. That work is all outlined in my MariaDB post, and was pretty simple.

So why am I whining about the web server set up?

The web server should've been just as easy right? Just bind the host ports to the container ports and move on with your life... Well, sure if all I was hosting was my own personal blog. However, unfortunately for me, I actually have a few websites that I host for family and friends and each of them needs to be available on ports 80 and 443.

The problem with that is when they are moved into containers I can't bind each website to ports 80 and 443, each web container needs to be bound to a different port with traffic redirected via a reverse proxy. Unlike a virtual host in Apache, they can't all answer web requests without some intermediary directing all the traffic. – Enter the reverse proxy.

Reverse proxy testing plan

Or maybe the headline here should've been: “No battle plan survives first contact with the enemy.“

For my reverse proxy, I've decided to use Nginx with SSL termination via Certbot. I had evaluated HAProxy, and I'm sure it's a great product, but I think it has way more features than what I need at the moment. Perhaps my final implementation will have HAProxy instead of Nginx, but for now, I'm still trying to figure out how much complexity I can live with long term, and learning another new-to-me technology isn't in the cards for the container project.

The AWS Elastic Load Balancer is also an option if you have the money to run it. If this was a serious workload that my actual job depended upon there is no doubt that I would've chosen a dedicated appliance like ELB. But, alas, I don't have that kind of money to throw at this problem! Plus, the goal is to learn something new, so in that vein.

The basic plan for testing was pretty simple:

Build Nginx container image.
Build Apache webserver image.
Stop httpd and php-fpm on cloud host
Start Nginx and Apache containers
Profit

I suppose I can happily report that steps 1 through 4 went off without a hitch. Step 5 is where the problems started.

A basic rundown of how I set up the Nginx reverse proxy:

All incoming http and https connections would reach the Nginx reverse proxy that is listening on tcp/80 and tcp/443.
Any plain text http connections to any website would get redirected to https and then passed on to the appropriate container.
SSL/TLS termination handled by the reverse proxy using Certbot.
Each website will be restricted to its own container listening on ports 8080, 8081, 8082,etc...

The basic Nginx reverse proxy configuration looks like this, for now:

    upstream sudoeditServers {
        server <ip_address>:8080;
        }


    # HTTP server
    # Proxy with no SSL

        server {rewrite ^(/.well-known/acme-challenge/.*) $1 break; # managed by Certbot


            listen       80;
            server_name  sudoedit.com;
            return 301 https://$host$request_uri;
            location = /.well-known/acme-challenge # managed by Certbot

    }

    # HTTPS server
    # Proxy with SSL

        server {
            listen       443;
            server_name  sudoedit.com;

             location / {
             proxy_set_header Host $http_host;
             proxy_pass http://sudoeditServers$request_uri;
             proxy_set_header X-Real-IP $remote_addr;
             proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
             proxy_set_header X-Forwarded-Host $server_name;
             proxy_set_header X-Forwarded-Proto https;
            }

            ssl                  on;
            ssl_certificate      /etc/letsencrypt/live/sudoedit.com/fullchain.pem;
            ssl_certificate_key  /etc/letsencrypt/live/sudoedit.com/privkey.pem;
            include /etc/letsencrypt/options-ssl-nginx.conf;

          }

For the final implementation I'll have some clean up work to do here. That will all be outlined when I do the final write up after everything is done.

Other than a couple deprecated, but still working parameters what's the problem?

Too many redirects

Much like the train in this picture you have to be careful how you set up redirects when you host a website. This is doubly true if that site is sitting behind a reverse proxy or a load balancer.

When I was running my first tests, I simply made a copy of my virtual host files and used a bind mount to make the site's configuration file available to the web container. At the time it didn't occur to me that there would be a problem when both the Nginx reverse proxy AND the Apache virtual host file tried to redirect the connecting client from http to https. – After all, Nginx should've already done that so any connection to the virtual host should be secure already... right?

Wrong, I hadn't thought that all the way through. Remember Nginx was not just redirecting to https, it was also acting as an SSL termination point, and then passing traffic to a container on an alternate point. This means that all the traffic hitting the container was over http, triggering another redirect on every visit.

What I had initially ignored as a redundant artifact from the pre-container days on this site, turned out to be the cause of an endless redirect loop, that would make it impossible for anyone to view the web page.

Take a look at the virtual host configuration file. At the very bottom you can see the rewrite rules that were put in place by Certbot when I initially set up the website.


    <VirtualHost *:80>
    	# The ServerName directive sets the request scheme, hostname and port that
    	# the server uses to identify itself. This is used when creating
    	# redirection URLs. In the context of virtual hosts, the ServerName
    	# specifies what hostname must appear in the request's Host: header to
    	# match this virtual host. For the default virtual host (this file) this
    	# value is not decisive as it is used as a last resort host regardless.
    	# However, you must set it for any further virtual host explicitly.
    	ServerName sudoedit.com
            Protocols h2 h2c

    	ServerAdmin luke@sudoedit.com
    	DocumentRoot /var/www/html/sudoedit/

    	# Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
    	# error, crit, alert, emerg.
    	# It is also possible to configure the loglevel for particular
    	# modules, e.g.
    	#LogLevel info ssl:warn

    	ErrorLog /var/log/httpd/sudoedit_error_log
    	CustomLog /var/log/httpd/sudoedit_access_log combined

    	# For most configuration files from conf-available/, which are
    	# enabled or disabled at a global level, it is possible to
    	# include a line for only one particular virtual host. For example the
    	# following line enables the CGI configuration for this host only
    	# after it has been globally disabled with "a2disconf".
    	#Include conf-available/serve-cgi-bin.conf
    RewriteEngine on
    RewriteCond %{SERVER_NAME} =sudoedit.com
    RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
    </VirtualHost>

Removing the three Rewrite rules at the bottom of file allowed Nginx to be the only source of https redirects, thus putting a stop to the endless redirect loop.

But wait! There's more!

After the redirect loop was broken I was able to get to the home page, but none of the images/css/javascript would load. Further the wp-admin page was inaccessible. In order to get around issues with css loading it looks like you have tell WordPress that if it is getting https traffic forwarded from a proxy that it should behave as if it is recieving that https traffic directly.

I'm not entirely certain why this is the case and over the course of piecing together some little tidbit's across the internet, this is apparently something that is common on php driven sites like WordPress. The solution I found after looking at a bunch of sites notably here: https://ahenriksson.com/2020/01/27/how-to-set-up-wordpress-behind-a-reverse-proxy-when-using-nginx/ and here: https://techblog.jeppson.org/2017/08/fix-wordpress-sorry-not-allowed-access-page/. Involves making an edit to the wp-config.php file:

    define('FORCE_SSL_ADMIN', true);
    define('FORCE_SSL_LOGIN', true);
    if ($_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https')
      $_SERVER['HTTPS']='on';

I'm still on the hunt for a solution that doesn't involve putting code directly into the wp-config file, but until then this does work and if you look into it there is no shortage of people who have had this same problem.

With the redirect issues resolved, I was ready to move on and see if there were any other issues in the way.

404 issue

With the home page finally loading correctly, I was pleased to see that SSL termination was working as intended. My site showed up with a little green lock and all was right with the world. Until I started checking the links to my posts.

Every post, lead to a 404 “Page not Found” error.

Why would the home page load but not the posts themselves? Nothing outside the home page was working, no posts, no archives, no categories, not the admin page... nothing.

The most likely scenario was that something was amiss with my “.htaccess” file. WordPress uses the htaccess file in Apache to handle permalinks. Although that was the most common issue it seemed a little odd that the htaccess file would be an issue. As before with the virtual host configuration file, I had just copied the webroot directory over to a new location and used a bind mount to add it to the Apache container.

I did A LOT of searching on this problem. Verifying my reverse proxy configuration over and over, checking and rechecking file permissions for all the files and directories in my containers. All of it checked out. Even the SELinux wasn't complaining about anything – that's gotta be a first.

Then I stumbled upon this similar issue involving permalinks at serverfault. The person asking the question was kind enough to answer it, and I am glad to know that there is at least one other person out there who makes little mistakes like this. My problem wasn't exactly the same, but it got me thinking about where my problem might be.

I have now resolved this. The problem was I had forgotten to enable Mod-ReWrite for Apache. > >

The thing is I had the rewrite module enabled:

    httpd -M | grep rewrite
     rewrite_module (shared)

So with mod_readwrite enabled, was the main Apache configuration file allowing htaccess files to make those modifications? In a word, Nope.

When I built the Apache container I didn't take into account that WordPress relies on the htaccess file to direct blog traffic to the appropriate posts. Having mod_rewrite enabled was only half the issue, the other half was allowing the htaccess file to change AllowOverride None to AllowOverride All in the httpd.conf file.

    <Directory "/var/www">
        ....
        AllowOverride All
        ....
    </Directory>

This is something I had set up years ago when I first started the blog, and never thought about it again.

What's next?

Right now I have all the containers built.

Nginx container for reverse proxy and ssl termination.
Apache container with php-fpm for WordPress sites.
MariaDB container.

The MariaDB container is up and running. So, the next step will be to do a direct cut over to the Nginx reverse proxy and bring the Apache containers online. Before I do that I need to finish a few things:

Create the systemd unit files that will manage the starting and stoping of each container. – Mandatory
Test SSL renewal – Mandatory
Decide if I want to stick with bind mounts or create Volumes. – Optional/Later

I figure at my current pace I should be able to make the final cutover sometime next weekend. Hopefully, with a full write up on what went well and what didn't by the following week. All that is assuming that real life doesn't get in my way too much in the meantime.

I'll be sure to take good notes and document the final product and share my results.

MariaDB migration to Podman

June 10, 2020

If you've been following along with my attempt to migrate this Wordpress site into container services with Podman then you will be happy to know that I've achieved the first milestone. The database for this site now resides pleasantly in a rootless Podman container.

One of the major reasons I wanted to try Podman was that, outside of installing the package itself, everything I wanted to run could be achieved as a non-root, non-privileged account. So far the non-root containers are living up to the marketing material, which is something I wish we could see more of in the tech world.

That isn't to say there were not challenges. If you read my last post you'll get a pretty good picture of some of the issues I ran into while getting my feet wet with this project. https://sudoedit.com/podman-selinux-and-systemd/.

This post is an extension of my last, so if you are trying to get started I suggest you read that post first. This post is a combination of a tutorial, and deep dive into the weird ways my brain try's to understand this stuff – I hope you stick with me through it.

How to set up the MariaDB service

Here is a quick recap of the build process:

Containerfile

    FROM registry.fedoraproject.org/fedora:32
    MAINTAINER luke@sudoedit.com
    RUN dnf -y install mariadb-server mariadb
    COPY mariadb-service-limits.conf /etc/systemd/system/mariadb.service.d/limits.conf
    RUN systemctl enable mariadb
    RUN systemctl disable systemd-update-utmp.service
    ENTRYPOINT ["/sbin/init"]
    CMD ["/sbin/init"]

This file pulls the standard Fedora 32 container image from the fedora container registry. My reasoning for doing so is outlined in my earlier post, and I encourage you to read it if you are so inclined. This container file defines my build, as a standard Fedora 32 system with MariaDB installed, a custom systemd unit file to allow more open-files for the MariaDB service, and allows systemd inside the container to manage the MariaDB service.

I realize that having systemd running in my container is controversial for some people, as it goes against the “single process” container model. I don't hold the single process per container philosophy so I'm just going to ignore it for now. I doubt there really is such a thing as a true single process container. I prefer to think of it as a single purpose container. The purpose in this case is to get a database up and running, and this accomplishes that goal.

Data Volume

Containers are (generally) ephemeral, any changes you make to a running container will disappear when the container is stopped and restarted. This can be a problem for a database since its goal is to store persistent data. The solution here is to mount some storage from the host into the container and allow the container process to write to that location.

I chose to create the following directory for my database: /srv/sudoedit/data/db

The directory tree looks like this:

    /srv
    └── sudoedit
        └── data
            └── db

Each directory in the tree needs to be owned by the user account and group that the container will run as, at least up to the point where the volume will be mounted – in this case that is the directoy “db”. At the point in your data tree that you reach the place where your data will be stored, you need to make the directory owned by the “mysql” user that exists in the container. Not a mysql user that is on the host – you don't need a mysql user on the host.

For clarification, I've chosen to mount /srv/sudoedit/data/db on the container host – to /var/lib/mysql inside the container. I'll show you how to do that a little later on in they unit file that manages this service on the host. For now, what you need to know is that the last directory in that chain, db and everything underneath it, needs to be owned by the mysql user in the container. How do we do that? – Enter the dark world of user namespaces.

User Namespaces

If you want to learn a little something about user namespaces (and you really do if you want to use Podman) then read these articles from Red Hat https://www.redhat.com/sysadmin/rootless-podman-makes-sense and opensource.com https://opensource.com/article/19/2/how-does-rootless-podman-work.

For our purposes, the key takeaway from those articles is that in order to change the owner of the directory at our mount point to the “mysql” user for the container, we need to enter the container namespace and change ownership of the db directory.

We do that with the podman unshare command.

After you create your directory, tree change ownership on the directory tree to the account that you want to use to run the container, then switch to that user.

Now you want to use podman unshare to set the owner of your directory to the mysql user in the container like this:

    $ podman unshare chown 27:27 /srv/sudoedit/data/db

In my container the mysql user has the UID and GID 27 so I set the ownership using those values – note that you do not need to run this as root. You are running that command as the normal unprivileged user account that will run the container – no sudo required. In this instance you don’t need sudo or root on the host, because when you enter the user namespace that the container will run in, your user account is treated as the root account in the container. Therefore, you have permission to change the ownership of arbitrary files to anything you want.

So lets take a look at the permissions on /srv/sudoedit/data/db

    cd /srv/sudoedit/data
    ls -lZ
    total 4
    drwxr-xr-x. 5 296634 296634 system_u:object_r:container_file_t:s0:c836,c854 4096 Jun 10 01:06 db

Notice a couple things here

What's up with that UID and GID? We set 27:27 not 296634:296634
Make note of the SELinux file label container_file_t any file that you want the container process to interact with needs that label.
- https://docs.fedoraproject.org/en-US/Fedora/11/html/Security-EnhancedLinux/sect-Security-EnhancedLinux-WorkingwithSELinux-SELinuxContextsLabeling_Files.html
- https://access.redhat.com/documentation/en-us/redhatenterpriselinux/7/html/selinuxusersandadministratorsguide/sect-security-enhancedlinux-workingwithselinux-selinuxcontextslabeling_files

In your case, the UID and GID listed will likely be different than the ones I posted here. This is because your user has been given control of a number of subuid's that will be mapped into any containers that are created by that user. Read the articles by Dan Walsh that I posted earlier. Basically, the uid and gid we see from the host perspective represent the uid and gid that get mapped into the container and become 27:27 inside the container namespace.

How does the uid mapping work?

Here is the simplest way I can explain it. Cat out the contents of /etc/subuid:

    cat /etc/subuid
    user1:100000:65536
    luke:165536:65536
    app-svc-account:296608:65536

Notice the above subuid's for user1, luke, and app-svc-account are set to 100000, 165536, and 296608 respectively.

user1 on the host gets to have the UID's 100000 through 165535 mapped into their containers, luke gets the next 65,536 UID's, and so on. Each UID range encompasses 65536 UID's. This range is configurable.

The app-svc-account is the service account I designated to run the MariaDB container. That means the host OS has allowed that account to control UID 296608 and then the next 65536 UID's that follow it. Since the mysql user has uid 27 and root in the container is 0 – that makes the host UID for the mysql user in the container namespace 296634 which is 296608 + 26. This is ideal because on the host machine there are no users with UID's in that range, and there never will be – which means if a process did escape the container, it would not have access to any files owned by a real user on the host.

NOTE: If you are using some kind of central authoriztion like LDAP or Active Directory then you will need to give some serious thought to how you want to handle the subuid issue on your hosts.... I'm not going to even begin to think about it here. Not yet at least, but it could be a real problem if you have subuid’s that overlap with real uid’s for real users.

Enable the service account to start services without logging in.

Use the loginctl command to enable users who are not logged in to start processes.

    enable-linger [USER...], disable-linger [USER...]
               Enable/disable user lingering for one or more users. If enabled for a specific user, a user manager is spawned
               for the user at boot and kept around after logouts. This allows users who are not logged in to run long-running
               services.

sudo loginctl enable-linger <username>

Systemd unit file

Next, I created the following systemd unit file which allows my app-svc-account user to start the MariaDB container at startup.

I named my unit file db-sudoedit.service and placed it at /usr/lib/systemd/system/

    [Unit]
    Description=Podman container - db-sudoedit.com
    After=sshd.service

    [Service]
    Type=simple
    User=app-svc-account
    ExecStart=/usr/bin/podman run -i --read-only --rm -p 3306:3306 --name db-sudoedit.com -v /srv/sudoedit/data/db/:/var/lib/mysql:Z --tmpfs /etc --tmpfs /var/log --tmpfs /var/tmp localhost/sudoedit-db
    ExecStop=/usr/bin/podman stop -t 3 db-sudoedit.com
    ExecStopAfter=/usr/bin/podman rm -f db-sudoedit.com
    Restart=always

    [Install]
    WantedBy=multi-user.target

It's a pretty standard unit file. But I want to point out a few things.

Notice I set the service to start “After” sshd. This is because I need the login service to be started and for now, the best way to make sure that the system is ready to log users in, is to wait for the sshd service to be up. 2. Notice the User definition – Systemd is starting this container service as a standard non-root user. 3. The ExecStart definition – The Z (Capital Z) in the volume declaration indicates the following: “The Z option tells Podman to label the content with a private unshared label. Only the current container can use a private volume.” – from man podman-run.

All that's left to do is start and enable the service and you should be up and running.

sudo systemctl enable db-sudoedit.service --now

If all goes well you should have your database up and running on port 3306 on your host machine.

You can switch over to your service account again and check for the running container:

    podman ps
    CONTAINER ID  IMAGE                                                 COMMAND     CREATED            STATUS                PORTS  NAMES
    194793d82d0b  localhost/sudoedit-db:latest  /sbin/init  About an hour ago  Up About an hour ago         db-sudoedit.com   3306

    podman container top -l | grep mysql
    mysql   144   1      0.046   1h12m52.971462433s   ?     2s     /usr/libexec/mysqld --basedir=/usr

What next?

Next step for me is to get a Nginx container up and running to act as a revers proxy for my apache vhosts. I'm going to break the 1 process per container rule again, and have it do both Nginx reverse proxy and have certbot for my letsencrypt SSL termination.

Once I get that up and running I'll update the blog with any tips I learned along the way. I will also include a brief discussion of privileged ports, and whether or not I chose to use a root container for Nginx or if I end up allowing non-root users to bind privileged ports.

If you read through this, and you think I missed something, or have any questions let me know.

References

March 10, 2020

#DNF Update Information – Fedora

The Fedora operating system comes with an updated version of the famous yum package management utility, called “DNF”. DNF stands for “Dandified YUM”, and it retains the general syntax that users of the yum package manager are used to.

If you are reading this post should be familiar with at least the basics of installing and updating packages with YUM or DNF. Take a look at the Fedora Docs if you need a quick refresher on how to install packages with DNF.

What I would like to go over is a bit more advanced, though not a difficult aspect of DNF/YUM. That is how to get detailed information on what updates are available, why they are needed, and how to be a bit more selective in the updates that you choose to install. The commands we cover here will work on the current Fedora release (currently 31), they should work on any release as far back as 22 which is when the switch to DNF became official as well as CentOS 8 and RHEL 8.

What information can you get from DNF?

There is a ton of information available directly from the command line to help you gather information on the latest fixes, enhancements, and security vulnerabilities that affect the systems you manage. We will see how to find Fedora Advisories, CVE's, and Bugzilla's that are installable on a Fedora system, be it a Workstation or a Server and how to install just the packages that are required to address those issues.

The command we are looking specifically is dnf updateinfo.

Keep in mind that the level of detail provided will be dependent on security meta data provided by the OS vendor, or repositories. Not all repositories include metadata for security and bugfixes in their repositories.

Who cares? Why not just install all the updates and not worry about it?

Just doing dnf -y upgrade every couple of weeks is probably just fine for a lot of people, maybe most people. If you don't care about the cve's your addressing, or the enhancements that are coming down then no need to keep reading. – no judgment, I often don't necessarily care about all that stuff either, but when I do care it's nice to know how to find that information.

This is directed more towards someone who has a need to:

Report on CVE's / Bugfixes available on a system.
Wants to minimize change while still keeping a system patched and secure.
Likes to stay in the loop about what vulnerabilities are being patched when they update.

How to get information on available updates.

DNF update summary

If you are just looking for a brief summary of the types of updates that are available on your system you can use dnf update info or dnf update info --summary both commands do the same thing.

    dnf updateinfo
    ...
    ...
    Updates Information Summary: available
        10 Security notice(s)
             4 Important Security notice(s)
             6 Moderate Security notice(s)
        22 Bugfix notice(s)
         6 Enhancement notice(s)
         4 other notice(s)

If you just need a quick executive summary to hand off to your manager, or to an application owner this is what you are looking for. It gives you a quick break down of the types of updates that are available (Security, Bugfix, Enhancement, etc) and in the case of security updates even breaks them down into more detailed categories (Critical, Important, Moderate, Low)

Notice that you do not have to run these commands with sudo. A regular user should be able to generate these reports if they need to.

DNF advisories

To get a bit more detailed look at the available patches than the summary contains, you can see which Fedora advisories are ready to be installed using dnf updateinfo --list

    dnf updateinfo --list
    ....
    ...
    FEDORA-2020-76d608179d Moderate/Sec.  NetworkManager-ssh-1.2.11-1.fc30.x86_64
    FEDORA-2020-76d608179d Moderate/Sec.  NetworkManager-ssh-gnome-1.2.11-1.fc30.x86_64
    FEDORA-2020-e94bce43a0 bugfix         abrt-2.14.0-1.fc30.x86_64
    FEDORA-2020-e94bce43a0 bugfix         abrt-addon-ccpp-2.14.0-1.fc30.x86_64
    FEDORA-2020-e94bce43a0 bugfix         abrt-addon-kerneloops-2.14.0-1.fc30.x86_64
    ...
    FEDORA-2020-262cfead59 bugfix         authselect-compat-1.1-3.fc30.x86_64
    FEDORA-2020-262cfead59 bugfix         authselect-libs-1.1-3.fc30.x86_64
    FEDORA-2020-375927619e unknown        babl-0.1.74-1.fc30.x86_64
    FEDORA-2020-5e06ad5ec5 unknown        cryptsetup-2.3.0-1.fc30.x86_64
    FEDORA-2020-5e06ad5ec5 unknown        cryptsetup-libs-2.3.0-1.fc30.x86_64
    FEDORA-2020-93f59740fe bugfix         cups-filters-1.27.1-1.fc30.x86_64
    FEDORA-2020-93f59740fe bugfix         cups-filters-libs-1.27.1-1.fc30.x86_64
    FEDORA-2020-173ac89547 bugfix         distribution-gpg-keys-1.37-1.fc30.noarch
    FEDORA-2020-42dbcf8d17 bugfix         dkms-2.8.1-4.20200214git5ca628c.fc30.noarch
    FEDORA-2020-66c974fdb6 enhancement    dnf-4.2.18-1.fc30.noarch
    FEDORA-2020-66c974fdb6 enhancement    dnf-data-4.2.18-1.fc30.noarch
    FEDORA-2020-66c974fdb6 enhancement    dnf-plugins-core-4.0.13-1.fc30.noarch
    FEDORA-2020-66c974fdb6 enhancement    dnf-yum-4.2.18-1.fc30.noarch
    FEDORA-2020-46169d6812 enhancement    enchant2-2.2.8-1.fc30.x86_64
    FEDORA-2020-247650d74a Important/Sec. firefox-73.0.1-1.fc30.x86_64
    ...
    FEDORA-2020-1a8b3ac8a4 bugfix         libsane-hpaio-3.19.12-4.fc30.x86_64
    FEDORA-2020-6f1209bb45 Moderate/Sec.  libtiff-4.0.10-8.fc30.x86_64
    FEDORA-2020-765f45cd37 unknown        libtirpc-1.2.5-1.rc2.fc30.x86_64
    FEDORA-2020-da16c02863 bugfix         libxcrypt-4.4.15-1.fc30.x86_64
    FEDORA-2020-da16c02863 bugfix         libxcrypt-compat-4.4.15-1.fc30.x86_64
    FEDORA-2020-da16c02863 bugfix         libxcrypt-devel-4.4.15-1.fc30.x86_64
    FEDORA-2020-b7b2270753 bugfix         mdadm-4.1-1.fc30.x86_64
    FEDORA-2020-881594a179 enhancement    mkpasswd-5.5.6-1.fc30.x86_64
    ...

What are we looking at here? By column, you can see the following information

Advisory name. i.e (FEDORA-2020-76d608179d)
Type. i.e. (enhancement, bugfix, security)
The name and version of the package that will address the issue. i.e. (NetworkManager-ssh-1.2.11-1.fc30.x86_64)

By default, the --list option creates a list of advisories that your system is affected by. You can break this list down even further using --security, --bugfix, or --enhancement.

Try dnf updateinfo --list --security to see a list of all the security-related advisories that are applicable to your system.

    dnf updateinfo --list --security
    ...
    FEDORA-2020-76d608179d Moderate/Sec.  NetworkManager-ssh-1.2.11-1.fc30.x86_64
    FEDORA-2020-76d608179d Moderate/Sec.  NetworkManager-ssh-gnome-1.2.11-1.fc30.x86_64
    FEDORA-2020-247650d74a Important/Sec. firefox-73.0.1-1.fc30.x86_64
    FEDORA-2020-092ef6572a Moderate/Sec.  glib2-2.60.7-3.fc30.x86_64
    FEDORA-2020-47efc31973 Important/Sec. libnghttp2-1.40.0-1.fc30.x86_64
    FEDORA-2020-6f1209bb45 Moderate/Sec.  libtiff-4.0.10-8.fc30.x86_64
    FEDORA-2020-8193c0aa68 Important/Sec. openjpeg2-2.3.1-6.fc30.x86_64
    FEDORA-2020-571091c70b Moderate/Sec.  ppp-2.4.7-34.fc30.x86_64
    FEDORA-2020-5cdbb19cca Moderate/Sec.  python3-pillow-5.4.1-4.fc30.x86_64
    FEDORA-2020-f8e267d6d0 Important/Sec. systemd-241-14.git18dd3fb.fc30.x86_64
    FEDORA-2020-f8e267d6d0 Important/Sec. systemd-container-241-14.git18dd3fb.fc30.x86_64
    FEDORA-2020-f8e267d6d0 Important/Sec. systemd-libs-241-14.git18dd3fb.fc30.x86_64
    FEDORA-2020-f8e267d6d0 Important/Sec. systemd-pam-241-14.git18dd3fb.fc30.x86_64
    FEDORA-2020-f8e267d6d0 Important/Sec. systemd-rpm-macros-241-14.git18dd3fb.fc30.noarch
    FEDORA-2020-f8e267d6d0 Important/Sec. systemd-udev-241-14.git18dd3fb.fc30.x86_64
    FEDORA-2020-4d11d35a1f Moderate/Sec.  webkit2gtk3-2.26.4-1.fc30.x86_64
    FEDORA-2020-4d11d35a1f Moderate/Sec.  webkit2gtk3-jsc-2.26.4-1.fc30.x86_64

Use DNF to get detailed information about an advisory

It looks like one of my outstanding security issues is FEDORA-2020-f8e267d6d0. What does that mean? DNF can give you a detailed look at what the advisories mean, what issues they address and which packages will be installed to fix those issues.

Using a new command switch dnf updateinfo --info

Let's say our management wants to know what is included in FEDORA-2020-f8e267d6d0. That information can be gathered from DNF, no need to start searching the web for answers.

    dnf updateinfo --info --advisory=FEDORA-2020-f8e267d6d0
    ...
    ===============================================================================
      systemd-241-14.git18dd3fb.fc30
    ===============================================================================
      Update ID: FEDORA-2020-f8e267d6d0
           Type: security
        Updated: 2020-03-09 15:44:28
           Bugs: 1614871 - systemd-journald.service: Service has no hold-off time, scheduling restart
               : 1705522 - resume from hibernation times out on disk unlock screen after 90 seconds (even with systemd.device-timeout=0)
               : 1708213 - Remote/distributed journal broken in systemd 241 (no workaround), backport 242 required
               : 1709547 - Boot fails when password file in crypttab can't be read
               : 1717712 - F30 installer screen inverted
               : 1793980 - CVE-2019-20386 systemd: a memory leak was discovered in button_open in login/logind-button.c when executing the udevadm trigger command [fedora-30]
               : 1798414 - CVE-2020-1712 systemd: use-after-free when asynchronous polkit queries are performed [fedora-all]
    Description: A few bugfixes and hwdb update.
               :
               : No need to log out or reboot.
       Severity: Important

As you can see DNF will provide a whole lot of useful information. Here are some of the highlights that I think are especially important:

Right at the top, you will see a list of packages that will be updated. In this case, it's just one.
Type. In this case, it is a security issue.
The date and time the package update became available.
Which bugs this will fix with a BZ number that you can look up (more on that later).
A brief description, which includes whether or not a reboot is required.
The severity.

Speaking of Bugzilla reports...

Looking at the output of the advisory information, we can see several bugs listed, all of them prefixed by a number. Those numbers correspond to a Bugzilla report.

DNF can also get information on a Bugzilla report. For example one of the bugs fixed by FEDORA-2020-66c974fdb6 is 1256108. Let's see what information we can get about that report.

    dnf updateinfo --info --bz=1256108
    ...
    ===============================================================================
      dnf-4.2.18-1.fc30 dnf-plugins-core-4.0.13-1.fc30 libdnf-0.43.1-2.fc30 microdnf-3.4.0-1.fc30
    ===============================================================================
      Update ID: FEDORA-2020-66c974fdb6
           Type: enhancement
        Updated: 2020-03-09 15:43:35
           Bugs: 1256108 -
               : 1338975 -
               : 1782052 -
               : 1783041 -
    Description: libdnf:
               : 	
               : - Allow excluding packages with "excludepkgs" and globs
               : - Add two new query filters: obsoletes_by_priority, upgrades_by_priority
               : - [context] Use installonly_limit from global config (RhBug:1256108)
               : - [context] Add API to get/set "install_weak_deps"
               : - [context] Add wildcard support for repo_id in dnf_context_repo_enable/disable (RhBug:1781420)
               : - [context] Adds support for includepkgs in repository configuration.
               : - [context] Adds support for excludepkgs, exclude, includepkgs, and disable_excludes in main configuration.
               : - [context] Added function dnf_transaction_set_dont_solve_goal
               : - [context] Added functions dnf_context_get/set_config_file_path
               : - [context] Respect "plugins" global conf value
               : - [context] Add API to disable/enable plugins
               :
               : dnf:
               :
               : - [doc] Remove note about user-agent whitelist
               : - Do a substitution of variables in repo_id (RhBug:1748841)
               : - Respect order of config files in aliases.d (RhBug:1680489)
               : - Unify downgrade exit codes with upgrade (RhBug:1759847)
               : - Improve help for 'dnf module' command (RhBug:1758447)
               : - Add shell restriction for local packages (RhBug:1773483)
               : - Fix detection of the latest module (RhBug:1781769)
               : - Document the retries config option only works for packages (RhBug:1783041)
               : - Sort packages in transaction output by nevra (RhBug:1773436)
               : - Honor repo priority with check-update (RhBug:1769466)
               : - Strip '\' from aliases when processing (RhBug:1680482)
               : - Print the whole alias definition in case of infinite recursion (RhBug:1680488)
               : - Add support of commandline packages by repoquery (RhBug:1784148)
               : - Running with tsflags=test doesnt update log files
               : - Restore functionality of remove --oldinstallonly
               : - Allow disabling individual aliases config files (RhBug:1680566)
               :
               : dnf-plugins-core:
               :
               : - Fix: config_manager respect config file location during save
               : - Redesign reposync --latest for modular system (RhBug:1775434)
               : - [reposync] Fix --delete with multiple repos (RhBug:1774103)
               : - [doc] Skip creating and installing migrate documentation for Python 3+
               : - [config-manager] Allow use of --set-enabled without arguments (RhBug:1679213)
               : - [versionlock] Prevent conflicting/duplicate entries (RhBug:1782052)
               :
               : microdnf:
               :
               : - Add reinstall command
               : - Add "--setopt=tsflags=test" support
               : - Add "--setopt=reposdir=<path>" and "--setopt=varsdir=<path1>,<path2>,..." support
               : - Add "--config=<path_to_config_file>" support
               : - Add "--disableplugin", "--enableplugin" support (RhBug:1781126)
               : - Add "--noplugins" support
               : - Add "--setopt=cachedir=<path_to_cache_directory>" support
               : - Add "--installroot=<path_to_installroot_directory>" support
               : - Add "--refresh" support
               : - Support "install_weak_deps" conf option and "--setopt=install_weak_deps=0/1"
               : - Respect reposdir from conf file
               : - Respect "metadata_expire" conf file opton (RhBug:1771147)
               : - Fix: Dont print lines with (null) in transaction report (RhBug:1691353)
               : - [repolist] Print padding spaces only if output is terminal
       Severity: None

In this case, the advisory was an “Enhancement”. You can see that the change report is fairly extensive and should satisfy the curiosity of most people who might have a need to know what this particular patch will do.

Okay, great... Now, what if I only want to install the packages to fix a particular bug?

Let's say our organization has a need to patch just one particular advisory. Let's pick a security-related one. The advisory FEDORA-2020-4d11d35a1f was related to a WebKit issue.


    dnf updateinfo --info --advisory=FEDORA-2020-4d11d35a1f
    ...
    ===============================================================================
      webkit2gtk3-2.26.4-1.fc30
    ===============================================================================
      Update ID: FEDORA-2020-4d11d35a1f
           Type: security
        Updated: 2020-03-09 15:45:07
    Description:  * Always use a light theme for rendering form controls.
               :  * Fix several crashes and rendering issues.
               :  * Security fixes: CVE-2020-3862, CVE-2020-3864, CVE-2020-3865, CVE-2020-3867, CVE-2020-3868
       Severity: Moderate

If for whatever reason this was something that you needed to fix right now but you were not ready to apply all of your patches you can tell DNF to only install packages that apply to a particular advisory using sudo dnf update --advisory="advisory_name".

    sudo dnf update --advisory=FEDORA-2020-4d11d35a1f
    Last metadata expiration check: 0:43:23 ago on Tue 10 Mar 2020 12:50:58 PM EDT.
    Dependencies resolved.
    =========================================================================================================
     Package                      Architecture        Version                     Repository            Size
    =========================================================================================================
    Upgrading:
     webkit2gtk3                  x86_64              2.26.4-1.fc30               updates               15 M
     webkit2gtk3-jsc              x86_64              2.26.4-1.fc30               updates              5.8 M

    Transaction Summary
    =========================================================================================================
    Upgrade  2 Packages

    Total download size: 21 M
    Is this ok [y/N]:

DNF will also take a comma-separated value of multiple advisories to apply. For instance, if we wanted to apply the following two advisories:


    sudo dnf update --advisory=FEDORA-2020-4d11d35a1f,FEDORA-2020-66c974fdb6
    Last metadata expiration check: 0:46:50 ago on Tue 10 Mar 2020 12:50:58 PM EDT.
    Dependencies resolved.
    =========================================================================================================
     Package                             Architecture      Version                  Repository          Size
    =========================================================================================================
    Upgrading:
     dnf                                 noarch            4.2.18-1.fc30            updates            396 k
     dnf-data                            noarch            4.2.18-1.fc30            updates             47 k
     dnf-plugins-core                    noarch            4.0.13-1.fc30            updates             30 k
     dnf-yum                             noarch            4.2.18-1.fc30            updates             45 k
     libdnf                              x86_64            0.43.1-3.fc30            updates            611 k
     python3-dnf                         noarch            4.2.18-1.fc30            updates            423 k
     python3-dnf-plugins-core            noarch            4.0.13-1.fc30            updates            170 k
     python3-hawkey                      x86_64            0.43.1-3.fc30            updates             96 k
     python3-libdnf                      x86_64            0.43.1-3.fc30            updates            711 k
     webkit2gtk3                         x86_64            2.26.4-1.fc30            updates             15 M
     webkit2gtk3-jsc                     x86_64            2.26.4-1.fc30            updates            5.8 M

    Transaction Summary
    =========================================================================================================
    Upgrade  11 Packages

    Total download size: 23 M

Try some of these commands on your own systems.

Take a look at the DNF documentation here: https://dnf.readthedocs.io/en/latest/index.html and try out different combinations of the updateinfo option on your own systems to get a more in-depth look at what you are updating the next time you need to patch.

Try adding a -v to the --info commands that we looked at above. You'll see that you can get even more information.

Happy patching! And let me know if this has helped you out at all!

Git with the program

February 23, 2020

TLDR; git is awesome. It will save you time, and headaches if you are working on automation.

Recently I've put some serious effort into learning and using git for version control.

I know what you're thinking.... probably something like “Just now learning git? Have you been living under a rock?”... I realize that I'm at least 10 years late to this party I just didn't know what I was missing.

My version control system of choice prior to a few weeks ago involved a process that is probably familiar to some of you, who like me, find yourself doing things the hard way, for way too long.

If I was working on modifying a script, I would simply make a copy of it and call it something like newscript-v2 or somescript-test. In all honesty, this process has worked just fine for most of my life. I'm not a coder by trade, I'm an operations guy, so most of my scripts were only a handful of lines, and most of those only performed boring tasks that I could do manually for awhile if I absolutely had to. No work really stopped if I had to spend a couple of days fixing a broken script it would just slow down a bit.

So what changed? Anisble

Over the last couple of years, Ansible has taken a huge foothold in my daily work life. We use it for pretty much everything, in a lot of ways Ansible has become the way Linux administration is done around my office. As a result, the old way of renaming a file, and moving playbooks to a new directory had become completely unsustainable. We would quite literally have dozens of directories with playbooks in various states of disarray, that would quickly become abandoned when a new change was required and we had to cut over to yet another new directory of playbooks. God help you if you mess up the blessed version of any roles used for deployments, configuration, or patching! Gone are the days when that would just mean work slows down for a bit while the scripts are fixed. Quite literally work would come to a grinding halt (for some things anyway) until the problematic playbooks are fixed. The Ansible playbooks have begun to do so much of the heavy lifting especially around server deployment and patching that if those playbooks were to break, we simply do not have the staffing resources to manually do all that work, it would be impossible to keep up with it all. This is where git comes in.

Git-huh what is it good for?

Absolutely everything... Well, everything that involves modifying plain text files, which in Linux is pretty much everything. Git comes with a feature called branching. Branching means that you can take all the things in your current working directory, modify them on a “branch” without changing the “master” copy and then merge those changes into the master branch when you are ready. These branches are created nearly instantly, they are easy to switch to, and they will change your life if you used to manage the files the way I did.

The git documentation (git-scm documentation) describes branching like this:

Frictionless Context Switching. Create a branch to try out an idea, commit a few times, switch back to where you branched from, apply a patch, switch back to where you are experimenting, and merge it in.
Role-Based Codelines. Have a branch that always contains only what goes to production, another that you merge work into for testing, and several smaller ones for day to day work.
Feature Based Workflow. Create new branches for each new feature you're working on so you can seamlessly switch back and forth between them, then delete each branch when that feature gets merged into your main line.
Disposable Experimentation. Create a branch to experiment in, realize it's not going to work, and just delete it – abandoning the work—with nobody else ever seeing it (even if you've pushed other branches in the meantime).

Branching is the feature that got me hooked on git. It means that my workspace can be cleaner. I don't have to worry about orphaned files laying around that may or may not be useful. I simply create a new branch start working on a fix, or a new feature and then delete that branch after I've merged the changes. Nothing to clean up, nothing half-broken laying around waiting for some poor unsuspecting soul to run accidentally. Try git checkout -b dev after you initialize your first test repository. If you haven't used git I'm sure you will be just as impressed as I am.

Git runs locally

You don't have to create a git hub account, or even build new infrastructure to start using git. This one of the big misconceptions I had around git. I assumed you needed a remote repository of some sort in order for it to work, you don't. Git is designed to run just as well locally as it does from a remote.

With Git, nearly all operations are performed locally [https://git-scm.com/about/small-and-fast](https://git-scm.com/about/small-and-fast)

Git definitely makes it easy to share code if that is what you want to do, but that is not its primary purpose. First and foremost it is a version control system, and it excels as a strictly local repository for your files. No GitHub required, no new accounts to keep track of.

Installing git is trivial

Installing git on any operating system is easy, and after learning a few core commands you can start to use git on your local system to maintain code (or blog posts), or Ansible playbooks. All without generating a bunch of orphaned files that you aren't sure whether or not it's safe to delete. Installing git is so easy that I'm not even going to bother going over the installation here, in some cases, your OS might already have git installed. If you are ready to try git for yourself head on over to https://git-scm.com/book/en/v2/Getting-Started-Installing-Git you will quickly find how to install git on Linux, macOS, and Windows.

Some resources to help you get started.

The free ebook: https://git-scm.com/book/en/v2
I have read this nearly cover to cover and cannot recommend it enough. This is an excellent resource.
The docs: https://git-scm.com/docs
- rtfm noob

I'm sure some of you have been using git for years. Please share anything that you've learned along the way. If you are new to git and decide to try it out, let me know how it goes.

Enable http/2

December 19, 2019

The http 2.0 protocol is designed for increased speed and performance. The protocol was published for release in 2015 and is supported by Apache 2.4 using the mod_http2 module.

Note: You will need to have a valid ssl cert for all practical purposes to implement http/2. Many web browsers including Firefox will not use http/2 on a site without an ssl cert. You can obtain a free ssl cert with letsencrpyt, check out https://certbot.eff.org for more info.

How to enable http 2.0 on Apache 2.4.

This write up is based on Fedora 30 and Apache 2.4, but it should also work on CentOS 8 or RHEL 8.

Install the required packages.

sudo dnf install php-fpm mod_fcgid fcgi mod_http2

Ensure the correct mpm module is loaded.

In order to enabled http/2 in Apache you need to make sure that you are not using the mpm_prefork module. The prefork module is non-threaded and cannot handle the multiplexed connection that http/2 requires. For the best results you will want to switch to the mpm_event module, we will walk through how to make that change momentarily.

Edit the file /etc/httpd/conf.modules.d/00-mpm.conf.

Uncomment the line:


    LoadModule mpm_event_module modules/mod_mpm_event.so

Make sure to place a # in front of the line:

    #LoadModule mpm_prefork_module modules/mod_mpm_prefork.so

You only want to have one mpm module loaded so make sure that mod_mpm_event is the only active module. Prefork and worker should be commented out.

Edit the file /etc/httpd/conf.d/php.conf.

Next edit /etc/httpd/conf.d/php.conf: – Look for the SetHandler parameter in the local php-fpm section shown below.

    #
    # Redirect to local php-fpm (no mod_php in default configuration)
    #
    <IfModule !mod_php5.c>
      <IfModule !mod_php7.c>
        # Enable http authorization headers
        SetEnvIfNoCase ^Authorization$ "(.+)" HTTP_AUTHORIZATION=$1

        <FilesMatch \.(php|phar)$>
            SetHandler "proxy:unix:/run/php-fpm/www.sock|fcgi://localhost"
            #SetHandler "proxy:fcgi://127.0.0.1:9000"
        </FilesMatch>
      </IfModule>
    </IfModule>

Ensure that the following parameter is active. Copy and paste it in under “” if it isn't present.

    SetHandler "proxy:unix:/run/php-fpm/www.sock|fcgi://localhost"

Ensure that SetHandler proxy:fcgi://127.0.0.1:9000 in that section is commented out with a #.

Edit your virtual hosts file:

Near the top of your virtual hosts file, under the “ServerName” parameter add the following:

    Protocols h2 h2c http/1.1

Start and Enable the php-fpm service.

    sudo systemctl enable php-fpm && sudo systemctl start php-fpm

Restart the httpd service:

    sudo systemctl restart httpd

Check your logs to verify http/2.

Visit your website and click a few links, refresh the page a couple times, and then take a look at your web logs.

    sudo grep "HTTP/2.0" /var/log/httpd/site_access.log

If all the above steps were completed successfully you should get some log entries returned that contain “HTTP/2.0”. If so congratulations your site is now supporting http/2!

References

https://http2.github.io/faq/#what-are-the-key-differences-to-http1x
https://letsencrypt.org
https://httpd.apache.org/docs/2.4/howto/http2.html

Force Local Users and Groups with Ansible

April 5, 2019

I'm in the process of migrating a few Puppet modules over to Ansible, and in the process I've run into an unusual situation while creating users and groups. Here is some background. I have an application that will refuse to complete its installation unless it can see certain users and groups in the local passwd and group files. It just so happens that these same users and groups are also contained in LDAP.

Puppet has an attribue called “forcelocal” in its user and group resource that has always been able to create a local user or group in this situation, despite having a matching user or group in LDAP. So, I was a bit dissappointed to discover that the similar “local” option in both the group and the user Ansible modules did not work in the same way.

From the user module docs, the “local” option has the following behavior:

Forces the use of “local” command alternatives on platforms that implement it. This is useful in environments that use centralized authentification when you want to manipulate the local users. I.E. it uses luseradd instead of useradd.This requires that these commands exist on the targeted host, otherwise it will be a fatal error. ([https://docs.ansible.com/ansible/latest/modules/usermodule.html#user-module](https://docs.ansible.com/ansible/latest/modules/user_module.html#user-module))_

After reading that, I expected Ansible to create local users and groups regardless of whether or not that user or group was already found in LDAP. However, that is not the case. For whatever reason specifying the local option does not create a local user or group, if that user or group is already in LDAP, and is visible to your target server. Instead, Ansible will simply mark the task as complete and happily move on to the next step. Looking at the code for the module it's using “grp” from the standard library, so it will just check the user database for the user or group, since it finds the user (albeit in LDAP) it moves on, which for my use case kinda defeats the whole purpose of the local option. I would like to see this module do a further check to see if the specified user name or group was listed in the /etc/passwd or /etc/group files, before marking success.

After a bit of head scratching, and cursing I read a few blogs, and stack exchange solutions from others who had attempted to solve this, but none of them struck me as viable for my situation. Because I'm not enough of a programmer to fix this little bug, and since I only need to run this particular playbook once, at the time a server is deployed, I chose a bit of a compromise solution.

At first, I kicked around the idea of inserting the user and group directly into the passwd, shadow, and group files, but that just didn't seem like a clean solution to me. Plus, I assume at some point this problem will be fixed, so it seems easier to continue using the group and user modules than to rewrite the playbook at some point in the near future.

So I decided to do the following: stop the sssd service (thus making the LDAP users and groups invisible to the server), add the users and groups using the Ansible module, and then restart sssd.

Here is a slimmed down version of what I ended up doing in the playbook. Keep in mind, it will only work if you are using sssd you should also make sure that your server is in a state where sssd can be stopped while these tasks are processed. In my case it's fine because I only run this particular sequence once when the server is built.

    ---
    - name: stop sssd
      service:
        name: sssd
        state: stopped
    - name: add group
      group:
        name: localgroup
        gid: 1234
        state: present
    - name: add user
      user:
        name: localuser
        uid: 1234
        group: localgroup
        state: present
    - name: start sssd
      service:
        name: sssd
        state: started
        enabled: yes
     ...

Let me know if you've run into this, or if you have a better solution. I suspect you could make this a bit more universal by adding a test to see if the user and group have an entry in the passwd and group files before stopping sssd.

Convert a pem file into a rsa private key

July 14, 2018

When you build a server in AWS one of the last steps is to either acknowledge that you have access to an existing pem file, or to create a new one to use when authenticating to your ec2 server.

If you want to convert that file into an rsa key that you can use in an ssh config file, you can use this handy dandy openssl command string.

openssl rsa -in somefile.pem -out id_rsa

Note: you do not have to call the output file id_rsa, you will want to make sure that you don't overwrite an existing id_rsa file.

Copy the id_rsa file to your .ssh directory and make sure to change permissions on the id_rsa key to read only for just your user.

chmod 400 ~/.ssh/id_rsa

How to get started using Ansible

July 1, 2018

Install Ansible

On most Linux distributions Ansible can be installed directly through your distribution's package manager.

For those using macOS or a distribution that doesn't package Ansible, you can install it via python pip. The Ansible docs have a really good walkthrough for installation that can be found here: http://docs.ansible.com/ansible/latest/installationguide/introinstallation.html

I won't repeat those instructions except to say that you will want to make sure that the computer you install Ansible on should have Python 2.6 or higher or Python 3.5 or higher and while Windows can be managed by Ansible it cannot be a control machine.

Once Ansible has been installed it is, for the most part, ready to be used. There are no daemons to start, or services to enable. Ansible will run when you call it, either directly from the command shell, or via scheduled tasks.

Configuration files:

There are a few files that you should be aware of when you get started using Ansible.

The Ansible configuration file:

Located at /etc/ansible/ansible.cfg the Ansible configuration file contains all of the global Ansible settings.

You can specify the user that Ansible will run as, the number of parallel processes (forks) that it will spin up, and many other configuration items to help you fine-tune your Ansible installation.

The settings in this file can be overwritten by creating an ansible.cfg file located in your users home directory ~/.ansible.cfg or in the directory that contains your playbooks current_dir/ansible.cfg. For common options and a more in-depth explanation of these files take a look at the ansible documentation.

The Ansible inventory file:

The Ansible inventory file is located at /etc/ansible/hosts.

You can change the default location of this file by updating the Ansible configuration file to point to a different path, or you can specify an inventory file with -i when calling an Ansible playbook.

The inventory file is a flat text file that lists the hosts you want to manage using Ansible. Any host that you want to run a playbook on must have an entry in the inventory file. When you installed Ansible a template file should've been created for you at /etc/ansible/hosts.

The structure of the inventory file is pretty simple and fairly easy to learn. As you become more comfortable using Ansible you will find that there are a lot of modifications that can be made to the inventory file that will make your playbooks even more powerful.

The basic setup of an Ansible inventory file would have a Header Section noted with brackets to indicate a group of servers: for example a group that contains all servers might look like this:

    [all]
    server1
    server2
    ........

A group that contains only database servers would look like this:


    [dbservers]
    dbserver1
    dbserver2
    .......

The dot's, in this case, indicate that the list could go on and on, don't put them in your inventory file.

You can have any name you want for your groups, just make sure that they make sense to you. You can also have groups of groups, and variables that apply to entire groups. Here is a link to the documentation that you will need to create well-defined inventory files: https://docs.ansible.com/ansible/latest/userguide/introinventory.html #intro-inventory

Don't skip that section read it carefully and take time to define your groups well.

Getting Started with Ansible

Once you have Ansible installed, and have built at least a simple inventory file. Try making a test group with a couple of hosts in it. Append the following to the bottom of your /etc/ansible/hosts file.

    [test]
    sometestserver01
    someothertestserver02

Ad-hoc commands

If you have never used Ansible, try out some of the ad-hoc commands. These commands can run Ansible modules against a host or group of hosts without writing a playbook.

One of the most useful ad-hoc commands is also the simplest one.

Ansible has a module built into it called ping. The ping module will connect with a target system and attempt to locate a compatible python installation.

It will then report back with a success or failure message, indicating that the target host is up and able to be managed with Ansible.

I use this module quite a bit to ensure that I have connectivity between my Ansible control server, and the client servers that I want to run playbooks against. This helps to minimize surprises when a playbook fails to run at the expected time.

To use the ping module run the following command:

ansible test -m ping

In this command string, we are doing a couple of things.

First, we need to specify the group or host that we want Ansible to run against, in this case, it is the “test” group we just created.

Second, we specify the module that we want to use, in this case, that is the ping module.

As long as your Ansible control server (could be a laptop at this point) can connect to the target hosts on port 22 then this command should complete successfully.

Speaking of success, how can you know if an Ansible playbook or command ran successfully.

Ansible output is color coded**

Green text: Indicates that Ansible ran successfully but made no changes the target host. This is a success in that, the host was already configured as defined in the playbook.

Yellow text: Indicates that Ansible ran successfully and made changes to the target host. This is a success in that, the host is now configured as defined.

Red text: Indicates a failure, either a connection failure, a syntax error, or a failure to run the appropriate tasks on the target host. If you see this you will want to take a moment to read the output. Ansible usually gives fairly detailed error messages.

Purple text: Indicates a warning. You will see this when you call a command rather than using an available module, i.e. Calling yum in a command rather than using the yum module. Generally Ansible will still run when you see purple, however, you should consider updating your playbook to use the built-in modules instead. You may also see purple text when a host is specified in a playbook but does not have a corresponding entry in the inventory file.

For more information about ad-hoc commands read the following documentation page: https://docs.ansible.com/ansible/latest/userguide/introadhoc.html

Convert old shell scripts

Once you have spent some time playing around with ad-hoc commands and feel comfortable using them. The next thing I would do is to start evaluating all the bash scripts you have laying around.

Find out which ones would be a good fit for Ansible. If you have scripts that you run for: patching, firewall changes, configuration file changes, user creation/modification/deletion. Then you already have a lot low hanging fruit to pick from.

Once you start converting bash scripts to Ansible and start running them from a centralized server, you will begin to see the power that this tool affords you.

Common repeatable tasks

If you find yourself doing the same thing over and over again. Take a few hours every day, and instead of going through the motions see if you can break those tasks down into repeatable steps. Check the Ansible module index to see if you can offload any of that work to a playbook.

After you have converted some shell scripts into Ansible it should be a little easier for you to identify which of your day to day tasks can be accomplished by an Ansible playbook.

I know most Sysadmins are not straining to fill the hours at the office. We all have more work than we know what to do with, but I promise that spending a couple hours a week on automation will be well worth it in the long run and unless your boss is incompetent or some kind of control freak they should be ecstatic to see you working to improve processes. If not, find a new job...

Things you don't like to do

This is why I love Ansible. Once you've found a few common configuration items, and identified repetitive tasks, you can begin the noble work of getting Ansible to do the parts of your job that you don't like doing.

Updating a firewall on 200 servers? That sounds awful and I don't want to do it... Spend one day on a playbook, and never worry about it again.

Spending your nights and weekends patching? Nah, I'd rather sit on the couch and watch my cats chase a laser while stuffing my face with pizza and beer.

Seriously, work is for suckers. Spend some time learning Ansible and it will pay you back in spades. You might even find yourself with enough time to work on a project you haven't had time for. Or maybe take a vacation.

Ansible Example: Patch and reboot

Ah yes, patching, we have to do it. If you're not regularly applying patches, you need to have a really good reason not to and a good mitigation strategy.

Patching is something, that in large environments, can quickly consume a lot of time if it isn't managed properly, and more importantly if it isn't automated. In that regard, I have provided an example playbook that you can use to begin automating some of your vulnerability patching.

    ---
    - hosts: <put some server names in here, with out the angle brackets>
      become: yes
      tasks:
      - name: Upgrade all installed packages
        yum:
        name: '*'
        state: latest
      - name: Reboot after update
        command: /sbin/shutdown -r 1 "Reboot after patching"
        async: 0
        poll: 0
        ignore_errors: true
      - name: Wait for server to become available
        wait_for_connection:
          delay: 60
          timeout: 500 # This can vary use a timeout that is reasonable for your environment, most VM's will reboot before 500 seconds.

Breaking down the playbook

What does it do? This playbook, will patch, reboot, and wait for a server to become available. After you run it you will get a nice little color-coded summary of the play.

Note:

Every yaml file starts with --- it's like the #! at the start of a bash script.

What does “become” do? Using the keyword become means that upon executing this playbook Ansible will attempt to escalate privileges to root.

After that, you can list your tasks in blocks using modules. You can find an index of provided modules here: https://docs.ansible.com/ansible/latest/modules/modulesbycategory.html

Granted this is a pretty simple playbook, it only takes into account Red Hat based systems, and it doesn't provide notifications upon completion, but I'll leave that part up to you. If you spend a lot of time patching then this little snippet of code should get you a long way towards reducing the manual effort you put into server updates.

If this helped you feel free to share it.

Working with logical Volumes (part 3)

December 12, 2017

Following this tutorial assumes that you have followed along with the other two parts in this series. However, if you already have some familiarity with Linux you should be able to follow along.

Working with logical volumes (part 1)

Working with logical volumes (part 2)

Add a disk to the volume group

One of the great things about lvm is that you can add and remove physical volumes on the fly without data loss and without interrupting services.

If you haven't already done so, add a new hard disk to your virtual machine. I created an additional 10 GB disk but you can make the disk any size you want. It doesn't have to match the previous disk that we created.

When I run sudo fdisk -l among my output is the following:

    ......
    Disk /dev/sdc: 10 GiB, 10737418240 bytes, 20971520 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    Disk /dev/sdb: 10 GiB, 10737418240 bytes, 20971520 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    .......

I have two disks that are 10 GB in size, however, if you remember from the last post we already identified /dev/sdb as the disk that has been used as a physical volume in our vgtest volume group. Knowing this I can safely create a new physical volume with /dev/sdc. (Check part 1 if you need a refresher in how to become familiar with your disk setup)

sudo pvcreate /dev/sdc
Physical volume "/dev/sdc" successfully created.

Now we want to add /dev/sdc into the vgtest volume group, using the vgextend command.

sudo vgextend vgtest /dev/sdc
Volume group "vgtest" successfully extended

That's all it takes to add a new disk to the volume group. You can see that the disk has been added by once again running pvscan and looking at the output.

sudo pvscan
PV /dev/sdb    VG vgtest          lvm2 [10.00 GiB / 6.00 GiB free]
PV /dev/sdc    VG vgtest          lvm2 [10.00 GiB / 10.00 GiB free]

As opposed to the first time we ran this command we can now see that vgtest has two disks associated with it. At the terminal run sudo vgdisplay vgtest -v and take a close look at the output. If you were successful in adding the disk you should see an abundance of information about the entire volume group.

Removing a physical volume

Step 1 move your data

What if I need to get rid of one of the disks in my volume group? The first step in this process is to move all existing data off of the physical volume that you no longer need.

You can move data from a physical volume with the pvmove command. I'll demonstrate how to do this by moving all of the data from /dev/sdb to /dev/sdc.

sudo pvmove /dev/sdb /dev/sdc
/dev/sdb: Moved: 0.05%
.....
/dev/sdb: Moved: 50.00%
.....
/dev/sdb: Moved: 100.00%

What you see above is the abbrevieated output of the pvmove command when performed on a live system. The syntax for pvmove is source to target just like the cp command.

First list the device that you want to move data from, next is the device that you want to move that data to.

Step 2 – remove the device from the volume group

In my case I want to remove the first device we used which was /dev/sdb. This is the device that contained the data we wanted to move to /dev/sdc. To remove a device from a volume group use the vgreduce command.

sudo vgreduce vgtest /dev/sdb
Removed "/dev/sdb" from volume group "vgtest"

The syntax on this command can be a bit tricky to remember (at least for me). You need to specify the device that you want to remove, not the one you are keeping. You are reducing the volume group by the obsolete device. Not reducing it down to the one you are keeping....

Now when you do pvscan you should be able to see that your vgtest volume group only contains one physical volume (/dev/sdc). But you can also still see that there is a physical volume on /dev/sdb.

sudo pvscan
  PV /dev/sdc    VG vgtest          lvm2 [10.00 GiB / 2.00 GiB free]
  PV /dev/sdb                       lvm2 [10.00 GiB]

Notice that /dev/sdb doesn't have an associated volume group (notated by “VG” in the above output)

Step 3 – remove the physical volume

Once you are sure that you no longer need the old device go ahead and remove it.

sudo pvremove /dev/sdb
  Labels on physical volume "/dev/sdb" successfully wiped

Conclusion

Working with logical volumes is actually much easier than many people make it out to be. The best part about using LVM is that you do not need to stop any services, or reboot the machine in order to make the changes you want. LVM allows you to make all of these changes without any kind of interruption in the normal operation of a Linux Server (or Desktop).