PostgreSQL – pg_upgrade from 10 to 12

I have some PostgreSQL databases running pretty well but we need to keep our software updated. This is a mandatory practice for a high-quality service. Those servers are running version 10 and they need to be upgraded to version 12. I have used pg_dump / pg_restore strategy for a long time, but this time I would rather use pg_upgrade.

Let's dive into how to do it.

Table of contents:

  1. Install
  2. InitDB
  3. Check upgrade consistency
  4. Set locale
  5. Upgrade
  6. Configurations

Install

The package postgresql12-server contains everything needed to run the server, but my databases use some extensions [1], then I will add postgresql12-devel and postgresql12-contrib to be able to compile and to install the extensions.

yum install postgresql12-server postgresql12-devel postgresql12-contrib

InitDB

After installation we need to setup new server with initdb:

~% /usr/pgsql-12/bin/postgresql-12-setup initdb
Initializing database … OK

Check upgrade consistency

We need to check compatibility. Turn to postgres user (su - postgres) and run the command:

~% /usr/pgsql-12/bin/pg_upgrade --old-bindir=/usr/pgsql-10/bin --new-bindir=/usr/pgsql-12/bin --old-datadir=/var/lib/pgsql/10/data --new-datadir=/var/lib/pgsql/12/data --check

Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               fatal

Your installation contains tables declared WITH OIDS, which is not supported
anymore. Consider removing the oid column using
    ALTER TABLE ... SET WITHOUT OIDS;
A list of tables with the problem is in the file:
    tables_with_oids.txt

Failure, exiting

As you may see, I got a fatal error, indicating that the upgrade is not possible. In my case, tables with OIDs are the culprit. In your case could be something else. In any case, we need to fix before upgrading.

I fixed tables removing OIDs on mentioned tables. And ran check again:

Performing Consistency Checks on Old Live Server
------------------------------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               ok
Checking for invalid "sql_identifier" user columns          ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

*Clusters are compatible*

Yay!

Set locale

There is a tricky configuration that is not detected by pg_upgrade check but it is very important to me. I use C locale on my databases [2], then I need to perform an extra step. If this is your case, you may follow the same steps applying yours.

I need to stop postgresql10 and start postgresql12:

systemctl stop postgresql-10.service
systemctl start postgresql-12.service

Then I run locale change at my template1 then locale will be enabled when my database will be upgraded.

UPDATE pg_database SET datcollate='C', datctype='C' WHERE datname='template1';

And stop again: systemctl stop postgresql-12.service to be ready to upgrade.

Upgrade

Upgrade command is the same that we run before, without --check flag.

~% /usr/pgsql-12/bin/pg_upgrade --old-bindir=/usr/pgsql-10/bin --new-bindir=/usr/pgsql-12/bin --old-datadir=/var/lib/pgsql/10/data --new-datadir=/var/lib/pgsql/12/data

Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for tables WITH OIDS                               ok
Checking for invalid "sql_identifier" user columns          ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows in the new cluster                        ok
Deleting files from new pg_xact                             ok
Copying old pg_xact to new server                           ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok
Setting frozenxid and minmxid counters in new cluster       ok
Restoring global objects in the new cluster                 ok
Restoring database schemas in the new cluster
                                                            ok
Copying user relation files
                                                            ok
Setting next OID for new cluster                            ok
Sync data directory to disk                                 ok
Creating script to analyze new cluster                      ok
Creating script to delete old cluster                       ok

Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    ./analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    ./delete_old_cluster.sh

Consider running analyze_new_cluster. Optional but nice to have.

vacuumdb: processing database "mydb": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "postgres": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "template1": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "mydb": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "postgres": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "template1": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "mydb": Generating default (full) optimizer statistics
vacuumdb: processing database "postgres": Generating default (full) optimizer statistics
vacuumdb: processing database "template1": Generating default (full) optimizer statistics

Done

Configurations

Before deleting your old cluster, remember to get some of your configurations.

mv /var/lib/pgsql/12/data/pg_hba.conf /var/lib/pgsql/12/data/pg_hba.conf.new
cp /var/lib/pgsql/10/data/pg_hba.conf /var/lib/pgsql/12/data/

I am not a big fan of writing directly to postgresql.conf file. Instead, I keep configuration files under version control and include the directory where those files are deployed. I treat them as code, then it becomes easier to maintain and manage.

Another advantage is that I don't have any mess about config file differences when a new version arises. I am automatically using new default configurations, my customized setting are loaded and I can quickly address any incompatibility caused by migration without touching the original conf file.

Let's go for it:

# Add settings for extensions here
include_dir = '/var/lib/pgsql/conf.d'   # include files ending in '.conf' from

Then you may delete your old cluster. 🙂

Links:

  1. Extensions: https://www.postgresql.org/docs/current/contrib.html
  2. Locale: https://www.postgresql.org/docs/current/locale.html

Tropeçando 89

Upgrading Postgres major versions using Logical Replication

Arch Fun Statistics

Everyday Hacks for Docker

In this post, I’ve decided to share with you some useful commands and tools I frequently use when working with awesome Docker technology. There is no particular order or “coolness level” for every “hack.” I will simply present the use case and how the specific command or tool has helped me with my work. Read these great hacks and make sure to check out the great hack of all – Codefresh –  the best CI for Docker out there.

Do not ignore .dockerignore (it’s expensive and potentially dangerous)

A video course Introduction to CQRS and Event Sourcing

Tropeçando 88

Intro Guide to Dockerfile Best Practices

There are over one million Dockerfiles on GitHub today, but not all Dockerfiles are created equally. Efficiency is critical, and this blog series will cover five areas for Dockerfile best practices to help you write better Dockerfiles: incremental build time, image size, maintainability, security and repeatability. If you’re just beginning with Docker, this first blog post is for you! The next posts in the series will be more advanced.

The case against the ifsetor function

how to traverse nested array structures with potentially non-existing keys without throwing notices

Laravel Beyond CRUD

Proposal for thinking Laravel applications using DDD approach. A blog series for PHP developers working on larger-than-average Laravel projects.

Designing Your First App in Kubernetes, Part 1: Getting Started

Kubernetes’s gravity as the container orchestrator of choice continues to grow, and for good reason: It has the broadest capabilities of any container orchestrator available today. But all that power comes with a price; jumping into the cockpit of a state-of-the-art jet puts a lot of power under you, but how to actually fly the thing is not obvious.

How to run short ALTER TABLE without long locking concurrent queries

Tropeçando 85

Good Engineering Practices while Working Solo

How Much maintenance_work_mem Do I Need?

While I generally like PostgreSQL's documentation quite a bit, there are some areas where it is not nearly specific enough for users to understand what they need to do. The documentation for maintenance_work_mem is one of those places. It says, and I quote, "Larger settings might improve performance for vacuuming and for restoring database dumps," but that isn't really very much help, because if it might improve performance, it also might not improve performance, and you might like to know which is the case before deciding to raise the value, so that you don't waste memory. TL;DR: Try maintenance_work_mem = 1GB. Read on for more specific advice.

JSONPlaceholder

Fake Online REST API for Testing and Prototyping

A Beginner’s Guide to the True Order of SQL Operations

The SQL language is very intuitive. Until it isn’t. A guide to understanding the order of a SELECT operation.

The state of open source security - 2019

Snyk is an incredible tool for package security. And they released a state of open source security, talking about open source adoption and package, images and code vulnerabilites. We are talking about maven, npm, pypi, docker, etc.

Tropeçando 83

PostgreSQL Tuning: Key Things to Drive Performance

Performance is one of the key requirements in software architecture design, and has been the focus of PostgreSQL developers since its beginnings

Illuminate your career

If you are a developer, this article is for you.

5 Things You Have Never Done with a REST Specification

How to to Backup Linux with Snapshots

While working on different web projects I have accumulated a large pool of tools and services to facilitate the work of developers, system administrators and DevOps
One of the first challenges, that every developer faces at the end of each project is backup configuration and maintenance of media files, UGC, databases, application and servers' data (e.g. configuration files).

Awesome PHP

A curated list of amazingly awesome PHP libraries, resources and shiny things.

Tropeçando 81

ACME Support in Apache HTTP Server Project

We’re excited that support for getting and managing TLS certificates via the ACME protocol is coming to the Apache HTTP Server Project (httpd). ACME is the protocol used by Let’s Encrypt, and hopefully other Certificate Authorities in the future. We anticipate this feature will significantly aid the adoption of HTTPS for new and existing websites.

Postgres Hidden Gems

Postgres has a rich set of features, even when working everyday with it you may not discover all it has to offer. In hopes of learning some new features that I didn’t know about myself as well as seeing what small gems people found joy in I tweeted out to see what people came back from. The response was impressive, and rather than have it lost into ether of twitter I’m capturing some of the responses here along with some resources many of the features.

Why upgrade PostgreSQL?

Usage should be simple – pick from which version you want to upgrade, to which version you want to upgrade, and press gives me… button.

Introducing DNS Resolver, 1.1.1.1 (not a joke)

Cloudflare’s mission is to help build a better Internet and today we are releasing our DNS resolver, 1.1.1.1 - a recursive DNS service. With this offering, we’re fixing the foundation of the Internet by building a faster, more secure and privacy-centric public DNS resolver. The DNS resolver, 1.1.1.1, is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released.

My Favorite PostgreSQL Queries and Why They Matter

Below, I present a combination of 8 differing queries or types of queries I have found interesting and engaging to explore, study, learn, or otherwise manipulate data sets.

Tropeçando 80

MVCC and VACUUM

Experienced PostgreSQL users and developers rattle off the terms “MVCC” and “VACUUM” as if everyone should know what they are and how they work, but in fact many people don’t. This blog post is my attempt to explain what MVCC is and why PostgreSQL uses it, what VACUUM is and how it works, and why we need VACUUM to implement MVCC. In addition, I’ll include a few useful links for further reading for those who may want to know more.

Setting up psql, the PostgreSQL CLI

PostgreSQL ships with an interactive console with the command line tool named psql. It can be used both for scripting and interactive usage and is moreover quite a powerful tool. Interactive features includes autocompletion, readline support (history searches, modern keyboard movements, etc), input and output redirection, formatted output, and more.

A Replication Cheat-Sheet

New flexbox guides on MDN

In preparation for CSS Grid shipping in browsers in March 2017, I worked on a number of guides and reference materials for the CSS Grid specification, which were published on MDN. With that material updated, we thought it would be nice to complete the documentation with similar guides for Flexbox, and so I updated the existing material to reflect the core use cases of Flexbox.

introduction to pgBackRest

pgBackRest (http://pgbackrest.org/) aims to be a simple, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads.

Instead of relying on traditional backup tools like tar and rsync, pgBackRest implements all backup features internally and uses a custom protocol for communicating with remote systems. Removing reliance on tar and rsync allows for better solutions to database-specific backup challenges. The custom remote protocol allows for more flexibility and limits the types of connections that are required to perform a backup which increases security.

Tropeçando 79

Plan for the unexpected: install diagnostic tools on your PostgreSQL servers

There’s a lot of information out there on how to configure PostgreSQL, on the importance of backups and testing them, etc.

But what about the server you run PostgreSQL on? We tend to pay a lot less attention to tools that you won’t need unless something breaks. But it’s worth taking some time to do so now, because that’s time you won’t have when your server is down and you’re in a rush.

SQL Feature Comparison

This comparison focuses on SQL features that can be used in SQL statements or self-contained SQL scripts that don't require additional software (e.g. a compiler) to be usable. Features for database administration or deployment are also not the focus of this comparison.

Building the DOM faster: speculative parsing, async, defer and preload

In 2017, the toolbox for making sure your web page loads fast includes everything from minification and asset optimization to caching, CDNs, code splitting and tree shaking. However, you can get big performance boosts with just a few keywords and mindful code structuring, even if you’re not yet familiar with the concepts above and you’re not sure how to get started.

How we tweaked Postgres upsert performance to be 2-3* faster than MongoDB

As we all know, relational databases are fine if you’re dealing with small amounts of data but for web-scale high performance high inserts speed and masses of queries per second, NoSQL is what you need. At least, that’s the conventional wisdom/hype surrounding NoSQL databases such as MongoDB. However as we’ve recently discovered this is very wrong indeed.

Painful Varnish mistakes

This post was initially titled "Top 6 Varnish mistakes", to echo Espen's blog. Even though his material is three years old, the content is still relevant. Plus, there's a healthy colleague competition going on here, and I can't just mimic Espen if I hope to beat him, so I had to do something different.

NULL in SQL: Indicating the Absence of Data

Tropeçando 78

Replicação: o que mudou

Detalhes das mudanças na replicação do PostgreSQL desde a versão 9.0 até a 9.6.

What the hell are Generics and would I want them in PHP?

So everyone is talking about this hip “new” kid on the block for PHP: Generics. The RFC is on the table and a lot of people are getting all excited about it, but you don’t fully see the excitement? Let’s explore what it’s all about!

The Easy Way to Setup PostgreSQL 10 Logical Replication

PostgreSQL workings in one picture

With the below illustration I attempted to capture key processes that encompass Postgres workflow. Here is a quick overview.

Speeding up Application Development with Bootstrap

Bootstrap is one of the best and most used HTML/CSS/JS front-end frameworks. Almost any web developer who has made a website or two has heard of this popular framework. It offers so many ready-made components and resources, Bootstrap can significantly speed up your application development.

Tropeçando 77

How to Measure Execution Time in Node.js

Design Pattern Workshop

Recently on the pgsql-performance mailing list, a question popped up regarding Postgres RAM usage. In this instance Pietro wondered why Postgres wasn’t using more RAM, and why his process was taking so long. There were a few insightful replies, and they’re each interesting for reasons that aren’t immediately obvious. Let’s see what is really going on here, and perhaps answer a question while we’re at it.

Ativando o Optimus NVIDIA GPU no Dell XPS 15 com Linux, mesmo na bateria

New Features Coming in PostgreSQL 10

Postgres 10 highlight - SCRAM authentication