- A meetup with SEAPUG on the 12th, where I will talk about the upcoming 9.4 features.
- A BOF at Usenix LISA where I talk about the same, only to a different crowd.
Thursday, October 30, 2014
Upcoming Seattle Visit
I will be in Seattle soon on business. This will include two opportunties to do PostgreSQL community stuff:
Tuesday, October 21, 2014
Introducing Flexible Freeze
One of the things I mentioned in my series on VACUUM FREEZE was that we really needed a Postgres utility which would opportunistically freeze tables during low traffic periods. Today I'm announcing the Flexible Freeze project, our first attempt at designing such a utility.
All that's there right now is a simple Python script. However, that script is already a useful tool, installed at multiple production sites. Here's how the script works:
Of course, I have a tanker-truck full of desired improvements/expansions to this. So, pull requests welcome.
If you're more into Ruby, Wanelo has rewritten flexible freeze for Rails and incorporated it into their Postmodern tool.
All that's there right now is a simple Python script. However, that script is already a useful tool, installed at multiple production sites. Here's how the script works:
- identify your active databases and daily/weekly low traffic periods.
- create a cron job which calls flexible_freeze.py with a time limit to keep it inside your low traffic window.
- flexible_freeze.py will loop through your tables with the oldest XIDs, freezing them until it runs out of time or out of tables
Of course, I have a tanker-truck full of desired improvements/expansions to this. So, pull requests welcome.
If you're more into Ruby, Wanelo has rewritten flexible freeze for Rails and incorporated it into their Postmodern tool.
Friday, October 10, 2014
New Table Bloat Query
To accompany the New Index Bloat Query, I've written a New Table Bloat Query. This also involves the launch of the pgx_scripts project on GitHub, which will include most of the "useful scripts" I talk about here, as well as some scripts from my co-workers.
The new table bloat query is different from the check_postgres.pl version in several ways:
Results look like this:
pct_bloat is how much of the table (0 to 100) is estimated to be dead space. MB_bloat is how many megabytes of bloat are estimated to exist. Table_mb is the actual size of the table in megabytes.
The suggested criteria is to list tables which are either more than 50% bloat and bigger than 10MB, or more than 25% bloat and bigger than 1GB. However, you should calibrate this according to your own database.
The new table bloat query is different from the check_postgres.pl version in several ways:
- Rewritten to use WITH statements for better maintainability and clarity
- Conditional logic for old Postgres versions and 32-bit platforms taken out
- Index bloat removed, since we have a separate query for that
- Columns formatted to be more immediately comprehensible
- Some attributes (such as JSON and polygon fields) have no stats, so those tables can't be estimated.
- There's no good way to estimate bloat for compressed (TOAST) attributes and rows.
Results look like this:
databasename | schemaname | tablename | pct_bloat | mb_bloat | table_mb
--------------+------------+-------------------+-----------+----------+----------
members_2014 | public | current_member | 92 | 16.98 | 18.547
members_2014 | public | member_response | 87 | 17.46 | 20.000
members_2014 | public | archive_member | 84 | 35.16 | 41.734
members_2014 | public | survey | 57 | 28.59 | 50.188
pct_bloat is how much of the table (0 to 100) is estimated to be dead space. MB_bloat is how many megabytes of bloat are estimated to exist. Table_mb is the actual size of the table in megabytes.
The suggested criteria is to list tables which are either more than 50% bloat and bigger than 10MB, or more than 25% bloat and bigger than 1GB. However, you should calibrate this according to your own database.
Thursday, October 2, 2014
JSONB and 9.4: Move Slow and Break Things
If you've been paying any attention at all, you're probably wondering why 9.4 isn't out yet. The answer is that we had to change JSONB at the last minute, in a way that breaks compatibility with earlier betas.
In August, a beta-testing user reported that we had an issue with JSONB not compressing well. This was because of the binary structure of key offsets at the beginning of the JSONB value, and the affects were dramatic; in worst cases, JSONB values were 150% larger than comparable JSON values. We spent August through September revising the data structure and Heikki and Tom eventually developed one which gives better compressibility without sacrificing extraction speed.
I did a few benchmarks on the various JSONB types. We're getting a JSONB which is both faster and smaller than competing databases, so it'll be worth the wait.
However, this means that we'll be releasing an 9.4beta3 next week, whose JSONB type will be incompatible with prior betas; you'll have to dump and reload if you were using Beta 1 or Beta 2 and have JSONB data. It also means a delay in final release of 9.4.
In August, a beta-testing user reported that we had an issue with JSONB not compressing well. This was because of the binary structure of key offsets at the beginning of the JSONB value, and the affects were dramatic; in worst cases, JSONB values were 150% larger than comparable JSON values. We spent August through September revising the data structure and Heikki and Tom eventually developed one which gives better compressibility without sacrificing extraction speed.
I did a few benchmarks on the various JSONB types. We're getting a JSONB which is both faster and smaller than competing databases, so it'll be worth the wait.
However, this means that we'll be releasing an 9.4beta3 next week, whose JSONB type will be incompatible with prior betas; you'll have to dump and reload if you were using Beta 1 or Beta 2 and have JSONB data. It also means a delay in final release of 9.4.