pg_stat_statements
Back in version 8.4, Itagaki Takahiro introduced a new Extension called pg_stat_statements. This extension creates a dynamic system view which shows you recently executed queries and aggregates both counts and total time to execute. The idea was to give you rapid feedback on which of your queries where "hot" and consuming a lot of system resources, instead of forcing you to wait for logging and log processing and get information the next day.
However, pg_stat_statements suffered a major limitation; unlike log processors like pgfouine and pgbadger, pg_stat_statements could only aggregate queries with the same structure and different constants if they were called using the PostgreSQL prepared plans interface. This meant that for the majority of applications (everything except Java, really), queries were not aggregated at all, making pg_stat_statements not that useful in practice.
Peter Geoghegan decided to fix that in this release (funded by the folks at Heroku.com), using the query plan for each query to "normalize" out the constants so that identical queries could he aggregated, regardless of how they were sent to the database. Suddenly, that means I can depend a lot less on log processing in order to tell clients what their worst queries are, and I can even troubleshoot potential bad queries in a new application deployment within minutes of it going live.
We're going to be installing pg_stat_statements on most of our clients who upgrade to 9.2, and apparently Heroku is going to make it available by default.
Peter went a bit further, though, so pg_stat_statements also now collects cache hit/dirty and IO timing information. This gives us ability to diagnose issues with bad application queries which we couldn't do before, even with the delay of log processing.
So the new pg_stat_statements looks something like this:
userid | 10
dbid | 18018
query | SELECT tbalance - sum(delta) FROM pgbench_tellers JOIN pgbench_history USING (tid) WHERE pg
bench_tellers.tid = ? group by tid, tbalance;
calls | 158207
total_time | 5999651.60500004
rows | 158084
shared_blks_hit | 80551793
shared_blks_read | 2
shared_blks_dirtied | 0
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_dirtied | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
blk_read_time | 169.666
blk_write_time | 0
Notice all the new columns. blk_read_time is particularly interesting; it shows that we're doing logical IO to access the records we need. Lemme see which queries are doing a lot of logical IO relative to the amount of data they need:
SELECT query, total_time,
blk_read_time / calls as read_time_query,
blk_read_time / rows as read_time_row
FROM pg_stat_statements
WHERE calls > 0 AND rows > 0
ORDER BY blk_read_time / calls desc LIMIT 5;Out of those results, this one is not a surprise:
query | select count(*) from pgbench_accounts;
total_time | 33765.04
read_time_query | 7934.87833333333
read_time_row | 7934.87833333333
total_time | 33765.04
read_time_query | 7934.87833333333
read_time_row | 7934.87833333333
But this one is:
query | UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;
total_time | 18529.712
read_time_query | 6.67768562644125
read_time_row | 6.67768562644125
total_time | 18529.712
read_time_query | 6.67768562644125
read_time_row | 6.67768562644125
From that, it seems like we're often updating rows which aren't in cache, and paying a heavy price for it.
Hi Josh, I clicked on the tshirt link, but it took me to the pg_stat_statements man page. - Leon
ReplyDeleteThe link should be http://www.zazzle.com/postgresql
ReplyDeleteSorry, problems with blogger and links. Link is fixed now.
DeleteHello,
ReplyDeleteI think that the I/O information may have been contributed by Ants on top of the excellent pg_stat_statements infrastructure Peter wrote.
Do you know if Heroku PostgreSQL 9.2 will support any time soon.
ReplyDeleteYes, they will.
Delete