Let me start by reminding everyone that most of my database experience is with Oracle, so I’m absolutely 100% biased
However, that doesn’t make me wrong! These are the things which have caused me the most grief as a PostgreSQL DBA.
1. Query progress
I do a lot of SQL tuning, and one thing I really want to see is some kind of progress indicator of a query and the various operations within a query (Oracle exposes some of this data via its v$longops view) When I get an EXPLAIN of a query I can see what operations there are and know from the join types which ones are run repeatedly or once only. However, when running a query that never completes due to a bad plan I need to know where in that plan it has gone wrong. I can never tell if I have to kill the query. I can’t even trace the session to see what is happening. An EXPLAIN ANALYZE shows some great stats like the number of times an operation was run and how many rows it returns. An incrementally updated version of this that I can query from somewhere would be fantastic.
2. Index Only Scans
Because of the way MVCC is implemented (explained previously here), there is no such thing as an index only scan. When a query retrieves a value based on an index lookup it needs to then go to the table to verify that this data can be used even when it matches the query criteria, based on the min/max transaction identifiers. This severely reduces the number of optimisations available for the query planner to use when executing queries. One reason I’ve seen for not including this information in the indexes was to do with trying not to increase the size of the indexes unnecessarily. I haven’t looked at exact numbers, but I feel this is something well worth the tradeoff. The performance improvements it would enable for a large number of queries should be substantial. A simple example: select count(1) from table. Currently this performs a sequential scan on the table, but if there were a suitable index it could do a full scan of that instead which would be much smaller and faster.
3. Session Wait Data
Before Oracle introduced wait events back in v8.0, also known as the dark ages of database tuning, we were forced to rely on statistic ratios and external tools to figure out what the database was doing. PostgreSQL has yet to rise from this dark ages period. I don’t claim that wait events as Oracle has defined them is the only way forward, but we need something more than just table level cumulative statistic totals. When I run a query, I want to know how much time I spent doing disk I/O vs data manipulation vs lock contention etc. When I’m tuning a query, does join order A->B perform fewer page reads than B->A etc.
4. Session Tracing
Being able to investigate exactly what a specific session is doing is essential to a lot of DBA debugging. The parameter log_min_duration_statement can be set to log every SQL statement that is executed by every user in the entire database cluster, but this can then become painful to parse and cause the log file to grow very big. It also provides no insight into what factors are influencing each queries runtime – such as slow disk accesses, waiting for the database writer to finish flushing dirty pages, etc. I get very frustrated at waiting for a slow query to finish and having no quantitative data as to why it’s taking so long.
5. Multiple Block I/O
Disks suck. They are slow and the bottleneck in 99% of database operations. Servers these days have large amounts of RAM to cache data from disk and can utilise large read aheads to pre-cache data which processes may use later. PostgreSQL relies largely on this OS level cache of data on disk. All disk I/O goes through the OS and reads a single block at a time. With multi-block I/O for example, it may take 10ms to read a single block and 20ms to read 64 blocks. When scanning an entire table I need to read every block anyway so multi-block I/O would benefit it greatly. PostgreSQL does not support this and due to the redesign effort it would require to implement, probably never will.
This takes away one very powerful optimisation feature I have used in Oracle many times for large batch processing where it is often much faster to read an entire table via multi-block reads instead of accessing specific rows of interest via index lookups even when there may only be 5% of the rows required.
6. Tablespace Quotas
I want to be able to limit how big certain tables can grow without them filling up a file system and crashing the database. Especially the temporary files as I have discussed before. Being able to specify at the database level how much space can be used by each tablespace would be very useful. Multiple tablespaces can share a single file system without fear of stealing each other’s disk space. The only way to limit disk space for a single tablespace at the moment is to give it a dedicated file system of a specific size. Not something always easy to do in a large organisation with dozens of servers and separate support teams for storage.
7. No Logging
Every single change to data within a PostgreSQL database needs to also be written to the WAL logs for recoverability purposes. If the database crashes while in an inconsistent state, it needs to use the logs to recover the data. However, not every bit of data is born equal – some are critical while some are worthless. Version 9.0 introduced temporary tables which are unlogged which is a great first step. Other common use cases include index creation and large data loads. If a crash causes an index to become unrecoverable we can fairly easily rebuild it in most cases, so the trade off might be worth it. A large data load might be a once off activity followed by a full backup anyway, so crash recovery for the duration of the load is not required.
8. Better Export/Import Tools
Don’t get me wrong, the pg_dump and pg_restore tools are indispensible when working with PostgreSQL. However, I have lost count of the times where I’ve needed to do something and checked out the options to those tools to be disappointed that they can’t do what I need. Some examples:
• Restore table data without creating the indexes (sure, indexes required to enforce constraints might be acceptable)
• Dump all functions in a schema
• Restore all objects into a specific tablespace
• An easy way to monitor the dump/restore progress
• Parallelism! 8.4 introduced parallel restores but only by doing one table per process which doesn’t help me with my large data warehouses that have one or two huge tables.
9. Query Hints
Yes I went there. I love using Oracle hints and SQL profiles (which rely on hint functionality). They make my job so much easier. Reasons for not implementing them in PostgreSQL that I have seen:
• “Write the queries properly in the first place and the planner will figure it out”. Bullshit! There are plenty of things the planner has no idea about, or the queries have too many tables for it to figure out in a reasonable timeframe.
• “If they exist, developers/DBAs will use them wrongly”. Probably, but let me make that choice!
• “They would rather spend time improving the optimiser in future versions“. OK nice idealistic thinking. Meanwhile on planet Earth, I’m trying to convince management that they need to redesign their app that hasn’t been touched in years because of some gradual growth in the data that screwed up PostgreSQL into thinking it should run the queries in a sub-optimal way.
The planner/statistics collection will NEVER be perfect. Basically I view the problem as this “I know more about the data and how it’s used than the database ever could, so allow me to suggest a better way of achieving this query”.
10. More Backup/Restore Options
I think the majority of PostgreSQL databases are pretty small, which is why the backup techniques are not very sophisticated yet. However, over time they will get bigger and we need better backup technologies in place. The largest one I have to deal with is about 500GB which isn’t really considered a large database these days, but still gives me plenty of headaches when trying to perform maintenance or database copies/backups. A true incremental backup solution would be beneficial, rather than having to copy all WAL logs since the last base backup. While this works fine and can minimise disk requirements to store a backup, the time to restore can be potentially huge. Imagine recovering through a week’s worth of WAL logs, several hundred GB in size.
I also find it frustrating that I cannot backup/restore a single database from the cluster using OS file copy techniques (I have to rely on a slow pg_dump/pg_restore). I guess that’s just something I have to get used to and try to design the clusters with this in mind, but that doesn’t help me when I’ve inherited an existing system and these things tend to grow organically over time.
Other notable mentions:
• Better fragmentation management. Maybe this has improved in 9.0, but I have had to deal with a lot of fragmentation issues in my 8.3 databases. I can’t remember the last time I had to worry about it in my Oracle databases.
• More parallelism support – Create those indexes in parallel.
• Better control over the cache, e.g. I need to perform some operation on a large table but don’t want to trash the existing cache and affect everyone else in the process.
It shouldn’t be surprising that Oracle has most of these features. Otherwise, I wouldn’t notice that they were missing in PostgreSQL to complain about it
Nathan,
No argument that most of those things would be very nice to have. I especially lust after query tracing. Are you up for a little hacking?
You’ll be glad to know, at least:
re: 7 PostgreSQL 9.1 will be getting Unlogged Tables
re: 8 PostgreSQL 9.1 will be getting attached tables via SQL/MED.
That’s the nice thing about Postgres … wait a few months, it’ll have more features.
I can’t stand editing my own code, let alone something written by others
Thanks for the heads up!
pg_upgrade allows you to upgrade from 8.3 or later without reloading data. It takes just seconds, even if there is a TB of data.
This is somewhat non-sense. PostgreSQL by no means have future if becoming another Oracle copy.
Work was done on a progress indicator in 2010. There is also enough use case that someone might pick it up in the near future.
I could not disagree more about the import/export tools, PostgreSQL’s pg_dump and pg_restore are actually usable compared to the junk that are Oracle’s tools. Just a few examples off the top of my head:
- imp cannot import a CLOB or BLOB column in the default tablespace of the user doing the import, it insists on having the tablespace of the CLOB/BLOB column in the target database
- exp in 11g is unable to export tables without rows, unless you set a relatively obscure database parameter (DEFERRED_SEGMENT_CREATION = false)
- exp is deprecated in 11g, instead we get expdp and impdp which are more difficult to use and cannot be used between different point versions of the database
There’s so many issues with Oracle’s import/export tools that I’ve started to use the dbms_metadata package to export the DDL and DbUnit to export the data in XML format.
I gladly tossed exp/imp aside in favour of expdp/impdp, I think they’re awesome. I’m also pretty certain you can expdp/impdp between versions. You can even do it between operating systems.
There’s a parameter for expdp/impdp
VERSION Version of objects to export where valid keywords are:
(COMPATIBLE), LATEST, or any valid database version.
If you need to export for a 10gR1, VERSION=10.1.0.2 will work right when exporting from a 11gR2.
I agree they are more complicated than exp/imp, I spent some months using the old tools but at the first non-trivial need I had, i loved the novelty
Sebastiano, normal “exp” is de-supported in 11g. If you are talking about expdp, the parameter CONTENT=METADATA_ONLY will export the metadata, without the rows.
My #1 missing feature is analytic functions. As a developer (not a DBA), I use them all the time.
It has a few. Some of the ones I use: http://www.postgresql.org/docs/current/static/tutorial-window.html
@Taylor: Analytical functions are called “windowing functions” in the SQL standard and PostgreSQL supports them since 8.4 The number of functions is more or less on par with Oracle, but the window definition (“partition by…”) is actually more advanced than in Oracle.
Index organized tables. Retrieving the row by primary key in a single operation should be faster than having to do separate index and heap lookups. Always having the table in physical order for efficient range scans can be very nice for certain workloads as well.
you mean “cluster” ?
http://www.postgresql.org/docs/9.0/static/sql-cluster.html
I’m no PostgreSQL expert, but IOTs are not heap tables sorted by index, they _are_ the index.
Consider “select a, b, c from x where a between 1 and 10″ where “a” is PK and x is IOT. In this scenario, blocks retrieved by index scan are the data and there is no need for additional table access to get it. If x wasn’t IOT, the scan would give us a list of rows to read, most probably lying in different blocks.
Also, reading ‘cluster’ command reference I see that clustering is not automatic for subsequent updates, blocks table access, and requires twice as much space as data size. Updatable tables holding a considerable amount of data would be a problem with this.
You say:
> The parameter log_min_duration_statement can be set to log every SQL
> statement that is executed by every user in the entire database cluster
In fact, this parameter, like many other session attributes, *can* be changed selectively:
* per-user: ALTER USER foo SET log_min_duration_statement=X
* per-database: ALTER DATABASE foo SET log_min_duration_statement=X
* Superusers can also change it per-session SET log_min_duration_statement=X
Thanks for pointing that out, but it doesn’t help me trace what is happening from a connection pooled app server. I can’t set the per-session parameter from another session can I?
Nathan,
if I see tablespace management come into postgres like it exists in ora-gel I will probably vomit.
honestly if there is one part of oracle that even non-DBA folks can clearly see is troublesome legacy junk left in oracle it is tablespace management.
“Multiple Block I/O” doesn’t really work as you’re hoping for given PostgreSQL’s disk access model. We’ve benchmarked it multiple times. The part that does help, helping the OS schedule reads, is implemented for bitmap table scans–the only case where there’s been a clear improvement measured. On sequential ones, OS read-ahead already does such a good job on so much I/O that trying to get the database involved doesn’t show any consistent benefit. This really isn’t a surprise on Linux in particular; the kernel has been playing around with improvements specifically targeting this workload for years, i.e. http://lkml.org/lkml/2006/5/24/71
The suggestion that the database needs to provide better caching control fall into a similar bucket. There is already some logic for keeping sequential scans and vacuums from using all of the cache. But even those fall down because you can still trash the operating system cache. And there’s little PostgreSQL can do about that without taking over all I/O, which is a questionable design direction. The assumption that the kernel developers are working on those low-level problems is a reasonable one, and it allows the database to advance faster than if it tried to address all those issues itself.
Would would work better in the place of these two is promoting the fragmentation management items from a footnote to the main list; it’s at #1 on my own prioritization list. Queries that get lost navigating bloated indexes for example is a large component to the problem you’d like to diagnose with “Query progress”, particularly on larger databases. And that wouldn’t actually help you all the time; bloat is often invisible to the query execution code, due to how query nodes pull data from one another on demand.
Greg, fact is that Oracle does a full table scan much faster than Postres, even old versions like Oracle 9.2.0.8. Postgres performance group is full of questions like that. Here is the result that’s hard to explain, especially having in mind that the machine that runs Oracle is much weaker than the machine running Postgres:
Oracle 9i:
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 – Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 – Production
SQL> set timing on
SQL> select count(*) from moreover_sm_documents
2 where created_at>=’01-JUL-11′ and
3 created_at \timing
Timing is on.
news=> select count(*) from moreover_sm_documents_y2011m07;
count
———-
36986831
(1 row)
Time: 295965.113 ms
news=>
(almost 5 minutes)
This is 64 bit machine with 16GB RAM, running RH 5.5
Records are exactly the same, as Postgres is copied from Oracle. The hardware advantage weighs heavily on the Postgres side, yet Oracle does a full table scan 4 times faster. There is nothing to add. Please note that I can make things even faster with Oracle by using several processes to scan simultaneously:
SQL> select /*+ parallel(sm,4) */ count(*)
2 from moreover_sm_documents sm
3 where created_at>=’01-JUL-11′ and
4 created_at< ‘01-AUG-11′;
COUNT(*)
———-
36986831
Elapsed: 00:00:57.13
That is something that Postgres definitely cannot do.
To put it simply, the performance hit incurred when moving from Oracle to Postgres is simply too great and, along with silly things like hints, prevents me from using Postgres. Postgres is simply not there yet. Judging by the debate about hints, it’s not going to get there for quite some time.
For some reason, Oracle result got lost. Here it is:
SQL> set timing on
SQL> select count(*) from moreover_sm_documents
2 where created_at>=’01-JUL-11′ and
3 created_at< ‘01-AUG-11′;
COUNT(*)
———-
36986831
Elapsed: 00:01:13.75
For me the 10 most important missing features of PG are, in order :
1) Muti processing Query processor (the ability to execute some stepds of a query plan via more thant one thread)
2) Indexed views (like MS SQL Server or materialized one like Oracle)
3) Storage dataspace management (min space, max space, space growth step and locations of file et collection of fiel via storage or tablespace).
4) Semantic optimisation technics
For instance the abality to return no rows at all, without scanning the table if a query ask for some values wich cannot appear in the table due to CHECK constraint (price >=0). In a same way, simplifying joins where FK are set.
5) DDL triggers to cath event like CONNECT, CREATE, ALTER, DROP…
6) a command to switch context from one database to one another (like USE in MS SQL Server)
7) a change data capture system to track the evolution of data in tables (INSERT, UPDATE, DELETE…)
9) auditing indexes : some internal views to know which index are not used and the ones that are missing
10) a synchro mirroring high avialability per database (like MS SQL Server database mirroring)
i think that couchdb style p2p http replication at the document level with sequence indexes would be awesome for #opendata everywhere
Bitmap indexes. I know that it isn’t straightforward due to MVCC and other architectural reasons but my gut tells me that it is still worth it for DW type situations.
I am perplexed by unwillingness to implement any meaningful system of hints which would enable a human to override the optimizer decision. The debate that I had about this convinced me that PostgreSQL is a project which doesn’t have much of a future.
I disagree with the statement about query hints. The one case where I ran into planner optimization issues, I wrote the performance list, got high quality workarounds within hours, and the problem was fixed in the next major release.
No, the planner will never be perfect, and I think there are a few areas (where query plans are to be cached) where hints should be allowed but that case is only because in those cases the planner fundamentally lacks sufficient information to provide the best general plan for successive calls.
In other words, I would suggest that the discussion needs to be whether the planner has sufficient information to make good planner choices, and where it does not, query plans may need to be supported. Such a minimalistic system would, I think, provide the best of both worlds, and there is precedent set in the few hints that can be passed to the create function statement.
I would love to see PostgreSQL have some kind of global query plan cache to reduce parsing/planning overhead.