To start with, comparing PostgreSQL and MySQL directly is a serious mistake. The real question is "what kinds of applications do you want to build?". I have had experience building complex web-based applications with both MySQL (mainly against my will) and PostgreSQL, and here are some of the conclusions I reached:
The basic (over)generalization I often hear is: use MySQL for basic dynamic websites, and very lightweight (read: non-critical) applications, but use PostgreSQL for Serious stuff. Well, this is not a bad generalization, but even for lightweight applications, PostgreSQL can really save you some time, and a lot of server-side code, if you bother to learn about views, constraints, stored procedures, triggers, etc...
Here are just the basics on what MySQL lacks, which PostgreSQL covers nicely: (I know MySQL is working on some of them, but it is rather late in the game, it seems to me...)
1. [color #0000A0]Foreign Key Constraints[/color] (yes, certain table types now have this, but there are questions as to how thoroughly it is being implemented--it is treated as an add-on, rather than the absolutely critical central feature it should be. And, MySQL still doesn't support all the ANSI-standard ON DELETE options, nor does it appear to support MATCH PARTIAL.)
2. [color #0000A0]Views.[/color] In the last MySQL application I worked on, THIS was my biggest headache (I later convinced the customer to switch to PostgreSQL). Views provide excellent ways to build one query on another and have logical independence from the physical data. A DBMS should not even be considered remotely 'relational' without views. Also, in conjunction with RULEs, PostgreSQL allows views to be updateable in any manner desired by the DBA (even multi-table views).
3. [color #0000A0]Subqueries.[/color] they seem like such a small extra feature, but there are times when a subquery can save you many lines of application code, extra looping, etc...
4. [color #0000A0]Data Integrity Constraints[/color] such as:
a. [color #0000A0]Triggers[/color] -- Methods to make sure that if X happens, we also do Y every time. Again, a critical feature if your database contains any serious business logic.
b. [color #0000A0]CHECK[/color] constraints (don't allow data in a certain table or field unless it fits the parameters YOU decide.)
c. [color #0000A0]RULES[/color] --PostgreSQL actually provides for methods to rewrite queries on the fly, based on certain conditions. This is a very useful feature, allowing you to catch certain things in ways that triggers don't. For example, a RULE can use the DO INSTEAD syntax so that the requested query doesn't happen at all, while a different query does happen. One example of it's use is to make views updateable, by INSTEAD updating the related tables: a great way to shield programmers from the internals of a data model, in large-scale applications.
5. [color #0000A0]Stored Procedures[/color] -- Is there a complex problem that you would like to have solve IN your database rather than in your application code? Business logic should reside in the database whenever possible, so that application code cannot accidentally circumvent it. For enterprise-level applications, stored procedures are critical. PostgreSQL supports stored procedures in several languages: SQL, PL/PGSQL, Perl, TCL, C, and I believe there are even plans for stored procedures in Java. UPDATE: in version 7.3, stored procedures can now return full results sets, just as a standard query. Also, the maximum number of parameters has been raised to 32.
6. [color #0000A0]Full constraints for Date column types[/color] -- I had a very difficult time with the last MySQL database I worked on, because the whole database was about time differences in events, and I found that MySQL has several bad habits with time data:
a. Wrong dates can be entered. For example, one can enter a date of 0000-00-00. This is wrong because no year has a 0 month, and no month has a 0 day. Also, it allows up to 31 days for ANY month, even February, which should only have 28 days. (when you think about it, this means every month can actually have 32 days in MySQL, because it starts at 0).
b. No truly good methods for measuring date intervals. PostgreSQL was a lifesaver here, because it actually has an "interval" datatype, that shows time intervals in human-readable format, such as "2 Years 5 Months 23 Days 18:03:23". This, combined with PostgreSQL's date calculation and comparison features, make it a great database for handling statistical, event-based data.
7. [color #0000A0]Cursors[/color] -- these allow you to isolate subsets of results from a query and do further manipulation. Very useful for reporting (see below).
8. (UPDATE) [color #0000A0]Schemas and Domains[/color] -- new in version 7.3. While these alone are worth a whole FAQ, I will just mention that they include very advanced concepts for fine-grained control over who can do what with tables, columns, and even data types. Also, they allow for users to have their own namespaces in a database. These features are not even on MySQL's road map (http://www.mysql.com/doc/en/TODO.html).
9. Everyone is wondering why I left out [color #0000A0]Transactions[/color]. I left transactions to the end, because too many people focus on transactions as the only important feature when discussing these things. No, transactions are just a basic underpinning. Transactions can't prevent you from corrupting your data, they can just prevent it from being caused by crashes or other anomalies. Yes, MySQL has transactions now, for two table types. Big deal. Transactions should be a given, not even a point of discussion.
---------------------------
These things I mention above are just the basics for a true relational database. This is before you even think about such things as scaleability. Scaleability is pointless if your data is bad, or unmanageable. RDBMS stands for "Relational DataBase Management System". Maybe MySQL should only be called a "DBS", leaving out the "Relational" and the "Management" parts.
In terms of scaleability, PostgreSQL again wins out over MySQL. MySQL may seem blazingly fast for single use access, or for mainly read-only access, with few inserts or updates, but there are many ways to optimize. PostgreSQL is optimized for serious work, where there are many concurrent updates, inserts, etc... with many users. PostgreSQL seems slower than MySQL in doing a single query, but when the number of users scales and the complexity of the queries goes up, PostgreSQL just keeps on handling it.
Here is a list of a few companies with PostgreSQL databases larger than 1 terabyte: http://archives.postgresql.org/pgsql-advocacy/2002-08/msg00005.php (there are some big names in this list )
Another scaleability advantage is [color #0000A0]Prepared queries[/color], which can pre-compile queries to simply accept certain parameters and return the results. PostgreSQL shares this feature with such enterprise DBMS's as Oracle.
Also, for those who really want to delve into the theoretical end of things, PostgreSQL has a few cool extras:
1. Specialized datatypes: there are several special datatypes aimed at IT and science/engineering, such as network address datatypes, and spatial datatypes (with special algorithms for sorting 3D data)
2. User-defined datatypes: Yes, you can create your own datatypes that behave by your own rules. How about a native XML datatype, for example?
3. User-defined operators: as a complement to user-defined types above, you can define your own special operators, to perform more specialized evaluations and manipulations than the standard operators (LIKE, >, <, BETWEEN, etc...).
4. Object-relational inheritance: This is not to suggest that PostgreSQL is trying to become an "O-O" (non-relational) database, but that it does allow tables to be created which inherit base attributes from parent tables, sharing the inherited columns back "downward" to the parent. This allows for some interesting database design methods.
Another important area of concern is [color #0000A0]reporting[/color]. I have found out through some more painful experience that if you want to do real reporting, a real DBMS helps. PostgreSQL supports JDBC and ODBC, which are the two main interfaces required by almost all reporting tools, which leads to:
1. Most serious reporting tools need to know foreign key relationships, in order to assist the user in building decent reports.
2. Views, views, views: A few views, a nested subquery or two, and suddenly a unmanageable data becomes crystal clear. The greatest thing is you get to build views on top of other views, rather than create Ye Massive Multiple Join Query of Death, as with MySQL.
3. Stored procedures, or functions can act like customized "widgets", where you can just pass values or recordsets as arguments, and have complex operations performed. UPDATE With Version 7.3 this is even more useful, because procedures can now return full recordsets, behaving as if they are tables.
4. Cursors are great for handling paged result sets, among other things.
---------------- Any good points to MySQL? --------
Yes, let's be fair:
1. MySQL is easier to deploy in certain situations, however even that has been changing lately with PostgreSQL.
2. It is a useful programmer's tool. For certain situations where your applications simply needs a place to store and manage some lists or arrays, this is not a bad choice. In fact, the in-memory HEAP table type by MySQL is essentially an SQL interface to shared memory segments, which is kind of nice.
3. MySQL runs natively on Windows, giving many developers an easy workstation to develop their apps before deploying to a Unix or Windows server. PostgreSQL is still primarily a Unix DBMS, although it can be installed on Windows with the Cygwin kit. (there is a native Windows port under development. UPDATE: See http://archives.postgresql.org/pgsql-hackers/2002-11/msg00375.php and download the beta here: ftp://209.61.187.152/postgres/)
4. MySQL supports an easier user-management system, with more fine-grained approaches to these privileges, so user access can be defined on databases, tables, and even columns. UPDATE: since version 7.3, PostgrSQL now supports SQL schemas and domains, as mentioned above. This is an enterprise DBMS feature not found in MySQL, and definitely provides for a more complete user management system.
5. MySQL has an official replication and hot backup system. PostgreSQL has several projects which are at various levels of completion, but not an official release, such as MySQL's. There are serious data integrity considerations in data replication which I have not explored here. The PostgreSQL team is working on an enterprise-level replication package which I believe addresses these issues. (http://gborg.postgresql.org/project/pgreplication/projdisplay.php) UPDATE: in addition to the great progress on PgReplicaton, there is now an official commercial high-availability system for PostgreSQL: http://www.pgsql.com/press/PR_Lanux_PgSQL.html
6. MySQL has some speed enhancements for read-intensive databases such as dynamic websites, which promise to make certain types of websites perform very well.
7. MySQL's development is moving at a fast pace, and they claim they intend to support all standard ANSI SQL features, eventually. Let's wish them the best. Meanwhile, just read [color #0000A0]comp.databases.postgresql.hackers[/color] and you will see that PostgreSQL is definitely keeping good time.
------------------------
Search www.phpbuilder.com for some very interesting articles about PostgreSQL, from the main developer of SourceForge, who first tried to do it with MySQL, but was very surprised to find that PostgreSQL handled the job much better.
And if you are concerned about support, PostgreSQL has every bit as much support as MySQL. There are newsgroups, mailing lists, forums, and even paid support levels, (see http://www.pgsql.com/support).
Also, for a serious website about understanding exactly what a relational database management system is, and why this is important, visit www.dbdebunk.com. The contributors are a couple of the most respected authors in database design and theory. C.J.Date is the senior contributor, and he also happens to be one of THE original researchers that helped develop the relational database concept in the 70s)
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.