The only thing you really need to understand about databases is that they store ...

pbh · on June 13, 2010

I want to be careful in how I phrase this, because I feel like this was a good effort, but I disagree with many parts of this description.

First, databases are not "just" about persistence, and I don't mean that in a pedantic way. The point of a database management system is to manage your data over time and allow you to make use of it. This includes things like stating constraints to ensure that the data in the database is always correct and providing methods for querying your data at varying levels of complexity.

Second, there are a number of in-memory databases. Oracle sells one. SQLite is often used as one. The fact that they are in-memory does not make them not databases.

Third, relational versus non-relational is not really an argument about "how the data is stored on disk" or about being "standards compliant." A database traditionally consists primarily of three things: a data model, a query language, and an implementation. A data model, like "everything is a table with column headers and rows" can be written to disk in many different ways and queried in many different ways. (It's an abstraction.) Indexing structures and data formats may be mostly the same even when the data model is different. The (current) "relational vs non-relational" debate, such as it is, is usually about whether it makes sense to change the data model and query language in order to ensure that the implementation is scalable on "cloud-like" platforms with specific data access needs. (That said, there are many different types of non-relational models and query languages designed for different purposes.)

Fourth, there isn't one SQL. There are in fact a number of standards, SQL-86, SQL-89, SQL-92, and so on. It makes sense to learn the standard first (at least up to 92 or so), rather than learning particular implementation specifics. Most databases implement SQL-92 or above, though there is somewhat wide variance in additional features like indexing structures and functions and in data modeling languages.

F_J_H · on June 14, 2010

The only thing you really need to understand about databases is that they store data for you, ie. they persist data.

...is like saying "the only thing you really need to know about sex is that it is for procreation." :-)

Sorry - couldn't resist. Utilizing databases effectively is often an area that is overlooked or kind of an afterthought. In my experience, often issues of scale stem from poor database design.

I put this in a separate comment, but any of Tom Kyte's books are excellent for gaining an understanding of how to use databases effectively. Although they are Oracle focused, there is still a lot to be learned about practical database design and utilization from his books. He also has a great site called “Ask Tom” where he answers questions. I have learned a ton from simply reading questions and his responses. Again, all Oracle focused, but very valuable nonetheless.

Ask Tom Link: http://asktom.oracle.com/pls/apex/f?p=100:1:0

sesqu · on June 13, 2010

The problem with the standards is that getting copies costs actual money. If one really wants to start learning about databases, it can be very difficult to decide what to pay for. Implementation documentation, at least, is generally free.

For books, I've heard good things about http://www.amazon.com/Fundamentals-Database-Systems-Ramez-El..., though I haven't read it myself.

pbh · on June 13, 2010

I completely agree. My intent was to suggest that one should learn to the standard first rather than any particular implementation, not that one should read any of the actual standards documents directly. (Yikes!) By analogy, if you want to learn C, read K&R to learn something approximating C89, rather than picking up a book on how to code to the specific dialect of C understood by GCC 4.5.

For what it is worth, I learned from Database Systems: The Complete Book.

http://www.amazon.com/Database-Systems-Complete-Book-2nd/dp/...

DS:TCB is pretty explicit about which of the SQL it teaches is part of which standard. That said, I suspect that any general database book should do a reasonable job.