Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Good databases do self-optimize quite a bit but they select where and how they do that optimization carefully. There are two big constraints on practical dynamic optimization: the cost must be less than the benefit and it must be applied to the system incrementally.

These constraints have effects that are not always obvious. For example, architecting a database kernel so that it can be dynamically optimized slows down computational throughput in the general case! This means that the benefits of dynamic optimization have to be sufficiently large that it offsets the performance tax of doing dynamic optimization at all. Also, because the performance tax is general, there is an implication that the optimization needs to be general across the set of runtime operations as well i.e. you can't optimize one operation at the cost of slowing down others. Nonetheless, there many areas such as cache replacement algorithms where dynamic optimization has big performance payoffs for the added complexity.

Second, dynamic optimization needs to be applicable incrementally, otherwise it tends to have a "stop the world" effect on performance when it is being applied, which is bad. A good example of this is dynamically optimizing storage for the actual distribution of queries it sees. Dynamically adding a new secondary index imposes a huge background cost that is visible in performance, so databases don't do it. However, dynamically optimizing individual pages for queries is common because it can be applied incrementally for pages you are already accessing anyway for a small one-time cost per page distributed over time as pages are accessed. And it only applies to pages you actually touch; building a new index pulls a lot of cold storage through the cache. Page level query optimization may be less effective overall than adding a second index but the fact that it can be applied incrementally makes it a preferred form of dynamic optimization.

In short, dynamic optimization has been studied and tried for many decades. Sophisticated databases do a lot of it but the mechanisms are chosen carefully to minimize adverse side effects in real world conditions.



Great Answer, thank you! Going further, would this mean one could build a DB where i could send off "optimize now" queries and telling thereby the DB that I accept stop-the-world type of effects for a limited time? Or even specifying the effects that i'm willing to accept?

My main issue is that most of the DB Stuff i do as a non-dba is so basic that it could be automated, but isn't for reasons that until your post came up weren't clear to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: