How not to optimize a MySQL query
Database, MySQL, Programming, SQL April 9th, 2007 - 23,381 viewsI just read a blog post discussing mysql query optimization and thought I’d put in my two cents.
The post suggests using a number of mysql specific statements (e.g. SQL_SMALL_RESULT, HIGH/LOW_PRIORITY, and INSERT DELAYED. STRAIGHT_JOIN was conspicuously missing). Unless absolutely necessary, this is usually A Bad Idea for at least two reasons. First, they are specific to MySQL which makes your database code less portable. This might or might not be a problem. Second, and perhaps more importantly, giving the SQL interpreter this sort of hint can lead to decreased performance in the future when your database or the interpreter changes. Telling the interpreter to anticipate a small result set (with SQL_SMALL_RESULT) might seem like a good idea, but could lead to problems when your table grows and the result becomes large! Basically, use these keywords with caution, and only when you really need them. And when you do use them, take special care in documenting where and why they’re in use.
The truth is there is no silver bullet that is going to make MySQL (or any dbms) run a poorly written query lightning fast. But here are some tips that the post somehow neglected to mention.
Properly index your tables
If you do a lot of lookups using a particular column of a table, or if you join on a column, that column should be indexed. Moreover, if all of the data that you are retrieving is available in the index (e.g. you’re using a multi-column index) then MySQL can avoid looking at the table altogether and execute your query using just the index.
Avoid superfluous queries
Don’t do this:
$result = query_db('select * from table1');
for each $result as $row
$array[] = query_db('select * from table2 where column = '.$row['id']);
endforeach;
Do this:
$result = query_db('select table2.* from '
.'table1, table2 where table1.id=table2.column');
Look for bottlenecks
Don’t waste time optimizing queries that aren’t bottlenecks in your application. Find the low hanging fruit and correct those problems first.
Learn SQL
This is the most important tip. SQL optimization really has to be done on a case by case basis, and you can’t do it unless you have a good understanding of the language and how you can use it to your advantage. You need to understand things like subqueries, grouping, left joins vs. right joins vs. full joins, etc. There is no free lunch.
If you’re interested in learning more, I highly recommend Stephane Faroult’s book The Art of SQL.
April 9th, 2007 at 10:12 pm
[...] If, like me, you often dabble in writing your own dynamic apps that are driven (at least in part) by MySQL, you might want to check out this little snippet: How not to optimize a MySQL query. [...]
April 10th, 2007 at 12:14 am
[...] More Info http://dev.mysql.com/doc/refman/5.0/en/optimization.html http://dev.mysql.com/doc/refman/5.0/en/explain.html http://dev.mysql.com/tech-resources/presentations/presentation-oscon2000-20000719/index.html http://jpipes.com/presentations/mysql_perf_tuning.pdf [pdf warning] http://immike.net/blog/2007/04/09/how-not-to-optimize-a-mysql-query/ [...]
April 10th, 2007 at 1:23 am
[...] MySQL queries. I couldn’t read this and let it stand because this list is really, really bad. Some guy named Mike noted this, too. So in this entry I’ll do two things: first, I’ll explain why his list is bad; [...]
April 10th, 2007 at 8:06 am
SQL is a language for dealing with relational databases. Relational databases behave like mathematical sets. If you want a fast query, make sure the relation you define is a very small subset of the cartesian products of whatever tables (sets) you are dealing with.
April 10th, 2007 at 8:07 am
Thank you very much for this. You are dead on; that ‘blog discussing mysql query optimization’ was a bunch of nonsense. If someone doesn’t know how to optimize their queries at the level you are describing tries stuff in that other blog post, they are in for some real trouble. Someone reading that other post would be dangerous with a little knowledge like that, gumming up their queries when really they need database fundamentals like proper joins and indexes.
April 10th, 2007 at 8:18 am
[...] It’s fast, too! I’m being dugg right now and the server’s not breaking a sweat. Save This [...]
April 10th, 2007 at 9:07 am
find the torrent http://www.demonoid.com/files/details/309979/1231425/
April 10th, 2007 at 9:50 am
Uhhh.. isn’t this like fundamentals of SQL 101?
April 10th, 2007 at 9:50 am
I like your example query, lol!
One thing we have been using recently at work which helps for inserting insane amounts of data into the database quickly is “LOAD DATA INFILE”. Works great for restoring homemade database backups using something like a pipeline seperated or comma seperated file (CSV). Or inserting data from an Excel file (after exporting it to csv). Much better than reading a file per line and running an INSERT command. Try it out on a test db, you’ll be amazed :)
April 10th, 2007 at 9:51 am
Crap I forgot to add the link to the command:
http://dev.mysql.com/doc/refman/4.1/en/load-data.html
April 10th, 2007 at 9:52 am
Good stuff Mike. Glad to see it got Dugg. Keep ‘em coming and I’ll be sure to get them linked up and attributed to you on our site.
April 10th, 2007 at 9:53 am
[...] are trying to do just that. One blogger named Mike that I found today takes them all to task with a few tips on the proper way to handle this sticky SQL situation. Properly index your [...]
April 10th, 2007 at 10:10 am
Thank you for the recommendation of the Art of SQL book! I picked it up and it’s exactly what I’ve been looking for to bring my SQL capabilities to the next level.
April 10th, 2007 at 10:27 am
[...] giving it a good think through. I’ll let the original post stand… but here’s How not to optimize a MySQL Query; and when his server comes back up (link got dugg and dugg hard…), 10 tips for optimizing [...]
April 10th, 2007 at 11:47 am
You wasted my bandwidth to look at that??
Come on guy.
April 10th, 2007 at 11:58 am
[...] attila the hun Response to the recent "How to optimize a MySQL query" mulan avi rapidshareread more | digg [...]
April 10th, 2007 at 1:50 pm
I don’t agree with your view. Optimization is about specializing your code for your environment and dataset. Yes, these things may change eventually, and at the point you re-optimize for your new environment.
Writing code that will work out of the box with 6 different databases is a completely pointless for most projects.
April 10th, 2007 at 11:10 pm
mike,
your response to my article is lame on so many levels.
“They are specific to MySQL which makes your database code less portable.”
The article was about optimizing MYSQL QUERIES not SQL QUERIES. The more you generalize something, the less it can possibly be optimized. Some of the best methods of optimization come from specialized cases.
and the rest?
Properly index your tables
Avoid superfluous queries
Look for bottlenecks
Learn SQL
These are fluff topics filled with virtually no content.
and judging from the amazon book (with affiliate ID) that you placed at the bottom of your article, you got your article on digg (and on the front page) to make money.
April 10th, 2007 at 11:28 pm
Justin:
Re: MySQL optimization - SQL is a standard. It’s worth sticking to the standard as much as you can. The rest (indexing tables, looking for bottlenecks, et. al.) were very general suggestions (as I said in my post). My point was that you need to actually understand how SQL works in order to optimize any non-trivial query.
Re: the amazon book - look closer. There is no affiliate ID in that link. I linked to Faroult’s book because I’ve read it and it was very good. I’m not making any money off of this site (do you see any ads). Please avoid the ad hominem attacks. Thanks.
Oh, and by the way, I didn’t digg this post until it already had a handful of diggs from other users (who apparently clicked through from your blog post). So if you’re pissed that this made the front page of digg you can blame your readers, not me.
April 11th, 2007 at 1:27 am
[...] I got dugg earlier today, neat. I woke up at around 9am (EST) and saw that my post on how not to optimize a MySQL query had just made the front page of Digg. Cool. I ran tail -f on my apache logs and saw lots of [...]
April 11th, 2007 at 5:44 am
It isn’t cool using where to relate two or more tables.
Instead it’s better to use INNER JOIN / ON statements.
April 11th, 2007 at 8:27 am
If you really want to do advanced query optimization you should not only learn sql.. but also the relational algebra that is behind it.
The DBMS translates the sql to this algebra and does some query optimization steps itself.
Also you should understand physical data organization on the hard disc (tracks , sectrs, cylinders, page fetches, RIAD).
Together with different index techniques like ( ISAM, B+ Tree, and hash indexing).
Only then you can do meaningfull analizing of the query plan and come op with the most optimal query and indices.
Note that this is only neccesary for really heavy duty databases. In most web based databases it is sufficient to choose some good indices.
April 11th, 2007 at 8:36 am
I don’t agree with your view. Optimization is about specializing your code for your environment and dataset. Yes, these things may change eventually, and at the point you re-optimize for your new environment.
Writing code that will work out of the box with 6 different databases is a completely pointless for most projects.
April 12th, 2007 at 7:45 am
[...] Read the comment too, How not to optimize a MySQL query [...]
April 12th, 2007 at 7:59 am
“Re: MySQL optimization - SQL is a standard. It’s worth sticking to the standard as much as you can. The rest (indexing tables, looking for bottlenecks, et. al.) were very general suggestions (as I said in my post). My point was that you need to actually understand how SQL works in order to optimize any non-trivial query”
SQL may be a standard, but if you are using mysql (and most likely if you are, you will not be using microsoft SQL, oracle, or postgres), you will want to use optimizations that will make your project or app faster, even if it is specific to mysql.
I think my favorite of all of your points is: “look for bottlenecks”
Okay Mike, tell me some specific bottlenecks that I should be looking out for in my mysql server setup.
“low hanging fruit” isn’t really a good enough answer.
I would have been fine with your article if it hadn’t directly attacked mine.
April 26th, 2007 at 1:29 pm
I like how everyone always says to properly index your tables but never gives any guidelines.
April 29th, 2007 at 2:58 am
Peace people
We love you
July 25th, 2007 at 10:30 am
I have find this web site what do you think about the products on this website are they good to help me in my Internet Marketing? There are several products like services and tools for Search Engine Submission, Link Building, Affiliate Marketing, article submission etc.
July 30th, 2007 at 1:48 pm
Thank you very much for your helpfull desc.
August 7th, 2007 at 3:23 am
The less-portable argument is useless in 90% of the projects. I mean please! Who runs MySQL and all of the sudden decides to go Postgres, or the other way around?
If people use MySQL, they should optimize for it. And optimization is not a one-time-thing, but starts again if you update your software, etc..
October 22nd, 2007 at 6:41 pm
@justin:
about “low hanging fruit” - use “log slow queries” capability to find the bottlenecks of your mysql server setup (look it up in google if you don’t know how to use it ;).
Otherwise the posts are totally unrelated. Justin’s tips are valid suggestions for a capable MySQL user that ran out of other solutions and wants to get the most out of their system. Of course, such users should know _a lot_ about MySQL anyway.
This blog entry however takes care of the other 99% of the users.
About portability: I am usng MySQL most of the time, but I try to use only standard SQL queries. Why? Because from time to time I get to work with Oracle, MS SQL, PostgreSQL, SQLite,… and I can’t be bothered to learn their specifics. Most of the time it pays off to just identify and optimize the bottlenecks, which means specific MySQL features are not needed. Better spend your time optimizing the algorithm.
March 20th, 2008 at 5:42 pm
[...] (or rather avoiding use of) SQL_SMALL_RESULT, HIGH/LOW_PRIORITY, and INSERT DELAYED. STRAIGHT_JOIN.read more | digg [...]
August 25th, 2008 at 5:39 am
You can also try to cut down the size of the data row, by using the smallest datatype possible. Beside that, use prepared statement & cache query will also boost the performance.