Mysql Performance: Which of the query will take more time?

mysql sql database-performance query-performance sqlperformance

44 观看


21 作者的声誉

I have two tables: 1. user table with around 10 million data columns: token_type, cust_id(Primary) 2. pm_tmp table with 200k data columns: id(Primary | AutoIncrement), user_id

user_id is foreign key for cust_id

1st Approach/Query:

update user set token_type='PRIME'
where cust_id in (select user_id from pm_tmp where id between 1 AND 60000);

2nd Approach/Query: Here we will run below query for different cust_id individually for 60000 records:

update user set token_type='PRIME' where cust_id='1111110';
作者: Prashant Mudgal 的来源 发布者: 2017 年 12 月 27 日

回应 (3)


539 作者的声誉

Theoretically time will be less for the first query as it involves less number of commits and in turn less number of index rebuilds. But, I would recommend to go with the second option since its more controlled and will appear to be less in time and you can event think about executing 2 seperate sets parellelly.

Note: The first query will need sufficient memory provisioned for mysql buffers to get it executed quickly. Second query being set of independent single transaction queries, they will need comparatively less memory and hence will appear faster if executed on limited memory environments.

Well, you may rewrite the first query this way too.

update user u, pm_tmp p set u.token_type='PRIME' where and <60000;

作者: Nans 发布者: 27.12.2017 05:18


864951 作者的声誉

Some versions of MySQL have trouble optimizing in. I would recommend:

update user u join
       pm_tmp pt
       on u.cust_id = pt.user_id and between 1 AND 60000
    set u.token_type = 'PRIME' ;

(Note: This assumes that cust_id is not repeated in pm_temp. If that is possible, you will want a select distinct subquery.)

Your second version would normally be considerably slower, because it requires executing thousands of queries instead of one. One consideration might be the update. Perhaps the logging and locking get more complicated as the number of updates increases. I don't actually know enough about MySQL internals to know if this would have a significant impact on performance.

作者: Gordon Linoff 发布者: 27.12.2017 05:35


80233 作者的声誉

IN ( SELECT ... ) is poorly optimized. (I can't provide specifics because both UPDATE and IN have been better optimized in some recent version(s) of MySQL.) Suffice it to say "avoid IN ( SELECT ... )".

Your first sentence should say "rows" instead of "columns".

Back to the rest of the question. 60K is too big of a chunk. I recommend only 1000. Aside from that, Gordon's Answer is probably the best.

But... You did not use OFFSET; Do not be tempted to use it; it will kill performance as you go farther and farther into the table.

Another thing. COMMIT after each chunk. Else you build up a huge undo log; this adds to the cost. (And is a reason why 1K is possibly faster than 60K.)

But wait! Why are you updating a huge table? That is usually a sign of bad schema design. Please explain the data flow.

Perhaps you have computed which items to flag as 'prime'? Well, you could keep that list around and do JOINs in the SELECTs to discover prime-ness when reading. This completely eliminates the UPDATE in question. Sure, the JOIN costs something, but not much.

作者: Rick James 发布者: 29.12.2017 10:55