[Rails][MySQL] 使用 update_all + JOIN + LIMIT 需注意效能問題
- Backend
- 18 Feb, 2021
MySQL 使用 JOIN + update_all 時,若遇到 LIMIT 則會轉為 subquery,需注意效能問題。
Rails 在使用 update_all 時,如果有用到 JOIN,會改用 subquery 的形式改寫,但若 Adapter 為 MySQL 時就會維持,因為 MySQL 語法本身支援 JOIN + UPDATE。但若是包含 LIMIT 時,由於 MySQL 也不支援,所以 Rails 會使用預設的 subquery 改寫。而在 MySQL 使用 update_all + subquery 時則可能遇到效能問題,可參考之前的文章:
- 預設用 subquery 改寫: https://github.com/rails/rails/blob/v6.1.2.1/activerecord/lib/arel/visitors/to_sql.rb#L796-L811
# The default strategy for an UPDATE with joins is to use a subquery. This doesn't work
# on MySQL (even when aliasing the tables), but MySQL allows using JOIN directly in
# an UPDATE statement, so in the MySQL visitor we redefine this to do that.
def prepare_update_statement(o)
if o.key && (has_limit_or_offset_or_orders?(o) || has_join_sources?(o))
stmt = o.clone
stmt.limit = nil
stmt.offset = nil
stmt.orders = []
stmt.wheres = [Nodes::In.new(o.key, [build_subselect(o.key, o)])]
stmt.relation = o.relation.left if has_join_sources?(o)
stmt
else
o
end
end
- Adapter 為 MySQL 時,若遇到 LIMIT 等語法則維持用 subquery 改寫(super class 預設行為),否則可以直接使用 JOIN + UPDATE:
https://github.com/rails/rails/blob/v6.1.2.1/activerecord/lib/arel/visitors/mysql.rb#L61-L70
# In the simple case, MySQL allows us to place JOINs directly into the UPDATE
# query. However, this does not allow for LIMIT, OFFSET and ORDER. To support
# these, we must use a subquery.
def prepare_update_statement(o)
if o.offset || has_join_sources?(o) && has_limit_or_offset_or_orders?(o)
super
else
o
end
end
- MySQL 官方文件,multiple-table 的 UPDATE 不支援 ORDER BY 跟 LIMIT:
https://dev.mysql.com/doc/refman/8.0/en/update.html
For the multiple-table syntax, UPDATE updates rows in each table named in table_references that satisfy the conditions. Each matching row is updated once, even if it matches the conditions multiple times. For multiple-table syntax, ORDER BY and LIMIT cannot be used.
做做實驗
Rails 6.1.2.1、MySQL 5.7
users 有 100 筆、posts 有 100000 筆
可以看到第二個 update_all,因為加了 JOIN,變成用 subquery,導致掃描了整張 posts 表,速度變很慢