indexing - understanding perf of mysql query using explain extended -
i trying understand performance of sql query using mysql. indexes on pk, query failed complete in on 10mins. have added indexes on columns used in clauses (timestamp, hostname, path, type) , query completes in approx 50seconds -- still seems long time not seem overly complex query.
so, i'd understand query causing this. assumption inner subquery in someway causing explosion in number of comparisons necessary.
there 2 tables involved:
storage (~5,000 rows / 4.6mb ) , machines (12 rows, <4k)
the query follows:
select t.hostname, t.path, t.used_pct, t.used_gb, t.avail_gb, t.timestamp, machines.type type storage t join machines on t.hostname = machines.hostname timestamp = ( select max(timestamp) storage st st.hostname = t.hostname , st.path = t.path) , (machines.type = 'nfs') order used_pct desc an explain extended query returns following:
id select_type table type possible_keys key key_len ref rows filtered 1 primary machines ref hostname,type type 768 const 1 100.00 using where; using temporary; using filesort 1 primary t ref fk_hostname fk_hostname 768 monitoring.machines.hostname 4535 100.00 using 2 dependent subquery st ref fk_hostname,path path 1002 monitoring.t.path 648 100.00 using noticing 'extra' column row 1 includes 'using filesort' , question: mysql explain query understanding states "using filesort sorting algorithm mysql isn't able use index sorting , therefore can't complete sort in memory."
what nature of query causing slow performance?
why necessary mysql use 'filesort' query?
indexes don't populated, there create them. that's why inserts , updates become slower more indexes have on table.
your query runs fast after first time because whole result of query put cache. see how fast query without using cache can
select sql_no_cache t.hostname ... mysql uses filesort order by or in case determine maximum value timestamp. instead of going through possible values , memorizing value greatest, mysql sorts values descending , picks first one.
so, why query slow? 2 things jumped eye.
1) subquery
timestamp = ( select max(timestamp) storage st st.hostname = t.hostname , st.path = t.path) gets evaluated every (hostname, path). have try index on timestamp (btw, discourage naming columns keywords / datatypes). if alone doesn't help, try rewrite query. there 2 excellent examples in mysql manual: the rows holding group-wise maximum of column.
2) minor issue, seems joining on char/varchar fields. numbers / ids faster.
Comments
Post a Comment