Mad IT: Query Optimizer

ก่อนจะพูดถึงเรื่อง optimizer เราจะต้องทำความเข้าใจในเรื่อง index ก่อนดังนี้
Primary Index หรือ clustered index คือ index บน column ที่ระบุ physical sequence ของ table
หมายถึงว่า table นั้นได้ถูกเรียงไปในทางเดียวกัน index

Secondary Index หรือ non-clustered index คือ index บน column ที่ไม่ได้ระบุ physical sequence ของ table

Statistic ใน database จะเก็บ

n_r จำนวน tuple in relation r
b_r จำนวน block containing tuples of relation r
s_r ขนาดของ tuple ของ relation r
f_r จำนวน tuple ใน 1 block
V(A, r) จำนวนของ distinct value ใน attribute A ใน relation r
SC(A, r) คือ selection cardinality ของ attribute A ของ relation r

optimizer จะเลือกแต่ะละ plan นั้นดูจาก cost แต่ละ plan
สำหรับ basic algorithms มีด้วยกัน 2 algorithm
A1 (linear search) หรือเรียกว่า table scan
A2 (binary search) โดยจะใช้ได้ต่อเมื่อ table ถูก physically sort เท่านั้น

สำหรับกรณีที่มี index ร่วมด้วย มีด้วยกันอีก 3 algorithm
A3 (primary index on candidate key, equality) cost จะเท่ากับ HTi + 1
A4 (primary index on non-key) cost จะเท่ากับ HTi + number of blocks containing retrieved records
A5 (secondary index) สำหรับ candidate key cost จะเท่ากับ HTi + 1 แต่ละสำหรับ non-key จะเท่ากับ HTi + number of records retrieved

algorithm ที่กล่าวไปแล้ว สำหรับการ select แบบ equal แต่ในกรณีที่มีการ comparison เข้ามาด้วยจะมีอีก 2 algorithm
A6 (primary index, comparison)
A7 (secondary index, comparison)

ซึ่งแต่ละ algorithm ก็จะมี วิธีการคิด cost ที่แตกต่างกันไป ซึ่ง

Mad IT

Query Optimizer

No comments:

Post a Comment