-
Clustered Index vs Non-clustered
Hi,
Is there any difference between having a clustered index on an identity column
versus a non-clustered index?
AFAIK, the clustered index will physically order the data based on the values
in the index. A non-clustered index does not order the table hence the moniker
"heap table" for tables lacking clustered indexes.
BTW, does anyone know when does a clustered index reorders the data?
Is it on every insert, update or when DBCC DBREINDEX is run?
I would guess the latter but you never know.
Since I am on the subject, I would like to ask a few more questions on the
differences between the 2 indexes.
1. What's the general rule of thumb of using a clustered index over a non-clustered
index?
I should mention that I just read a great article at www.sql-server-performance.com
on "heap tables" not releasing space after rebuilding the indexes whereas
clustered tables will release unused space. I tested and found that to
be the case.
2. Has anyone measured the performance difference between the 2 indexes?
I just changed a clustered index to a non-clustered one and joined against
12 million rows on the one column that the index is based on. The queries
took the exact same time to execute.
Michael Tzoanos
-
Re: Clustered Index vs Non-clustered
Clustered indexes cause the data to be ordered according to the index,
which means reordering happens every time the index changes, for whatever
reason. Could be an UPDATE, or an INSERT.
Since clustering affects the physical order of data, a table can have a
maximum of 1 clustered index.
The general rule of thumb is to:
1: cluster on columns where you fetch ranges
2: cluster on columns you frequently order by
3: cluster on primary key. Of dubious value in my opinion.
One thing to note is that when a table doesn't hava a clustered index,
any non-clustered index contain a pointer to the actual data in the
table. When a clustered index is created, all non-clustered indexes are
rebuilt to contain the index columns of the clustered index, which may
lead to slower access.
Quite often, it's more valuable to consider creating covering indexes,
i.e. an index containing all columns required to run a query, thus
eliminating looking in the actual data pages.
About reclaiming unused space: Yes, rebuilding a non-clustered index
doesn't affect actual table space, but a clustered index *includes* the
data pages, hence the change in space usage when rebuilding the index.
Rune Bivrin
"Michael Tzoanos" <mtzoanos@hotmail.com> wrote in
news:3d52a6e8$1@10.1.10.29:
>
> Hi,
>
> Is there any difference between having a clustered index on an
> identity column versus a non-clustered index?
>
> AFAIK, the clustered index will physically order the data based on the
> values in the index. A non-clustered index does not order the table
> hence the moniker "heap table" for tables lacking clustered indexes.
>
> BTW, does anyone know when does a clustered index reorders the data?
> Is it on every insert, update or when DBCC DBREINDEX is run?
>
> I would guess the latter but you never know.
>
> Since I am on the subject, I would like to ask a few more questions on
> the differences between the 2 indexes.
>
> 1. What's the general rule of thumb of using a clustered index over a
> non-clustered index?
>
> I should mention that I just read a great article at
> www.sql-server-performance.com on "heap tables" not releasing space
> after rebuilding the indexes whereas clustered tables will release
> unused space. I tested and found that to be the case.
>
> 2. Has anyone measured the performance difference between the 2
> indexes?
>
>
> I just changed a clustered index to a non-clustered one and joined
> against 12 million rows on the one column that the index is based on.
> The queries took the exact same time to execute.
>
>
> Michael Tzoanos
>
>
>
-
Re: Clustered Index vs Non-clustered
Keep in mind that a clustered index keeps the rows in order within a page
(8k) of data. The pages are not physically in order, except after a ordered
load, or reindex based on that cluster.
What would it take to insert row1 in a million row table, if there was no
free space. Another reason to manage the freespace % numeber for clustered
indexes carefully.
KlK, MCSE
Rune Bivrin <rune@bivrin.com> wrote:
>Clustered indexes cause the data to be ordered according to the index,
>which means reordering happens every time the index changes, for whatever
>reason. Could be an UPDATE, or an INSERT.
>Since clustering affects the physical order of data, a table can have a
>maximum of 1 clustered index.
>
>The general rule of thumb is to:
>1: cluster on columns where you fetch ranges
>2: cluster on columns you frequently order by
>3: cluster on primary key. Of dubious value in my opinion.
>
>One thing to note is that when a table doesn't hava a clustered index,
>any non-clustered index contain a pointer to the actual data in the
>table. When a clustered index is created, all non-clustered indexes are
>rebuilt to contain the index columns of the clustered index, which may
>lead to slower access.
>
>Quite often, it's more valuable to consider creating covering indexes,
>i.e. an index containing all columns required to run a query, thus
>eliminating looking in the actual data pages.
>
>About reclaiming unused space: Yes, rebuilding a non-clustered index
>doesn't affect actual table space, but a clustered index *includes* the
>data pages, hence the change in space usage when rebuilding the index.
>
>Rune Bivrin
>
>"Michael Tzoanos" <mtzoanos@hotmail.com> wrote in
>news:3d52a6e8$1@10.1.10.29:
>
>>
>> Hi,
>>
>> Is there any difference between having a clustered index on an
>> identity column versus a non-clustered index?
>>
>> AFAIK, the clustered index will physically order the data based on the
>> values in the index. A non-clustered index does not order the table
>> hence the moniker "heap table" for tables lacking clustered indexes.
>>
>> BTW, does anyone know when does a clustered index reorders the data?
>> Is it on every insert, update or when DBCC DBREINDEX is run?
>>
>> I would guess the latter but you never know.
>>
>> Since I am on the subject, I would like to ask a few more questions on
>> the differences between the 2 indexes.
>>
>> 1. What's the general rule of thumb of using a clustered index over a
>> non-clustered index?
>>
>> I should mention that I just read a great article at
>> www.sql-server-performance.com on "heap tables" not releasing space
>> after rebuilding the indexes whereas clustered tables will release
>> unused space. I tested and found that to be the case.
>>
>> 2. Has anyone measured the performance difference between the 2
>> indexes?
>>
>>
>> I just changed a clustered index to a non-clustered one and joined
>> against 12 million rows on the one column that the index is based on.
>> The queries took the exact same time to execute.
>>
>>
>> Michael Tzoanos
>>
>>
>>
>
-
Re: Clustered Index vs Non-clustered
Hi,
Thanks for the replies to both of you.
>>3: cluster on primary key. Of dubious value in my opinion.
>>
And potentially a hot spot if the primary key is the identity of the column.
>Keep in mind that a clustered index keeps the rows in order within a page
>(8k) of data. The pages are not physically in order, except after a ordered
load, or reindex based on that cluster.
Kevin, I am sorry but where did you find that out? I am looking for more
information on indexes and I haven't read this anywhere.
>Another reason to manage the freespace % numeber for clustered
>indexes carefully.
You're talking about the FillFactor right?
So what would happen if the row that we are inserting into the millon row
table belonged on the first data page and there was no space? Would it reorder
all the data in the table?
Thanks,
Michael Tzoanos
-
Re: Clustered Index vs Non-clustered
Can't remember where I read this, I read too many books etc. But I know this
is the way it works. Think of the perf hit in that insert situation if it
didn't.
What happens when you insert row one, into a million row table and there's
no room in the page it belongs in.
The page it belongs in is split in two, I don't know SQL Servers exact algorithm,
let's say 50/50. The new row is inserted into the first of those 2 pages,
assuming that's where it belongs. And the second page (the newly created
page from the split) is placed at the end of the data area with the upper
level indexes pointing accordingly.
A clustered index still has normal indices, just no leaf level, lowest level.
KlK, MCSE
"Michael Tzoanos" <mtzoanos@hotmail.com> wrote:
>
>Hi,
>
>Thanks for the replies to both of you.
>
>>>3: cluster on primary key. Of dubious value in my opinion.
>>>
>And potentially a hot spot if the primary key is the identity of the column.
>
>>Keep in mind that a clustered index keeps the rows in order within a page
>>(8k) of data. The pages are not physically in order, except after a ordered
>load, or reindex based on that cluster.
>
>Kevin, I am sorry but where did you find that out? I am looking for more
>information on indexes and I haven't read this anywhere.
>
>>Another reason to manage the freespace % numeber for clustered
>>indexes carefully.
>
>You're talking about the FillFactor right?
>
>So what would happen if the row that we are inserting into the millon row
>table belonged on the first data page and there was no space? Would it
reorder
>all the data in the table?
>
>
>Thanks,
>
>Michael Tzoanos
>
-
Re: Clustered Index vs Non-clustered
Sorry, yes fillfactor. By managing your fillfactor you eliminate, or reduce
the need for these page splits. Which are expensive, overhead wise.
You should also watch the fillfactor setting when reorging.
One flaw I don't like about fill factor is it's assumption of evenly distributed
inserts. Unless your key structure is absolutely perfect is wastes space.
In today's DASD environment though thats not a big issue.
Last note, this page split activity can also be caused during an update if
you increase the size of a row, and it now will not fit within the original
space.
KlK, MCSE
"kevin knudson" <klk@DontEmailMe.com> wrote:
>
>Can't remember where I read this, I read too many books etc. But I know
this
>is the way it works. Think of the perf hit in that insert situation if
it
>didn't.
>
>What happens when you insert row one, into a million row table and there's
>no room in the page it belongs in.
>
>The page it belongs in is split in two, I don't know SQL Servers exact algorithm,
>let's say 50/50. The new row is inserted into the first of those 2 pages,
>assuming that's where it belongs. And the second page (the newly created
>page from the split) is placed at the end of the data area with the upper
>level indexes pointing accordingly.
>
>A clustered index still has normal indices, just no leaf level, lowest level.
>
>KlK, MCSE
>
>
>"Michael Tzoanos" <mtzoanos@hotmail.com> wrote:
>>
>>Hi,
>>
>>Thanks for the replies to both of you.
>>
>>>>3: cluster on primary key. Of dubious value in my opinion.
>>>>
>>And potentially a hot spot if the primary key is the identity of the column.
>>
>>>Keep in mind that a clustered index keeps the rows in order within a page
>>>(8k) of data. The pages are not physically in order, except after a ordered
>>load, or reindex based on that cluster.
>>
>>Kevin, I am sorry but where did you find that out? I am looking for more
>>information on indexes and I haven't read this anywhere.
>>
>>>Another reason to manage the freespace % numeber for clustered
>>>indexes carefully.
>>
>>You're talking about the FillFactor right?
>>
>>So what would happen if the row that we are inserting into the millon row
>>table belonged on the first data page and there was no space? Would it
>reorder
>>all the data in the table?
>>
>>
>>Thanks,
>>
>>Michael Tzoanos
>>
>
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Development Centers
-- Android Development Center
-- Cloud Development Project Center
-- HTML5 Development Center
-- Windows Mobile Development Center
|