Friday, March 9, 2012

Integer Index -vs- nVarChar(50) index....

We have a legacy database that was ported over from LDAP where the user
tables are indexed on the username (which is an nVarChar(50) field). We have
sever (ok, 92) related tables all using the nVarChar(50) field as the
primary key.
Does anyone have hard numbers as to whether SQL2000 would be much more
efficient if we were to convert all the tables to use an integer as the
primary key (and relate the tables on that key) instead of the nVarChar(50)?
TIA,
OwenYou can calculate the avg length of that column and comprare to 8 bytes (for
bigint) or 4 bytes (for int). The amount of I/O operations over an index wit
h
a key of, let us say, 20 bytes (10 characters for nvarchar) is for sure
higher compare with 4 or 8 bytes. Not to mention if the username is using a
clustered index, in this case that amount (for example 35 bytes) is
replicated to the nonclustered indexes also.
AMB
"Owen Mortensen" wrote:

> We have a legacy database that was ported over from LDAP where the user
> tables are indexed on the username (which is an nVarChar(50) field). We ha
ve
> sever (ok, 92) related tables all using the nVarChar(50) field as the
> primary key.
> Does anyone have hard numbers as to whether SQL2000 would be much more
> efficient if we were to convert all the tables to use an integer as the
> primary key (and relate the tables on that key) instead of the nVarChar(50
)?
> TIA,
> Owen
>
>|||int will be much more efficient.
hard to give you "Hard Data" but I would think the performance improvements
would be fairly significant especially if the system has a lot of data.
If your Database is only a few Gig and if you only have minor load in any
given day, then it may not be worth the efforts in changing.
cheers
Greg Jackson
PDX, Oregon|||Define hard numbers. Ints are better because they are smaller. They are
also better because they are fixed length. So for each row in your table,
your index pages would hold ~ 2000 rows (8000bytes per page/ 4 bytes per
row) If you have an average of 10 characters per entry, plus the 2 byte
overhead for variable length, then you will only get 8000/12 = 666 rows per
page. Ignoring the dubious nature of the number of rows, it will take 3
times as many pages in the index to handle the need for this index.
Carrying this over to the leaf nodes, if this is the clustered index, you
have to add the 12 bytes for every row rather than 4.
Now, this having been said, pure performance of the index is not the only
issue here. I like integers keys as much as the next guy, and possibly
more, but is this THAT great of a savings. Truly it depends on what your
needs are, and how often you will be doing a full index scan. The cost
might be very small if you are just doing single row retrievals, since you
may not have more than a single level added to the index, as the b-tree
index structure is pretty good.
So the cost to change all of these tables versus ten-twenty milliseconds
might not be worth it. The question is (finally) what is the impetus to
change this. Is something slow? Have you come to the conclusion that for
PK based retrieval it is costing too much? Or other types of retrieval?
----
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Owen Mortensen" <ojm.NO_SPAM@.acm.org> wrote in message
news:up0n6hgYFHA.2768@.tk2msftngp13.phx.gbl...
> We have a legacy database that was ported over from LDAP where the user
> tables are indexed on the username (which is an nVarChar(50) field). We
> have sever (ok, 92) related tables all using the nVarChar(50) field as the
> primary key.
> Does anyone have hard numbers as to whether SQL2000 would be much more
> efficient if we were to convert all the tables to use an integer as the
> primary key (and relate the tables on that key) instead of the
> nVarChar(50)?
> TIA,
> Owen
>

No comments:

Post a Comment