Sunday, March 25, 2012

basic design question re effective relationship

Hi,

Probably a very simple question but I'm afraid my experience at this is
somewhat limited so...

In a database with a many to one relationship, say orders to customers, is
it preferable to have an UID for a customer and store this in the orders
table rather than the customer name, for example.

CUSTOMERS
++++++++++
UID: Name: Address: Postcode: etc
01 Alpha 12 Acacia Ave HN7YHH
02 Beta 23 Acacia Ave HN6YTH
03 Gamma 28 Acacia Ave HN7UYH

ORDERS
Customer Amount date
Alpha 100 20030823
Alpha 250 20030824
Beta 90 20030825
Alpha 800 20030825
Gamma 1000 20030826

Or is it is better to have:

ORDERS
Customer Amount date
01 100 20030823
01 250 20030824
02 90 20030825
01 800 20030825
03 1000 20030826

Assuming that the second option is the best, how far do I go? I can see how
it would beof benifit to do this for orders and customers where there may be
several hundred thousand orders and hundreds of customers, but for example
would I need to do this for a customer contact database with titles - eg
each title having a UID and using this ID in the customer table?

Thanks for taking the time to read this. Any advice is gratefully received.

Kind thanks

Chris S"Chris Strug" <hotmail@.solace1884.com> wrote...

> Probably a very simple question but I'm afraid my experience at this is
> somewhat limited so...

Hi Chris: Yeah it is a little "basic" but you have to start somewhere...

> In a database with a many to one relationship, say orders to customers, is
> it preferable to have an UID for a customer and store this in the orders
> table rather than the customer name, for example.

Definitely. The CustomerId is the primary key for CUSTOMERS so it would be
the foreign key in the ORDERS table. For a couple of reasons. It is
guaranteed to be unique which the customer name really isn't. While it may
be in the case of a company name (though there can be two "Starbucks"
entries) it most certainly wouldn't be if you were entering people's names.
Also it means you can change/modify a customer name without impacting the
ORDERS table.

> Assuming that the second option is the best, how far do I go? I can see
how
> it would beof benifit to do this for orders and customers where there may
be
> several hundred thousand orders and hundreds of customers, but for example
> would I need to do this for a customer contact database with titles - eg
> each title having a UID and using this ID in the customer table?

Where to draw the line is always open for debate but you wouldn't typically
implement this for "titles" since a person is typically given a single
title. Admittedly it could be a compound title "President & CEO" but would
you want those listed seperately? It introduces problems for using the
database. When printing an envelope somebody has to ask "which of these
titles did you want to use?"

You would use the CustomerId to track multiple addresses perhaps. There is
a common need for a separate shipping address for instance. And the
CustomerId would be used if you were tracking the actual "contacts" like
mailings.

Tom|||Tom,

Thanks for the reply.

The reason I ask is because I have been given a large SQL Server database to
manage. The database handles Containers and their movements in and out of
our facility. The main table has been set up as a many to one link between a
container table and a movement table. The PK in the containers table is an
eleven digit string which acts as a secondary key in the movements table (if
that makes sense!).

Keeping the integrity of the relationship isn't too much of an issue as a
container ID is a code determined by an ISO standard that has all kinds of
checksums in it to ensure it is unique. However, the movements table has
around 250,000 records in it linked to the container table with approx
75,000 records.

My thinking was that by identifying each container with a unique integer ID
and using this as the basis for the PK / SK, this would be quicker and more
efficient - after all I'm sure SQL server can handle an integer value faster
than a string.

Hence my question about titles - in a large customer table, would it make
sense to use a link to a titles table identified with an integer and store
this integer in the customer table as a secondary key? As I said, I think
that storing an integer is more efficient than storing a string?

Thanks for taking the time to read this - your help is greatly appreciated.

Kind thanks

Chris.|||"Chris Strug" <hotmail@.solace1884.com> wrote...

> The reason I ask is because I have been given a large SQL Server database
to
> manage. The database handles Containers and their movements in and out of
> our facility. The main table has been set up as a many to one link between
a
> container table and a movement table. The PK in the containers table is an
> eleven digit string which acts as a secondary key in the movements table
(if
> that makes sense!).

Pretty much... I'm somewhat familiar with the container industry and I'm
working on something related. Is the e-mail address attached to your
message your actual e-mail address, we could chat out of the public eye?

> My thinking was that by identifying each container with a unique integer
ID
> and using this as the basis for the PK / SK, this would be quicker and
more
> efficient - after all I'm sure SQL server can handle an integer value
faster
> than a string.

Personally I wouldn't worry too much about the efficiency of integers vs
strings. It takes whatever time it takes to do whatever it has to do and
you know it is operating faster this year than it was a couple of years ago.
And things will generally operate faster next year again without our direct
intervention.

That said I've been making it a point to establish a unique,
non-informational integer as a primary key in my tables. Again there are a
couple of reasons for it but the primary one is simply that I don't have to
look for a naturally occurring primary key. Additionally I make them unique
to the system. There isn't a pool for the containers and another pool for
the customers, there is simply one pool and everybody gets a key assigned
from it.

> Hence my question about titles - in a large customer table, would it make
> sense to use a link to a titles table identified with an integer and store
> this integer in the customer table as a secondary key? As I said, I think
> that storing an integer is more efficient than storing a string?

If you needed to maintain a seperate list of "titles" for a customer then
yes I would reference a standard integer key. I wouldn't think of it as
"titles" per se but simply some data with a many-to-one relationship. In
your example it is a table of "titles" but you would use the same solution
regardless of what it was. That is why I prefer a non-information, integer
key that the system assigns.

By the way, in most cases no "user" ever sees these keys. They are
meaningless and each one is simply one larger than the previous one. There
is (just about) always a public "code" which is referenced by users. In
your example the user would key in (or select) the (it's called a BIC number
right?) container code.

Oh, the other advantage to using integers in this way is that (in the case
of systems that already exist) the client may have established customer
codes which they (and the customer) have been using for years. Having them
change to a number isn't likely to happen and they can continue to use the
codes they are familiar with. The code is looked up, the integer id is
determined and the system uses it internally for all other queries.

Tom|||Tom,

Thanks for the taking the time to reply.

I'm a bit tied up at the moment (deadlines 'n all) but yes, the hotmail
account is valid - just swap the domain with the username. Please feel free
to drop me a line.

I'll come back to your post in a bit, but in the meantime, thanks again for
your help.

Cheers

Chris.sql

No comments:

Post a Comment