Showing posts with label records. Show all posts
Showing posts with label records. Show all posts

Sunday, March 25, 2012

Basic "RANK" question: how to stop null records appearing.

i have a query that runs just fine, returning the turnover per sector, where the sector has non-null turnover.

I'd like to add ranking into the query, but when I do, I now get ALL sectors, with those sectors having null turnover all ranking equal bottom.

This kinda makes sense, but essentially I'd like to exclude the null turnover ones. I've tried various combinations of nonempty, NON EMPTY and Exists without success. What's the answer to this, and more importantly, how should I be thinking this through to get the right answer myself?

1. First query: i have defined the ranking measure, but not used it. I get only non empty sectors.

WITH

member measures.[turnover rank] as rank([SECTOR3].[SECTOR3 ID].CurrentMember, [SECTOR3].[SECTOR3 ID].[SECTOR3 ID].ALLMEMBERS)

SELECT {[Measures].[TURNOVER]} on columns,

nonempty {[SECTOR3].[SECTOR3 ID].[SECTOR3 ID].ALLMEMBERS} ONROWS

FROM [WmCube4]

where [TIME].[Date].&[2007-06-04T00:00:00]

2. Second query: i include the ranking measure in the output - now null records appear.

WITH

member measures.[turnover rank] as rank([SECTOR3].[SECTOR3 ID].CurrentMember, [SECTOR3].[SECTOR3 ID].[SECTOR3 ID].ALLMEMBERS)

SELECT {[Measures].[TURNOVER],measures.[turnover rank]} on columns,

nonempty {[SECTOR3].[SECTOR3 ID].[SECTOR3 ID].ALLMEMBERS} ONROWS

FROM [WmCube4]

where [TIME].[Date].&[2007-06-04T00:00:00]

Not sure how ranking could occur without using a 3rd parameter for Rank() - but you can return null sectors with empty turnover, like:

member measures.[turnover rank] as

iif(IsEmpty([Measures].[TURNOVER]), Null,

rank([SECTOR3].[SECTOR3 ID].CurrentMember,

[SECTOR3].[SECTOR3 ID].[SECTOR3 ID].ALLMEMBERS,

[Measures].[TURNOVER]))

|||

works perfectly, thanks!

Tuesday, March 20, 2012

Bad Update Performance with very large recordsets

When updating a table with 80 million records we are experiencing very bad
times. We found using maxdop1 to keep the process in one thread helpful,
but something is still causing problems.
Updating one bigint field which is nullable to a bigint value from a table
inner joined with the current table is crawling.
Should we remove the nullability of the field and set it to a default of -1?
Would this ensure space is allocated for the field? What else can we look
for?
What about NO LOCK or Serializable? Could
What would be the *fastest* way to update records in a table assuming only 1
connection to the database? How do fields allowing NULLs affect this?
Thanks.Never try to update 80 million rows at once if you can avoid it. Wrap the
updates in smaller batches of say 10, 50 or 100K in size. If no one else
needs to use the table while this is happening you can use TABLOCK hint to
help.
--
Andrew J. Kelly
SQL Server MVP
"Kory Yingling" <kory@.removeme-mlsc.com> wrote in message
news:u$j1RcBbDHA.1744@.TK2MSFTNGP12.phx.gbl...
> When updating a table with 80 million records we are experiencing very bad
> times. We found using maxdop1 to keep the process in one thread helpful,
> but something is still causing problems.
> Updating one bigint field which is nullable to a bigint value from a table
> inner joined with the current table is crawling.
> Should we remove the nullability of the field and set it to a default
of -1?
> Would this ensure space is allocated for the field? What else can we look
> for?
> What about NO LOCK or Serializable? Could
> What would be the *fastest* way to update records in a table assuming only
1
> connection to the database? How do fields allowing NULLs affect this?
> Thanks.
>

Sunday, March 11, 2012

Bad Duplicate Records

we're running a query that returns duplicate records when it
_absolutely_ should not, that is, the data should have only unique
records. i remember reading some similar scenario that was either
service pack or dop related. this is only happening on one of our
boxes. this ring a bell w/ anyone? thanks,
arthur
Duplicate records ring the bell of "using outer joins where not expected"
Kind of fuzzy for us to help you without any DDL / DML or sample data.
Jens Smeyer.
http://www.sqlserver2005.de
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113512308.829100.162530@.g14g2000cwa.googlegr oups.com...
> we're running a query that returns duplicate records when it
> _absolutely_ should not, that is, the data should have only unique
> records. i remember reading some similar scenario that was either
> service pack or dop related. this is only happening on one of our
> boxes. this ring a bell w/ anyone? thanks,
> arthur
>
|||let me assure you, there are no outer joins. basically the query is:
select a.f2, b.f2
from a inner join b on a.f1 = b.f1
where a.f3 = y
if i run either
select * from a where f1 = x or select * from b where f2 = x
i only get one record. however when i join the two in the first query,
i get four records.
|||But that are different queries.
Image the sample data
Table 1
=====
f1 f2 f3
x x y
a u y
b u y
c u y
d u y
select * from a where f1 = x --> 1
Table 2
======
f1 f2 f3
u x y
a a y
b b y
c u y
d u y
select * from b where f2 = y --> 1

> select a.f2, b.f2
> from a inner join b on a.f1 = b.f1
> where a.f3 = y
--> 4
Try checking your join !
HTH, Jens Smeyer
http://www.sqlserver2005.de
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113516845.847086.23020@.g14g2000cwa.googlegro ups.com...
> let me assure you, there are no outer joins. basically the query is:
> select a.f2, b.f2
> from a inner join b on a.f1 = b.f1
> where a.f3 = y
> if i run either
> select * from a where f1 = x or select * from b where f2 = x
> i only get one record. however when i join the two in the first query,
> i get four records.
>
|||sorry, i should have left off the where a.f3 = y (or changed it to
where a.f1 = x) . my real query has a similar condition in it but does
not contribute to the duplicate records.
in any case, we're getting sidetracked. i am 100% sure there is nothing
wrong with the query - we have 2 boxes with the exact same data and one
returns duplicate data, the other doesn't.
like i said, i remember something about a bad service pack or dop
setting that causes this to happen.
arthur
|||If you are concerned about processor settings try this:
Try adding
<SelectQuery>
OPTION (MAXDOP 1)
or look here if it doesnt work
http://support.microsoft.com/kb/266372/EN-US/
HTH, Jens Smeyer
http://www.sqlserver2005.de
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113593450.443922.169900@.l41g2000cwc.googlegr oups.com...
> sorry, i should have left off the where a.f3 = y (or changed it to
> where a.f1 = x) . my real query has a similar condition in it but does
> not contribute to the duplicate records.
> in any case, we're getting sidetracked. i am 100% sure there is nothing
> wrong with the query - we have 2 boxes with the exact same data and one
> returns duplicate data, the other doesn't.
> like i said, i remember something about a bad service pack or dop
> setting that causes this to happen.
> arthur
>
|||thanks! that got rid of the duplicate records. very odd since this is
not a complex query at all. do you know if this issue is related to any
service pack/hot fix?

Bad Duplicate Records

we're running a query that returns duplicate records when it
_absolutely_ should not, that is, the data should have only unique
records. i remember reading some similar scenario that was either
service pack or dop related. this is only happening on one of our
boxes. this ring a bell w/ anyone? thanks,
arthurDuplicate records ring the bell of "using outer joins where not expected"
Kind of fuzzy for us to help you without any DDL / DML or sample data.
Jens Smeyer.
http://www.sqlserver2005.de
--
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113512308.829100.162530@.g14g2000cwa.googlegroups.com...
> we're running a query that returns duplicate records when it
> _absolutely_ should not, that is, the data should have only unique
> records. i remember reading some similar scenario that was either
> service pack or dop related. this is only happening on one of our
> boxes. this ring a bell w/ anyone? thanks,
> arthur
>|||let me assure you, there are no outer joins. basically the query is:
select a.f2, b.f2
from a inner join b on a.f1 = b.f1
where a.f3 = y
if i run either
select * from a where f1 = x or select * from b where f2 = x
i only get one record. however when i join the two in the first query,
i get four records.|||But that are different queries.
Image the sample data
Table 1
=====
f1 f2 f3
x x y
a u y
b u y
c u y
d u y
select * from a where f1 = x --> 1
Table 2
======
f1 f2 f3
u x y
a a y
b b y
c u y
d u y
select * from b where f2 = y --> 1

> select a.f2, b.f2
> from a inner join b on a.f1 = b.f1
> where a.f3 = y
--> 4
Try checking your join !
HTH, Jens Smeyer
http://www.sqlserver2005.de
--
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113516845.847086.23020@.g14g2000cwa.googlegroups.com...
> let me assure you, there are no outer joins. basically the query is:
> select a.f2, b.f2
> from a inner join b on a.f1 = b.f1
> where a.f3 = y
> if i run either
> select * from a where f1 = x or select * from b where f2 = x
> i only get one record. however when i join the two in the first query,
> i get four records.
>|||sorry, i should have left off the where a.f3 = y (or changed it to
where a.f1 = x) . my real query has a similar condition in it but does
not contribute to the duplicate records.
in any case, we're getting sidetracked. i am 100% sure there is nothing
wrong with the query - we have 2 boxes with the exact same data and one
returns duplicate data, the other doesn't.
like i said, i remember something about a bad service pack or dop
setting that causes this to happen.
arthur|||If you are concerned about processor settings try this:
Try adding
<SelectQuery>
OPTION (MAXDOP 1)
or look here if it doesnt work
http://support.microsoft.com/kb/266372/EN-US/
HTH, Jens Smeyer
http://www.sqlserver2005.de
--
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113593450.443922.169900@.l41g2000cwc.googlegroups.com...
> sorry, i should have left off the where a.f3 = y (or changed it to
> where a.f1 = x) . my real query has a similar condition in it but does
> not contribute to the duplicate records.
> in any case, we're getting sidetracked. i am 100% sure there is nothing
> wrong with the query - we have 2 boxes with the exact same data and one
> returns duplicate data, the other doesn't.
> like i said, i remember something about a bad service pack or dop
> setting that causes this to happen.
> arthur
>|||thanks! that got rid of the duplicate records. very odd since this is
not a complex query at all. do you know if this issue is related to any
service pack/hot fix?

Bad Duplicate Records

we're running a query that returns duplicate records when it
_absolutely_ should not, that is, the data should have only unique
records. i remember reading some similar scenario that was either
service pack or dop related. this is only happening on one of our
boxes. this ring a bell w/ anyone? thanks,
arthurDuplicate records ring the bell of "using outer joins where not expected"
Kind of fuzzy for us to help you without any DDL / DML or sample data.
Jens Süßmeyer.
--
http://www.sqlserver2005.de
--
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113512308.829100.162530@.g14g2000cwa.googlegroups.com...
> we're running a query that returns duplicate records when it
> _absolutely_ should not, that is, the data should have only unique
> records. i remember reading some similar scenario that was either
> service pack or dop related. this is only happening on one of our
> boxes. this ring a bell w/ anyone? thanks,
> arthur
>|||let me assure you, there are no outer joins. basically the query is:
select a.f2, b.f2
from a inner join b on a.f1 = b.f1
where a.f3 = y
if i run either
select * from a where f1 = x or select * from b where f2 = x
i only get one record. however when i join the two in the first query,
i get four records.|||But that are different queries.
Image the sample data
Table 1
=====f1 f2 f3
x x y
a u y
b u y
c u y
d u y
select * from a where f1 = x --> 1
Table 2
======f1 f2 f3
u x y
a a y
b b y
c u y
d u y
select * from b where f2 = y --> 1
> select a.f2, b.f2
> from a inner join b on a.f1 = b.f1
> where a.f3 = y
--> 4
Try checking your join !
HTH, Jens Süßmeyer
http://www.sqlserver2005.de
--
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113516845.847086.23020@.g14g2000cwa.googlegroups.com...
> let me assure you, there are no outer joins. basically the query is:
> select a.f2, b.f2
> from a inner join b on a.f1 = b.f1
> where a.f3 = y
> if i run either
> select * from a where f1 = x or select * from b where f2 = x
> i only get one record. however when i join the two in the first query,
> i get four records.
>|||sorry, i should have left off the where a.f3 = y (or changed it to
where a.f1 = x) . my real query has a similar condition in it but does
not contribute to the duplicate records.
in any case, we're getting sidetracked. i am 100% sure there is nothing
wrong with the query - we have 2 boxes with the exact same data and one
returns duplicate data, the other doesn't.
like i said, i remember something about a bad service pack or dop
setting that causes this to happen.
arthur|||If you are concerned about processor settings try this:
Try adding
<SelectQuery>
OPTION (MAXDOP 1)
or look here if it doesn´t work
http://support.microsoft.com/kb/266372/EN-US/
HTH, Jens Süßmeyer
--
http://www.sqlserver2005.de
--
"arthur" <alangham@.gmail.com> schrieb im Newsbeitrag
news:1113593450.443922.169900@.l41g2000cwc.googlegroups.com...
> sorry, i should have left off the where a.f3 = y (or changed it to
> where a.f1 = x) . my real query has a similar condition in it but does
> not contribute to the duplicate records.
> in any case, we're getting sidetracked. i am 100% sure there is nothing
> wrong with the query - we have 2 boxes with the exact same data and one
> returns duplicate data, the other doesn't.
> like i said, i remember something about a bad service pack or dop
> setting that causes this to happen.
> arthur
>|||thanks! that got rid of the duplicate records. very odd since this is
not a complex query at all. do you know if this issue is related to any
service pack/hot fix?

Bad Design?


Table A
[ID] [SalesmanEmail] [SalesmanName]
Table B
[ID] [QuoteNo] [SalesmanID]
Table C
[ID] [Product] [QuoteNo]

Program A creates the records for Table C. In the process it reads from Table A and Table B.

Program B reads record from Table C. Program B now needs [SalesmanEmail] field.

The suggestion is to add [SalesmanEmail] to Table C. Is this good or bad design?Select TableA.SalesmanEmail From TableA Where TableA.ID In (Select Distinct TableB.SalesmanID From TableB Where TableB.ID = whatever);|||That's how I suggested we do it, however the "senior" programmer was the one who asked me to write out SalesmanEmail directly into Table C.

I tried to explain to him that if there were ever any modifications to SalesmanEmail in Table A or to SalesManID in Table B, every record I wrote in Table C would now be invalid because we wrote it to a column instead of doing a lookup, but he still didn't get what I meant. And his method also wastes DB space by storing the same field in multiple tables.|||The suggestion to put the salesman's email address into Table C violates the Second Normal Form of database normalization. It belongs solely in Table A.

Terri|||:: Table C violates the Second Normal Form of database normalization.

Is that a misdeameanor or felony? or does the SQL cop just write you a ticket?

j/k. :)

Yeah, I just gave a query example how you don't need to add a new field if it's already in another.|||But he would probably be more comforted to see:


SELECT
TableC.Product,
TableC.QuoteNo,
TableA.SalesmanName,
TableA.SalesmanEmail
FROM
TableC
INNER JOIN
TableB ON TableB.QuoteNo = TableC.QuoteNo
INNER JOIN
TableA ON TableA.SalesmanID = TableB.SalesmanID

;-)

Terri
PS - I think the penalty is pocket protector removal

Bad data

Some knuckleheads have used a cutom indexing app to create several thousand records with this problem: The indexing app permitted field entries longer than the db receiving the index. IE, a 100-char entry when the DB only accepted 50 characters. The app had no safeguard against this error.

Does Oracle or PL/SQL have a function that will allow me to select all records where the content exceeds a certain number of characters in length? Actually in this case it would be better to find strings exactly 50 chars long. The data has been imported to our Oracle db, and I need to find these bad records!Try:

select * from myTable
where length(col_x) = 50;

This will list all rows where col_x is 50 characters long.

:cool:|||That's it! Thanks.

bp

Thursday, February 16, 2012

backup, truncation, behavior

Scenario:
I have a database in SQL Server 2000 with 90gb of data. It is growing every day with a some millions records. In order to get superior performance I rebuild the indexes every night. The job takes about 1.5 hours, and sometimes causes to the log file to grow to 30-40 gigs. I do a backup of the log files every 20 minutes, and this causes an log truncation to happen automatically (according to the MS documention).
Problem:
The log files makes the disk goes full. I cannot insert new disks at this point.
Question:
When rebuilding the indexes takes 1.5 hours, and I do a log backup (which also does a truncation) every 20 minutes, will the log backup ever get to do something with the transaction log while its still working on a transaction (doing the rebuild of index)?
Observation:
It seems like the 20 minute log backup is of no help because I see the log files grows so big.

BR
Henrik
NorwayIn the future, you may want to point your SQL specific questions tohttp://www.sqljunkies.com/forums/.

Here is some code I run in one of my jobs to shrink it down:

DBCC SHRINKDATABASE (dbname, 10)

DBCC SHRINKFILE(dbname_Log, 10)
BACKUP LOG Website WITH TRUNCATE_ONLY
DBCC SHRINKFILE(dbname_Log, 10)

This first shrinks the db and then the log file and sets the size to be 10 MB. Then, it backs up the log file and shrinks it again. When I'm done, I get about 8-10 MB off of the db and my log file will be only 1 MB.|||Thanks for you reply and interest.

DBCC SHRINKDATABASE AND SHRINKFILE actually removes data, which is not the intent.

The backup command issued is what I'm talking about. What happens if you issue a CREATE INDEX blabla WITH DROP at 09.00 PM, and this takes 1.5 hours. Then, at 09.20 you issue a BACKUP LOG blabla. Will the last command help on reducing the log file, or will it not do any work before 1.5 hours has past?

DBCC Indexdefrag is neither an option for me.

-Henrik