• Sponsored Links :

SQL SERVER

Q. I have SQL Server 2000 with approximately 45 databases on a single instance. These databases are used by multiple ASP applications running on my public Web site. SQL Server regularly reaches 100 percent CPU utilization and requires reboots. How should I start troubleshooting and repairing this problem?
A Start with rebuilding all of your indexes. It’s easy, assuming you have memory configured correctly. Then run SQL Profiler for a few hours and capture a snapshot of your standard business T-SQL traffic. Run this trace through the Index Tuning Wizard (ITW) that comes built into SQL Server 2000. It will take a look at the queries and make recommendations for indexes to improve performance.
If you have SQL Server 2005 installed, you can use the Database Tuning Advisor (DTA) which is the successor to ITW, to tune your workload. DTA can tune SQL Server 2000 as well. It tunes a database on a production server by offloading most of the tuning load onto a test server. DTA uses the production server hardware configuration information, without actually copying the data from the production server to the test server. It only copies the metadata and necessary statistics.
Q. How can I configure SQL Server to operate over multiple servers that operate virtually as one? In other words, I’d like to have SQL Server load-balanced across several Windows® servers in order to distribute the processing load to improve overall performance when there are many users accessing the database through a Web site. I would also like failover clustering.
A No version of SQL Server supports load balancing in the manner you discuss. However, there are a number of high-availability techniques to separate the processing load onto different servers. For example:
1. Peer-to-peer replication (new in SQL Server 2005)
2. Distributed partitioned views (available in SQL Server 2000 and enhanced in SQL Server 2005)
3. Active/active failover clustering, which is really active/passive and passive/active; in other words, two instances of SQL Server (this technique has been available since the release of SQL Server 7.0)
SQL Server doesn’t support load balancing because Microsoft Cluster Service (MSCS) uses what is known as a "shared nothing" cluster methodology. That is, the nodes operate independently of each other and do not share disks or resources. There is only one owner of each resource at a time.
Q. For security reasons, my HR department doesn’t want the IT department to be able to view the personnel database, which is stored in SQL Server. How can I prevent the IT staff from simply restoring backed-up copies of the database onto another server during the backup process, enabling them to see the data?
A Unfortunately, there is no way to completely prevent DBAs from restoring databases elsewhere unless their physical access is restricted to only certain servers. There are a few things you can do, though, to make the backup process more secure:
1. 1. Encrypt the data in the database. You may take a small performance hit, but the DBAs won’t be able to see the data in plain text (at least not easily).
2. 2. Use a third-party product such as Quest Software’s LiteSpeed for SQL Server and encrypt your backups on the way out so that nobody can view the backup files. Doing so would add an additional layer of protection as standard backups made by the SQL Server engine are not encrypted.
Q. When I run one particular query, CPU usage will reach 100 percent after a few seconds, and the server will be shut down in a few minutes. I can reproduce it every time. We have hundreds of millions of unique values in the table in question.
Is this a hardware issue or a software issue? I can’t get any information from Event Log. The server is an HP AMD 64-bit machine. I am using SQL Server 2005 (64-bit).
A If the system really is crashing (bugchecking) then you will need to take a look in the memory.dmp file. First check the Startup and Recovery settings (in the Control Panel) and make sure it’s set to create at least a kernel memory dump. If that is already set, then find the memory.dmp file and save it to another machine. The next step is to download the Debugging Tools for Windows (windbg) from microsoft.com/whdc/devtools/debugging. Make sure to select the right package for your system (in this case you want the 64-bit versions), and then choose the x64 package.
When the debugger is installed, run windbg and set the symbol path to the Microsoft Symbol Server. To do this, type Ctrl+S and enter a valid symbol path. I usually use something like srv*c:\symcache*http://msdl.microsoft.com/download/symbols. For more information on this, see Debugging Tools and Symbols: Getting Started.
Next, open the memory.dmp file in your favorite Windows debugger and run !analyze -v. This will tell you what the bugcheck was and give more details on why the server crashed.
Q I have a network administrator who is also a database admin for all the SQL Server™ databases in my enterprise. As such, he has administrative access to the data and data structures of those databases. How can I monitor his administrative activities?
A If you’re using SQL Server 2005, you can use Data Definition Language (DDL) triggers to write events to a log table that you can later review. In SQL Server 2000, you can run a profiler, but that can be resource intensive.
Often these monitoring requirements can be met by having a documented change-control process in place and by removing the built-in administrator role from the sysadmin list, replacing it with a custom group that only contains your DBAs. This would prevent your operating system and domain admins from automatically gaining sa access to SQL Server.
However, while SQL Server 2005 DDL triggers can generate information about schema changes, they won’t capture read or write object access. In SQL Server 2005, you can probably get a lot more mileage from event notifications with very little overhead (for further information, see "DDL Events for Use with Event Notifications").
Running a server-side trace (without using the Profiler UI) may incur an acceptable amount of overhead, but that option first needs to be tested and confirmed. You may be able to use a combination of filters on Application Name, DBUserName, LoginName, and NTUserName for capturing events such as RPC:Completed, SP:Completed, SP:StmtCompleted, SQL:StmtCompleted (these are examples from SQL Server 2000).
With a well-configured trace, you can monitor both read and write access in all databases for selected users or for applications (such as Enterprise Manager and Query Analyzer), while limiting tracing so it captures only the info you want and excludes normal application activity.
Just keep in mind that a user with sysadmin privileges on SQL Server can defeat most forms of database monitoring with enough effort and determination.
Q I am running SQL Server 2000 SP4 on Windows Server™ 2003 SP1 and I have been investigating some performance problems. I’ve noticed that the Pages/sec and Page Reads/sec counters are very high. The average Pages/sec is 606 and the max is 6469. The Page Reads/sec average is 507 and the maximum is 2090.
The server has 23.5GB of memory with SQL Server configured to use 21.3GB. I changed the memory allocated to SQL Server to 18GB, but I see no change in the values for the Pages/sec and Page Reads/sec. This is a 32-bit environment and SQL Server is running on a dedicated machine. Could either database fragmentation or bad queries be causing the counters to reach such heights?
A When Pages/sec is greater than 0 or Page Reads/sec is greater than 5, Windows is going to disk to resolve memory references (via a hard page fault), which comes at the price of disk I/O and CPU resources. Do you have any other applications running on the server apart from SQL Server? If you stop SQL Server what happens? If the problem is disk fragmentation, try using the DBCC SHOWCONTIG tool on specific tables that could be causing the issue, rebuilding those tables if required. If this issue occurs only during a specific query, do some query tuning.
The following is further information on these counters that you may find useful in determining your problems:
When looking at Pages/sec, start investigating if you’re getting over 100 on a system with a slow disk. Please note that the stated values of over 20 pages, which appear in many other sources of documentation, are out of date. On a system with a fast disk subsystem, even 500 pages per second may not create an issue.
Always break up this count into pages output and pages input if the counter is above 100 per second.
Pages/sec represents the number of pages read from the disk and written to the disk in order to resolve memory references to pages that were not in memory at the time of the reference. In other words, it is the sum of Pages Input/sec and Pages Output/sec. This counter includes paging traffic on behalf of the system cache used to access file data for applications. It is the primary counter to observe if you are concerned about excessive memory pressure (thrashing) and the excessive paging that may occur as a result.
This counter, however, also accounts for such activity as the sequential reading of memory-mapped files, whether cached or not. Therefore, you must understand the expected behavior of the processes on your server in order to interpret this counter. The typical red flag for a mapped file reading is the following combination of characteristics:
• A high number of Pages/sec
• A normal (average, relative to the system being monitored) or high number of Available Bytes
• A normal or small number for Paging File: % Usage
In the case of a non-cached memory- mapped file, you will also see normal or low cache (cache fault) activity.
The number of Pages Output/sec is only an issue when disk load becomes an issue. This value is completely meaningless unless it is evaluated in concert with the disk throughput capacity.
Remember, these pages are pages written by an application and need to be written to the disk to resolve memory references to pages that were not in memory at the time of the reference. This is not resource intensive, and as long as disk write time for the logical partition does not exceed 30 percent, you should not see any impact on system performance. The proper way to observe the disk’s write time is to look at its inverse counter; in this case, the disk idle time should be 70 percent or greater.
Pages Output will stress the system but the application will remain unaware of this. An application will only wait on Pages Input and the troubleshooter will need to know what the application tolerance for waiting on Pages Input will be. For example, SQL Server and most applications will tolerate very few Pages Input while Exchange Server will do much better. Again, you will need a good baseline for comparison.
If you suspect paging is the issue, the best threshold value to use for diagnosing the problem is disk read time on the logical disk that holds the page file. To rule out paging as an issue, look for disk idle times of greater than 85 percent and transfer times of less than 15 ms.
It is important that you know that the pages in and pages out represent writing to the page file and not some data file reading and writing to another file. This is bit difficult except on an enterprise installation where you have different disk partitions for each type of file—for example, the page file is on a partition by itself or on the system partition, the TEMPDB file is on another partition, and all the SQL Server files on another partition.
Look at the disk counter for logical disks, then take the bytes per second and divide by 4096 to convert to pages per second and see which disk the I/O is going to in order to determine if this is true paging or data reads and writes.
SQL Server uses paging for some memory operations. If none of the disks with page files have I/O then the remaining operations must be data operation, not process paging.
If you have no other information, follow the general guideline that paging 40 Pages Input/sec per spindle will not slow down most applications.
Q In my current transactional replication I replicate the transaction, but not the stored procedure. It was recommended that I replicate the stored proc itself.
I have a few questions regarding this. If I replicate the sproc, how do I tell replication not to replicate the transactions? Before I publish the sproc, do I need to stop replication, take a snapshot, and then restart? And finally, if the sproc refers to a table on a linked server, will this be performed on the subscriber?
A To answer your first question, when the sproc is executed, SQL Server automatically tags the session with proper information so modifications made inside the sproc are not individually marked for replication. As a result, only the sproc exec statement will be replicated, not the individual Data Manipulation Language (DML) within the sproc.
As for stopping replication, the answer is no, you don’t have to do all that. You can simply add the sproc article just as you do with table articles.
Finally, if you publish the stored procedure sp_foo, ‘exec sp_foo ‘bar’’ will get replicated as such. You need to make sure sp_foo can execute on the subscriber just like it executes on the publisher.
Q I have some domain users who need to view a report that I created in SQL Server 2005 Reporting Services. How do I grant permission to those users so that they can view the report?
A You can either script this using the rs tool, or you can use the Report Manager by following these steps:
1. Go to the folder in Report Manager where you have the report and click the Show Details button in the upper right.
2. Click the Edit icon next to the report you want to share, and then click the Security link on the left.
3. In the Group or username field, enter the DOMAIN\groupname for the group you created and assigned the users to. (You can add users individually, but groups are easier to manage.)
4. Put a check next to the Browser role and click OK. And now you’re all set.
Q I need to issue a list of all tables that have more than 10,000 rows for a specific application. In that list, I want to identify possible candidate columns for indexes based on auto-generated statistics. I know I can list these statistics using the sp_helptstats Stored Procedure in SQL Server 2000 and then run DBCC SHOW_STATISTICS to see the data distribution. But I don’t want to do that. Is there an alternative?
A The best way to do something like this is to configure a trace to capture the SP:Starting, SP:Completed, and Showplan:Statistics events. (There are others you may also have to capture to collect the events for remote procedure calls and T-SQL events.)
Run the trace at least three times during your operational day—during slow, moderate, and heavy traffic times. Take the output trace files and feed them to the Index Tuning Wizard (ITW) as workload files. The recommendations made by the ITW are not absolute, but they should point you in the right direction. Running this sort of trace periodically is really the best way to keep the indexing and statistics in tune.
Q I recall there being an issue in SQL Server 2000 that if a max memory setting was configured, SQL Server allocated the entire max amount at startup. Is this problem fixed in SQL Server 2005?
A Unfortunately, this is a misconception. Configuring max server memory alone does not make SQL Server allocate all the memory at startup—not under SQL Server 2000, and not under SQL Server 2005. But, depending on the real load, the configured amount can be consumed very quickly.
There are, however, other settings that would result in switching of the dynamic memory management—for example, turning on address windowing extensions (AWE) support. The good news is this has changed in SQL Server 2005!
If you run SQL Server 2005 under Windows Server 2003 (editions that support AWE), you get dynamic memory management even if you switch on AWE support. On almost all other configurations, SQL Server switches dynamic memory management off as soon as you switch AWE on.
Dynamic AWE memory is only available with SQL Server 2005 on Windows XP and Windows Server 2003 or higher. After changing the AWE option, you need to restart the SQL Server process; the decision of whether to use static AWE memory allocation or dynamic AWE memory allocation is made at server start-up time.
It’s also worth noting that if you’re using SQL Server 2005 on 64-bit systems, and the account used to run SQL Server has the "Lock pages in memory" privilege, SQL Server will default to using AWE memory even though the AWE configuration option is not available on 64-bit systems. See the blog by Slava Oks at blogs.msdn.com/slavao/archive/category/9005.aspx for more details on this.
The difference between static memory allocation and dynamic memory allocation is that with dynamic memory allocation, SQL Server responds to changes to min and max server memory without having to restart SQL Server. SQL Server will also respond to the operating system low memory notifications and release memory back to the OS when necessary.
Q I need to rebuild statistics on very large SQL Server tables that are located in heavy usage environments where there is no server downtime. I would rather not play around with different sample sizes. I’m running SQL Server 2000 and SQL Server 2005. There seems to be less of a need for explicit sampling in SQL Server 2005 since the default sample size has increased dramatically. When is it necessary to do a full scan on a table?
A If the sample is good, there is usually no need to play with the sample size and the default sample will be sufficient.
But, statistical estimations based on sampling depend on the randomness of the sample. A non-random sample will yield bad results, and without randomness, increasing the sampling rate won’t help. The only proven workaround is to avoid sampling altogether and do full scan statistics.
Unfortunately, there is no simple way to determine the randomness of the sample. It usually takes some statistical analysis of the involved data. For instance, to get a sample on column B, you use an index on column A. If A and B are correlated in any way, the sample isn’t random. Other sources of non-randomness come from the fact that you do page sampling for performance reasons.
So, start with default sampling. Try full scan only if you observe performance issues (slow plans). If this works, come up with a plan for periodically refreshing the statistics of the full scan. Also keep in mind that at least on SQL Server 2005, full scan stats get parallelized. Depending on your circumstances and hardware this might dramatically reduce the build time.
Q What is the difference between DELETE and TRUNCATE? Is one faster than the other?

DELETE logs the data for each row affected by the statement in the transaction log and physically removes the row from the file, one row at a time. The recording of each affected row can cause your transaction log grow massively if you are deleting huge numbers of rows. However, when you run your databases in full recovery mode, detailed logging is necessary for SQL Server to be able to recover the database to the most recent state, should a problem arise. The fact that each row is logged explains why DELETE statements can be slow.
TRUNCATE is faster than DELETE due to the way TRUNCATE "removes" rows. Actually, TRUNCATE does not remove data, but rather deallocates whole data pages and removes pointers to indexes. The data still exists until it is overwritten or the database is shrunk. This action does not require a lot of resources and is therefore very fast. It is a common mistake to think that TRUNCATE is not logged. This is wrong. The deallocation of the data pages is recorded in the log file. Therefore, BOL refers to TRUNCATE operations as "minimally logged" operations. You can use TRUNCATE within a transaction, and when this transaction is rolled-back, the data pages are reallocated again and the database is again in its original, consistent state.
Some limitations do exist for using TRUNCATE.
• You need to be db_owner, ddl_admin, or owner of the table to be able to fire a TRUNCATE statement.
• TRUNCATE will not work on tables, which are referenced by one or more FOREIGN KEY constraints.
So if TRUNCATE is so much faster than DELETE, should one use DELETE at all? Well, TRUNCATE is an all-or-nothing approach. You can't specify just to truncate those rows that match a certain criteria. It's either all rows or none.
You can, however, use a workaround here. Suppose you want to delete more rows from a table than will remain. In this case you can export the rows that you want to keep to a temporary table, run the TRUNCATE statement, and finally reimport the remaining rows from the temporary table. If your table contains a column with the IDENTITY property defined on it, and you want to keep the original IDENTITY values, be sure to enabled IDENTITY_INSERT on the table before you reimport from the temporary table. Chances are good that this workaround is still faster than a DELETE operation.
You can also set the recovery mode to "Simple" before you start this workaround, and then back to "Full" one it is done. However, keep in mind that is this case, you might only be able to recover to the last full backup.
Q My application is very INSERT heavy. What can I do to speed up the performance of INSERTs?

Here are a variety of tips that can help speed up INSERTs.
1) Use RAID 10 or RAID 1, not RAID 5 for the physical disk array that stores your SQL Server database. RAID 5 is slow on INSERTs because of the overhead of writing the parity bits. Also, get faster drives, a faster controller, and consider turning on write caching on the controller if it is not already turned on (although this has its disadvantages, such as lost data if your hardware fails).
2) Ensure that your server is not starved for memory.
3) The fewer the indexes on the table, the faster INSERTs will be.
4) Try to avoid page splits. Ways to do this include having an appropriate fillfactor and pad_index for you indexes, rebuilding indexes often, and consider adding a clustered index on an incrementing key for the table (this forces pages to be added one after another, and page splits are not an issue).
5) Keep the columns widths as narrow as possible.
6) If data length in a column is consistent, use CHAR columns, or if data length varies a lot, use VARCHAR columns.
7) Try to batch INSERTs rather than to INSERT one row at a time. But this can also cause problems if the batch of INSERTs is too large.
None of these suggestions will radically speed up your INSERTs by themselves, but put together, they all will contribute to overall faster INSERTs.
Q Does running the SQL Server Index Wizard find all of the tables in my database that need indexing?
Does running the SQL Server Index Wizard in SQL Server 7.0 or 2000, or the Database Engine Tuning Advisor in 2005, find all of the tables in my database that need indexing?
No. While these are handy tools to help identify some of the more obvious missing indexes, they can still miss a lot of useful indexes. Here are some of the reasons why:
• While the Index Wizard is very sophisticated, it still is not smart enough to identify all potential indexes. The 2005 Database Engine Tuning Wizard does a better job than the older tools.
• The number of indexes that are identified is partially based on the settings you choose when running these tools. Some settings do a more thorough, and time-consuming analysis, resulting in fewer or more indexes being identified.
• The trace file you "feed" to these tools for analysis also plays a large part in identifying indexes. Ideally, the trace should be of a representative sample of data. If not, then these tools won't be able to identify all the poorly running queries and be able to suggest appropriate indexes.
I recommend the use of these tools be used to identify the easy to find indexes, and then your next step is to use Profiler to identify any other long running queries, then you must manually review each of these to see if changes in indexes will help their performance. This can be a slow process, but this is just one of the many jobs of the DBA.
Q.I am running SQL Server 2000 on Windows 2000 with 1GB of RAM on a dedicated server, but I am getting the following error message when doing backup and running simple queries like ALTER TABLE. What can be the problem?
Server: Msg 845, Level 17, State 1, Line 1
Time-out occurred while waiting for buffer latch type 4 for page
(1:8450), database ID 8.
Server: Msg 3013, Level 16, State 1, Line 1
I have also run into this same (well, very similar) error message on two different occasions. What the SQL Server is telling you is that the disk I/O system is not able to keep up with SQL Server. Almost always, this message means that you have a driver or hardware problem with your I/O system, and you need to identify the problem and fix it. This error can occur anytime SQL Server is generating a lot of I/O activity.

In one of the cases I found this error, in was with a SAN that was using an outdated driver. On another occasion, it was a physical problem with a SCSI card. Your cause may be the same, or very similar.

You, or a network administrator, needs to check the I/O subsystem and see what is going on. Many have error logs you can view to see if error are being generated
I have been taking a look at the queries run against a particular table, along with the indexes on the table, and I have discovered the following:

1) Query #1 uses a composite index that includes a three column index.

2) Query #2 uses a composite index that includes a two column index.

3) The first two columns of both indexes are identical.

Here's my question. Since the two different indexes overlap the first two columns, is the composite index with the two columns redundant? And if I remove it, will Query #2 automatically use the three column index, even though it has previously only used the two column index?

Q.I have been taking a look at the queries run against a particular table, along with the indexes on the table, and I have discovered the following:
1) Query #1 uses a composite index that includes a three column index.
2) Query #2 uses a composite index that includes a two column index.
3) The first two columns of both indexes are identical.
Here's my question. Since the two different indexes overlap the first two columns, is the composite index with the two columns redundant? And if I remove it, will Query #2 automatically use the three column index, even though it has previously only used the two column index?
You are correct in your assumptions. What you have discovered is a redundant index, and this a more common problem than you might expect. Redundant indexes hurt performance because it takes extra resources to maintain them. By removing the index with only two columns, and leaving the index with three columns, you will be removing a redundant index, and when Query #2 runs, it will now automatically use the three column index, and you won't see any performance degradation because of this.

As a part of your performance tuning strategy, you might want to review heavily indexed tables and see if there are any redundant indexes that you can eliminate.
Is there any significant performance difference when joining tables across different databases on the same server?
This is very easy to test yourself. For example, make a test copy of one of your databases. Then create a query that JOINs two tables from within the same database. Next, create a second JOIN query similar to the first, but modify the second query so that it JOINs a table from the original and test database. Then run both queries and examine their query plans.
In virtually every case, the execution plans are identical, which tells you that performance of the query, whether it is inside a single database, or between two databases on the same server, are more or less identical.
On the other hand, if the databases are on separate servers, performance may suffer greatly due to network latency, etc.
Q. From time to time, my tempdb database grows so large that it runs out of disk space, which causes the connection that is using the tempdb to die. The tempdb can get as large as almost 6 GB, and the server's performance really suffers during this process, until, of course, when the connection is killed. What causes this and how can I prevent it?
I have seen this before, and it can almost always be traced back to a poorly designed query. The tempdb database can be used directly via Transact-SQL, for example, as when temp tables are used; or it can be used indirectly, like it when ORDER BYs or GROUP BYs are run as part of a query.
Whenever I see this problem, I first identify the query that is causing the problem and see what is going on. This can be easy if a user reports the problem and you know what was happening when the query failed. If the user is not helpful, then you can use Profiler to capture the data and analyze it.
When it comes down to it, the reason the tempdb fills up is because the query is returning way too much data, and you need to find out why and fix it. Often, it is because the query allows a user to specify one or more criteria (in the WHERE clause), and in this particular instance, the WHERE clause was not specific enough, and way too much data was returned.
In some cases, you may not be able to easily control the queries that hit your database. For example, perhaps you allow Excel or Access users to query SQL Server data. In this case, you can't prevent users from writing bad queries. In this case, what I do is to limit size of the tempdb, instead of letting it grow automatically (which is the default setting for tempdb). This way, when a bad query runs, it will die much sooner (because enough space can't be accessed, which will help to reduce the server's stress when running such a bad query. Sure, this might make some users a little unhappy when their queries fail, but a bad query should not be allowed to run in the first place.
In fact, it is a good idea to pre-size the tempdb database anyway, so that when SQL Server is restarted, it will automatically be set at a specific size. This way, when the tempdb really does need to be used, time doesn't have to be wasted as the tempdb has to autogrow.
Predicting the ideal size for the tempdb is not easy. Generally, I take a guess, and then watch it, seeing how good my guess is. And if I am wrong, then I will make the necessary adjustments.
Is there any performance difference between using SET or SELECT to assign values in Transact-SQL?

This should be an easy task, but as far as I can tell, there is no really easy way to do this. In the following demo code, I will give an example of how to do this, although it is a lot of trouble.
--First, the following code pins the "authors" table in pubs:
DECLARE @db_id int, @tbl_id int
USE pubs
SET @db_id = DB_ID('pubs')
SET @tbl_id = OBJECT_ID('pubs..authors')
DBCC PINTABLE (@db_id, @tbl_id)
SELECT @tbl_id
--Next, a table is not really pinned until after it is retrieved from disk into the buffer, so you need to do this with the following command:
SELECT * FROM authors
--Now to see if the authors table is pinned or not, you can run the following code. If the table is pinned, you will get a message that it is, if it is not pinned, then you won't get any return information about pinned tables.
IF OBJECTPROPERTY (object_id('authors'),'TableIsPinned') = 1
PRINT 'TableIsPinned'
--When you are done, you will want to unpin your table. To do so, run this example code:
DECLARE @db_id int, @tbl_id int
USE pubs
SET @db_id = DB_ID('pubs')
SET @tbl_id = OBJECT_ID('pubs..authors')
DBCC UNPINTABLE (@db_id, @tbl_id)
Of course, this example code only will check one table at a time. To check all your tables, run the following code:
SELECT * FROM INFORMATION_SCHEMA.Tables WHERE TABLE_TYPE = 'BASE TABLE' AND OBJECTPROPERTY(object_id(TABLE_NAME), 'TableIsPinned') > 0
When creating temp tables within a stored procedure, is it good practice to drop them as soon as they are no longer required, or leave the 'tidy-up' until the end?

Maintaining temp tables uses up SQL Server resources, so the sooner you delete them after you are done using them, the better. For most SQL Servers, there is probably not much of a performance difference if you decide to delete them later than sooner, but on a very busy SQL Server that manages lots of temp tables, the difference might be noticeable.
Ideally, for best SQL Server performance, you should avoid using temp tables in the first place. Instead, consider using one of these "lower overhead" options instead of using a "higher overhead" temp table.
• Rewrite your code so that the action you need completed can be done using a standard query or stored procedure.
• Use a derived table.
• Use the SQL Server 2000 or 2005 "table" datatype.
• Consider using a correlated sub-query.
• Use a permanent table instead.
• Use a UNION statement to mimic a temp table
Q.I have a complex query that, according to the execution plan, uses parallelism, which is what I would expect. But when the query actually runs, it doesn't appear to use all of the available CPUs to their full potential. For example, when I watch Performance Monitor when I run the query, it appears that only one of the CPUs is really being used heavily, and all the rest are not all that busy. Our server has 8 CPUS and 6.5GB RAM. Is there any way to force SQL Server to use all of the available CPUs for running parallel queries in order to fully take advantage of all the CPU power that is available, further speeding up the execution of the query?
First, let me answer your question, then I will discuss a little about what you are experiencing. No, you can't force SQL Server to use more CPUs that it wants to use when executing a parallel query. On the other hand, you can prevent SQL Server from using all of the available CPUs, but the most you can do is to tell SQL Server to use parallelism, and that is it. Only SQL Server can decide how to take advantage of the available CPUs.
Now, let's talk a little bit about how parallelism works and why SQL Server might not take full advantage of all the available CPUs in your server.
By default, if your server has two or more CPUs, and assuming parallelism has not been turned off, and the number of available CPUs for use by SQL Server has not been restricted to a single CPU, and if the Query Optimizer thinks it will take less than 5 seconds (the default value, you can change this if you want) to execute, then the Query Optimizer will consider using parallelism for executing a query.
By parallelism, what we mean is that SQL Server will consider splitting the query into two or more execution threads and running them on multiple CPUs in your server. In many cases, this can speed up the execution of the query, but not in all cases. The use of parallelism has its own overhead, and the time savings of using parallelism must exceed the overhead incurred before the Query Optimizer will use it. This makes sense.
When the Query Optimizer evaluates a query for parallelism, it must consider many factors, such as the current load on the server, the nature of the query, I/O requirements, and so on. This is a very complex decision-making process as you might expect. Once it is done with this process, it decides on the optimum query plan, which may use two or more CPUs. We can't control how many will be used, only the Query Optimizer can do this.
In regards to your query and the use of the CPUs that you saw with Performance Monitor. There are several potential explanations for the behavior you saw. Some of them include:
• The Query Optimizer decided that using more CPUs to execute the query would not provide any performance benefit. For example, the bottleneck to performance for this query may be disk I/O, not CPU, and using more threads may not have helped performance.
• Just because there was only one very busy CPU does not necessarily mean that other CPUs were not also executing part of the query. For example, it is possible that most of CPUs were involved executing the query, but that only one of the execution threads was overly busy, while the other execution threads were running, but just at a much lower level based on the amount of work allocated to them.
• Another thing that can affect your observations with Performance Monitor is that an executing SQL Server thread can jump from CPU to CPU as needed in order to optimize performance. Because of this, watching CPU activity in order to evaluate the performance of parallelism is not always very helpful.
• Perhaps the Query Optimizer made a mistake. This doesn't happen often, but it does happen.
So, unfortunately, there is no way to force SQL Server to use all of the CPU power that is available.
Q.I have a VB-based application that uses SQL Server as its back-end. I have noticed that if I run the exact Transact-SQL queries (taken from the VB app) in Query Analyzer, that they run must faster than they do from within the VB application. Why is this?

application. Here are some reasons, although this is not a comprehensive list.
• VB can introduce additional overhead to Transact-SQL that does not occur in Query Analyzer. And how you call the query in your VB program affects performance differently. For example, using a bound control to execute the query code can be slow. Also, calling the code directly within a recordset object, and not from a stored procedure, can also slow down the code.
• Is the VB app using DAO to access SQL Server? This is very slow. ADO is faster.
• Are you using ODBC or OLEDB to connect? OLEDB offers better performance.
• Is SET NOCOUNT ON being used in your application to reduce network traffic between the application and SQL Server? If not, this can reduce performance.
• What is your setting for the CacheSize property of the ADO Recordset object? If you are using the default value of 1, performance can be very slow due to unnecessary network traffic generated.
This list could go on and on. For more tips, see this page. The main point I want to make is that while SQL Server may perform its tasks quickly (you saw this when you ran the query in Query Analyzer), it is very easy to slow it down with unnecessary overhead from your VB application.
There is another lesson to be learned here also. If you are trying to performance tune a VB application that accesses SQL Server, one of the best ways to help identify if it is the VB app, or SQL Server, causing the problem, is to run each of the queries in your VB application from within Query Analyzer. If the queries run faster in Query Analyzer, then your VB application is the most likely cause of the performance problem. But if the query runs slow in Query Analyzer, then you can blame the query (perhaps it is written poorly), or the indexes (you may have poor or no useful indexes), or perhaps some other functionality of SQL Server is responsible. But until you perform this testing, you really don't know the cause of the performance problem.
Q.I have a table with over 40 million rows. I would like to partition the table base on the creation date. Does SQL Server support table partitioning?
Even with the fastest hardware and optimally-written queries, there comes a point in the size of a table where there are just too many rows to handle quickly. In this user's case, it looks like 40 million rows is that point. For you, it might be more or less, depending on your situation.
Fortunately, in SQL Server 2000 and 2005, there is what is called "partitioned views." Essentially, what a partitioned view does is to allow you to divide your data into multiple tables (instead of one very large table), and to place these tables one one or more SQL Servers. What a partitioned view does is to join the data in all of the tables so that it appears that there is only a single table.
For example, you might divide a large table by year, or month (and year), or using some method that makes logical sense. This way, each table will be much more manageable. These multiple tables might be located on a single physical SQL Server (local partitioned view), or on multiple SQL Servers (distributed partitioned view). A distributed partitioned view is implemented on what is called a SQL Server federation of servers.
Both options allow you to more easily manage data and can contribute to faster performance.

Which is faster when using SQL Server 2000, temp tables or the new table datatype?

Generally speaking, if the data you are dealing with is not large, then the table datatype will often be faster than using a temp table. But if the amount of data is large, then a temp table most likely will be faster. Which method is faster is dependent on the amount of RAM in your server available to SQL Server, and this of course can vary from server to server. The greater the available RAM is in a server, the greater number of records that can be efficiently stored in a table datatype. You may have to test both methods to determine which method is best for your situation.
Here are some reasons why the table datatype, when used with reasonable amounts of data, is generally faster than using a temp table:
• Records are usually stored in memory, not in a temp table in the tempdb database, so performance is much faster.
• Table variables act like local variables and have a well-defined scope. Whenever the batch, function, or stored procedure that created the table variable goes away, the table variable is automatically cleaned up.
• When a table variable is used inside a stored procedure instead of a temp table, fewer recompilations occur, reducing server overhead.
• Table variables require less locking and logging resources when compared to temporary tables, reducing server overhead and boosting concurrency.
If you haven't learned how to use table variables yet, you need to take the time to do so as soon as you can. They can be powerful tools in the correct situations.
Q.We are using SQL Server 2000 as our back-end and an application called VFP as our front-end. At times, the application's performance is very poor. We used Task Manager to identify that the sqlservr.exe process appears to use up memory when running queries, but it appears not to release the memory after the queries are complete. We believe that this may be the cause of our application's performance problems.
Is there any option to release the additional memory that has been used by SQL Server after a query is completed?
Based on what you have told me, I really doubt that your theory is correct. First, I am assuming that your application and SQL Server are running on the same physical server. If this is the case, the best way to reduce performance issues for either the application or SQL Server is to separate them on different servers. This is basic advice I give to everyone. SQL Server should always run on a dedicated SQL Server.
From what you have provided, it is hard for me to guess exactly what your problem is. I recommend you do both a Performance Monitor log (over at least 24 hours) and do a Profiler trace over the same period. The information provided will help you identify the performance problem you see in your application.
By the way, there is no way to tell SQL Server, through a SQL Server option, to give up memory that it has already taken and used when running a query. The closest you can come to this is to change the SQL Server's memory configuration from dynamic (the default) to a fixed amount. This way, you can limit how much memory SQL Server can access on a server. But remember, if you do this, you may be harming SQL Server's overall performance.

Q.My application makes heavy use of temp tables. Should I be creating temp tables as needed, or perhaps should I be using a permanent table over and over instead?
Assuming everything is the same, both tables will produce very similar access speeds. If your application makes heavy use of temp tables, you might consider using a permanent table for these reasons:
• A regular table already exists. A temp table has to be created, which takes time and overhead. For example, if you need to create a temp table 500 times a day, obviously a permanent table would be more efficient.
• A regular table can have preexisting indexes, while a temp table does not, unless you add them, which takes more overhead. Yes, records may take a little longer to INSERT if a preexisting index exists, but overall, the overhead is less than having to INSERT your data into a temp table, and then add an index.
• A temp table, when used, has to be managed in the tempdb database, which is additional overhead as it is created, used, and deleted. A regular table already exists, and doesn't take any extra effort to manage.
• A permanent table acts more like a global temp table than a local temp table. This may or may not be to your advantage. But even if you want the advantages of a local temp table, a permanent table will still work if you include the necessary column to differentiate the rows of different users, and the appropriate code.
Each of these potential benefits of permanent tables over temp tables is small, but in some situations, can add up to big performance savings. The only way to know for sure in your situation is to try both options.
Q.I'm having a huge problem. I had a database in SQL Server 7.0. I've now exported it to SQL Server 2000. Everything looks fine, same data, same indexes, same everything, but I have a stored procedure (select form multiple tables with joins) that runs perfectly (less than 1 sec) in SQL Server 7.0, but takes lot more time running under SQL Server 2000 (27 sec).

Most likely, you need to update the statistics of all the indexes in your new SQL Server database. The easiest way to do this is to create a database maintenance plan and select the "Update statistics used by query optimizer" from the "Update Data Optimization Information" screen of the maintenance wizard, then run it now, or schedule it to run whenever the SQL Server is not being heavily used. Updating statistics is a I/O intensive task, and it can slow down other SQL Server operations when it is running.

Q.The following two queries produce the same results. Is one or the other versions of these two queries more efficient than the other?
SELECT DISTINCT productcode
FROM sales
SELECT productcode
FROM sales
GROUP BY productcode
The goal of both of the above queries is to produce a list of distinct product codes from the sales table. The first query uses SELECT DISTINCT to accomplish this task, and the second query uses GROUP BY.
If you were to run these two identical queries in Query Analyzer, with the Execution Plan option turned on, you would find that not only are the results identical, but the Execution Plans are identical. The SQL Server Query Optimizer has the ability to "decipher" each query, determining what the ultimate results are, and to produce the most efficient Execution Plan possible.
So the answer to your question is that there is no performance advantage to using one form of the query over the other.
Q.Which of the following joins will produce better performance?
ANSI JOIN Syntax
SELECT fname, lname, department
FROM names INNER JOIN departments ON names.employeeid = departments.employeeid
Former Microsoft JOIN Syntax
SELECT fname, lname, department
FROM names, departments
WHERE names.employeeid = departments.employeeid

SQL Server supports two variations of performing JOINs: the ANSI JOIN syntax and the former Microsoft JOIN syntax. Both produce identical results and identical performance. There is no performance reasons to use one form of the JOIN over the other.
On the other hand, there are two good reasons why you should use the ANSI JOIN syntax over the former Microsoft JOIN syntax. First, it is more portable because it is the ANSI standard, and second, because eventually Microsoft may eliminate support of the former JOIN syntax.

Q.We have two SQL Server servers, both with virtually the same hardware. One is a development server, where we create and test Transact-SQL code, and the other is our production SQL Server. Here's my problem. When a particular view runs on the development server, it takes about 30 seconds. But when I run the identical view on the production server, the view takes over 10 minutes to run. The databases on both servers are almost identical. We periodically copy the production database to the development server so that we are working with the same data and the same indexes. What could be causing the difference in time for this particular view to run?
Assuming that the servers and the databases are virtually the same, and the statistics have been updated on both of them, here's what I suggest you check. First, do an Execution Plan of the view that is causing the problem on the development server. Find out if this particular view is using parallelism as part of the execution plan. I am guessing that this is the case.
Now, do an Execution Plan of the view on the production server. Find out if this particular view is using parallelism as part of the execution plan. My guess is that it is not.
If I am write so far, then check the SQL Server Parallelism setting for each server. I bet that it is set to "Use All Available CPUs" on the development server, and set to use only one CPU on the production server. If I am right, then set the production server to "Use All Available CPUs," and the performance difference of the same view on both servers should be resolved, and performance should be virtually identical on both servers.
By default, SQL Server is set to use "Use All Available CPUs" on a server. If this is not your current setting, then this has been changed by someone.
Essentially, this setting tells SQL Server whether or not to try and attempt to execute queries in parallel using multiple processors. If the default "Use All Available CPUs" option is selected, then SQL Server will attempt to run queries in parallel, using multiple CPUs. This can sometimes result in dramatic performance differences in a query's performance. But if this option is turned off, then SQL Server will only use one CPU to execute a query, which can be much slower.
Keep in mind that if SQL Server is set to use parallelism, that it may not always use it. If the server is very busy, it may not be used, but when the server is less busy, it may be used. This means that sometimes the view will be fast, and other times, slow (depending if parallelism is being used or not).
On the other hand, sometimes parallelism can cause queries to take longer to execute than if parallelism is not used. In these cases, you can use a Query Hint to turn off parallelism for the odd-acting query.
Q.I have some Transact-SQL code that I run as a scheduled job in SQL Server, and it seems to run much longer than I would expect. If I run the same code in Query Analyzer (as a script) it takes about 20 seconds to run. If I put the code in a stored procedure, it takes about 10 minutes to run. I have added "set nocount on" in the code and have even recompiled the stored procedure, but it doesn't help. When I check the query plan produced by the script, and check the query plan produced by the stored procedure, they are different. The script's query plan shows that an index is used, but the query plan for the stored procedure does not use the index. What is going on?
If you see different query plans produced by what is essentially the same code, as you have in your script and stored procedure, my guess is that the Query Optimizer is not able to come up with consistent query plans because the index statistics for the table or tables affected by your code are not up to date.
What you need to try is to reindex the tables in your database, and also ensure that the database settings, "Auto Create Statistics" and "Auto Update Statistics" options are on. By doing this, you can ensure that your indexes statistics are up-to-date, and this will help ensure that the Query Analyzer can work properly to select the optimum query plan to make your code perform as fast as it can.
Q.I don't like to delete records in our production database, even though we will never need them again. Because of this, I have included an extra column in every table called the "ACTIVE_FLAG" that is set to one character in length that contains either an "a" for active records or a "d" for deleted records. How does this affect performance?
I have seen some database designs like yours, but generally speaking, it is because there is a fairly good chance that the records will be needed again, and flagging the record is an easy way to toggle access to them or not. But if you don't think the records will ever be needed again, I wouldn't recommend your method because it will negatively affect your server's performance.
Even though you may not need to access the inactive records and are marked accordingly, they still take up disk space. This is space for the record itself, along with any space taken by indexes you have on your tables. Unnecessary data in a table forces SQL Server to perform more I/O than necessary when reading data pages, which in turn slows down performance. Ideally, the only data you want in your database is data that someone needs. If nobody is using data in a database, it should be removed.
If you think there is a possibility that you might need the records again, another option is to archive the records in another database. For example, you could flag a record, like you do now, to render it inactive. Then periodically, such as one a month, or a quarter, or a year, move those records from the production database to another database on the same or different server. This way, the records are still available, but they don't take up unnecessary space in the production database, hindering performance.
Depending on the proportion of inactive records to active records, you may find that after archiving inactive records, that your database's performance increases significantly.
Q.In my stored procedures I have the option of calling the same data from either a table or a view. For best performance, should I be calling a table or a view?
Anytime you call data from a view instead of a table, there is additional overhead, which can hurt performance. Because of this, optimum performance can be had by calling your data directly from a table, not a view.
If you don't use the stored procedure often, the performance difference will not be noticeable if you call a view instead of a table. But if you do call the stored procedure often, then you will indeed be able to see a difference.
Standard views (excluding indexed views in SQL Server 2000 and 2005) are best used as a way to ensure users are only able to see the data from a table that they are supposed to see. Views should be a part of your overall security strategy, and used only for this purpose.
Q.Can the use of NULLS in a database affect performance?
Can the use of NULLS in a database affect performance?
Last Update : 2006/10/31
Rating : Not Rated

Yes, SQL Server's performance can be affected by using NULLS in your database. There are several reasons for this.
First, NULLS that appear in fixed length columns (CHAR) take up the entire size of the column. So if you have a column that is 25 characters wide, and a NULL is stored in it, then SQL Server must store 25 characters to represent the NULL value. This added space increases the size of your database, which in turn means that it takes more I/O overhead to find the data you are looking for. Of course, one way around this is to use variable length fields instead. When NULLs are added to a variable length column, space is not unnecessarily wasted as it is with fixed length columns.
Second, use of the IS NULL clause in your WHERE clause means that an index cannot be used for the query, and a table scan will be performed. This can greatly reduce performance.
Third, the use of NULLS can lead to convoluted Transact-SQL code, which can mean code that doesn't run efficiently or that is buggy.
Ideally, NULLs should be avoided in your SQL Server databases.
Instead of using NULLs, use a coding scheme similar to this in your databases:
• NA: Not applicable
• NYN: Not yet known
• TUN: Truly unknown
Such a scheme provides the benefits of using NULLs, but without the drawbacks.

Q.We have a growing population of Access users, much to my chagrin! My question is whether there is a way of lowering these user's priorities on the server/database. We are hoping that if the performance using Access is bad, that this will dissuade users from using Access.

At our company, we forbid users from accessing any SQL Server databases using Microsoft Access. We do this because of extra overhead put on SQL Server from using Access as a client, and because most of these user's don't really understand what they are doing, and often create queries that unnecessarily use up SQL Server resources.
Unfortunately, there is no way to identify Access users and assign them a lower priority. All connections are created equal, and all connections have the same access to SQL Server resources.
The closest you can come to achieving this is to use the "query governor cost limit option". This option is used to specify the maximum amount of time (in seconds) that a query can run. If the query analyzer determines that if any query will exceed the maximum time, then that query will be aborted by SQL Server. Unfortuantely, this option affects all SQL Server users, not specific users, such as Access users.
The easiest way to keep them out is to deny then access in the first place.
I understand that should avoid the use of SQL Server cursors in order to increase an application's performance. Can you provide some ways that I might be able to avoid using cursors in my applications?

A SQL Server cursor should only be considered in situations where you need to scroll through a set of rows, and then based on criteria you specify, do something potentially different to each row (and in many cases even this can be done using a standard query). If what you need to do to each row is the same, then you should definitely avoid a cursor, and instead use a Transact-SQL query.
Keep in mind that one of the biggest benefits of using a relational database, such as SQL Server, is that is acts on an entire sets of records in one fell swoop. This results in very fast performance. But if you have to perform different actions on each different record, then you often have to use a cursor to accomplish your goal. Because records have to be examined one-at-a-time, cursors often result in poor performance.
While it is true that a query will always outperform a cursor (assuming they are performing the same task), this doesn't always mean that you shouldn't use a cursor in some cases. For example, sometimes I need to perform a fairly simple task on an occasional basis. In these cases, I often use a cursor because they are fairly easy to write, and because performance is not an issue for the task at hand.
On the other hand, if the task is repeated often, and performance is an issue, then you should avoid cursors if at all possible. Some ways to avoid cursors include:
In the past couple of weeks, we have been running some jobs that take a lot of the CPU usage. I would like to know if there is any possible way to :

1. Distribute between processors, the scheduled jobs.
2. Minimize the maximum amount of CPU usage that a job can take.

If there are any other solutions, I would grateful to accept them and learn them.
Your question is a very common one, unfortunately, there is no good answer. While SQL Server has some limited ability to control how many CPUs it uses, and there is the ability to give the mssqlserver service a priority boost over other applications running on the same physical server, there is no way to assign specific SQL Server processes or jobs to a specific CPU. Nor is there any way to assign only a portion of a CPUs resources to a specific process or job.

I assume you are asking this question because you want to more efficiently make use of the CPU resources on your server, and perhaps because the CPU utilization of your server has hit a bottleneck.

Assuming you have a CPU bottleneck, these are some of the most common ways to overcome it:
• Get faster CPUs, or add additional CPUs.
• Get CPUs with a larger L2 cache.
• Tune your application so that it doesn't have SQL Server does not have to access the disk as often. For example, add indexes so table scans aren't needed, normalize your database to eliminate redundant data, etc.)
• Schedule jobs so that they don't overlap.
• Tune your queries to reduce the CPU load.
• Move some of the processing load to another SQL Server.
• Consider turning on Windows NT fibers.
• Be sure that both OLTP and OLAP queries are not being run on the same server. These different database applications should be performed on separate servers.
• Ensure that SQL Server is the only application running on the server (other than server management utilities).
• Remove or disable all unnecessary services.
I mentioned earlier that SQL Server has some limited ability to control the number of CPUs it can use, and that there is the ability to give the mssqlserver service a priority boost over other applications running on the same physical server. Generally, if you are running SQL Server on a dedicated physical server, then there is little or no advantage in taking such action, and I don't personally recommend playing with the SQL Server settings that permit this.

If you are running SQL Server on a shared server (which I don't recommend from a performance perspective) fiddling with the above options might provide some benefits. But be prepared to do a lot of testing to be sure you get the results you expect.
Q.I would like some input on the best, most efficient way to store images for use on a website. The two main options that I see are storing images as a BLOB in SQL Server, or storing the URL of the file in SQL Server, and actually storing the image file on a file share. I have heard that storing in a SQL database as BLOB can be somewhat cumbersome when it comes to retrieval. What do you recommend?
You are correct in stating that storing BLOB objects in SQL Server is cumbersome. It is also slow.
The most commonly accepted way of managing images for a website is not to store the images in the database, but only to store the URL in the database, and to store the images on a file share that has been made a virtual folder of your web server.
Q.I work in a SQL Server environment where we have literally hundreds of stored procedures in one database that are utilized for an OLAP application, and likewise for an OLTP application.
I have seen in many instances where a stored procedure is used solely because one can pass parameters to it and use these parameters in the WHERE portion of a SELECT clause. These stored procedures do nothing more than return a result set.
I was also told that SQL Server has a FIFO queue where only a certain number of pre-compiled stored procedures reside. Furthermore, I was also told that I should consider using user-defined functions that does its SELECT from a view(s) and uses any parameters passed in as delimiters in the WHERE portion of the SELECT clause.
Since functions are not pre-compiled but stored procedures are, is there any performance gain from using user-defined functions with views as opposed to stored procedures?
Q.Before SQL Server 2000, user-defined functions were unavailable. Because of this, stored procedures were often the only way to emulate what a user-defined function can do now. So the question becomes, are there any advantages of converting my current stored procedures to user-defined functions?
I want to break the answer for this question into two parts: performance and convenience issues. Let's start with performance first.
For the most part, rewriting stored procedures as functions will not give you any performance benefits, and quite possibly, they may cause a performance hit because of the extra overhead they incur as compared to stored procedures. The amount of the performance hit, if any, will depend on how the function is written and what it is doing. Keep in mind that user-defined functions are also pre-optimized and compiled similarly to stored procedures (unlike what you have been told). Even so, they have more overhead than corresponding stored procedures.
From a convenience standpoint, there are some reasons to rewrite some stored procedures as user-defined functions, assuming performance is not a major issue. Some of these include:
• The ability for a user-defined function to act like a table gives developers the ability to break out complex logic into shorter code blocks. This will generally provides the additional benefit of making the code less complex, and easier to write and maintain.
• If you want to be able to invoke a stored procedure directly from within a query, then rewriting a stored procedure as a user-defined function would be worthwhile.
In your question, you say, "I was also told that I should consider using user-defined functions that does its SELECT from a view(s) and uses any parameters passed in as delimiters in the WHERE portion of the SELECT clause"
What you were told is not exactly correct. What you described in your question is what is referred to as an inline user-defined table-valued function. An inline function returns a table data type and is an alternative to using a view, as a user-defined function can pass parameters into a T-SQL select command, and in essence, provide you with a parameterized, non-updateable view of the underlying tables. This can be convenient in many cases, but it won't perform faster than performing the same task in SQL Server. In addition, this type of function is more limited in its functionality than a stored procedure.
I also want to mention that stored procedure query plans are not cached in a FIFO (first-in, first-out) manner in the SQL Server data cache. They are actually cycled out of the data cache based on how often they are used. Because of this, often used query plans can stay in cache for a long time, while seldom used query plans will be cycled out, making room for other stored procedures as needed.
Q.I have a "main" table named Property. This table has approximately 60 columns and over 1 million records (this record count will increase significantly over time). Of these 60 columns, approximately 25 are IDs which reference values in multiple different lookup tables. In any given view, query or store procedure, I end up joining 12-25 tables in an effort to return the appropriate data.
I've been under the impression that joins are costly and should be kept to a minimum. Is this true? I have tried several different ways of retrieving this information (i.e. user-defined functions, table datatypes) but nothing is as efficient as joins. I'd like to improve performance if possible. Is there a better way?
Yes, joins can be costly, but of course they cannot be avoided in a relational database. Generally, I prefer to avoid joins that join more than 4 tables. This is not always possible, but that is my goal.

If you have to join from 12-25 tables, this seems to me (and this is just a guess, as I don't know the database) that the database's design is less than optimal. Most likely, you cannot change the design, but if you can, then this would be a great first step to reducing the number of tables in your joins.

One of the easier ways to change the design is to selectively denormalize the tables so that fewer joins are required. This would require the minor changes of some tables, and perhaps any applications that access them, but again, this may not be possible.

Another option would be to create some new tables that are denormalized and that are updated using either triggers (if you need real time updating) or by DTS (if real time is not needed). This way, you can query one or more denormalized tables for the data. Of course if the tables are huge, like yours, these options could present some performance problems.

Another potential option is to create a data warehouse or cube to summarize the data. This may reduce the amount of data stored (much data is summarized), but it would not be real time, although it could be close with regular updates.

Of course, none of the above options may work for you. If that is the case, then joins, using appropriate indexes on the joined columns, are your best bet. Other options, such as using temp tables, sub-selects, and others, probably won't be as fast as the typical join.
Q. Being a developer, I am not an expert at SQL Server (I know just enough to get done what I need); however, many of our clients are asking us for growth estimations of our database. How much MBs/GBs will a database take up? Also, does SQL Server 7 have the ability to estimate how big it will grow from 100,000 records to 200,000?

First of all, you should refrain from messing around with SQL Server's system tables, unless you are told to and are guided by Microsoft. You are running an unsupported system if you modify the system tables!
But if you don't want to listen to the above advice, why would one want to create a trigger on system tables?
The following two reasons seem to be among the most often asked about:
• Creating a trigger on sysobjects in order to take some action when a new object is created or an existing object is modified.
• Creating a trigger on sysusers or sysxlogins in order to take some action when a new user is added or already existing ones are about to be modified.
To be fair, the build-in support in SQL Server 2000 for these reasonable requirements is poor.
Having said that, it is technically possible to create trigger on system tables. Or at least to create trigger on some of the system tables. But they are not guaranteed to fire.
Now, what is meant with "on some of the system tables"? Well, to put it in easy words, there are two types of system tables: those that accept triggers, and those that don't. While:
use msdb
go
create trigger test on dbo.backupfile for insert, update, delete as select 'Hallo Welt'
drop trigger test

The command(s) completed successfully.
obviously finishes successfully, but you will receive:
Server: Msg 229, Level 14, State 5, Procedure test, Line 4 CREATE TRIGGER permission denied on object 'sysobjects', database 'master', owner 'dbo'.
when trying to execute the following statement:
use master
go
create trigger test on dbo.sysobjects for insert, update, delete as select 'Hallo Welt'
So, what can you do in such cases where you need to determine modifications on the system tables? Well, you can run a trace to capture these events. You can also activate SQL Server's build-in C2 auditing, which is likely to be an overkill, since it captures everything, use some third-party tools that offer such functionality, or simply wait for the next version with its' improved functionality in this area.

Q.I have an INSERT INTO SELECT query which appends about 500,000 rows of data to a table. The query takes about 22 minutes to run. I would like it to run in less than 5 minutes, if possible.

One solution I thought of was to append the 500,000 rows of data to an array (instead of a table). And when the array was filled (with 500,000 rows of data), to do a one-time transfer to a table.

My questions are, I don't know how to transfer an array to a table, and I even if I can, I don't know if it will be any faster than using INSERT INTO?
The absolutely fastest way to import data into a table is to use the BULK INSERT command. This command has many options which affect performance, so you will want to examine them closely.
Also, be sure the table you are importing into doesn't have any indexes, as they will slow down the import process. If your table has indexes, then you may want to drop them before the import, then re-add them after the import.
For some additional speed, be sure the file you are importing is on the same physical server, but a different physical drive, as the database you are importing into. This reduces network latency, and by placing it on a drive other than the drive that is holding your database, I/O performance problems will be minimized.
Q.I want to add an identity or unique identifier column to a table in order to ensure that each row in the table is unique. From a performance perspective, which one should I use?

One of the key aspects of database table design is to ensure what is called entity integrity. What this means is that you need to ensure that each row in your database tables is unique. If you don't take the proper precautions, it is possible that one or more rows of your table might be the same. If this happens, what does this really mean (duplicate records)?
One of the more common ways to guarantee entity integrity is to create a primary key. A primary key, which is an index created on one or more columns, is used to guarantee that no duplicate records are entered into a table. If you try to enter a duplicate record, SQL Server won't let you and will give you an error message.
Some tables lend themselves to uniqueness and it is easy to find a column that you know will always be unique, and adding a primary key to the column is no problem. But in other cases, there is no single column that is unique. Instead, it may take two, three, four, or more columns to uniquely identify a column. If you want, you can create a primary key on multiple columns.
Unfortunately, creating a primary key on multiple columns has a major drawback. And that is the index that is created to enforce each row's uniqueness may get very large. The wider the index, the larger the physical index will be. This can hurt performance because it takes SQL Server more resources to maintain or to query wider indexes than it takes it to maintain or to query narrow indexes.
So is there any way to avoid creating primary keys with wide indexes? Yes, and that brings us to the point of our question. Instead of creating a primary key with a wide index, what you can do is to add either an identity or a unique identifier column to the table, and use this column as the primary key. This will greatly reduce the width of the index created, reducing the physical size of the index, and speeding up SQL Server's access to the table.
What an identify field or a unique identify column does is to automatically create a unique value for each row created in your database. You don't have to worry about creating these values for yourself, this is done automatically by SQL Server. But what you must decide is whether to use an identify field or a unique identity column. Ideally, from a performance perspective, you want to use the one that creates the smallest possible physical index in order to maximize performance. So which one of these types of columns should you use?
Let's first look at identify values. An identify value is an automatically incrementing integer that starts counting from a base seed value (such as 1, or 100, or 1000, whatever integer you choose), and increments by a value you specify, which is called an increment (such as 1, -1, 10, whatever integer you choose). The default seed is 1, and the default increment is 1. Because the values are never the same (they increment), an identify value often makes for a good column on which to base a primary key.
When you create an identity column in a table, you must also specify the data type as some form of integer. As you probably know, SQL Server supports several different integer types, each with different sizes (widths). They include:
Bigint 8 bytes -263 to 263-1 (SQL Server 2000 and 2005 only)
Int 4 bytes -2,147,483,648 to 2,147,483,647
Smallint 2 bytes -32,768 to 32,767
Tinyint 1 byte 0 to 255
When creating an identify column, always select the narrowest integer type that will meet your needs. If an Int will do, then don't use Bigint, as you will just unnecessary be increasing the physical size of the index, which can hurt SQL Server's performance.
A unique identifier column, like an identify column, is used to create unique values for each column in your table. But instead of using an incrementing value, instead it creates a unique value based on a special internal algorithm that creates what are called a Globally Unique Identifier (GUID), which is 16 bytes in size.
Right away, you should notice which unique column type — identify or unique identifier — will provide the best performance as a primary key. Because the unique identifier column will always be 16 bytes wide, it will always create a primary key index that is larger than any of the identity column options, and unless you have a special reason that is outside the scope of this question, should not be used as the basis as a primary key.
Instead, use one of the variations of the identity column, selecting the smallest one that meets your data's needs. This way, you can ensure that index created by the primary key is as small as it can, helping to ensure overall better performance of your database, and at the same time ensuring entity integrity of your database tables.
Q.I have learned a lot from your articles on SQL Server clustering. We are moving towards a clustering environment very soon, but there is one problem that I can't solve before we go live. Restoring a Master database in a clustering environment is a big problem. This is especially important in a disaster scenario where the Master database goes corrupt and we need to restore from a previous backup of Master database. To restore a Master database, this requires the server to be started in a single user mode, and that’s where the problem comes in, since Cluster Administrator is the only place to do any restarts of the server, etc. Surprisingly I couldn’t find anything on this subject on the net, and the DELL engineers we asked were clueless as well.
You would think that an important topic like this would be widely covered, but since it is not, here's the relatively easy and straight-forward solution for SQL Server 7.0 and SQL Server 2000. Just follow these steps.
1. Identify which node SQL Server is currently running on, such as node1.
2. From Cluster Administrator, take the SQL Server resources off-line.
3. Go to the node running SQL Server (see step 1) and start it from the command prompt using:

sqlservr.exe -c -m
4. Restore the master database using Query Analyzer or Enterprise Manager.
5. Quit SQL Server from the command prompt.
6. Using Cluster Administrator, restart the SQL Server services.
I am after a some advice for an installation that I am about to start.

The company I am currently working for has just invested heavily in a SAN. It is located in two sites with high speed connections. The solution is fully fault tolerant and each server is configure with 2 x 2GB fibre cards (HBA) which connect to 2 SAN fabrics (each with 2 fibre switches). There are 2 disk controllers in each of the HSG80's. It's a Compaq SAN which uses StorageWorks software to provide the failover for the multiple paths. As it is in the SAN, I can also perform snapshot backups.

I have suggested a SQL 2000 cluster to meet the high availability and business resumption (DR) demands that our company must meet. (In fact, I am about to configure a test environment!)

I have had a heated discussion with the development manager who doesn't understand how a cluster works (but he thinks he does!). He cannot understand why clustering is better than replication. The fact that we can do both doesn't help. He can't see that a Cluster offers anything.

I have tried to explain the pros and cons of replication verses clustering, but haven't had much luck.

In your opinion, what is the best configuration? I need to offer high-speed database performance and data protection. I have thought of using log shipping to another disk on the SAN for data protection, but would clustering be better?
Here are your viable options, along with the pros and cons of each.

1) Clustering
• Offers hardware, OS, and software fault tolerance
• High reliability
• Automatic failover
• Very fast switchover
• No lost data
• Works with most application software
• Does not offer data fault tolerance
• High cost
• Coupled with SAN as shared array, can offer great performance
• Nodes have to be located physically close together
2) SAN Snapshots
• Offers hardware, OS, software, and data fault tolerance (assuming two
servers, primary and hot spare), and some form of SAN snapshot/mirroring is used.
• High reliability
• Manual failover
• Potential for some lost data (time between snapshots). Transaction consistency is not guaranteed.
• Switchover is often 1+ hours
• High Cost
3) Log Shipping
• Offers hardware, OS, software, and data fault tolerance (assuming two
servers, primary and hot spare)
• High reliability
• Potential for some lost data (time between log backups). Transaction consistency is not guaranteed.
• Manual failover
• Switchover is often 1+ hours
• Lower Cost
4) Replication
• Designed for data, not for schema modifications, because of this, hard to
implement for stand-by server
• Replication is unreliable
• Manual failover
• Potential for minimal lost data (time between transactions). Transaction consistency is not guaranteed.
• Switchover is often 1+ hours or more
• Lower Cost
5) Disk Mirroring (SQL Server 2005)
• Offers hardware, OS, SQL Server, and data fault tolerance
• Automatic and manual failover options availables
• Failover can be fast
• Mirrored databases can be at different physical locations
• Does not require any special hardware
• Moderate cost
• Database-, not server-centric
• May not work automatically for all application software
Q. If I have two SQL Server 2000 servers running separately, and want to create a clustered environment with them. Do I need to build the Windows 2000 cluster first, and then re-install SQL Server 2000, or can I cluster the database engine post-installation?

Although it is a pain, here is what I suggest:
• The very first step is to backup your databases and test your backups to be sure you can restore them.
• Verify that the hardware you have is on the Microsoft Hardware Compatibility List.
• Attach the shared disk array to the two servers, and test the setup to see that everything works correctly.
• Most likely, you don't have Windows Windows 2000 Advanced Server on your two servers, which is necessary for clustering. Because of this, you must rebuild the servers from scratch. By this, I mean to wipe out the previous operating system and then install Windows 2000 Advanced Server. You can install the new operating system before or after you install your shared disk array.
• Once you have Windows 2000 Advanced Server properly installed, then install and test the Windows 2000 Clustering Service.
• Now you are ready to install SQL Server 2000 Enterprise Edition, and then test it.
• Finally, restore your database backups.
If, by chance, you already have Windows 2000 Advanced Server installed on these two servers, and if you already have SQL Server 2000 Enterprise Edition installed, you still have a lot of work ahead of you, but a little less. Here's what I recommend in this case:
• Backup your databases and test your backups to be sure you can restore them.
• Verify that the hardware you have is on the Microsoft Hardware Compatibility List.
• Attach the shared disk array to the two servers, and test the setup to see that everything works correctly.
• Uninstall SQL Server 2000 Enterprise Edition from both nodes. Leave the databases files intact so that you can reattach them later, if you like.
• Install the Windows 2000 Clustering Service.
• Reinstall SQL Server 2000 Enterprise Edition using the clustering option in Setup.
• Finally, restore your database backups, or if all the MDBs and LDFs are still intact, you can reattach them instead of restoring your backups.

Q.We are having a problem with our SQL Server cluster. Everything was originally set up and configured properly, then for some reason the SQL Server Fulltext service failed on the primary node, and will not run on this node any longer. I changed the configuration of Fulltext so that this failure would not prevent the group from failing over, and this works.

However, I'm trying to get to the bottom of why Fulltext failed, and in the event log for the node I see the following message:

'An error occurred during the online operation for instance : 80040154 - Class not registered'

So at this point I am thinking its time to reinstall the SQL cluster. The problem is that the SQL Setup process no longer recognizes the Virtual Server! How should I go about uninstalling at this point?
While I have not seen your specific problem, I have seen similar problems, especially the problem of SQL Server setup not being able to see the virtual server. In these cases, even though you will not like what I have to say, the best option is to rebuild your cluster from scratch.
The purpose of a cluster is to ensure high reliability, and you can't guarantee this if a cluster begins acting up like you describe. Whenever I see the slightest hint of a problem, I start over from scratch, reinstalling the operating system SQL Server, and service packs as appropriate.
Only when I get a clean install, with no odd behavior, do I believe the cluster is ready for production.
Q.We currently have five SQL Server servers managing over 100 databases. We would like upgrade to the newest version, and at the same time incorporate clustering. What are our clustering options, and what would you recommend?
It looks like you have a big project ahead of you. There are several options, so let's take a look at each, exploring their pros and cons.
One option would be to purchase two very large SQL Servers and create an Active/Active cluster where the 100 databases are split more or less evenly between the two nodes of the cluster. The pros of this would be less cost and less administrative headache. The cons are that these two servers, even though they are large, may not be large enough to handle the entire load. In addition, whenever a failover occurred, performance would really suffer because a single node would have to support all 100 databases.
A second option would be to purchase four to six very large SQL Servers boxes, creating a multi-node cluster, where you have one node as the backup for the others in the cluster. This way, you could split the 100 databases among mutiple servers instead of only two. The pros of this are the ability to distribute the load over more nodes. The cons of this include the complex set up and administration, and it is very expensive for the hardware and the software licenses.
A third option is to purchase five two-node active/passive clusters. This should give you the server resources you need to handle the 100 databases and give you the option to more easily expand should you find that the number of databases, or the load on the servers, grows. The pros of this include: the capacity to easily handle the current load; 2) The capacity to more easily grow into new capacity by adding additional two-node clusters if necessary; 3) Greater performance if a failover should occur, as the passive node will take over fully for the active node (assuming hardware is the same). Cons of this option include: 1) Potentially high cost, 2) Some greater administrative needs, as you have more servers to monitor and manage. A variation of this would be to use two or three Active/Active nodes instead, as this would save some hardware costs. But this would mean that if a failover occurs, that performance during this time would suffer.
Something else you have to keep in mind is how to handle the shared array for each of the clusters. Your best bet, in any of the options described above, is to use a fibre-attached SAN device. This will offer the greatest performance, flexibility, and cost-benefits.
As you might guess, the best solution for you will depend on many factors, some of which I have not described here. My belief is that you want to remain as flexible as possible, leaving room for growth, even if this means you may have to spend a little extra on hardware and software to accomplish this goal.
Q.I have two Microsoft 2000 Advanced servers that I want to cluster. Is it advisable to run both Terminal Services (application mode) and SQL Server on the same cluster together?

Technically speaking, what you want to do is possible, but I wouldn't recommend it. One problem is that Terminal Services (Application Mode) has some limitations under clustering, and is not fully cluster aware.
But the biggest problem you will run into is performance. Terminal Services (Application Mode) is a huge resource hog. The same is true for a very busy SQL Server. If your SQL Server activity is low, and your Terminal Services sessions are few, then they might be able to co-exist together, but I still would shy away from it because of the added complexity of putting both on the same cluster. Clusters still are a little tricky to set up and manage, and adding Terminal Services and SQL Server into the mix at the same time is just asking for trouble.
I have a question on how to uninstall a SQL Server 2000 instance installation from Windows 2000 cluster. I tried to use the setup program on the install CD, but I am not getting the option to uninstall when I run the setup program.
Let me give you brief scenario as what has happened:
1. Windows 2000 cluster installed successfully.
2. Installed SQL Server 2000 instance successfully.
3. Created a new group and added all SQL-related resources to it.
4. Decided to remove the Full Text Search resource from the new SQL group created above, as we are not using this service.
5. I deleted the Full Text Search resource and said OK without realizing that this has dependency on the SQL server resource.
6. This lead to deletion of SQL server, the SQL virtual IP, and SQL Virtual Name resources.
7. I have tried to re-add each resource manually, but this doesn't work. And now, I can't even install SQL Server 2000.
Q.Any suggestions to repair or uninstall SQL Server 2000 will be greatly appreciated.
While I can't say this with 100% confidence, I believe the registry on your server has become corrupted, and because of this, the uninstall option of the SQL Server Setup program believes that SQL Server is no longer installed, even though it is.
I had a similar problem once, where the registry got corrupted, and my "solution" was to completely reinstall Windows 2000, Windows clustering, and SQL Server clustering from scratch. This might seem a little radical, but it was the only option I felt comfortable with. While I have spent more than my share of time fiddling with the registry, I don't like to on servers that are as mission critical as a SQL Server cluster.
Of course, if you have an image backup of your server before you installed SQL Server 2000, then you probably would be able to restore that backup and save you some time. But if you don't have a backup, I would suggest you reload your software from the very beginning.
So how can you prevent this from happening again? As you may know, for some reason that only someone at Microsoft knows, when SQL Server 2000 instance is installed on a cluster, it automatically installs the SQL Server Fulltext resource. You don't have a choice.
If you don't plan to use this feature, don't try to delete the resource, as you did, instead, do the following steps:
• Start Cluster Administrator.
• Display all of the SQL Server services on the right side of the screen.
• Right-click on the "SQL Server Fulltext" resource, and choose "Properties."
• Select the "Advanced" tab from the SQL Server Fulltext Properties screen.
• Deselect the checkbox next to "Affect the group."
• Click "OK."
• (Optional) Right-click on the "SQL Server Fulltext" resource, then choose "Take offline".
By taking the above steps (except the last one), you have told the Cluster Service not to automatically failover all of the SQL Server resources should the Full-Text service fail for any reason. If it should fail, you will see a warning message in Cluster Administrator, but no other cluster resources will be affected. This is a very important step you should take, otherwise you might find your SQL Server services failing over for no apparent reason should the Full Text Service fail.
The last step, the one I call "optional," can be done if you want to take this unused resource offline. Doing so will slightly reduce overhead on the SQL Server cluster. I describe this last step as optional because the amount of overhead it will save is minimal, and because it will cause the name of the group that includes this resource to to display a red "x" next to it, which may be disconcerting to some people, although it is no cause for worry.

Q.I have recently set up a two-node cluster using Windows 2000. I have a single RAID 5 shared disk array that is sliced into two logical disks which appear in Windows 2000 Disk Manager as Disk 1 and Disk 2. I have also created one primary partition on each disk and have assigned drive letters of Q: & W: respectively. I have used Drive Q: for the Quorum Resource and W: for the data files.
I have successfully installed one default instance of SQL Server on the cluster. My intention is to run multiple instances of SQL Server on the cluster in an Active/Active configuration. However, when I try to installing the second instance of SQL Server, I find that Disk W: is no longer available as a disk resource, and that the only disk available is Q: How can I create more than one physical disk resource using the same disk? Or, do I have to slice the array further and use logical disks 3,4,5 and so on?
In an Active/Active configuration, each instance of SQL Server must have its own dedicated shared array space, that is why you are seeing the problem you are.

If you want to run more than one instance of SQL Server in an Active/Active configuration, you must have at least two shared disk arrays available, one for each instance of SQL Server. These disk arrays must be physical, not logical. This is because they must run independently of each in order for failover to occur properly. You can't create more than one physical drive out of one physical drive.
You have no choice but to reconfigure your current array so that you end up with at least two physical arrays, or you must get more physical drives.
In addition, the Quorum drive should also be on its own physical disk array, which means you should have at least three shared physical disk arrays for your Active/Active SQL Server clustering configuration.
Q.I was told by a friend, that at his last company they had a SQL Server environment which contained four clustered servers that were load balanced. So in effect, when they pulled the plug on one server, the other three would adjust and pick up the load.

From my experience of setting up and using Windows and SQL Server clustering, I cannot for the life of me understand how this is possible. Could you please correct my thinking here and let me know if this is possible using native Microsoft clustering, or load balancing, or both?
You are correct, and your friend is confused. Windows clustering, nor SQL Server clustering, does not natively support load balancing. Clustering's sole purpose is for boosting fault tolerance. When a cluster fails over, all that happens is that the instance of SQL Server running on the failed server is moved to another server. If the server that is being failed over to already has an instance of SQL Server running on it, it will now have two instances, and performance mostly likely will suffer.
Q.We have a SQL Server 2000 cluster. I have been trying to get SQLMail and the SQLAgentMail to work with it, but have failed. What am I doing wrong?

SQLMail is not officially supported under SQL Server7.0 clustering.
SQLMail in SQL Server 2000 is supposed to work under clustering, although I have not personally tried it. In fact, I don't recommend it as you must add mail clients to both servers, and I don't like the idea of adding mail clients (with all the security and other reliability issues they represent) on a production cluster. But that is your choice.
To use SQLMail with SQL Server 2000 clustering, keep the following in mind:
• Each node must have a mail client, such as Outlook, installed on it.
• Each instance of SQL Server 2000 in the cluster must use the same service account.
• Each node in the cluster must have a MAPI profile with an identical name and settings.
Even if you setup everything correctly, SQLMail under SQL Server 2000 clustering may still not work. I understand that there is a problem with SP2 that prevents it from working correctly, and that Microsoft has a hotfix for the problem. So if you cannot get SQLMail to run correctly, you may need to contact Microsoft Support.
In SQL Server 2005, SQLMail is no longer used, and has been replaced with the new SMTP mail option. This is supported under clustering and does not require that a mail client to be installed on each node of the cluster.
Q.We are planning a new network with a SQL 2000 cluster. We intend to have the two servers separated from each other by 100 meters. Will the cluster's shared disk array be a single point of failure for us.

A cluster is designed to provide fault tolerance for the physical servers, the operating system, and the application software, but not for data. Fault tolerance for data should be handled by a fault tolerant shared disk array, log shipping, or database mirroring (in SQL Server 2005).
So in a sense, the shared disk array is a single point of failure should the entire array fail. But if the shared array is well designed and employs fault tolerance features, it should be very reliable. This is a good reason why you should not skimp on spending money on a shared disk array for a SQL Server database cluster.
If you intend to separate the two nodes of the cluster by 100 meters, you will have no choice but to use Fibre Channel connections to the shared array, as SCSI is not supported for such a long distance.
Recently, due to corporate requirements, we had to change all the IP numbers on all our servers, including our clustered SQL server. Since this change, we cannot set up an ODBC connection or connect to Enterprise Manager using the virtual IP address or the virtual name of the server, although we can using the IP address or name of Node 1 of the cluster to make connections.
We can ping the name and number of the server. We can even map a drive to the server, but that's it. Did we miss something when we changed IP addresses? I've spent hours scanning over them and have found the numbers to be correct as far as I can tell.
Q.We're using SQL Server 2000. We changed the IP's by going into Cluster Administrator, then changing the IP addresses for each node, as well as the server itself. I was not present at the time of the change, but have checked all the addresses in the cluster administrator, and they appear to be correct.
The problem you are experiencing is occurring because you did not change the virtual IP address of the cluster as per Microsoft's instructions. You cannot change the virtual IP address of SQL Server clustering using the Cluster Administrator. How you change the virtual IP address depends on if you are using SQL Server 7.0 or SQL Server 2000.
If you are running SQL Server 7.0, you must uninstall, and then reinstall SQL Server 7.0 clustering. This is a pain, but the only option.
If you are running SQL Server 2000, you have to "Advanced\Maintain Virtual Server for Failover Clustering"option of the Windows 2000 setup program.
Q.I followed your advice regarding the installation order for clustering SQL Server (and am pleased to see my own experiences verified, as it were). I wondered at which point in the order do you install Outlook (for e-mail profiles) as I have had some problems with this?

SQLMail is not officially supported under SQL Server7.0 clustering.
SQLMail in SQL Server 2000 is supposed to work under clustering, although I have not personally tried it. In fact, I don't recommend it as you must add mail clients to both servers, and I don't like the idea of adding mail clients (with all the security and other reliability issues they represent) on a production cluster. But that is your choice.
To use SQLMail with SQL Server 2000 clustering, keep the following in mind:
• Each node must have a mail client, such as Outlook, installed on it.
• Each instance of SQL Server 2000 in the cluster must use the same service account.
• Each node in the cluster must have a MAPI profile with an identical name and settings.
Even if you setup everything correctly, SQLMail under SQL Server 2000 clustering may still not work. I understand that there is a problem with SP2 that prevents it from working correctly, and that Microsoft has a hotfix for the problem. So if you cannot get SQLMail to run correctly, you may need to contact Microsoft Support.
In SQL Server 2005, SQLMail is no longer used, and has been replaced with the new SMTP mail option. This is supported under clustering and does not require that a mail client to be installed on each node of the cluster.
Q.What are the implications of server clustering on replication? Is it advisable, or is there a better way to prevent downtime given the current replication setup? Once the two node cluster has been set up, would replication be configured in the same way as before?

Replications works with SQL Server 7.0, SQL Server 2000, and SQL Server 2005 clustering. But as you might well imagine, configuring and maintaining a SQL Server cluster that is also involved in replication can be a headache.
Whether or not you decide to cluster SQL Servers involved in replication has to do with how important the servers are to your enterprise. If they are mission critical and your business cannot function without them, you will probably want to cluster your SQL Servers. At the same time, you will also have to accept the extra added dimension of complexity that is involved.
If you are going to implement replication and the SQL Server cluster will participate as a Publisher and a Distributor, use a file share located on the cluster disk resource as the snapshot folder. This way, replication will failover when SQL Server fails over.
When you configure a clustered SQL Server as a Distributor, SQL Server needs to have access to a snapshot folder as a temporary holding place during the replication process. In order to ensure that replication still works when failover occurs, this folder must be located on a shared folder on the cluster's shared disk resource. If it is not, then when failover occurs, replication may stop working.
You will have to manually create the required folder on the shared disk resource, and additionally create the necessary share with appropriate permissions. In addition, you will have to configure the shared folder using Cluster administrator as a clustered shared folder.
If you don't want to do the above, one option is to not make the clustered SQL Server a Distributor, but only a Publisher, and locate the Distributor on a non-clustered SQL Server. This way, the snapshot folder will exit on a non-clustered SQL Server, negating the above advice.
I want to install SP1 for SQL Server 2000 on my clustered SQL Server. Are there any special steps or procedures I should follow?
Before we actually look at the answer, let's recall some basics of the IDENTITY property and SQL Server's numerical data types.
You can define the IDENTITY property on columns of the INT data type and on DECIMAL with scale 0. This gives you a range of:
TINYINT 0 - 255
SMALLINT -32.768 - 32.767
INT -2.147.483.648 - 2.147.483.647
BIGINT -2^63 - 2^63-1
When you decide to use the DECIMAL datatype you have a potential range from -10^38 to 10^38-1.
So, keeping this in mind, we're now ready to answer the original question here. What happens when an INTEGER IDENTITY value is about to run out of scope?
CREATE TABLE id_overflow ( col1 INT IDENTITY(2147483647,1) )
GO
INSERT INTO id_overflow DEFAULT VALUES INSERT INTO id_overflow DEFAULT VALUES SELECT * FROM id_overflow
DROP TABLE id_overflow

(1 row(s) affected) Server: Msg 8115, Level 16, State 1, Line 2 Arithmetic overflow error converting IDENTITY to data type int. Arithmetic overflow occurred.
This script creates a simple table with just one column of type INT. We have also created the IDENTITY property for this column. But instead of now adding more than 2 billion rows to the table, we rather set the seed value to the positive maximum value for an INTEGER. The first row inserted is assigned that value. Nothing unusual happens. The second insert, however, fails with the above error. Apparently SQL Server does not start all over again or tries to fill the maybe existing gaps in the sequence. Actually, SQL Server does nothing automatically here. You have to do this by yourself. But what can you do in such a case?
Probably the easiest solution is to alter the data type of the column to BIGINT, or maybe right on to DECIMAL(38,0) like so:
CREATE TABLE id_overflow ( col1 INT IDENTITY(2147483647,1) )
GO
INSERT INTO id_overflow DEFAULT VALUES ALTER TABLE id_overflow ALTER COLUMN col1 BIGINT INSERT INTO id_overflow DEFAULT VALUES
SELECT * FROM id_overflow
DROP TABLE id_overflow

col1
--------------------
2147483647 2147483648
(2 row(s) affected)
If you know in advance that your table needs to keep that many rows, you can do:
CREATE TABLE bigint_t ( col1 BIGINT IDENTITY(-9223372036854775808, 1) )
GO
INSERT INTO bigint_t DEFAULT VALUES
SELECT * FROM bigint_t
DROP TABLE bigint_t

col1
--------------------
-9223372036854775808
(1 row(s) affected)
Or the DECIMAL(38,0) variation:
CREATE TABLE decimal_t ( col1 DECIMAL(38,0) IDENTITY(-99999999999999999999999999999999999999, 1) )
GO
INSERT INTO decimal_t DEFAULT VALUES
SELECT * FROM decimal_t
DROP TABLE decimal_t

col1
----------------------------------------
-99999999999999999999999999999999999999
(1 row(s) affected)
One might be distressed in one's aesthetical taste by those negative numbers, but it's a fact, that one now shouldn't have to worry about running out of scope for quite some time.
Q.I understand that should avoid the use of SQL Server cursors in order to increase an application's performance. Can you provide some ways that I might be able to avoid using cursors in my applications?

A SQL Server cursor should only be considered in situations where you need to scroll through a set of rows, and then based on criteria you specify, do something potentially different to each row (and in many cases even this can be done using a standard query). If what you need to do to each row is the same, then you should definitely avoid a cursor, and instead use a Transact-SQL query.
Keep in mind that one of the biggest benefits of using a relational database, such as SQL Server, is that is acts on an entire sets of records in one fell swoop. This results in very fast performance. But if you have to perform different actions on each different record, then you often have to use a cursor to accomplish your goal. Because records have to be examined one-at-a-time, cursors often result in poor performance.
While it is true that a query will always outperform a cursor (assuming they are performing the same task), this doesn't always mean that you shouldn't use a cursor in some cases. For example, sometimes I need to perform a fairly simple task on an occasional basis. In these cases, I often use a cursor because they are fairly easy to write, and because performance is not an issue for the task at hand.
On the other hand, if the task is repeated often, and performance is an issue, then you should avoid cursors if at all possible. Some ways to avoid cursors include:
• Rewriting the cursor as a normal query. Some people write cursors that perform the same task over and over on a set of records. This is a waste of server resources because this could be easily handled by a standard query. And even if what you need to do to each row is conditional on data in a row, you still may be able to to use a standard query using a CASE statement.
• Rewriting the cursor as a derived query. See this article for more information.
• Rewriting the cursor using temporary tables in a query. See this article for more information.
• Rewriting the cursor using table variables in a query (SQL Server 2000 or 2005).
If you find that you have to use a cursor, then try to use a FAST-FORWARD, READ-ONLY cursor, which is the cursor that uses the least resources. See these tips for speeding the use of cursors.
Q.My manager has asked me to put together a plan for a backup server for our production server running SQL Server. Should I use log shipping or clustering?
Most of the production SQLServersI have run across don't have any easy or quick way to failover to a working server should they fail. Generally, most people hope their production SQL Server never fails. And in the back of their minds, they think they know what to do to recover should the worst happen, but for the most part, they don't have any formal plans, nor have they tested how they actually plan to failover. In these cases, you can assume that it might take a day or longer for them to get back into production should their SQL Server fail.

If your production SQL Server is not critical to the success of your business, and you can afford to be down a day or two, then not having a formal failover plan is not much of a problem

On the other hand, if your production SQL Server is critical to the success of your organization, then you must have a tested failover plan in place. Two popular ways to implement a failover plan is to either implement SQL Server log shipping or clustering. Which of these is your best choice?

The question you have to ask yourself is what you mean by best?

If by best, you mean a failover solutionthat fails over automatically and quickly (a minute or so), the SQL Server clusteringis the answer.

If by best you mean a failover solution is that somewhat easy to implement and less expensive, then log shipping is the answer.

Unfortunately, there is no best answer. You have to wade through the pros and cons and decide for yourself what is the best compromise. Here are the major pros and cons of log shipping vs. clustering.

Pros of Log Shipping
• Relatively speaking, less expensive.
• Relatively speaking, less work to set up.
• Relatively speaking, less on-going maintenance.
• Requires less experienced DBAs and networkadministrators.

Cons of Log Shipping
• Failover is not quick, may take from 15 - 60 minutes or more of downtime.
• Failover is not automatic, it must be manually implemented.
• Failed over server must be renamed to the name of the failed server.
• Failing back to the original server, once it is fixed, is a manual process that can require 15 - 60 minutes or more of downtime.

Pros of Clustering
• Failover is quick, usually resulting in less than minute or two of downtime.
• Failover is automatic, no manual intervention is required.
• Clustering doesn't require any servers to be renamed.
• Failing back is quick, usually resulting in less than minute or two of downtime.

Cons of Clustering
• More expensive than log shipping.
• Requires more set up than log shipping.
• Requires more on-going maintenance.
• Requires more experienced DBAs and network administrators.

kamlesh7880's picture