Posts

Showing posts from April, 2011

VB and VB.NET - Difference

This is an old issue but very frequently we come across the same query. Particularly when we require the application (OCX) migration from VB. What  are some of the basic differences between VB & VB.NET? Answer: The following section enlists some of the issues I came across when encountered with a similar migration requirement. These differences are mostly in reference to VS2005. 1. VB.NET provides managed code execution that runs under the Common Language Runtime (CLR) resulting in robust, secure and stable applications. All features of the .NET framework are readily available in VB.NET. 2. VB.NET is entirely object oriented. This is a major addition that VB6 didn’t have. 3. The .NET framework comes with ADO.NET, which follows the disconnected paradigm, i.e. once the required records are fetched the connection no longer exists. It also retrieves the records that are expected to be accessed in the immediate future. This enhances Scalability of the...

SEO - Directory Submission

It was very recent days that I became very interested in writing myself down. Now, in todays world it is very much presumable that I would do the entire writing stuff online. Definitely being involved with IT related work, it is very much expected from me as well. In the above, process I noticed the importance of SEO and its one of the techniques called Directory Submission. A short note of my understanding on this topic. Introduction The key objective of every web site owner is to increase the amount of traffic directly received. This traffic is increased by improving their rank with search engines like as Google and Yahoo! The search engines determine the position of the website in a search on the basis of a formula. One of the key factors of this formula is the number of back links a web site has. The quality of incoming links improves the search engine rankings and internet visibility of a website. Directory submission is a proven method for gaining quality one-way incoming bac...

Teradata - Month End Date

Sql Server: Calculating the last day of the month in SQL Server is pretty simple. Run the following piece of code and its done. SELECT DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,GETDATE())+1,0)) CurrentMonthEnd_Day ResultSet: CurrentMonthEnd_Day ———————– 2010-08-31 23:59:59.000 Teradata: However, with reference to Teradata SQL, in our day to day queries in a datawarehouse environment very frequently we need to play with the month end dates that are mostly kept in the date dimension table with one field, i.e. the month end date field. You might require to fetch the month end date which is 4 years back. usually the simplest way in this case will be to calculate 48 months back from the current month end date (in case the monthe end date that long is not stored into the database). This is fair. The piece of code required to do is simple. SELECT (ADD_MONTHS(CAST('2012/08/31' AS DATE),-48)) Result: 2008/08/31 Now trying this simple thing will not work if its a leap year a...

Managing Memory

Let us discuss on some on some aspects of performance tuning and optimization of sql queries. The following terms are quiet relevant in this regard. Although we do not always think about these counters unlesss we are seriously having issues with the sql server memory or the performance of a strong query. Buffer Cache Hit Ratio  The buffer cache is a syndicate or collection of buffer pages. The data pages occupies the biggest part of the memory and pages are mostly read into this cache. In OLTP systems this counter value should be as high as possible. Ideally, 99 percent in most cases. It is exactly the reverse in OLAP or DW environment where cache requirement although is there but as most of it is bulk data (billions of data records), cache requirement is less. Also frequency of requests coming in can be much lower than e.g. transactions in bank. If Buffer cache hit ratio value is low that means, only a few requests could be served out of the buffer cache, with the rest of the ...

Technology Day @Microsoft

It was some long time back..almost 2 years back I attended a session from my company. I understand technology is travelling to fast for us to keep par with. Still a small note: Microsoft Innovation Day (Full Day Session) Topics: v     SaaS (Software as a Service) v     Cloud Computing v     Ajax v     Query Tuning v     Debugging tools with .NET v     Live Mesh v     Internet Explorer 8 1.     SaaS (Software as a Service) The key characteristics of SaaS software, according to IDC, include: a) Network-based access to, and management of, commercially available software. b) Activities are managed from central locations rather than at each customer's site, enabling customers to access applications remotely via the Web. c) Application delivery, typically is closer to a one-to-many model (single instance, multi-tenant architecture) than to a one-to-one model, incl...

Automating stored procedure execution

The sp_procoption system stored procedure can be used to mark the user stored procedure for automatic execution when the SQL Server starts. Only objects in the master database owned by dbo can have the startup setting changed and this option is restricted to objects that have no parameters. USE master EXEC sp_procoption 'indRebuild', 'startup', 'true'

Select count(*)

Select COUNT(*) from a table usually does a full table scan to return the total row count. The time taken can be considerable for large tables. Instead it is advisable to use the other approach in most cases. You can use sysindexes system table. The ROWS column in the sysindexes table contains the total row count for each table in your database. So, you can use the following select statement instead of SELECT COUNT(*): SELECT rows FROM sysindexes WHERE id = OBJECT_ID('table_name')      AND indid < 2 The speed can be improved manifold. 

Data Mocking & Data Masking

I am currently working on an article as an anecdote to Data Mocking (what we call Mockups) and Data Masking. In the database world we, very frequently come up with these terms. I can understand that, you guys being database enthusiasts might be very much familier with these terms. However, not everyone would be like you all and the target audience for the article would be those guys.

SQL Optimizer Parameterization

Image
Plan Recompilation Once the Query Optimizer has come up with a plan, which may have taken a considerable amount of work, SQL Server does its best to ensure that you can leverage all that costly work again. It does this by caching the plan it just created, and taking steps to ensure that the plan is reused as widely as possible. It does this by using parameterization options. Parameterization Parameterization is a process whereby SQL Server will take the T-SQL you entered and look for ways to replace values that may be variables with a token, so that if a similar query is processed, SQL Server can identify it as being the same underlying query, apart from some string, or integer values, and make use of the already cached plan. For example, the following is a basic T-SQL query to return data from the Adventure Works 2008 database:    select *    from person.person    where lastname = 'fluffy' The parameterization of this query would result in the string...

What is Normalization??

Normalization in the simplest terms: Normalization is the process of organizing data in a database. The target is to eliminate redundant data (for example, storing the same data in more than one table) and ensure sensible data dependencies in other words, only storing related data in a table. First Normal Form (1NF) First normal form (1NF) is the basic step:     * Eliminate duplicate columns from the same table (mind it not duplicate rows)     * Create separate tables for each group of related data and identify each row with a unique column i.e. set of columns (primary key). Second Normal Form (2NF) Second normal form (2NF) further removes duplicate data:     * Should meet the first normal form.     * Remove subsets of data that apply to multiple rows of a table and place them in separate tables.     * Create relationships between the new tables and their parents through the use of foreign keys. Thi...

Database pages simplified..

Database Pages A Page is the unit by which SQL Server data is stored. SQL Server 2000 views each data file and breaks it down to 8K byte pages. There are 128 pages stored in 1MB. A page can only belong to one object. The beginning of each page has a header that contains information about the characteristics of that page. The page header takes up the first 96 bytes of each page, leaving 8060 bytes for actual data. The header contains information such as PageID, Previous Page, Next Page, amount of free space available, the object ID of the object that owns the page, etc. By checking the header, SQL Server can quickly determine the status of each page, making it easily identifiable. To keep track of allocations, SQL Server stores allocation information in certain page types. Lets have a look at them: 1) Global Allocation Map (GAM) Pages A GAM page reports whether an extent is free or allocated. It has a bit (1 or 0) to keep track of extent allocations in the interval it covers. A bit ...

Orphaned Distributed Transaction SPID

Explanation: This example is of an Orphaned Distributed Transaction SPID which specifically involves MSDTS , the Microsoft Distributed Transaction Coordinator. The is what happens when a transaction involves data which resides on more than one server (such as when a database record is replicated out to two or more servers where MSDTC needs to become involved) and the following happens: a) A server drops off the network b) There’s a power outage at the distributor server. The Negative SPID (-2) in SQL Server Phenomenon: May be you have been investigating a performace issue in SQL Server or exploring the error log file for sometime. But to add to your astonishment you notice that there is a negative SPID = -2 at the top of the chain and is the cause of all deadlocks. What next? Definitely you would like to wait (not sure how long?) or try to kill the process. Here is what you have when you try any of them: Wait : No use waiting as the process will not complete/refresh u...

SQL ResultSet in CSV Format

There might be a requirement very often to dump the query results straightaway into CSV file format or in other words into a .CSV file. Here is what we need to do to achieve the same. 1) In case we are using OSQL through JCL to call the stored proc, then we need to add a new parameter (-s) as "," or ";" to indicate the character separator. E.g. SET PARM5= -s %";"% The other parameter (-o) needs to be modified to specify (.CSV) as output file format instead of (.TXT). 2) This is not all. The stored proc needs to be modified as well to accommodate the changes. Please refer to these different approaches taken. SELECTED APPROACH: a) SELECT 'col_1' + ',' + 'col_2' + ',' + ' '     UNION ALL     SELECT CONVERT (varchar, col_1) + ','              + CONVERT (varchar, col_2) + ',' + ' ' ALTERNATIVE APPROACH: a) SELECT 'col_1', ',' , 'col_2'     UNION ...

Table Variable OR Temporary Table

A few considerations: Table variables will perform poorly with large record sets, especially when doing joins because there can be no indexes other than a primary key. Beware, though, when many users start using table variables -- large amounts of RAM are used because all temporary variables are stored and processed directly in memory. Table variables should hold no more than 2 Megs to 3 Megs of data each (depending on user load and system memory). There are some more limitations to the usage to be kept in mind:   • Table variables do not maintain statistics like temporary tables can. Statistics cannot be created on table variables through automatic creation or by using the CREATE STATISTICS statement. Therefore, for complex queries on large tables, the lack of statistics may deter the optimizer to determine the best plan for a query, thus affecting the performance of that query. • The table definition cannot be changed after the initial DECLARE statement. • Tables variables ca...