If you are looking for free and complete guide to SQL tutorial, you have landed to the right place. As I had promised that I will guide you through
acquiring skillsets required to become a successful business intelligence
analyst, I am starting my tutorial with SQL. Knowledge of SQL is very important if you are
aspiring for BI analyst. SQL (Structured Query Language)
is used to create and modify databases along with retrieving, modifying and
deleting the business related data residing in the databases.
To start with, you should install Microsoft SQL Server 2008 Express in your computer. Following link will guide you step by step to install SQL on your system which is FREE!
Data which has to be analyzed for
understanding and predicting the business is stored in the databases. Understanding RDBMS is very important. Please find
following link which explains databases from a beginner’s perspective. Even if you already know, brush up little bit.
When database is created, it is used to store
data on daily business e.g. sales data for a retail store,bank transactions, phone calls by a caller for mobile companies etc. We need to extract this
data and analyze it. Hence, knowledge of SQL is very important for analysis.
Case Study:
I
will start with a very simple example. You are given a Database DB_PhoneSales which
contains table, 'Phone' and the data in the table is based on sales of 3 types of Phones. The sales data
contains phone short name, city, month and quantity sold.
Name
|
City
|
Month
|
Quantity
|
A
|
New york
|
Jan
|
169
|
A
|
Seattle
|
Feb
|
399
|
B
|
Florida
|
Jan
|
172
|
B
|
New york
|
Jan
|
185
|
B
|
Seattle
|
March
|
322
|
C
|
Seattle
|
Jan
|
294
|
C
|
New york
|
March
|
145
|
A
|
Florida
|
Feb
|
393
|
A
|
Florida
|
March
|
155
|
B
|
New york
|
Jan
|
135
|
B
|
Florida
|
March
|
151
|
C
|
New york
|
March
|
224
|
B
|
Seattle
|
Feb
|
228
|
A
|
Florida
|
Feb
|
499
|
You are required to find out the following:
a..
Which phone is sold the most?
b. Which city has minimum number of phones sold?
How sql will be used to solve the above case study? These are the topics
which you need to know:
1. Data Type:
Business data can be in form of string(names,
address), integer(age, phone number), decimals(interest rate e.g. 1.2, price$2.5),
date (25/10/1994), etc.. You should understand this very carefully because choosing
right data type in the beginning saves you from lot of problems and reworks which can
arise later.
Case study
In the above case study you can
observe that Name, City and Month are strings whereas Quantity is an integer.
Following link will give you all the data types used.
2.
DML statements: Data Manipulation Language (DML)
statements are used to retrieve, update, insert and delete the data. W3S School is a very good website for learning
SQL. You can brush up, every time, very quickly for your interview by going
through this website. Please go to the link and explore DML, SQL functions,
joins etc.
Case Study
Now I assume that till now you have
installed SQL in your computer. If not please do it, what is stopping you? After that you have gone through the data types and DML statement tutorials. You
can easily tell that following statement will be used to retrieve data from
database:
Select name,city,month,quantity from Phone
3. Aggregating
Statements:
In business intelligence you will be grouping and ordering data most of
the time. Generally, the data which you receive is at a very detailed level. You
need to group it so that you can see the overall picture of your business.
Case Study:
In our case study we can see that quantity
of phones sold is at monthly and city level. But you want to know overall which
phone is being sold the most. So you sum the quantity for the phone and see
which is the highest. Please find the
SQL statements used to arrive at required result.
SELECT NAME, SUM(QUANTITY) AS
TOTAL_QUANTITY FROM PHONE ORDER BY TOTAL_QUANTITY DESC
Name
|
Total_Quantity
|
A
|
2004
|
B
|
1839
|
C
|
973
|
SELECT CITY, SUM(QUANTITY) AS TOTAL_QUANTITY FROM PHONE ORDER BY TOTAL_QUANTITY
City
|
Total_Quantity
|
Seattle
|
1382
|
New york
|
1702
|
Florida
|
1732
|
Now the top row will give you the required result.
4. Joining tables
In databases you must have read about relational database. You will have
a database with tables which are related rationally. RDMS breaks the data and
stores them in a very meaningful manner but most of the times, for reporting purpose, you need to combine the broken pieces of data. Here joining tables come into picture. Please complete the below tutorial and practice to get hold of SQL joins.
Tutorial:
Tutorial:
Case Study
We already know that the database
DB_PhoneSales has a table phone. Now you to know the
corresponding name of the phones. The second table PhoneName contains corresponding phonename.
Name
|
City
|
Month
|
Quantity
|
A
|
New york
|
Jan
|
169
|
A
|
Seattle
|
Feb
|
399
|
B
|
Florida
|
Jan
|
172
|
B
|
New york
|
Jan
|
185
|
B
|
Seattle
|
March
|
322
|
C
|
Seattle
|
Jan
|
294
|
C
|
New york
|
March
|
145
|
A
|
Florida
|
Feb
|
393
|
A
|
Florida
|
March
|
155
|
B
|
New york
|
Jan
|
135
|
B
|
Florida
|
March
|
151
|
C
|
New york
|
March
|
224
|
B
|
Seattle
|
Feb
|
228
|
A
|
Florida
|
Feb
|
499
|
Code
|
PhoneName
|
A
|
AlphaPhone
|
B
|
BetaPhone
|
C
|
CellPhone
|
The SQL query will be:
SELECT B.PHONENAME, SUM(A.QUANTITY)
AS TOTAL_QUANTITY FROM PHONE AS A INNER JOIN PHONENAME B ON A.NAME=B.CODE ORDER
BY TOTAL_QUANTITY DESC
Name
|
Total_Quantity
|
AlphaPhone
|
2004
|
BetaPhone
|
1839
|
CellPhone
|
973
|
5. Nulls
Understanding of Nulls takes you one step higher than a beginner. It is very common to mistake null with zero. However null means nothing. you cannot add 5 and null. However you can add 5 and 0, which will give you 5.
There will be many instances when the dataset which you will receive will have null values. They create problems when you try to sum the values, join the tables, etc. Please go through the following link to gain thorough understanding on NULL values.
There will be many instances when the dataset which you will receive will have null values. They create problems when you try to sum the values, join the tables, etc. Please go through the following link to gain thorough understanding on NULL values.
6. Index
The data which we generally get in
the industry is very huge. It can have thousands to millions rows. Hence when
you are searching data, it can be a very slow process. Indexing the database can makes the search
process faster. However it has side effects also. For full information please go
to the following links:
7. Analytics functions
Now days you can get inbuilt analytical functions which will help you a
lot in analysis processes. These have become very famous in Business
Intelligence Interviews lately. Please go to the following links to learn them.
I guess If you properly go through these links, study them and assimilate
properly, you are good to go with SQL skills for business analysis.
So remember these websites which are very helpful in learning SQL. Ihave
already mentioned them above but would like to summarize.
And ofcourse always take help of MSDN. It is a huge repository, filled
with knowledge on Microsoft technologies. For further practice you can install AdventureWorks database, which is a fictitious company database provided by Microsoft, which is very helpful in practicing and learning SQL. Please find the link:
http://msdn.microsoft.com/en-us/library/ms124501(v=sql.90).aspx
Please give me your review comments so that I can make this blog better.
http://msdn.microsoft.com/en-us/library/ms124501(v=sql.90).aspx
Please give me your review comments so that I can make this blog better.
Happy Studying J
Next is MS excel. So easy but such a powerful tool for analysis.
Next is MS excel. So easy but such a powerful tool for analysis.
Very comprehensive and useful.
ReplyDelete