What is SQL?
"See - kwal", "SQL", whatever you want to call it, SQL is a powerful tool used by businesses, governments, and researchers to find insights into their data. It helps groups answer questions such as, "what were the top 5 selling products among young customers in the last quarter?" or " what was the largest spending category within the city population in the last year?" With a little learning, it's an easy to use tool to unlock the secrets within big data.
Why do we need SQL?
In larger organizations, data on consumers, clients, employees, etc... is stored in databases. This can be something like MicrosoftSQL Server, PostgreSQL, Oracle Database, or MangoDB.
Organizations use these "databases" because they often deal with very large data. You couldn't just store it in an excel file, excel maxes out at a little over 1,000,000 rows. So you need something a little bigger.
Unfortunately, with such a large amount of data, it's hard to look at it and gain anything of real value. Imagine looking at an excel sheet with 50 columns and 1 billion rows. It would be hard to pick anything out.
This is where SQL steps in: SQL is the communication language used to talk to databases. It allows us to pick out little nuggets from our data to answer real life questions which can help us drive improvements.
Imagine SQL as a waiter. You tell the waiter what you want in the form of a query and SQL runs to the database (the kitchen). SQL then returns with your order and you can enjoy the tasty food that was not accessible to you just a second ago behind the kitchen doors.
How do we use SQL?
As stated previously, we write queries which are basically short blocks of code to order some data from the database. You can use a SQL graphical interface such as pgAdmin, Microsoft SQL Management Studio, or Toad. It looks a little something like this:
SQL code is quite simple so let's go through a quick example to see how it works:
Recommended by LinkedIn
In SQL, SELECT is what we use to indicate what columns (called "fields") we want. In this example, we wanted to pick out the columns named "customer_id" and "payment_id". Then we write the FROM statement to indicate what table we want it from. In this case, these columns are located in the table called "payment".
SQL is used for what we call "relational databases". Basically, relational databases are comprised of all tables. So we have a bunch of columns (fields) and rows (records). SELECT and FROM are used in all queries but the code can get more advanced as you ask harder questions.
For example, what if we wanted to know: all the customers in our database, how much they've spent in their total lifetime, the month of their first order, and the amount of their first order. Even though we don't have any columns in our data that tell us this, with a little fiddling, we can get the answer like this...
SQL basically lets us interact with big data like a big pivot table. We can even do more than a pivot table can as we can use things like Regular Expressions (RegEx), CASE statements (IF statements), and more to find deeper insights.
SQL is a very powerful tool and fairly easy to learn. Would highly recommend taking the course SQL for Business Intelligence on the O'Reilly website. It's made by a guy named Jeff; he's very funny.
Hope you had a great July 4th!