The Key: an Intro to SQL

Magnus PS
4 min readAug 14, 2021

Data is the commodity of the future.

If the disruptions caused by tech and the past decade’s demand for data storage and analysis were any indication of the future, it was a strong one.

From 2013 to 2025, the amount of generated data will have grown by 25x. An exponential amount.

And with this exponential growth in the amount of data we’ve generated, comes the demand for qualified individuals to gather and make sense of it.

Across industries, oceans, from the private to the public sector …

At an individual, enterprise and national level …

Those that can best gather and analyze this swell of data can and will hold a serious competitive advantage over those who cannot.

But before we can step through the door, before we can apply advanced tools and algorithms, we’ve first got to open the door. We’ve got to open the door between us and our data.

The key is oft-used and oft-overlooked.

Photo by Jorien Loman on Unsplash

Rather than building a storefront / app to access data, the alternative and often preferred avenue for accessing the data stored in DBMSs is SQL.

Let’s go over a couple background points before returning to the topic at hand:

  • A database is an organized collection of structured information, or data. Databases are like a warehouse for data. They’re where we store our data.
  • Our data (and databases) are typically stored in a computer system controlled by a Database Management System (DBMS). These DBMSs handle incoming data and it’s our communication with these systems (via SQL) that enable our access and management of data. Some popular forms of DBMS’s are MySQL, PostgreSQL, and SQLite.

Now, back to SQL.

SQL (pronounced “S.Q.L.” or “sequel”) is short for Structured Query Language and is the language used to communicate with databases. It’s the language used to access and manage our data.

In other words, if the data (and insights therein) were on the other side of a locked door, SQL would be the key.

https://www.memecreator.org/meme/you-get-a-key-you-get-a-key-everyone-gets-a-key/
Meme source: memecreator.org

SQL is foundational. If we were to imagine the skills involved in data analysis or science as a triangle, SQL would form the base.

Before applying any advanced algorithms (ahem machine learning, buzz word, buzz word), we must first be able to gather and analyze our data.

A few years back Forbes reported that 80% of Data Scientist’s time was spent on data prep.

“Data prep” includes the location and transformation of data, which is done in … yup, you guessed it … SQL.

While the % of time spent on data prep may be up for debate, the point of the matter is that whether our Data guru is an Analyst, Scientist or Engineer, they’re going to spend the majority of their time gathering and cleaning data.

Thus, a strong grasp of SQL is essential for anyone looking to make headway in the wide-wide world of data.

To familiarize ourselves, let’s check out a very simple SQL example:

We’ll go line-by-line:

  1. We select all variables from within the dataframe. The “*” is used to specify that we want to return all fields / columns within our dataframe.
  2. We specify the dataframe / table (fruit) from which we’d like to extract our data.
  3. We provide a condition by which we filter our data. We filter to return rows where the fruit’s name is “apple”. The fruit dataframe has a field / column “name” and what we do here is keep only the rows / observations where our elected field matches the specified string (apple).

This is an incredibly rudimentary query. Assuming you’re new to data (analysis), Datacamp can be a great place to start learning SQL.

datacamp.com
Beginner course available on Datacamp

Data has played and will continue to play a large role in our collective future.

As such, we’ve got to better equip ourselves to make sense of this data.

Whether for the betterment of society or simply a career move, the ‘Why?’ may differ. What doesn’t, is the fact that SQL is absolutely essential.

SQL allows us to access and manage our data, and for any data team or process this is foundational.

--

--

Magnus PS

Writer | Data Analyst | Project Manager | "Health Nut"