Everything you need to master Structured Query Language — from file systems and database theory to advanced queries, stored procedures, PostgreSQL, and SQL Server.
require SQL
by Indeed.com
covered
What is SQL?
SQL (Structured Query Language) — pronounced either as “S-Q-L” or “sequel” — is the standard language for interacting with relational databases. It allows users to create, read, update, and delete data in a structured, table-based format. SQL is not merely a programming language; it is a declarative query language, meaning you describe what you want without specifying how to get it. The underlying Database Management System (DBMS) handles the execution logic internally.
Think of SQL as a common language between a user and a database — like two people speaking English to communicate. Without SQL (or a similar language), a user would have no standardized way to instruct the database to store, retrieve, or manipulate data.
SQL was developed at IBM in the early 1970s and has since grown into the backbone of data management. According to Indeed.com‘s analysis of over 32,000 data job listings, SQL is listed in 42.7% of them — making it the single most in-demand technical skill for data professionals globally.
Traditional File System — The Problem SQL Solves
Before databases and SQL, data was stored in flat files using a traditional file system. While this works for small amounts of data, it fails catastrophically at scale due to four major problems:
| Problem | Description | Impact |
|---|---|---|
| Data Redundancy | Same data stored in multiple locations leads to duplication | Increased storage costs, inconsistency, security vulnerabilities |
| Limited Data Sharing | Multiple users cannot access the same file simultaneously | Bottlenecks, access restrictions, reduced productivity |
| Slow Data Retrieval | No indexing or query mechanism — data must be searched linearly | Slow performance, inability to run complex queries |
| Data Dependence | Changing one file format requires updating all related programs | Inflexibility, expensive maintenance, risk of application failure |
SQL and the relational database model elegantly solve all four problems through structured tables, referential integrity, concurrent access support, and powerful querying.
History & Evolution of SQL
SQL has a rich history spanning more than five decades. Understanding its origins helps appreciate why it became the universal standard for relational databases.
| Year | Milestone |
|---|---|
| Early 1970s | SQL developed at IBM by Donald D. Chamberlin and Raymond F. Boyce; initially called SEQUEL (Structured English QUEry Language) |
| 1974 | Renamed to SQL because “SEQUEL” was a trademarked name of a UK-based engineering company |
| 1979 | Oracle becomes the first commercially available RDBMS using SQL |
| 1986 | ANSI (American National Standards Institute) and ISO officially recognize SQL as the standard language for relational databases |
| 1989 | SQL-89 standard published with referential integrity and persistent stored modules |
| 1992 | SQL-92 (SQL2) — massive expansion; joins, cascades, transactions standardized |
| 1999 | SQL:1999 adds recursive queries, triggers, and object-oriented features |
| 2003+ | SQL:2003 introduces XML-related features, window functions, sequences |
| Present | SQL remains the #1 data query language worldwide, continuously evolving with SQL:2016, SQL:2019, and SQL:2023 |
SQL is a declarative language — you tell it what result you want, and the DBMS figures out how to retrieve it. This abstraction from implementation details is what makes SQL so powerful and beginner-friendly simultaneously.
SQL Basics & Core Concepts
What is a Database?
A database is an organized collection of structured data stored and accessed electronically from a computer system. Think of it like a library: the library is the database, and each book on the shelves is a piece of data. The librarian (the DBMS) knows exactly where every book is and how to retrieve it efficiently.
What is a Table?
A table is the fundamental unit of storage in a relational database. It organizes data into:
- Rows (Tuples): Each row represents a single record or entity instance
- Columns (Attributes): Each column represents a specific field or attribute of that entity
- Cell: The intersection of a row and column — stores a single value
-- Example: STUDENT table structure CREATE DATABASE edureka; USE edureka; CREATE TABLE student ( student_id INT NOT NULL PRIMARY KEY, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, address VARCHAR(200), city VARCHAR(100), marks INT );
Core SQL Queries with Examples
1. SELECT Statement
The SELECT statement is the most frequently used SQL command. It retrieves data from one or more tables.
-- Select all columns from student table SELECT * FROM student; -- Select specific columns only SELECT first_name, last_name, marks FROM student; -- Select with condition SELECT first_name FROM student WHERE city = 'Goa';
2. WHERE Clause with AND, OR, NOT
-- AND operator: Both conditions must be true SELECT * FROM student WHERE first_name = 'Bharat' AND last_name = 'Singh'; -- OR operator: At least one condition must be true SELECT * FROM student WHERE first_name = 'Bhuvi' OR last_name = 'Kumar'; -- NOT operator: Condition must be false SELECT * FROM student WHERE NOT first_name = 'Ashok';
3. INSERT INTO
-- Insert a new record into the student table INSERT INTO student (student_id, first_name, last_name, address, city, marks) VALUES (7, 'Manoj', 'Sharma', '07 MG Road', 'Jaipur', 438);
4. UPDATE
-- Update a student's name where student_id = 1 UPDATE student SET first_name = 'Amar', last_name = 'Kumar' WHERE student_id = 1;
5. DELETE
-- Delete a student record from a specific city DELETE FROM student WHERE city = 'Mumbai';
6. ORDER BY, GROUP BY, HAVING
-- Sort results in descending order by marks SELECT * FROM student ORDER BY marks DESC; -- Count students per city SELECT city, COUNT(student_id) AS student_count FROM student GROUP BY city; -- Show cities with more than 2 students SELECT city, COUNT(student_id) AS student_count FROM student GROUP BY city HAVING COUNT(student_id) > 2;
7. BETWEEN, IN, LIKE, NULL
-- BETWEEN: Range filter SELECT * FROM student WHERE marks BETWEEN 400 AND 500; -- IN: Multiple value filter SELECT * FROM student WHERE city IN ('Delhi', 'Goa', 'Bengaluru'); -- LIKE: Pattern matching SELECT * FROM student WHERE first_name LIKE 'A%'; -- Names starting with 'A' -- IS NULL: Check for empty values SELECT * FROM student WHERE marks IS NULL; -- ALIAS: Rename columns or tables temporarily SELECT student_id AS id, first_name AS name FROM student AS s;
SQL Command Categories
All SQL commands fall into one of four major categories, each serving a distinct purpose in database management:
| Category | Full Name | Commands | Purpose |
|---|---|---|---|
| DDL | Data Definition Language | CREATE, ALTER, DROP, TRUNCATE, RENAME | Define and manage the database schema and structure |
| DML | Data Manipulation Language | SELECT, INSERT, UPDATE, DELETE, MERGE | Access, retrieve, and manipulate data stored in tables |
| DCL | Data Control Language | GRANT, REVOKE | Control user access rights and permissions |
| TCL | Transaction Control Language | COMMIT, ROLLBACK, SAVEPOINT, SET TRANSACTION | Manage transactions and ensure data integrity |
DDL — Data Definition Language
DDL commands deal with the structure of the database. They define how data will be stored — think of them as blueprints.
-- Create a new table with constraints CREATE TABLE employee ( emp_id INT PRIMARY KEY NOT NULL, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, department VARCHAR(100), salary DECIMAL(10, 2), hire_date DATE ); -- Add a new column to existing table ALTER TABLE employee ADD email VARCHAR(100); -- Change data type of a column ALTER TABLE employee MODIFY salary FLOAT; -- Delete all rows but keep table structure TRUNCATE TABLE employee; -- Permanently remove a table DROP TABLE employee;
TCL — Transaction Control Language
Transactions group multiple SQL operations into a single unit of work, following the ACID properties:
| Property | Meaning |
|---|---|
| Atomicity | Transaction is all-or-nothing — either all steps succeed or none do |
| Consistency | Database remains in a valid state before and after the transaction |
| Isolation | Concurrent transactions don’t interfere with each other |
| Durability | Once committed, changes persist even through system failures |
BEGIN TRANSACTION; UPDATE accounts SET balance = balance - 500 WHERE account_id = 101; UPDATE accounts SET balance = balance + 500 WHERE account_id = 202; -- If everything is correct, save changes COMMIT; -- If something went wrong, undo all changes ROLLBACK; -- Create a checkpoint to rollback to SAVEPOINT sp1; ROLLBACK TO sp1;
SQL Operators
SQL offers a rich set of operators for filtering, comparing, and combining data:
| Operator Type | Operators | Use Case |
|---|---|---|
| Arithmetic | +, -, *, /, % | Mathematical calculations on numeric columns |
| Comparison | =, !=, <, >, <=, >= | Comparing values in WHERE clauses |
| Logical | AND, OR, NOT, IN, BETWEEN, LIKE, EXISTS, ALL, ANY | Combining multiple conditions |
| Bitwise | &, |, ^ | Bitwise operations on integer values |
| Compound | +=, -=, *=, /= | Shorthand for arithmetic update operations |
| Set | UNION, INTERSECT, MINUS/EXCEPT | Combining results from multiple queries |
| String | + (concatenation), LIKE, %, _ | String manipulation and pattern matching |
LIKE Operator — Wildcard Pattern Matching
-- % matches zero or more characters SELECT * FROM student WHERE first_name LIKE 'A%'; -- Starts with A SELECT * FROM student WHERE first_name LIKE '%a'; -- Ends with a SELECT * FROM student WHERE first_name LIKE '%ar%'; -- Contains 'ar' -- _ matches exactly one character SELECT * FROM student WHERE first_name LIKE '_____'; -- Exactly 5 chars SELECT * FROM student WHERE city LIKE 'Che___i'; -- Che + 3 chars + i -- CASE expression (like IF-THEN-ELSE) SELECT city, CASE WHEN city = 'Chennai' THEN 'Tamil Nadu' WHEN city = 'Bengaluru' THEN 'Karnataka' ELSE 'Unknown' END AS state FROM department;
Aggregate Functions
Aggregate functions perform calculations on a set of values and return a single result. They are almost always used with GROUP BY and HAVING clauses.
| Function | Syntax | Description | NULL Handling |
|---|---|---|---|
| COUNT() | COUNT(column) or COUNT(*) | Returns number of rows matching criteria | Ignores NULLs (except COUNT(*)) |
| SUM() | SUM(column) | Returns total sum of a numeric column | Ignores NULLs |
| AVG() | AVG(column) | Returns average value of a numeric column | Ignores NULLs |
| MIN() | MIN(column) | Returns smallest value in the selected column | Ignores NULLs |
| MAX() | MAX(column) | Returns largest value in the selected column | Ignores NULLs |
-- Count of all student IDs SELECT COUNT(student_id) AS total_students FROM student; -- Average marks SELECT AVG(marks) AS avg_marks FROM student; -- Total marks of all students SELECT SUM(marks) AS total_marks FROM student; -- Student with minimum marks SELECT first_name, MIN(marks) AS min_marks FROM student; -- Student with maximum marks SELECT first_name, MAX(marks) AS max_marks FROM student; -- Count students grouped by city, show cities with count > 500 SELECT city, COUNT(student_id) AS count FROM student GROUP BY city HAVING COUNT(student_id) > 500 ORDER BY count DESC;
Database Normalization (1NF → BCNF)
Database Normalization is a systematic technique of organizing tables in a relational database to eliminate data redundancy and ensure data dependencies make logical sense. It was formalized by Edgar F. Codd along with his relational model.
Why Normalize?
Poor database design leads to anomalies — problems that occur during data manipulation:
| Anomaly | Problem | Example |
|---|---|---|
| Insertion Anomaly | Cannot insert data without adding unrelated data | Can’t add a new department without assigning an employee first |
| Update Anomaly | Updating one record requires updating multiple rows | Changing a department name requires editing 1000 employee rows |
| Deletion Anomaly | Deleting one record accidentally removes other data | Deleting the last employee in a department deletes all department info |
The Normal Forms
First Normal Form
Each cell must contain a single (atomic) value. No repeating groups or multi-valued attributes allowed.
Second Normal Form
Must be in 1NF AND have no partial dependencies — every non-prime attribute depends on the whole primary key.
Third Normal Form
Must be in 2NF AND have no transitive dependencies — non-key attributes don’t depend on other non-key attributes.
Boyce-Codd NF
Stricter version of 3NF. For every functional dependency A → B, A must be a Super Key of the table.
Normalization Step-by-Step Example
Consider a messy table with rental data: customer name, address, movies rented, and category — all in one row with repeating values.
-- BEFORE (Unnormalized — has repeating groups) -- | CustomerName | Address | MoviesRented | Category | -- | Alice Johnson| 1st St, No 3 | Mission Impossible, Clash | Action, Sci-Fi| -- 1NF — Atomic values, one movie per row CREATE TABLE rentals_1nf ( customer_name VARCHAR(100), address VARCHAR(200), movie_rented VARCHAR(100), category VARCHAR(50) ); -- 2NF — Split into Customer and Rentals tables CREATE TABLE customers_2nf ( customer_id INT PRIMARY KEY, salutation VARCHAR(10), full_name VARCHAR(100), address VARCHAR(200) ); CREATE TABLE movies_rented_2nf ( rental_id INT PRIMARY KEY, customer_id INT REFERENCES customers_2nf(customer_id), movie_name VARCHAR(100) ); -- 3NF — Separate salutations into lookup table CREATE TABLE salutations_3nf ( salutation_id INT PRIMARY KEY, salutation VARCHAR(10) -- Mr=1, Miss=2, Mrs=3, Dr=4 );
SQL Triggers
A trigger is a stored SQL code that automatically executes in response to specific events (INSERT, UPDATE, DELETE) on a particular table. Triggers enforce data integrity and automate repetitive tasks.
Imagine a marketing manager who needs to send a welcome email every time a new customer is added to the database. Instead of manually doing it, she creates a trigger — once a new row is inserted into the customer table, the welcome email logic fires automatically. Like dominoes falling in sequence.
Trigger Syntax
-- General syntax for creating a trigger CREATE TRIGGER trigger_name {BEFORE | AFTER} {INSERT | UPDATE | DELETE} ON table_name FOR EACH ROW BEGIN -- SQL statements to execute END; -- Example: Add 100 bonus marks to every new student CREATE TRIGGER calculate_marks BEFORE INSERT ON student FOR EACH ROW BEGIN SET NEW.marks = NEW.marks + 100; END; -- AFTER INSERT trigger: Store total marks in another table CREATE TRIGGER store_total_marks AFTER INSERT ON student FOR EACH ROW BEGIN INSERT INTO final_marks (student_id, total_marks) VALUES (NEW.student_id, NEW.marks); END; -- Drop a trigger DROP TRIGGER trigger_name; -- Show all triggers in a database SHOW TRIGGERS IN edureka;
| Trigger Type | When It Fires | Can Modify Data? |
|---|---|---|
| BEFORE INSERT | Before a new row is inserted | Yes — modifies NEW row |
| AFTER INSERT | After a new row has been inserted | Read-only (cannot update inserted row) |
| BEFORE UPDATE | Before an existing row is updated | Yes — modifies NEW values |
| AFTER UPDATE | After an existing row has been updated | Read-only |
| BEFORE DELETE | Before a row is deleted | Yes — can cancel the deletion |
| AFTER DELETE | After a row has been deleted | Read-only |
✅ Advantages of Triggers
- Enforce security approvals on database tables automatically
- Provide an additional layer of data integrity checking
- Counter-react invalid or unauthorized data changes
- Handle errors at the database layer before they reach the application
- Useful for auditing — automatically logging all data changes
- Reduce code duplication across multiple application layers
❌ Disadvantages of Triggers
- Cannot use NOT NULL, UNIQUE, or CHECK constraints inside triggers
- Increase the overhead of the database server
- Difficult to troubleshoot — execute invisibly in the background
- Risk of infinite loops if a trigger fires another trigger (nested triggers)
- Can lead to unexpected behavior if not documented properly
- Not visible to client applications — can cause confusion
SQL Joins — Complete Guide
SQL Joins combine rows from two or more tables based on a related column between them. They are one of the most powerful and commonly tested features of SQL, especially when dealing with one-to-many or many-to-many relationships.
INNER JOIN
Returns only the rows that have matching values in both tables. The intersection of both sets.
LEFT JOIN
Returns all rows from the left table and matching rows from the right table. Non-matching rights get NULL.
RIGHT JOIN
Returns all rows from the right table and matching rows from the left table. Non-matching lefts get NULL.
FULL JOIN
Returns all rows from both tables. Non-matching rows get NULL on the opposite side.
-- INNER JOIN — only matching records SELECT e.emp_id, e.first_name, p.project_name FROM employee e INNER JOIN project p ON e.emp_id = p.emp_id; -- LEFT JOIN — all employees, even those without a project SELECT e.first_name, e.last_name, p.project_name FROM employee e LEFT JOIN project p ON e.emp_id = p.emp_id ORDER BY e.first_name; -- RIGHT JOIN — all projects, even those with no assigned employee SELECT e.first_name, p.project_id, p.project_name FROM employee e RIGHT JOIN project p ON e.emp_id = p.emp_id; -- FULL JOIN — all records from both tables -- (MySQL workaround using UNION) SELECT e.first_name, p.project_name FROM employee e LEFT JOIN project p ON e.emp_id = p.emp_id UNION SELECT e.first_name, p.project_name FROM employee e RIGHT JOIN project p ON e.emp_id = p.emp_id; -- SELF JOIN — join a table to itself (e.g., supervisor structure) SELECT e.first_name AS employee, s.first_name AS supervisor FROM employee e LEFT JOIN employee s ON e.super_id = s.emp_id; -- NATURAL JOIN — automatically joins on columns with same name SELECT * FROM employee NATURAL JOIN department; -- CROSS JOIN — Cartesian product (every row × every row) SELECT e.emp_id, d.dept_name FROM employee e CROSS JOIN department d;
When you have a many-to-many relationship (e.g., employees working on multiple projects, and projects having multiple employees), you need a junction/bridge table (e.g., works_on) and two JOIN statements to link all three tables together correctly.
Built-in SQL Functions
SQL provides a rich library of built-in functions organized into several categories:
String Functions
-- UPPER and LOWER — Change case SELECT UPPER('hello sql'); -- 'HELLO SQL' SELECT LOWER('HELLO SQL'); -- 'hello sql' -- LTRIM, RTRIM — Remove whitespace SELECT LTRIM(' Hello'); -- 'Hello' (left spaces removed) SELECT RTRIM('Hello '); -- 'Hello' (right spaces removed) -- SUBSTRING — Extract part of a string SELECT SUBSTRING('ABCDEF', 2, 3); -- 'BCD' (start at 2, length 3) -- REPLACE — Replace characters SELECT REPLACE('ABCDE', 'CDE', 'XXX'); -- 'ABXXX' -- CONCAT — Join strings SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM employee; -- LEN — Length of string SELECT LEN('Edureka'); -- 7 -- REVERSE — Reverse a string SELECT REVERSE('ABC'); -- 'CBA' -- LEFT and RIGHT — Extract from left/right SELECT LEFT('Edureka', 3); -- 'Edu' SELECT RIGHT('Edureka', 3); -- 'eka'
Math Functions
SELECT ABS(-10); -- 10 (absolute value) SELECT ROUND(4.567, 2); -- 4.57 (round to 2 decimal places) SELECT FLOOR(4.9); -- 4 (largest integer ≤ value) SELECT CEILING(4.1); -- 5 (smallest integer ≥ value) SELECT SQRT(16); -- 4 (square root) SELECT POWER(2, 4); -- 16 (2 to the power of 4) SELECT SQUARE(4); -- 16 (square of 4) SELECT EXP(4); -- e^4 exponential value SELECT RAND(); -- Random value between 0 and 1
Date & Time Functions
SELECT GETDATE(); -- Current date and time SELECT CURRENT_TIMESTAMP; -- Same as GETDATE() SELECT DATEPART(YEAR, GETDATE()); -- Current year SELECT DATEPART(MM, GETDATE()); -- Current month number SELECT DATEDIFF(MM, '2020-01-01', GETDATE()); -- Months between dates SELECT DATEADD(MM, 2, GETDATE()); -- Add 2 months to today
Conversion Functions
-- CAST — Convert value to specified data type SELECT CAST(10 AS INT) * 20 AS cast_result; -- 200 -- CONVERT — Similar to CAST but with more options SELECT CONVERT(INT, 10) * 20 AS convert_result; -- 200 -- TRY_CAST — Returns NULL on failure instead of error SELECT TRY_CAST('A' AS INT); -- NULL (no error) -- Logical Functions SELECT CHOOSE(3, 'test', 'just', 'rest'); -- 'rest' (3rd value) SELECT IIF(1 > 10, 'true', 'false'); -- 'false'
Stored Procedures
A stored procedure is a precompiled set of SQL statements stored in the database under a name and executed as a unit. Think of them as functions in traditional programming languages — write once, call many times.
-- Create a stored procedure without parameters CREATE PROCEDURE get_department_details AS BEGIN SELECT * FROM department; END; -- Execute a stored procedure EXEC get_department_details; -- Stored procedure with input parameter CREATE PROCEDURE get_emp_by_dept @dept_no INT AS BEGIN SELECT * FROM employee WHERE department_no = @dept_no; END; EXEC get_emp_by_dept @dept_no = 20; -- Stored procedure with output parameter CREATE PROCEDURE get_emp_count @dept_no INT, @emp_count INT OUTPUT AS BEGIN SELECT @emp_count = COUNT(*) FROM employee WHERE department_no = @dept_no; END; -- Exception handling with TRY-CATCH CREATE PROCEDURE safe_divide AS BEGIN TRY SELECT 10 / 0; -- Will throw divide by zero error END TRY BEGIN CATCH SELECT ERROR_NUMBER() AS error_num, ERROR_MESSAGE() AS error_msg, ERROR_LINE() AS error_line; END CATCH;
✅ Advantages of Stored Procedures
- Reusability: Write once, call from anywhere in the application
- Performance: Compiled and cached — subsequent calls are faster
- Security: Users can execute without needing direct table access
- Reduced Network Traffic: Multiple SQL operations in a single call
- Easier Maintenance: Business logic centralized in one place
- Modular Programming: Break complex logic into manageable units
❌ Disadvantages of Stored Procedures
- Database-Specific: Syntax varies between MySQL, SQL Server, PostgreSQL
- Increased Memory Use: Stored on server and consume database resources
- Debugging Complexity: Harder to debug than application-level code
- Version Control Challenges: Harder to manage with Git and CI/CD pipelines
- Limited Portability: Migrating between database engines may break procedures
User Defined Functions (UDFs)
User Defined Functions (UDFs) are custom functions written by the user to encapsulate complex logic. Unlike stored procedures, functions must return a value and can be used directly inside SQL statements.
| Type | Returns | Usage | Can Use in SELECT? |
|---|---|---|---|
| Scalar Function | Single value (int, varchar, etc.) | Calculations, formatting | ✅ Yes |
| Inline Table-Valued | Table (single SELECT statement) | Parameterized views | ✅ Yes (as table) |
| Multi-Statement Table-Valued | Table (multiple statements) | Complex table-returning logic | ✅ Yes (as table) |
-- 1. Scalar Function — returns a single value CREATE FUNCTION get_full_name ( @first_name VARCHAR(50), @last_name VARCHAR(50) ) RETURNS VARCHAR(100) AS BEGIN RETURN @first_name + ' ' + @last_name; END; -- Call scalar function SELECT dbo.get_full_name('John', 'Smith') AS full_name; -- 2. Inline Table-Valued Function — returns a table CREATE FUNCTION get_employee_info (@emp_id INT) RETURNS TABLE AS RETURN ( SELECT * FROM employee WHERE emp_id = @emp_id ); -- Call table-valued function SELECT * FROM dbo.get_employee_info(1003); -- Use UDF in WHERE clause SELECT * FROM employee WHERE salary > dbo.get_avg_salary();
SQL vs MySQL — Key Differences
One of the most common points of confusion for beginners is the difference between SQL and MySQL. Simply put: SQL is the language; MySQL is a database management system that uses SQL.
- A query language, not software
- Standardized by ANSI/ISO
- Used to interact with relational databases
- Commands remain static (fixed standard)
- Not a product — has no installation
- Works with any compliant RDBMS
- Core of relational databases
- A software application that runs SQL
- Open source, developed by Oracle
- Provides multi-user database access
- Receives frequent version updates
- Installable on Windows, Linux, macOS
- Has its own implementation dialect
- Cross-platform compatible RDBMS
| Feature | SQL | MySQL |
|---|---|---|
| Type | Language | RDBMS Software |
| Created by | IBM (Chamberlin & Boyce) | MySQL AB (now Oracle) |
| Year | 1970s | Early 1990s |
| Cost | Free (standard) | Open source (Community) + paid Enterprise |
| Data Security | Secured via RDBMS | Can be more vulnerable (open source access) |
| Language support | N/A | PHP, Python, Java, Perl, Ruby, C++ |
| Platform | OS-independent | Windows, Linux, macOS |
| Speed | Depends on RDBMS | Very fast, benchmark-proven |
SQL vs NoSQL — The Ultimate Comparison
SQL databases are relational — they store data in structured tables with predefined schemas. NoSQL databases are non-relational — they store data in flexible formats like documents, key-value pairs, graphs, or wide-column stores.
| Parameter | SQL (Relational DB) | NoSQL (Non-Relational DB) |
|---|---|---|
| Database Type | Relational — tables with rows & columns | Non-relational — documents, graphs, key-value, wide-column |
| Schema | Predefined, rigid schema | Dynamic, flexible schema |
| Query Language | SQL (standardized) | No standard — varies per DB (MongoDB uses BSON/JSON queries) |
| Scalability | Vertically scalable (better hardware) | Horizontally scalable (add more servers) |
| Relationships | Strong support — foreign keys, joins | Minimal relationships — embed data instead |
| Complex Queries | Excellent — nested queries, joins | Limited — no standard query interface |
| Hierarchical Data | Not ideal — too many tables | Excellent — nests naturally in documents |
| ACID Properties | Fully supported | Partial — follows CAP theorem instead |
| Online Processing | OLTP (transactional systems) | OLAP (analytical systems, big data) |
| External Support | Extensive vendor and community support | Growing community; limited enterprise support |
| Examples | MySQL, PostgreSQL, Oracle, SQL Server, SQLite | MongoDB, Redis, Apache HBase, Cassandra, DynamoDB |
Use SQL when: You have structured data, need complex queries, require strong data integrity, or handle financial/transactional systems with ACID compliance.
Use NoSQL when: You have unstructured/semi-structured data, need horizontal scaling for massive datasets, build real-time apps, or require flexible/evolving schemas.
Installation Guide — All Platforms
Follow these step-by-step installation guides to set up MySQL (the most popular SQL database) on your operating system.
- Visit dev.mysql.com/downloads/installer
- Download MySQL Installer for Windows — choose the full package (~450 MB) or web installer
- For 32-bit: Download mysql-installer-community-x.x.x-32bit.msi
- For 64-bit: Download mysql-installer-community-x.x.x-64bit.msi
- Run the installer, choose Developer Default setup type
- Select components: MySQL Server, MySQL Workbench, MySQL Shell
- Configure: set root password, choose authentication method
- Set Windows Service name (default:
MySQL80) - Execute configuration — wait for green checkmarks
- Open MySQL Workbench to verify installation
- Test:
mysql -u root -pin Command Prompt
- Method 1 — Homebrew (Recommended):
- Install Homebrew:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" - Run:
brew install mysql - Start MySQL:
brew services start mysql - Secure installation:
mysql_secure_installation - Method 2 — DMG Package:
- Visit dev.mysql.com/downloads/mysql
- Choose macOS 13 (x86_64) for Intel or macOS 13 (ARM64) for M1/M2/M3
- Download and open the
.dmgfile - Follow the wizard, set root password
- Add MySQL to PATH:
export PATH="/usr/local/mysql/bin:$PATH" - Install MySQL Workbench from official site for GUI access
- Ubuntu / Debian (64-bit):
- Update packages:
sudo apt update - Install:
sudo apt install mysql-server - Start service:
sudo systemctl start mysql - Enable on boot:
sudo systemctl enable mysql - Secure setup:
sudo mysql_secure_installation - Login:
sudo mysql -u root -p - Ubuntu 32-bit (legacy):
- Add i386 architecture:
sudo dpkg --add-architecture i386 - Install 32-bit libs:
sudo apt install mysql-server:i386 - CentOS / RHEL:
sudo yum install mysql-serversudo systemctl start mysqld- Get temp password:
sudo grep 'temp' /var/log/mysqld.log
- Download Termux from F-Droid (recommended) or Play Store
- Update Termux:
pkg update && pkg upgrade - Install MariaDB (MySQL compatible):
pkg install mariadb - Initialize database:
mysql_install_db - Start server:
mysqld_safe & - Wait 5 seconds, then login:
mysql -u root - Set root password:
ALTER USER 'root'@'localhost' IDENTIFIED BY 'yourpassword'; - Flush privileges:
FLUSH PRIVILEGES; - ARM 32-bit note: Termux on 32-bit Android supports MariaDB; use
pkg install mariadb - ARM 64-bit: Full support — same steps apply
- Test:
mysql -u root -pthenSHOW DATABASES; - Install GUI:
pkg install adminerfor a web-based interface
PostgreSQL Installation
# Windows — Download installer from postgresql.org/download/windows/ # Choose version, select 64-bit or 32-bit, run wizard # macOS — via Homebrew brew install postgresql@16 brew services start postgresql@16 psql postgres # Ubuntu/Debian Linux sudo apt update sudo apt install postgresql postgresql-contrib sudo systemctl start postgresql sudo -u postgres psql # CentOS/RHEL sudo yum install postgresql-server sudo postgresql-setup initdb sudo systemctl start postgresql # Termux (Android) pkg install postgresql initdb $PREFIX/var/lib/postgresql pg_ctl -D $PREFIX/var/lib/postgresql start psql -U $(whoami) postgres
SQL Server Installation (Windows)
# Step 1: Download SQL Server Developer Edition (free) # Visit: microsoft.com/en-us/sql-server/sql-server-downloads # Step 2: Run installer, choose installation type: # Basic — default config, fastest setup # Custom — choose features, useful for enterprise # Download Media — offline installer # Step 3: Accept license terms, set installation directory # Step 4: Configure authentication: # Windows Authentication Mode — uses Windows login # Mixed Mode — SQL + Windows (recommended for development) # Step 5: Name your instance (e.g., SQLEXPRESS or custom) # Step 6: Download and install SQL Server Management Studio (SSMS) # Visit: docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms # Step 7: Connect via SSMS # Server Name: localhost\INSTANCE_NAME # Authentication: Windows Authentication or SQL Authentication
Verifying Your SQL Installation
-- MySQL / MariaDB version check mysql --version SELECT VERSION(); -- PostgreSQL version check psql --version SELECT version(); -- SQL Server version check SELECT @@VERSION; -- Show all databases SHOW DATABASES; -- MySQL/MariaDB \l -- PostgreSQL SELECT name FROM sys.databases; -- SQL Server
PostgreSQL — Deep Dive
PostgreSQL (also called Postgres) is a powerful, open-source object-relational database system with over 30 years of active development. It is often described as the world’s most advanced open-source relational database and is widely adopted in enterprise environments.
| Feature | Description |
|---|---|
| Data Types | Supports all standard types PLUS JSON, XML, Geometric (Point, Line, Circle, Polygon), Arrays, and custom user-defined types |
| Data Integrity | Primary keys, foreign keys, UNIQUE, NOT NULL, CHECK, EXCLUSION constraints, explicit locking |
| Performance | Indexing, sophisticated query planner, MVCC (Multi-Version Concurrency Control), JIT compilation, table partitioning |
| Reliability | Write-Ahead Logging (WAL), replication, point-in-time recovery, active standbys, tablespaces |
| Security | Row-level security, column-level encryption, SSL/TLS, role-based access control, audit logging |
| Extensibility | Stored functions in PL/pgSQL, Python, Perl, Tcl; foreign data wrappers; custom operators; 1000+ extensions via PGXN |
PostgreSQL Unique Features vs MySQL
-- Create schema (namespace for tables) CREATE SCHEMA company; SET search_path TO company; -- PostgreSQL supports JSON natively CREATE TABLE orders ( order_id SERIAL PRIMARY KEY, data JSONB ); INSERT INTO orders (data) VALUES ('{"customer": "Alice", "items": ["book", "pen"], "total": 45.99}'); -- Query JSON fields SELECT data->>'customer' AS customer_name FROM orders; -- Window functions (unique to PostgreSQL among open-source DBs) SELECT emp_id, first_name, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank, AVG(salary) OVER () AS company_avg FROM employee; -- UUID generation CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; SELECT uuid_generate_v4(); -- Generates unique UUID each call -- Creating a stored procedure (PostgreSQL style) CREATE OR REPLACE PROCEDURE insert_employee( p_first_name TEXT, p_last_name TEXT ) LANGUAGE plpgsql AS $$ BEGIN INSERT INTO employee(first_name, last_name) VALUES (p_first_name, p_last_name); END; $$; CALL insert_employee('John', 'Smith');
Microsoft SQL Server — Complete Guide
Microsoft SQL Server is a full-featured enterprise Relational Database Management System (RDBMS) that uses Transact-SQL (T-SQL) — Microsoft’s proprietary extension of SQL — for querying and modifying data. It integrates deeply with the Microsoft ecosystem including Azure, Power BI, and .NET.
Key Components of SQL Server
| Component | Role |
|---|---|
| Database Engine | Core service for storing, processing, and securing data; handles transactions, locking, and concurrency |
| SQL Server Agent | Windows service for scheduling and automating jobs (backups, maintenance plans, scripts) |
| SQL Server Browser | Routes incoming client requests to the appropriate SQL Server instance |
| SSAS (Analysis Services) | OLAP engine for multidimensional data analysis, data mining, and machine learning |
| SSRS (Reporting Services) | Report authoring, deployment, and management platform; integrates with Hadoop |
| SSIS (Integration Services) | ETL (Extract, Transform, Load) platform for data integration and workflow applications |
| Full-Text Search | Full-text queries against character-based data stored in SQL tables |
| SQL Server VSS Writer | Enables backup/restore of data files without running the SQL Server |
T-SQL Specific Features
-- TOP keyword (limit results in SQL Server) SELECT TOP 10 * FROM employee ORDER BY salary DESC; -- PIVOT — rotate rows into columns SELECT customer_id, [11], [22], [33] FROM ( SELECT customer_id, cost FROM supplier ) AS src PIVOT (AVG(cost) FOR customer_id IN ([11], [22], [33])) AS pvt; -- OFFSET-FETCH (pagination) SELECT * FROM employee ORDER BY emp_id OFFSET 10 ROWS FETCH NEXT 5 ROWS ONLY; -- MERGE statement MERGE target_table AS target USING source_table AS source ON (target.id = source.id) WHEN MATCHED THEN UPDATE SET target.name = source.name WHEN NOT MATCHED BY TARGET THEN INSERT (id, name) VALUES (source.id, source.name) WHEN NOT MATCHED BY SOURCE THEN DELETE; -- Temp tables (local vs global) CREATE TABLE #local_temp (id INT); -- Local: session only CREATE TABLE ##global_temp (id INT); -- Global: visible to all sessions -- Variables and conditional logic DECLARE @employee_count INT; SET @employee_count = (SELECT COUNT(*) FROM employee); PRINT @employee_count;
SQL for Data Science
Data science is fundamentally about extracting insights from data. SQL is the language that makes this possible at scale. Without SQL, a data scientist would have no efficient way to query, filter, aggregate, and prepare data from databases containing millions or billions of rows.
Why Data Scientists Need SQL
We generate more than 2.5 quintillion bytes of data every day. Before any machine learning model can be trained or any business decision made, this data must be extracted, cleaned, and transformed. SQL is the primary tool for all of these operations.
Common Data Science SQL Patterns
-- Explore dataset: See all employee records SELECT * FROM employee_details; -- Find unique job roles in the organization SELECT DISTINCT title FROM employee_details; -- Find unique department-job combinations SELECT DISTINCT department, title FROM employee_details; -- Filter: All employees in Fire Department SELECT * FROM employee_details WHERE department = 'Fire'; -- Employees NOT in Police Department SELECT * FROM employee_details WHERE department NOT IN ('Police'); -- Employees hired before year 2000 SELECT * FROM employee_details WHERE original_hire_date < '2000-01-01'; -- Average salary of surgeons SELECT AVG(salary_annual) AS avg_surgeon_salary FROM employee_details WHERE title = 'Surgeon'; -- Employees eligible for 25% raise above $30,000 SELECT * FROM employee_details WHERE 1.25 * salary_annual > 30000; -- Salary range filter SELECT * FROM employee_details WHERE salary_annual BETWEEN 24000 AND 50000; -- View employee name + salary + hire date (who joined before 2008) SELECT name, salary_annual, original_hire_date FROM employee_details WHERE original_hire_date < '2008-02-01';
Why MySQL is Preferred for Data Science
| Reason | Detail |
|---|---|
| Open Source & Free | No licensing costs — ideal for research and startups |
| Easy to Learn | Basic SQL knowledge is sufficient to start querying immediately |
| Scalability | Handles databases with 50+ million rows; up to 4 GB by default |
| Python Integration | Seamlessly integrates with pandas, SQLAlchemy, and Jupyter notebooks |
| Cross-Platform | Works on Windows, Linux, macOS — same queries everywhere |
| Speed | Benchmark-proven fast for read-heavy analytical workloads |
| Client-Server Architecture | Multiple applications can connect simultaneously — enables shared data workflows |
| API Support | APIs for C, C++, Java, Perl, PHP, Python, Ruby — perfect for ML pipelines |
Advantages of SQL
| # | Advantage | Explanation |
|---|---|---|
| 1 | Well-Defined Standards | SQL follows ANSI/ISO standards — queries are written in a consistent, unambiguous way across vendors |
| 2 | Easy to Learn | SQL uses natural English-like syntax (SELECT, FROM, WHERE) making it one of the most beginner-accessible languages |
| 3 | Multiple Views Support | SQL can create virtual tables (views) that protect data integrity and expose only necessary data to users |
| 4 | Portability | SQL queries can be executed on any compliant RDBMS with minimal modification as long as environments match |
| 5 | Interactive Language | Write complex queries to communicate with databases in near real-time; results are immediate and human-readable |
| 6 | Declarative Nature | You describe WHAT you want, not HOW to get it — the DBMS optimizes execution automatically |
| 7 | Data Security | DCL commands (GRANT, REVOKE) provide fine-grained access control; roles and permissions protect sensitive data |
| 8 | Data Integrity | Constraints (PRIMARY KEY, FOREIGN KEY, CHECK, UNIQUE) enforce business rules at the database level |
| 9 | Multi-User Access | Concurrent users can read and write data simultaneously thanks to transaction isolation |
| 10 | Scalability | SQL databases can handle enormous amounts of data with proper indexing, partitioning, and optimization |
| 11 | Aggregation & Analysis | Powerful aggregate functions (SUM, AVG, COUNT, MIN, MAX) enable complex data analysis in a single query |
| 12 | Join Operations | SQL’s JOIN mechanism allows complex relationships between tables to be queried efficiently |
| 13 | Standardized Transactions | ACID properties ensure data remains consistent even during concurrent access or system failures |
| 14 | Industry Demand | SQL is the #1 most requested data skill — 42.7% of all data job postings require it (Indeed.com, 2023) |
| 15 | Integration Ecosystem | SQL integrates with virtually every programming language, BI tool, ETL platform, and cloud service |
Disadvantages of SQL
| # | Disadvantage | Explanation |
|---|---|---|
| 1 | Complex Interface | SQL’s interface can be complex for some users — certain operations require intricate nested queries or joins |
| 2 | Not Suited for Unstructured Data | SQL struggles with unstructured data (images, videos, social media posts) — NoSQL handles these better |
| 3 | Rigid Schema | Schema changes are expensive operations — adding a column to a billion-row table can take hours |
| 4 | Vertical Scaling Limitations | SQL databases scale primarily by upgrading hardware (vertical), which has physical and economic limits |
| 5 | Vendor-Specific Dialects | MySQL, PostgreSQL, SQL Server, and Oracle each have different syntax extensions — portability is not perfect |
| 6 | Object-Relational Impedance Mismatch | SQL’s tabular model doesn’t map naturally to object-oriented programming — ORM tools add overhead |
| 7 | Cost at Scale | Enterprise RDBMS licenses (Oracle, SQL Server) are extremely expensive for large-scale deployments |
| 8 | Overhead for Simple Operations | For extremely simple key-value lookups, SQL’s overhead can be slower than NoSQL solutions like Redis |
| 9 | Limited Support for Big Data | SQL databases are not natively designed for the distributed computing models required by Hadoop/Spark ecosystems |
| 10 | No Native Full-Text Search | While some SQL databases support full-text search, it’s generally inferior to dedicated search engines like Elasticsearch |
| 11 | Difficult to Handle Hierarchical Data | Representing tree-like or nested structures in relational tables requires complex recursive CTEs |
| 12 | Performance on Complex Analytics | Very complex analytical queries (multiple joins, subqueries on large datasets) can be slow without careful optimization |
Top SQL Interview Questions & Answers
These are the most frequently asked SQL interview questions for data analysts, data engineers, backend developers, and database administrators.
SELECT MAX(salary) FROM employee WHERE salary < (SELECT MAX(salary) FROM employee);. Alternatively using TOP: SELECT TOP 1 salary FROM (SELECT DISTINCT TOP 2 salary FROM employee ORDER BY salary DESC) AS temp ORDER BY salary ASC;.SQL Career Guide
SQL is one of the most valuable skills in the technology and data industry. Its applications span virtually every domain — healthcare, finance, retail, government, education, and more.
Industries Using SQL Daily
| Industry | SQL Use Cases | Example Systems |
|---|---|---|
| Education | Student records, exam results, attendance, online assessments | LMS databases, university ERPs |
| Healthcare | Patient records, doctor-staff coordination, inventory, appointments | Hospital Management Systems |
| Retail / E-Commerce | Customer data, product inventory, personalized recommendations, sales analytics | Amazon, Flipkart, Shopify backends |
| Financial Services | Transaction processing, fraud detection, portfolio management, risk assessment | Banking core systems, trading platforms |
| Government | Census data, tax records, public service databases | National ID systems, revenue departments |
| Technology | User data, application logs, A/B test results, feature flags | SaaS product databases |
| Logistics | Shipment tracking, warehouse management, route optimization | Supply chain management systems |
Job Roles That Require SQL
SQL Learning Roadmap
-
Foundation: SQL Basics (Week 1-2)
Master SELECT, WHERE, ORDER BY, INSERT, UPDATE, DELETE. Practice on a sample database like the Sakila or Northwind database.
-
Intermediate: Joins & Aggregations (Week 3-4)
Deep dive into all JOIN types, GROUP BY, HAVING, aggregate functions, and subqueries. Build a multi-table project.
-
Advanced: Schema Design & Optimization (Week 5-7)
Study normalization (1NF–BCNF), indexing strategies, query execution plans, and performance tuning techniques.
-
Professional: Stored Procedures, Triggers & Views (Week 8-10)
Write complex stored procedures, triggers for automation, views for data abstraction, and user-defined functions.
-
Specialization: Choose Your Path (Week 11+)
Pick a direction: PostgreSQL for web development & open-source, SQL Server for Microsoft/enterprise environments, MySQL for general web apps, or BigQuery/Snowflake for cloud data warehousing.
-
Certification: Get Validated
Pursue certifications like Edureka SQL Certification, Microsoft Certified: Azure Data Fundamentals, Oracle Database SQL Certified Associate, or the Google Data Analytics Certificate.
SQL is the foundational language for working with data in a world powered by information. From small startups to the world’s largest enterprises, SQL drives the systems that store, protect, and surface data insights. Whether you’re querying a 10-row test table or a billion-row data warehouse, the core principles remain the same — and mastering them opens doors to virtually every technology career path in existence today.
🎓 Ready to Master SQL?
Start practicing with free tools like MySQL Workbench, pgAdmin, or DBeaver. The best way to learn SQL is by writing queries on real datasets.
Install SQL Now →
0 Comments