SQL Basics: Convention, Commands, Constraints, and Operators

Soo Kim
9 min readOct 5, 2022

--

https://www.wevolver.com/article/making-databases-faster

Currently, I work as a Flutter front-end developer, but I am keenly interested in becoming a full-stack developer and having been looking for opportunities to increase my exposure to back-end development.

Thankfully, my coworker, a senior back-end developer, has opened up that door so that on our next project, I can dabble in some back-end related work. He suggested that I start off by doing some SQL practice (which is so much more complicated than the simple sqflite package I’ve been using in Flutter). Here’s my thoughts on some conventions and brief explanation + examples of commands, constraints, operators, etc.

Table of Index

  • Style convention
    - Snake-case (lowercase_underscore) for column and table names
    - Capitalization for keywords and reserved words
    - Prefix Ambiguous Terms
    - Singular/Plural table names and singular column names
    - Foreign Key
  • Commands
    - DDL
    (Data Definition Language): create, drop, alter, rename, truncate
    - DML (Data Manipulation Language): insert into, update, delete from, select from
    Functions — count, min, max, sum, avg, concat(SELECT)
    Keywords — distinct, order by, limit, as (SELECT)
    Operators — and, or, not, is null, is not null (used with WHERE clause)
    - DQL (Data Query Language): select
    - DCL (Data Control Language):
    - TCL (Transaction Control Language):
  • Constraints — not null, unique, primary key, foreign key, check, default, create index
  • Others — like, in, between, wildcards
  • Join
  • How to deal with lists?

Style Convention

I think the “best convention” is to have a consistent style throughout your code — whatever the style may be — because technically, there’s no right or wrong when it comes to styles. That being said, these are some practices that I decided to follow because it made sense to me (aside from the obvious ones like name conflicts).

Snake-case (lowercase_underscore) for column and table names

This helps separate the field/table names from SQL style convention of capitalizing statement names/clauses/keywords (although yes, you can use lowercase). Honestly, a matter of personal choice (or following the convention at work).

SELECT name, phone_number, department_id FROM employees;

Capitalization for keywords and reserved words & lowercase for data types

Some do not like their CODE SCREAMING AT THEM, but I personally think it’s not too many characters that would be SCREAMING AT THEM. I think it also depends on what style you use for columns and table names. If you use PascalCase, it might make sense not to capitalize the keywords and reserved words. Again, this is a matter of personal choice or following work convention.

Prefix Ambiguous Terms & Do not Use Unconventional Abbreviation

If you have two columns that use the same word, you should definitely use a prefix, such as employee_id and department_id. Although some people prefer all column fields to be prefixed (eg. employee_name, employee_phone_number) for when you are joining two tables, I don’t think that’s necessary because you can use the tablename.column_name to distinguish them. But definitely, be as precise as possible (job_position is better than position).

Singular/Plural table names and singular column names

I’ve read different opinions on singular and plural table names…but I think it all depends on the table itself. For example, if a table is a collection of rows that describe multiple entities (like employees), it should be plural. On the other hand, if it collectively describes one entity (like marketing_team), then it should be singular. As for columns, if it were to be plural, I think it’d be better to make a table of its own.

Foreign Key

Foreign Key should be the Primary Key of another table and it should not be the primary key and foreign key at the same time (exception could be tables with a one-to-one relationship). However, I’ve decided that unless absolutely necessary, for now, I won’t be using much foreign keys because it gets slightly annoying to modify the primary key that is used as a foreign key somewhere.

Todo One-to-one relationship: 
CREATE TABLE employees (
employee_id int PRIMARY KEY AUTO_INCREMENT,
name varchar(50) NOT NULL,
phone_number varchar(25),
age int NOT NULL
);
CREATE TABLE marketing_team_employees (
employee_id int PRIMARY KEY,
job_position varchar(50) NOT NULL,
FOREIGN KEY (employee_id) REFERENCES employees(employee_id)
);

Commands

The four main categories of SQL commands are as follows:
1. DDL — Data Definition Language
2. DML — Data Manipulation Language
3. DQL — Data Query Language
4. DCL — Data Control Language
5. TCL — Transaction Control Language

DDL (Data Definition Language)

As the name suggests, DDL is used to define — create, modify or remove — the structure of objects (not the data) in a database. It deals with database schema and can be used to create new tables, indexes, sequences etc.

The 3 most common DDL command types are CREATE, DROP, and ALTER and can be used with multiple commands (e.g. CREATE TABLE). Commands include: database, table, location, sequence, index, alias, view etc. Other command types include TRUNCATE, which basically retains the structure of a table but removes all records and RENAME.

// Copy another table 
CREATE TABLE employees_copy AS
SELECT name, phone_number
FROM employees;
CREATE TABLE employees_copy AS SELCT * FROM employees;ALTER TABLE employees
ADD department varchar(50); // add new columm
ALTER TABLE employees
MODIFY COLUMN department int; // modify data type
ALTER TABLE employees
RENAME COLUMN department TO department_id; // modify column name
ALTER TABLE employees
DROP COLUMN department_id; // remove column
DROP TABLE employees; // remove table completely

DML (Data Manipulation Language)

DML commands are used to modify the data in a database.

  • INSERT INTO <table_name> (column list) VALUES (values);
    if adding values for all columns: INSERT INTO <table_name> VALUES;
INSERT INTO employees (name, age)
VALUES ("Soo Kim", 30);
INSERT INTO employees
VALUES (1, "Soo Kim", "123-456-7890", 30);
  • UPDATE <table_name> SET column_name = value WHERE condition;

WHERE can be used with operators — AND, OR, NOT, IS NULL, IS NOT NULL, LIKE, etc.

UPDATE employees 
SET phone_number = "000-111-222" WHERE name = "Soo Kim";
  • DELETE FROM <table_name> (WHERE condition);
DELETE FROM employees 
WHERE name = "Soo Kim" AND NOT employee_age <= 30;
  • SELECT * FROM <table_name>; // all columns
    (technically DQL but when you add FROM or WHERE, it becomes part of DML)
SELECT * FROM employees
WHERE department_id = 1;
SELECT DISTINCT department_id FROM employees; // returns only different values of department_idSELECT COUNT(employee_id) AS "Marketing Team Member Count" FROM employees
WHERE department_id = 1;
// returns the number of rows (COUNT)
// return column name: Marketing Team Member Count (AS) = also known as aliases
SELECT COUNT(DISTINCT department_id) FROM employees;
// returns the number of departments
SELECT name AS youngest_employee FROM employees
ORDER BY age
LIMIT 1;
// sort in ascending order
// return only the first record
SELECT * FROM employees
ORDER BY age DESC, name;
// sorted by age descending order, and then name in ascending order
SELECT MIN(age) FROM employees // vs MAX(...)
WHERE department_id = 1;
// returns smallest value of age
SELECT AVG(age) FROM employees AS average_age;
// returns average age
SELECT SUM(order_quantity) FROM orders; // returns sum of ordersSELECT customer_id, name, CONCAT(address, ", ", city, " ", country, " ", postal_code) AS address
FROM customers; //
SELECT employees.employee_id, bonuses.bonus_amount, employees.bank_name, employees.bank_account FROM employees, bonuses
WHERE employees.employee_id = bonuses.employee_id
ORDER BY employees.employee_id;

DQL (Data Query Language)

DQL commands are used to get data within the schema objects in a database and perform queries, and it includes SELECT.

DCL (Data Control Language)

DCL commands deal with rights, permissions, and other ways to “control” the database system (GRANT and REVOKE).

TCL (Transaction Control Language)

if you’re using the terminal to do sql, you would need to use TCL commands, but if you’re using a platform to work with your database, commands like rollback and commit should be automatically carried out.

  • ROLLBACK: Rollbacks a transaction in case of any error occurs.
  • SAVEPOINT: Sets a savepoint within a transaction.
  • SET TRANSACTION: Specify characteristics for the transaction.
  • COMMIT: Commits a transaction

Constraints

Constraints (rules for the data) are declared after the datatype of a column when creating or altering a table.

  • NOT NULL
  • UNIQUE — all values in a column are different
  • PRIMARY KEY — unique key that identifies each row (NOT NULL + UNIQUE)
  • FOREIGN KEY — prevents actions that would destroy links between tables; refers to the primary key of another table
    - table with foreign key = child table
    - table with primary key = referenced / parent table
  • CHECK — limit the values in a column
  • DEFAULT — a default value for a column if no value is specified
CREATE TABLE young_adult_users (
user_id int PRIMARY KEY AUTO_INCREMENT,
age int NOT NULL CHECK(age>=18 AND age<30),
has_subscribed_to_email tinyint(1) DEFAULT true
);
  • CREATE INDEX — to create and retrieve data from db very quickly; only create indexes on columns that will be frequently searched, because updating tables with indexes takes longer than updating those without
CREATE INDEX index_name 
ON table_name (column_name);

Wildcards, Other Operators (Like, In, Between), and Aliases

A wildcard character can be used to substitute characters in a string and is used with the LIKE operator, which is used to search for a specified pattern.

  • % : 0+ characters
SELECT * FROM employees 
WHERE name LIKE "s%"; // start with s (eg. Soo)
WHERE name LIKE "%s"; // end with s (eg. Lewis)
WHERE name LIKE "%oo%"; // have "oo" in any position (eg. Soo)
WHERE name LIKE "d%l"; // start with d and end with l (eg. Daniel)
  • _ : single character
SELECT * FROM employees 
WHERE name LIKE "_e%"; // have e in 2nd position (eg. Lewis)
WHERE name LIKE "da__%"; // start with da, and are at least 3 characters in length (eg. Daniel)

The IN operator, basically a shorthand for multiple OR operations, allow you to specify multiple values in a WHERE clause (or NOT IN).

SELECT * FROM employees
WHERE employee_id IN (1, 2, 3);
SELECT * FROM employees
WHERE department_id IN (SELECT department_id FROM employees WHERE name = 'Soo Kim'); // select all employees that have the same department_id as employee with name "Soo Kim"

The BETWEEN operator selects values within a given range (inclusive — begin and end values included), and it works with numbers, text, or dates.

SELECT * FROM employees
WHERE age BETWEEN 20 AND 30;
SELECT * FROM burgers
WHERE price NOT BETWEEN 10 AND 20
AND burger_id NOT IN (1, 2, 3);

Joins

A JOIN clause is used to combine rows from multiple tables based on a common column. There are a few types of joins: inner, left (outer), right (outer) and full (outer).

  • (INNER) JOIN — returns all records that have matching values
SELECT * FROM employees
INNER JOIN bonuses WHERE employees.employee_id = bonuses.employee_id
ORDER BY employees.employee_id;
// returns all column & its values from both employees and bonuses if employee_id matches in both tables
// records from employees and then bonuses
  • LEFT (OUTER) JOIN — returns all records from left table, and matched records from right table; if no records exist with the same matching value on the right table, it’ll show up as NULL
  • RIGHT (OUTER) JOIN — returns all records from the right table, and matched records from the left table
  • FULL (OUTER) JOIN — returns all records when there is a match in either left or right table

How to Deal With Lists?

Unlike NoSql, databases that use sql do not support lists, which can become frustrating. There are two ways which I think are viable when dealing with lists of values.

  • junction table: have separate tables identifying each entity, and have a junction table contains primary keys of the tables you want to relate. That way, you can find and analyze information related to each primary key.
CREATE TABLE students (
student_id ...,
name ...,
birthday ...,
);
CREATE TABLE classes (
class_id ...,
name ...,
professor_id ...,
professor_name ...,
semester ...,
academic_year ...,
);
CREATE TABLE professors (
professor_id ...,
name ...,
birthday ...,
);
CREATE TABLE students_classes (
id ...,
student_id ...
class_id ...
);
SELECT COUNT(*) FROM students_classes
WHERE student_id = (SELECT student_id FROM students WHERE name = "Soo Kim");
// count how many classes student "Soo Kim" is taking
SELECT COUNT(*) FROM students_clases
WHERE class_id = (SELECT class_id FROM classes WHERE academic_year = 2022 AND semester = "FALL" AND name = "French 101");
// count how many students are class "French 101" in 2022 Fall semester
CREATE TABLE professors_classes (
id ... ,
professor_id ...,
class_id ...,
);
  • separator: this is the most primitive yet easy way. Use a separator (like comma, semi-colon, hyphen, underscore etc) between the values. However, I would only use this method for values will not need information regarding the values, and 1:M relationships.
UPDATE employees_copy
SET siblings = "Hannah, Shawn"
WHERE employee_id = 1;
// assumes that no other information for siblings will be provided or needed.

--

--