SQL Tutorial
Part II
Summary: In part II of the tutorial we review several basic statements of the SQL language
Introduction
This is the second part of the SQL course. In this part, we will focus on various SQL commands from the creation of a table, its modification, and/or deletion.
We will focus, above all, on the SELECT command, which is, in my judgment, the most important of all.
I hope that this second part will be enjoyable and instructive for you.
Creating a table
Like we have seen in the first installment, to create a table, the CREATE command with the TABLE qualifier is used. In effect, the CREATE command serves to create:
- users: CREATE USER
- tables: CREATE TABLE
- indexes: CREATE INDEX
- views: CREATE VIEW
The CREATE command indicates to the manager that something is going to be created. Later we will cover what and how.
What we are interested in now is the creation of a table:
Syntax
CREATE TABLE name ( column type [DEFAULT value] [NOT NULL], ...
[INHERITS (inherits, ...)]
[CONSTRAINT constraints CHECK (test), CHECK (test)] );
Where:
name: |
Is the name that is given to the table and how it is referenced by any command |
Column: |
Is the name of the column |
Type: |
Is the type of data(varchar, char, int, date, time, timestamp), Postgres has other types of data, but they aren't compatible with ANSI SQL |
Value: |
The value that it will have by default |
Inherits: |
This is Postgres's own. It defines an inheritance from another table. This will create an entity that contains the columns of the table that we are creating and the ones inherited |
Nom_cons: |
This defines a rule of integrity to meet each time that a row is modified |
Test: |
Conditions to check |
Example:
CREATE TABLE countries (
cod_country integer NOT NULL,
name varchar(30))
CONSTRAINT cod_raro CHECK (cod_country > 0 AND cod_country < 154);
With this example we have created a table of countries. Each time that we insert a new row, these conditions will be meet:
Note
What does NULL mean? In SQL there exist two states, data and no data. In effect, we can be interested that a field of rows doesn't have data, like zero (0) as well as the spaces are data. SQL introduces the concept of NULL and works with it. A practical example:
I have a table of bills, with the following fields: customer, value, date_issued, date_paid
When I create the row, I will insert the data: customer, value, date_issued
I will leave the date paid null; in this way I will know which bills need to be paid with the following command:
SELECT * FROM bills WHERE date_paid IS NULL;
One can claim that a zero (0) in the field date_paid would do the same thing. It's true, except that zero (0) is not a date, and it prevents creating date_paid of date type and applying appropriate date functions.
Examples of creating with NULL:
insert into countries values (15, NULL);
Or:
insert into countries (cod_country) values (27);
The absence of the "name" field implies that this will receive a NULL value.
Modifying a table
In PostgreSQL, modification only deals with adding new columns.
ALTER TABLE table ADD name type;
Where:
Table |
Name of the table to modify |
Name |
Name of the column to add |
Type |
Type of data (see CREATE TABLE) |
Inserting data into a table
Now we will insert data into our table: SYNTAX:
INSERT INTO table [(column, column, ...)] VALUES (value-1, value-2,
...)
or else:
INSERT INTO table [(column, column, ...)] SELECT ....
As we have seen, there are two forms of inserting data into a table, either line by line or the result of a sub-select that can return one or more rows.
When we insert lines into a table, we will ALWAYS put data in all the columns including those that we don't mention, these are created with NULL values.
If, in the command, we don't specify which columns we are going to fill, it is understood that we are going to give data to all, example:
INSERT INTO countries VALUES (34, 'Spain');
This would be incorrect:
INSERT INTO countries VALUES (34);
But, this would be correct:
INSERT INTO countries (cod_country) VALUES (34);
I recommend that a command embedded in "C" programs or in database functions ALWAYS specify the columns that we are going to touch, in effect, if we add a new column to the table (ALTER TABLE), the next insert an error will occur, Example:
INSERT INTO countries VALUES (34, 'Spain');
INSERT 18301 1
ALTER TABLE countries add population integer
INSERT INTO countries VALUES (34, 'Spain');
This will give a parser error, given that the population data is missing
Note
PostgreSQL, doesn't generate an error. It creates the line with the "population" field NULL. This is only a particularity of PostgreSQL, any other SQL manager would give an error.
We still have another type of INSERT, which is fed from a sub-select.
This type of insert is used very often to create temporary tables or tables to carry out a concrete task of speculative calculations.
The part replaced is that which touches the data itself, this comes from the SELECT instructions that were carried out previously and the inserting of the data. The instruction SELECT can return one or more rows, this instruction SELECT has the same restrictions of the same SELECT.
Selecting data
I wanted to arrive at this point! :-))
We have covered required SQL commands, the SQL language without SELECT would be like beans without sausage.
The SELECT command allows us to access data, but with the reservation that searches, unions of tables, functions with the data, and with the search rules can be carried out.
An example:
select * from countries;
Another example:
SELECT a.name, SUM(population)
FROM countries a, states b, counties c
WHERE b.cod_country = a.cod_country
AND (c.cod_country = b.cod_country
AND c.state_code = b.state_code)
AND population IS NOT NULL
GROUP BY a.name
ORDER BY sum ASC;
Let me explain, I requested the populations of all of the countries ordered by the population in an ascending order.
For this I have added a new column (population) to the counties table. This would be like this:
create table counties (cod_country int,
state_code int,
county_code int,
county_name varchar(60),
population int);
insert into counties values (1, 1, 1, 'Country 1, State 1, County 1',
5435);
insert into counties values (2, 1, 1, 'Country 2, State 1, County 1',
7832);
insert into counties values (3, 1, 1, 'Country 3, State 1, County 1',
4129);
insert into counties values (1, 2, 1, 'Country 1, State 2, County 1',
76529);
insert into counties values (2, 2, 1, 'Country 2, State 2, County 1',
9782);
insert into counties values (3, 2, 1, 'Country 3, State 2, County 1',
852);
insert into counties values (1, 3, 1, 'Country 1, State 3, County 1',
3433);
insert into counties values (2, 3, 1, 'Country 2, State 3, County 1',
7622);
insert into counties values (3, 3, 1, 'Country 3, State 3, County 1',
2798);
insert into counties values (1, 1, 2, 'Country 1, State 1, County 2',
7789);
insert into counties values (2, 1, 2, 'Country 2, State 1, County 2',
76511);
insert into counties values (3, 1, 2, 'Country 3, State 1, County 2',
98);
insert into counties values (1, 2, 2, 'Country 1, State 2, County 2',
123865);
insert into counties values (2, 2, 2, 'Country 2, State 2, County 2',
886633);
insert into counties values (3, 2, 2, 'Country 3, State 2, County 2',
982345);
insert into counties values (1, 3, 2, 'Country 1, State 3, County 2',
22344);
insert into counties values (2, 3, 2, 'Country 2, State 3, County 2',
179);
insert into counties values (3, 3, 2, 'Country 3, State 3, County 2',
196813);
insert into counties values (1, 1, 3, 'Country 1, State 1, County 3',
491301);
insert into counties values (2, 1, 3, 'Country 2, State 1, County 3',
166540);
insert into counties values (3, 1, 3, 'Country 3, State 1, County 3',
165132);
insert into counties values (1, 2, 3, 'Country 1, State 2, County 3',
0640);
insert into counties values (2, 2, 3, 'Country 2, State 2, County 3',
65120);
insert into counties values (3, 2, 3, 'Country 3, State 2, County 3',
1651462);
insert into counties values (1, 3, 3, 'Country 1, State 3, County 3',
60650);
insert into counties values (2, 3, 3, 'Country 2, State 3, County 3',
651986);
insert into counties values (3, 3, 3, 'Country 3, State 3, County 3',
NULL);
commit work;
Now, we can't do it with ALTER TABLE, but should use UPDATE and I haven't explained it, so use the "cut & paste" and everyone will be happy :-))
Now we can execute the QUERY, and the results should be:
name | sum
- ---------+-------
country 1| 705559
country 2|1212418
country 3|2804018
(3 rows)
Now we will verify this:
select sum(population) from counties where cod_country = 1;
Results:
sum
- ------
791986
(1 row)
!!!!!! One Difference !!!!!!
Let's look at the states table, it is missing the state 3, we do:
INSERT INTO states VALUES (3, 1, 'State 3, Country 1');
INSERT INTO states VALUES (3, 2, 'State 3, Country 2');
INSERT INTO states VALUES (3, 3, 'State 3, Country 3');
and repeat the command, resulting in:
name | sum
- ---------+-------
country 1| 791986
country 2|1872205
country 3|3003629
We are missing the state 3 from each country.
Now for those that have been lost, remember that the joins between tables were EXACT, it only extracts data if the condition are exact.
Let's look at the first part of the WHERE: b.cod_country = a.cod_country
This means that I join the country table with the states where the country code is equal, now remember the country data that we inserted:
Don't execute this, is only used as an illustration.
create table countries (cod_country integer, name varchar(30));
insert into countries values (1, 'country 1');
insert into countries values (2, 'country 2');
insert into countries values (3, 'country 3');
commit work;
Now the states data:
create table states (state_code int,
cod_country int,
state_name varchar(30));
insert into states values (1, 1, 'State 1, Country 1');
insert into states values (2, 1, 'State 2, Country 1');
insert into states values (1, 2, 'State 1, Country 2');
insert into states values (2, 2, 'State 2, Country 2');
insert into states values (1, 3, 'State 1, Country 3');
insert into states values (2, 3, 'State 2, Country 3');
commit work;
All of the states 3 from every country are missing, but in the country table the corresponding data of the states 3, so it is normal that we don't add the country data with the states with code 3 to be discarded in the second part where:
AND (c.cod_country = b.cod_country
AND c.state_code = b.state_code)
The state exists in the counties table but NOT in the states table.
For those that haven't understood, take an aspirin, go walk your dog (if you don't have a dog, go walk without a dog), breath a little fresh air and comeback to begin from the first exercise.
Is is very important to understand how the joining of data is carried out, without this the development that we do can have unpredictable results.
Let's shift gears and start with the SELECT command syntax.
SELECT [DISTINCT] expression1 [AS nom-attribute] {, expression-i [as
nom-attribute-i]}
[INTO TABLE classname]
[FROM from-list]
[WHERE where-clause]
[GROUP BY attr_name1 {, attr_name-i....}]
[ORDER BY attr_name1 [ASC | DESC ] [USING op1 ] {,
nom-attribute-i...}]
[UNION {ALL} SELECT ...]
Step by step:
DISTINCT: |
This is to eliminate duplicate row returned |
expression1: |
what we want returned, normally a column for a table from the list FROM |
AS nom-attribute: |
an alias for the name of a column, example:
manu=> select cod_country from countries;
cod_country
- -----------
1
2
3
(3 rows)
Now with the alias:
manu=> select cod_country as countr from countries;
countr
- ------
1
2
3
(3 rows)
|
INTO TABLE: |
allow the resulting row to be inserted directly into another table (see INSERT ... SELECT...) |
FROM: |
list of input tables |
WHERE: |
selection statement (union and selection criteria). |
GROUP BY: |
group criteria, certain functions that are used in expressions might need grouping |
ORDER BY: |
Order criteria of the returned rows, ASC ascending order, DESC descending order, USING if the column that defines the order is not in the list (expression) |
UNION ALL SELECT: |
This says to add to the results of the first SELECT this second SELECT that can be different tables, but return the same number of columns. |
We have seen that the commands SELECT not only returns items from the DB but it can also modify them:
SELECT SUM(salary * 1.1) - SUM(salary) AS increment FROM employees;
This will return the pay increment of a pay increase more then 10%.
Let's look at the functions that we have available:
COUNT(): |
returns the number of rows that aren't NULL |
SUM(): |
returns the total sum of a column of numbers |
AVG(): |
returns the average of a column of numbers |
MIN(): |
returns the minimum value in a column |
MAX(): |
returns the maximum value in a column |
FLOAT(int): |
returns a FLOAT8, FLOAT(12345) |
FLOAT4(int): |
returns a FLOAT4, FLOAT4(12345) |
INT(float): |
returns an INT from a FLOAT/4, INT(123.456) |
LOWER(text): |
returns lower-case text |
UPPER(text): |
returns upper-case text |
LPAD(text, long, char): |
fill to the left with char the length of long the column text |
RPAD(text, long, char): |
fill to the right with char the length of long the column text |
LTRIM(text, char): |
remove from the left of text all of the characters char |
RTRIM(text, char): |
remove from the right of text all of the characters char |
POSITION(string IN text): |
extract from text the position of string, but IT DOESN'T WORK |
SUBSTR(text,from[,to]): |
extract the substring of text, from the position from to the position of to or the end of the string |
DATETIME(date, hour): |
converts a datetime format to date (YYYY-MM-DD) and an hour (HH:MM) |
These were a few of the functions that exist in SQL, these are those that are defined in ANSI SQL and are also present in Postgres95.
Details of WHERE
Until now, we have seen that the section of WHERE from the SELECT we put things like:
AND column = value
This is a small example of what we put or combine:
AND, OR, NOT, IN, IN ALL, =, !=, >, <, (SELECT....), LIKE also the parenthesis are relevant, example:
WHERE
column IN (SELECT DISTINCT column FROM table WHERE ....)
column IN ('value1','value2','value3',...)
(column = 'value' and column = 'other_value' OR column != 'value')
!= the same as saying NOT EQUAL
LIKE permits searching for a string within a column with wildcards:
WHERE column LIKE '%Pepito%'
The % is a wildcard, in the example, it's true if "Pepito" is in the string
WHERE column LIKE 'Pepito%'
will be true if "Pepito" is at the beginning of the string
WHERE column LIKE '%Pepito'
will be true if "Pepito" is at the end of the string
There's not enough time here to list all of the possible options used with WHERE, the limit is in the imagination of the programmer or the limits of the individual process parser.
Now we can leave the SELECT command and concentrate on the last two.
UPDATE command
The UPDATE command allows modification of one or more rows, depending on the defined condition in WHERE SYNTAX:
UPDATE table SET column-1 = expression-1
[, column-i = expression-i]
[WHERE condition]
Where:
table: |
the table to modify, only one table can be modified at a time |
column: |
the column that will be modified |
expression: |
the value that the column will receive. This value can be static or the results of a function |
condition: |
the condition that define the modification boundaries, here the defined rules are applied for the SELECT |
DELETE commandThe DELETE command allows modification of one or more row of a table.
SYNTAX
DELETE FROM table
[WHERE condition]
Where:
table: |
the table where rows will be deleted, only one table can be deleted from at a time |
condition: |
the condition that define the modification boundaries, here the defined rules are applied for the SELECT
NOTE: without the WHERE ALL of the rows will be deleted from the table
|
|