Using PostgreSQL

  1. What is PostgreSQL?
  2. PostgreSQL is an open-source relational database management system based on the SQL (Structured Query Language) standard for relational databases. For more information, see http://www.postgresql.org.

    In the CS department, PostgreSQL is running under Redhat Linux, and its interface is accessible from any Solaris 7 machine.

  3. How do I connect to a PostgreSQL database?
  4. PostgreSQL databases can be accessed in many ways. This document covers only the psql command-line interface; see the PHP Tutorial for information on accessing PostgreSQL databases with PHP. For more information about the many other interfaces, which include several procedural languages such as Perl, as well as ODBC support, see http://www.postgresql.org.

  5. A Brief Introduction to SQL
  6. This section presumes a basic understanding of relational databases. There are many introductions to relational databases on the web; for a starting point, try http://www.uk.research.att.com/pub/docs/att/tr.98.2.pdf. Some of the information in this section has been adapted from the SQL commands section of the PostgreSQL User Guide; throughout this document, many explanations have links to specific sections of this guide for more information.

    1. Creating, deleting, and modifying tables and indices
    2. To create a table:

      CREATE TABLE [TEMP] <tablename> (<field1> <type1>,<field2> <type2>,...,<fieldn> <typen>)

      This command will create a table with the named fields and data types; if the TEMP parameter is specified, the table only exists until the current database session ends. For example:

      CREATE TABLE customers (firstname varchar(20), lastname varchar(20) NOT NULL, address varchar(40), city varchar(20), state char(2) DEFAULT "TN", zip int4);

      would create a table named customers with five text fields of various sizes, and an integer field. The NOT NULL keyword specifies that when a record is inserted into the table a value must be provided for the lastname field; the DEFAULT keyword specifies that the string that follows, TN, will be filled into the state field if it is left blank when a record is inserted.

      There are many data types available in PostgreSQL, including some complex data types beyond those specified in the SQL standard. In particular, PostgreSQL stands out for the special data types it provides for manipulation of geometric data and IP addresses. More information about all the data types available can be found in the user's guide; a few of the most commonly used are:

      char(n)

      Character string with a fixed length of n characters

      varchar(n)

      Character string with a variable length of up to n characters

      text

      Variable-length character string

      int2, int4, int8

      Signed 2, 4, or 8 byte integer

      float4, float8

      4 or 8 byte floating-point number

      date

      Date without time of day

      time

      Time of day

      timestamp

      Date and time

      serial

      Used to create a unique sequence of numbers; each item added to the table will receive the first unused number in this field when added.

      If a table has a lot of data in it, lookups into it may become very time-consuming. To reduce the search time, you can create an index on one or more (but no more than 16) fields of the table. An index will allow rapid access of records based on indexed fields; however, it will make inserting and deleting records slower, as the index must be updated each time the table is modified. An index can also be used to guarantee uniqueness of an item in a table; for example, if you had a "customers" table and wanted to be sure each customer had a unique customer number, you could create a unique index on the customer number field.

      To create an index:

      CREATE [UNIQUE] INDEX indexname ON tablename (column1, column2, ..., columnn);

      Note that a table can have more than one index; often, a table will have a unique index on one or more fields and also have another, non-unique index on one or more other fields to allow ease of searching on fields that do not need to have unique values.

      To delete a table:

      DROP TABLE tablename

      This will remove a table and any associated indices from the database.

      Note that if the table contained any fields of serial type, the sequences created in the database for these fields is NOT automatically removed; you will have to explicitly issue the drop sequence command:

      DROP SEQUENCE tablename_fieldname.

      To delete an index without deleting the table it is associated with:

      DROP INDEX indexname

      If a large amount of data is to be loaded into or unloaded from a table, it may be advantageous to drop the indices from that table so that they do not need to be updated for each record; note however that unique indices should be generally retained to guarantee data integrity.

      To modify an existing table:

      ALTER TABLE tablename ADD columnname columntype;
      adds a column named columnname with type columntype to table tablename.
      ALTER TABLE tablename RENAME columnname TO newcolumnname;
      changes the name of column columnname to newcolumnname
      ALTER TABLE tablename ALTER columnname SET DEFAULT value;
      sets the default value of column columnname to value
      ALTER TABLE tablename ALTER columnname DROP DEFAULT;
      removes the current default value from column columnname
      ALTER TABLE tablename RENAME TO newtablename;
      changes the name of table tablename to newtablename

      Note that PostgreSQL does NOT currently support removal of a column from a table; you must drop and recreate the table.

    3. Selecting records from tables
    4. To select records from a single table:

      SELECT column1, column2, ..., columnn
      FROM tablename
      WHERE criteria;

      For the following examples, assume that three tables have been created; how to actually insert records into a table will be covered in the next section.

      The tables:

      items:

      item_number

      description

      type

      color

      1

      Red Widget

      widget

      red

      2

      Blue Widget

      widget

      blue

      3

      Red Shmoo

      shmoo

      red

      4

      Blue Shmoo

      shmoo

      blue

      components:

      item_number

      component

      1

      part A

      1

      part B

      1

      part C

      2

      part A

      2

      part D

      3

      part E

      3

      part F

      3

      part G

      4

      part A

      4

      part H

      4

      part I

      stock:

      component

      quantity_on_hand

      part A

      120

      part B

      30

      part C

      20

      part D

      20

      part E

      10

      part F

      30

      part G

      15

      part H

      25

      part I

      5

      Example 1: Selecting all records from a table

      The wildcard * can be used in place of the names of columns in a select statement if all columns are to be returned. So, the statement:

      SELECT *
      FROM items;

      Returns:

      item_number

      description

      type

      color

      1

      Red Widget

      widget

      red

      2

      Blue Widget

      widget

      blue

      3

      Red Shmoo

      shmoo

      red

      4

      Blue Shmoo

      shmoo

      blue

      Example 2: Selecting certain columns

      Any columns from a table can be selected in any order. So, the statement:

      SELECT type, item_number
      FROM items;

      Returns:

      type

      item_number

      widget

      1

      widget

      2

      shmoo

      3

      shmoo

      4

      Example 3: Selecting certain rows, part 1

      To select rows based on certain criteria, specify a condition in the where clause of the select statement.

      For example,

      SELECT *
      FROM items
      WHERE type = 'widget';

      Returns:

      item_number

      description

      type

      color

      1

      Red Widget

      widget

      red

      2

      Blue Widget

      widget

      blue

      The condition specified can also be >, <, >=, or <= as appropriate.

      Example 4: Selecting certain rows, part 2

      If you only want to see rows that satisfy more than one condition, you can append as many conditions as desired using AND. For example,

      SELECT *
      FROM stock
      WHERE quantity_on_hand > 10
      AND quantity_on_hand < 30;

      Returns:

      Component

      quantity_on_hand

      part C

      20

      part D

      20

      part G

      15

      part H

      25

      Example 5: Using OR

      If you want to see the logical union of multiple sets of records, you can join conditions together with OR. (Note that SQL does have a UNION keyword, but it doesn't mean the same thing; see the PostgreSQL documentation on UNION .) For example:

      SELECT *
      FROM stock
      WHERE quantity_on_hand < 10
      OR quantity_on_hand >30;

      Returns:

      Component

      quantity_on_hand

      part A

      120

      part I

      5

      Example 6: Subqueries

      Suppose you wanted to know the quantity on hand for components used in item 1. To find this out, you would first need to find the components used from the components table, then look up the quantities for those components in the stock table. SQL provides an easy way to combine these lookups with subqueries, which allow you to select rows having values drawn from a SELECT from another table. For example:

      SELECT *
      FROM stock
      WHERE component IN
      (SELECT component
      FROM components
      WHERE item_number = 1);

      Returns:

      Component

      quantity_on_hand

      part A

      120

      part B

      30

      part C

      20

      Example 7: Selecting from multiple tables

      Now suppose you wanted to know what the quantity on hand was for components used in widgets. You could use two subqueries:

      SELECT *
      FROM stock
      WHERE component IN
      (SELECT component
      FROM components
      WHERE item_number IN
      (SELECT item_number
      FROM items
      WHERE type = 'widget'));

      Fortunately SQL provides a powerful mechanism, known as a join, for crossing multiple tables that makes long nested queries like this unnecessary. To perform a join, specify all the tables you will be selecting from in the FROM clause and specify in the WHERE clause the columns that correspond in the tables. So, to perform the above query:

      SELECT stock.component, stock.quantity_on_hand
      FROM items, components, stock
      WHERE items.item_number = components.item_number
      AND components.component = stock.component
      AND items.type = 'widget';

      Returns:

      Component

      quantity_on_hand

      part A

      120

      part A

      120

      part B

      30

      part C

      20

      part D

      20

      One advantage of the nested subquery method is that while with a join, the same row is returned twice in this case, it would appear only once with the subquery method. See the section on selecting distinct values below for a way to avoid this problem.

      Note that whenever two tables in a query have columns with the same name, the column name must be specified as tablename.columnname to indicate which table the column is to come from. To save typing when table names are long, you can assign aliases to the tables in the FROM clause and use the aliases in place of the table names. For example, the above query could also be written:

      SELECT s.component, s.quantity_on_hand
      FROM items i, components c, stock s
      WHERE i.item_number = c.item_number
      AND c.component = s.component
      AND i.type = 'widget';

      Example 8: Ordering results

      If a query returns a lot of rows, it may be convenient to order them by one or more columns. For example, suppose you wanted to know what parts the stock was low on. You could order a selection from the stock table by quantity, like this:

      SELECT *
      FROM stock
      ORDER BY quantity_on_hand;

      Results:

      Component

      quantity_on_hand

      part I

      5

      part E

      10

      part G

      15

      part C

      20

      part D

      20

      part H

      25

      part B

      30

      part F

      30

      part A

      120

      You can also order by multiple columns, which will order by the first column, then by the second column where identical in the first column, etc. If you want to see the results in descending order, just add the keyword DESC after the column names in the ORDER BY clause.

      Example 9: Counting records

      If you only want to know how many records match a set of criteria, you can use the COUNT(*) function:

      SELECT COUNT(*)
      FROM components
      WHERE item_number = 3;

      Returns:

      count(*)

      3

      Example 10: Aggregates

      SQL also includes several aggregate functions, including SUM(), MAX(), and MIN().

      To use these, the GROUP BY clause is added to the end of the select statement to specify how records should be grouped together. For example:

      SELECT c.item_number, MIN(s.quantity_on_hand)
      FROM components c, stock s
      WHERE c.component = s.component
      AND c.item_number = 3
      GROUP BY c.item_number;

      Returns:

      item_number

      quantity_on_hand

      3

      10

      Example 11: Selecting distinct values

      Suppose you only wanted to see what different colors there are for products. You could select the color column from all rows, but you would get multiple copies of each result – not a big deal with four rows, but quite tedious for a large table. SQL remedies this problem with the DISTINCT keyword:

      SELECT DISTINCT color
      FROM items;

      Returns:

      color

      blue

      red

    5. Inserting, Updating, and Deleting Records
    6. To insert a record:

      INSERT INTO tablename (column1, column2, ..., columnn) VALUES(value1, value2, ..., valuen);

      Value1 will be placed into column1, value2 in column2, and so on, regardless of the order of the columns in the table; columns may be omitted and will be filled in with nulls (an error will occur if a column specified as NOT NULL is omitted). If values are provided for all columns in the table in the order in which they appear in the table, the list of columns may be omitted.

      To delete a record or records:

      DELETE FROM tablename
      WHERE condition;

      Any valid condition for a WHERE clause can be used to specify records to delete.

      To update a record or records:

      UPDATE tablename
      SET column1 = value1,
      column2 = value2,
      ...,
      columnn = valuen
      WHERE condition;

      Any or all of the columns in a table may be given new values with an update statement, and again, any valid condition for a WHERE clause may be used. For example, if this statement were executed:

      UPDATE items
      SET description = 'Green Shmoo',
      color = 'green'
      WHERE item_number = 4;

      Then the table would look like this afterwards:

      item_number

      description

      type

      color

      1

      Red Widget

      widget

      red

      2

      Blue Widget

      widget

      blue

      3

      Red Shmoo

      shmoo

      red

      4

      Green Shmoo

      shmoo

      green

    7. Moving data to and from files
    8. PostgreSQL provides a COPY command as an addition to the SQL standard for moving data between tables and files. However, the COPY command can only be used by database administrators because of its direct interaction with the database. To give file access functionality to other users, psql provides a wrapper to the COPY command, \copy.

      To use the copy command:

      \copy tablename [from|to] filename

      Copying to a file will create a file with the given name that includes all the rows of the table, one row per line, with the fields separated by tabs. To copy from a file, the file must be in the same format; data in the file will be added to the table.

       

     


    Text Copyright Anne-Scott Whitmire, 2001.