Relational database management systems (RDBMS), including Oracle, organize data values into table columns, each with a specific data type. Each data type associates its own properties with its values, so that Oracle treats values of one data type differently from values of another. For example, a string value enclosed in a pair of quotes, say '23-NOV-91', will be interpreted as a textual value, when it is stored as a character data type. However, the same value will represent a time-related information when it is stored in a datetime column.
The native data types supported by Oracle can be roughly classified into: numeric, character, datetime, intervals, and large object (LOB). Here's a closer look at each.
Numeric Data Types
Numeric data types are used to represent numerical values, including whole numbers and fractional parts. Oracle's numeric data type has four subtypes: NUMBER, FLOAT, BINARY_FLOAT, and BINARY_DOUBLE[1]. These types vary in their storage requirements and precision with which they represent values. NUMBER is the most commonly used type in practical applications while the other three types are designed to provide flexibility for accommodating specific, yet infrequent, requirements.
NUMBER(p, s)
NUMBER is the Oracle's most versatile and widely used numeric data type. Its constructor accepts both precision (p) and scale (s) as parameters, which together define the number of digits that can be stored in a table column.
- Precision (p): A value between 1 and 38, specifying the total number of digits, with a default of 38. Entering a value exceeding the specified precision will result in an error.
- Scale (s): A value between -84 and 127, specifying the number of digits to the right of the decimal point. If s is negative, it represents the number of digits to the left of the decimal point. When a number with more decimal digits than the specified s is entered, it is rounded accordingly. Particularly, if s is negative, rounding applies to the digits to the left of decimal point.
Here are some examples:
Data Type | Value Entered | Value Stored As | Description |
---|---|---|---|
NUMBER | 123.54 | 123.54 | Stored as entered. |
NUMBER(3) | 123.54 | 124 | s defaults to 0 and is rounded to nearest integer. |
NUMBER(3,2) | 123.54 | Error ORA-01438 | p is 3, yet five digits are entered. Value will not be stored and Oracle raises an error. |
NUMBER(5,2) | 123.54 | 123.54 | Stored as entered. |
NUMBER(7,1) | 123.54 | 123.5 | s is 1. Entered value is rounded to the first decimal. |
NUMBER(7,-1) | 123.54 | 120 | s is -1. Entered value is rounded to the nearest tenth. |
NUMBER(7,-1) | 125.54 | 130 | s is -1. Entered value is rounded to the nearest tenth. |
NUMBER(4,5) | 0.1234 | Error ORA-01438 | Value will not be stored and Oracle raises an error. |
NUMBER(4,5) | 0.01234 | 0.01234 | Stored as entered. |
NUMBER(4,7) | 0.01234 | Error ORA-01438 | Value will not be stored and Oracle raises an error. |
NUMBER(4,7) | 0.0001234 | 0.0001234 | Stored as entered. |
NUMBER(3,7) | 0.0001234 | Error ORA-01438 | Value will not be stored and Oracle raises an error. |
NUMBER(3,7) | 0.0000123 | 0.0000123 | Stored as entered. |
NUMBER(3,7) | 0.00001234 | 0.0000123 | The number of digits up to the 7th decimal point is 123; 4 in the 8th decimal point is rounded. |
Note: The acceptable value range for p and s does not represent the largest values the database can store; rather, they define the largest and smallest specifications for the values you can have.
When you see that precision is greater than scale (e.g., NUMBER(4,5)), it means three things:
- The number will not have an integer part, only fractional part. The leading 0 in the integer part is not counted as part of the precision. However, if a non-zero digit is entered in the integer part (e.g., entering 1.0234 for NUMBER(4,5)), Oracle will raise an error.
- The scale represents the total number of digits in the fractional part. For example, when entering 0.0001234 into a NUMBER(4,7) column, the scale of 7 indicates the 7 digits after the decimal point: 0001234.
- The fractional part has two parts: significant digits and zeros. The significant digits are determined by the precision, and the minimum number of zeros is determined by (scale - precision). For example, NUMBER(4,7) column has precision of 4 and scale of 7. Thus, the number of significant digits should be 4 and there must be at least \(7-4 = 3\) zeros in the fractional part. Thus, 0.0001234 is valid for the column. Similarly, 0.00001234 is also valid, but will be rounded to 0.0000123. On the other hand, 0.01234 is not a valid value for the column as it has only one zero in the fractional part.
In practice, you'll typically use NUMBER without any parameters. It is so versatile that it can dynamically adjust scales of the entered value (parameter p defaults to 38, which is the maximum supported by Oracle). In fact, the precision and scale parameters are not meant to define "allowable values"; rather, they are used on the purpose of imposing restrictions on the number of digits, so that developers can have more control over storage.
FLOAT, BINARY_FLOAT, and BINARY_DOUBLE
Aside from NUMBER, Oracle offers three additional subtypes to store and manage numerical data. These types, however, virtually have no practical usage in many cases.
FLOAT is a numeric data type that represents values only with the precision p. The default precision for FLOAT is 53 and can range from 1 to 126, but it cannot be 0 or negative. So, this data type is designed for decimal values between, but not including, -1 and 1. Internally, a FLOAT value is represented as NUMBER; however, while NUMBER is based on the decimal system, FLOAT is based on the binary system. This requires extra caution when converting FLOAT type data into NUMBER, which makes FLOAT less commonly used in practice.
BINARY_FLOAT and BINARY_DOUBLE are used even less frequently than FLOAT. These two types can store 32-bit floating point number and 64-bit floating point numbers, requiring 4 and 8 bytes of storage, respectively. However, Oracle is primarily designed for business systems--such as HR, CRM, or accounting--rather than scientific computing. So, there is nearly no point to employ BINARY_FLOAT or BINARY_DOUBLE in many real practices. In fact, columns defined with these two types lead to precision issues when performing arithmetic operations for typical business applications.
Aside: Using Comma As a Decimal Point
Suppose you live in a locale where the conventions are reversed, using a dot as a thousand separator and a comma as the decimal separator. If you insert a value, 1234,567, for example, to a NUMBER(7,3) column, it won't work as intended. This is because, by default, Oracle expects the comma to separate one column from another. So, it will interpret the value, 1234,567, as two separated values: 1234 and 567.
In Oracle, NLS (National Language Support) parameters are the settings that define the language, territory, character set, and sorting order for each database session. To accommodate a numeric value with reversed separator conventions, you should run the following command:
ALTER SESSION SET NLS_NUMERIC_CHARACTERS = ',.';
The ALTER SESSION command manages session settings within the current Oracle session and the NLS_NUMERIC_CHARACTERS parameter determines the characters to be used for the decimal and group separator. By default, this parameter is set to '.,', with a dot (.) as the decimal separator and a comma (,) as the group separator for thousands. So the command above temporarily modifies current session to reverse the current NLS_NUMERIC_CHARACTERS, allowing '1234,567'[2] to be interpreted as a single value.
Character Data Types
In Oracle SQL, character strings refers to any textual values enclosed by a pair of single quotations. These strings can contain any character supported by the database character set, including:
- Letters: Uppercase (A-Z) and lowercase (a-z)[3]
- Numbers: Digits 0-9
- Special characters: These include commonly used symbols like spaces, punctuation marks (!, @, #, $, %, ^, &, *, -, +, =, , |, {, }, [, ], ., <, >, ?, /)[4]
Oracle offers four character subtypes to store textual data: CHAR, VARCHAR2, NCHAR, and NVARCHAR2.
CHAR(n)
Columns defined with the CHAR data types store fixed-length character strings, meaning any character strings shorter than the specified length will be padded with blank spaces. Parameter n determines the length of the stored values, with a minimum of 1 and a maximum of 2000 characters[5].
For example:
Observe that the given text string, 'Hello, Oracle' is padded with blank spaces to reach the length of 20. On the other hand, attempting to enter a value longer than n will result in an error. For example:
Alternatively, you can specify the length in characters instead of bytes[6] by adding the CHAR keyword after n. This is particularly useful when you expect the entered values to potentially contain multi-byte characters. For example:
In the SQL script shown above, my_column has a data type of CHAR(5 CHAR), meaning it can store character strings with 5 characters (not bytes). Three character strings, each with five characters, are then entered. Hangul and Hiragana characters take 2 bytes each, while Roman alphabet characters take 1 byte each. Regardless, all three strings are successfully inserted into the table as they have five characters.
Note: When using CHAR(n CHAR) for multi-byte characters, there can be storage overhead. For example, '안녕하세요' (meaning "Hello") contains 5 Hangul characters, each occupying 2 bytes. If stored in a CHAR(5 CHAR) column, there are 5 bytes storage overhead, so that the byte length of the stored string is 15. Similarly, when storing 'ä½ å¥½' (meaning "Hello" and requiring 2 bytes each) to the same column, byte length of the stored value will be \(2 \times 2 + 5 = 9\):
VARCHAR2(n)
The term VARCHAR2 stands for "variable character," signifying its ability to store character strings with variable lengths. The parameter n defines the maximum allowable length for values stored in the column. Similar to CHAR data type, you can explicitly specify whether n represents bytes of characters by putting the keywords BYTE or CHAR after n. If omitted, the defaults is n BYTE. However, unlike CHAR, specifying n is mandatory when declaring a table column with the VARCHAR2 data type, with a valid range from 1 to 4000. Additionally, VARCHAR2 does not pad its values with blanks; instead, its length adjusts dynamically to match the actual string length.
For example:
NCHAR(n) and NVARCHAR2(n)
NCHAR and NVARCHAR2 are designed to store Unicode characters, with each character occupying a fixed two bytes. These data types support various languages and symbols, overcoming the limitations of single-byte character sets.
- NCHAR stores Unicode characters with a fixed length of n characters, where n is the size defined during column creation (e.g., NCHAR(10) allocates space for 10 characters). The maximum length of an NCHAR column is 2000 bytes, with a default size of 1. It is important to note that NCHAR uses fixed number of bytes per character. Even if a character represented by one logical character, it will still consume 2 bytes of storage. If you expect the input to consist of single-byte characters (e.g., plain English text), consider using CHAR instead, to optimize storage.
- NVARCHAR2 supports Unicode characters with a variable length, defined by the size specified during column creation (e.g., NVARCHAR2(20) can hold strings up to 20 characters.) The maximum length of an NVARCHAR2 column is 4000 bytes.
Datetime and Interval Data Types
Oracle supports the DATE, TIMESTAMP, and INTERVAL data types to efficiently handle such operations.
DATE
Columns defined with DATE data type are used to store date information, including the year, month, and day. Values entered into a DATE column must adhere to valid formats specified by the NLS_DATE_FORMAT parameter[7]. By default, Oracle uses the format 'DD-MON-YY' for DATE columns, e.g., '23-NOV-91'. Below are the range of valid values for each date field:
- YEAR: -4712 to 9999 (BC 4712 to AD 9999)
- MONTH: 01 to 12
- DAY: 01 to 31 (depending on the month and year)
For example:
TIMESTAMP(n)
The TIMESTAMP data type in Oracle extends the functionality of the DATE data type by including HOUR, MINUTE, and SECOND (includes fractional seconds, defaulting to milliseconds). These fields should appear after the date-related fields, delimited by colons (:). For example:
Observe that fractional seconds exceeding the pre-specified n digits are truncated.
Below are the range of valid values for the time-related fields:
- HOUR: 00 to 23
- MINUTE: 00 to 59
- SECOND: 00 to 59.9(n) where 9(n) is the precision of time in fractional seconds.
TIMESTAMP(n) WITH TIME ZONE
The TIMESTAMP WITH TIME ZONE further extends the functionality of the TIMESTAMP data type by including time zone information. It stores the date, time, fractional seconds, and a time zone offset from UTC of the location of the database server (or a specified region name). So, this data type is particularly useful for applications that operate across different geographical regions, as it ensure consistent handling of time-sensitive data regardless of time zone differences.
The time zone offset is represented as +/-HH:MI, while region-based time zones account for daylight saving time changes. Time zones can be modified for the session using the following command:
ALTER SESSION SET TIME_ZONE = 'time-zone';
To list the valid string values for the 'time-zone' in the command above, you can query the V$TIMEZONE_NAMES view as follows:
SELECT * FROM V$TIMEZONE_NAMES;
TIMESTAMP(n) WITH LOCAL TIME ZONE
This variation differs from TIMESTAMP WITH TIME ZONE in that the time zone offset is not stored with the column's value. Instead, when the value is retrieved, the time zone information is presented to the user in their local time zone. While the offset itself is not retained within TIMESTAMP WITH LOCAL TIME ZONE, the system dynamically computes the offset when the value is accessed from a different time zone. Thus, end users perceive the time in their respective local time zone, irrespective of the time zone where database server is located. The value of n is representing the fractional decimals; if omitted, it defaults to 6.
For example:
Oracle Database supports time zone representations when dealing with TIMESTAMP data type. Here's a summary of key concepts of time zones in Oracle:
- Database Time Zone: The default time zone for the database, typically set to the server's location. It governs the default interpretation of TIMESTAMP stored in the database, if not specific time zone is provided.
- Session Time Zone: The time zone used for the current user session. It determines how timestamps are displayed and manipulated within the session and can be specified with ALTER SESSION SET TIME_ZONE = 'time_zone_region'.
- Time Zone Names: Oracle recognizes various time zone formats, including named regions (e.g., America/Los_Angeles) and offsets from UTC(e.g., UTC-08:00).
- Daylight Saving Time: Oracle handles DST automatically based on the chosen time zone and its corresponding rules.
In Oracle SQL, TIMESTAMP WITH TIME ZONE stores date, time, and associated time zone. For example:
Just as when defining a TIMESTAMP, you could also add a number in parentheses to specify fractional decimals. For example:
When a column in a table is defined as TIMESTAMP WITH TIMEZONE, you should also specify the time difference. For example:
In the figure shown above, the TIMESTAMP value '2024-01-16 18:40:55' is based on the default time zone, which is the geolocation in where the database is located. On the other hand, when you specified the specific time zones, either based on UTC offset or the time zone names, the records will represent the associated time zone information:
Alternatively, you can also define a column as TIMESTAMP(n) WITH LOCAL TIME ZONE. This variation differs from TIMESTAMP WITH TIME ZONE in that the time zone offset is not stored with the column's value. Instead, when the value is retrieved, it is presented to the user in their local time zone: while the offset itself is not retained within TIMESTAMP WITH LOCAL TIME ZONE, the system dynamically computes the offset when the value is accessed from a different time zone. Thus, end users perceive the time in their respective local time zone, irrespective of the time zone where database server is located. The value of n is representing the fractional decimals; if omitted, it defaults to 6.
Time Zone Variables, Operators, and Functions
Oracle also provides some built-in variables, operators, and functions to handle time zone values. The DBTIMEZONE is a variable that stores the database time zone, whereas SESSIONTIMEZONE stores the time zone of the current session. LOCALTIMESTAMP is another variable that stores the current timestamp in the session time zone.
Note: TIMESTAMP WITH TIME ZONE and TIMESTAMP WITH LOCAL TIME ZONE are data types, not variables. Don't get confused with LOCALTIMESTAMP, which is a variable.
One of the operators that I find useful in practice is AT TIME ZONE. It converts a TIMESTAMP value on its left-hand side to the time zone on its right-hand side. For example:
Oracle built-in function FROM_TZ converts a timestamp literal to a TIMESTAMP WITH TIME ZONE value. For example:
Aside: NLS_DATE_FORMAT and NLS_TERRITORY
The default date format for each Oracle session is defined by NLS_DATE_FORMAT and NLS_TERRITORY. Thus, to change the default date format of 'DD-MON-YY', you should redefine these two NLS parameters. For example, in the United Kingdom, date literals formatted as 'DD-MON-RR', where DD is the two-digit day, MON is the three-letter abbreviation for the month, and RR is the two-digit year. Note that values of RR ranging from 00 to 49 are assumed to be in the 21th century (2000 - 2049), while RR value ranging from 50 to 99 are assumed to be in the 20th century (1950 - 1999).
To see the value specified for the initialization parameter NLS_TERRITORY in your database, use SHOW PARAMETERS NLS_TERRITORY:
To see the current format of date literal, use the following query:
To modify the date literal format for the current session, you should run the ALTER SESSION command. For example, the following command sets it to 'YYYY-MM-DD':
Below is a table listing the elements of the NLS_DATE_FORMAT parameter that you can use for DATE type data:
Element | Description | Element | Description |
---|---|---|---|
AD/A.D. BC/B.C. | Anno Domini or Before Christ indicator, with or without periods. | AM/A.M. PM/P.M. | Morning or afternoon hours, with or without periods. |
CC/SCC | Century. | D | Day of the week, 1 through 7. |
DAY | The name of the day spelled out. | DD | Day of the month, 1 through 31. |
DDD | Day of the year, 1 through 366. | DL | Long date format. Appearance determined by NLS_TERRITORY and NLS_LANGUAGE, e.g., sample AMERICAN_AMERICA output is 'Saturday, November 23, 1991.' |
DS | Short date format. Appearance determined by NLS_TERRITORY and NLS_LANGUAGE, e.g., sample AMERICAN_AMERICA output is '11/23/1991.' | DY | Abbreviated name of day, e.g., SUN, MON, TUE, etc. |
E | Abbreviated era name. | EE | Full era name. |
FF | Fractional seconds. | FM | Used in combination with other elements to direct the suppression of leading or trailing blanks. |
FX | Exact matching between the character data and the format model. | IW | Week of the year, 1 through 53. |
I IY IYY | Last one, two, or three digits of the ISO year[7]. | J | Julian day, counted as the number of days since January 1, 4712 B.C. |
MON | Abbreviated name of month, e.g., JAN, FEB, MAR, etc. | MONTH | Name of month spelled out. |
PR | If negative, numbers are enclosed with angle brackets (<>). If positive, returned with leading and trailing spaces. PR follows specification, e.g., 9999PR. | RM | Roman numeral month. |
RR | Two digit years, where 00 through 49 interpreted as 21st century, 50 through 99 interpreted as 20th century. | RRRR | Four digit years. If only two digits are provided, it will be interpreted as RR. |
Q | Quarter of year. | WW | Week of the year, 1 through 53. Week 1 starts on the 1st day of the year and ends on the 7th day of the year. |
W | Week of the month, 1 through 5. Week 1 starts on the 1st day of the month and ends on the 7th day of the month. | X | Local radix character. This is the character used in a numeric representation to separate an integer from its fractional part. |
Y,YYY | Year with comma in position. | YEAR SYEAR | The year spelled out in English. The S version causes BC dates to display with a minus sign prefix. |
YYYY SYYYY | Four-digit year. The S version causes BC dates to display with a minus sign prefix | YYY YY Y | The last three digits, two digits, or one digits of the year. |
- / , . ; : | Punctuation that is accepted in place and passed through as is. | "text" | Literal value. Display as is. |
For example:
By default, timestamp literals should be formatted as 'DD-MON-YYYY HH:MM:SS.999999', where 9 represents the fractional digits. Similar to date literals, you can also modify the timestamp literal formats, using ALTER SESSION SET NLS_DATE_FORMAT command. List of NLS_DATE_FORMAT elements that you can use for the timestamp include those for date literals and extends to:
Element | Description | Element | Description |
---|---|---|---|
HH HH12 | Hour of the day, 1 through 12 (both are identical). 12 midnight is represented as 12. | HH24 | Hour of the day, 1 through 24. 12 midnight is represented by 00. |
MI | Minute, 1 through 59. | SS | Seconds 0 through 59. |
SSSS | Seconds past midnight, 0 through 86399. | TS | The short time format. Allowable only when specified with the DL or DS format model element, separated by blank space. |
TZD | Abbreviated time zone with daylight saving time. Valid only in timestamp and interval formats, 00 through 12. | TZH | Time zone hour. Valid only in timestamp and interval formats, 00 through 12. |
TZM | Time zone minute. Valid only in timestamp and interval formats, 00 through 59. | TZR | Time zone region information. Valid only in timestamp. |
One common mistake is using MM for minutes. In NLS_DATE_FORMAT parameters, MM represents double digit month, not minute! Minutes are represented by MI.
INTERVAL YEAR(n) TO MONTH
Unlike DATE and TIMESTAMP, which pinpoint specific moments in time, intervals allow you to represent durations between two different time points. Oracle offers two main INTERVAL data types: INTERVAL YEAR TO MONTH and INTERVAL DAY TO SECOND. These two types are often used to store the results from adding or subtracting two DATE or TIMESTAMP columns.
This data type stores a duration defined by year and month values. The precision of the YEAR value, denoted by n, can range from 0 to 9 digits. For example, setting n to 9 in INTERVAL YEAR(n) TO MONTH allows representing durations up to nine figures of years (999,999,999 years). If omitted, n defaults to 2. Here is an example:
In script shown above, yr_to_mth is defined as INTERVAL YEAR(3) data type, allowing it to store any intervals ranging from -999 to 999 years. For example, '123-11' can be stored in this column, representing 123 years and 11 months. When displayed in the data table, it appears as '+123-11'. Note that the hyphen between the year and month components is not subtraction; it is delimiter between year and month.
Negative values in the interval represent durations in the past. For example, '-23-11' denotes a of minus 23 years and 11 months, equivalent to 287 months ago. Here, the minus sign at the beginning indicates years ago, while the hyphen between year and month is the delimiter.
Intervals can also be expressed solely in terms of years or months. For example, in line 10, there is a direct entry of 50 years. Similarly, in line 11, 1123 months are specified, which equals 93 years and 7 months.
INTERVAL DAY(n1) TO SECOND(n2)
This data type stores a time span defined in days, hours, minutes, and seconds, with precision for both days (n1) and seconds (n2). The valid range for n1 is 0-9, with a default setting of 2, determining the number of digits allowed when specifying the size of DAY component. On the other hand, the value for n2 represents the fractional seconds precision for the SECOND component, with an acceptable range of 0-9 and default value of 6. For example:
Here, the column day_to_sec is defined as INTERVAL DAY(3) TO SECOND(4). So this column can accommodate any number of days between -999 days and 999 days, along with up to 4 fractional seconds. Similar to INTERVAL YEAR TO MONTH, you can also input values specified in a single unit of day, hour, minutes, or second. For example, line 10 and 11 inserts '-50' hours and '1123' minutes, respectively. Oracle automatically converts the inserted values into number of days, hours, minutes, and seconds.
Large Objects
In addition to the major four categories of the data types we've discussed so far, Oracle SQL also has large objects, or LOBs, to handle unstructured data types. LOBs can generally be used similarly to other data types, except that they cannot be used as primary keys, and they are not compatible with DISTINCT, GROUP BY, ORDER BY, or joins. Tables may have multiple columns with LOB data types.
The LOBs include the following: BLOB, CLOB, and NCLOB.
BLOB is designed for storing large binary objects, such as image or video files. Declaration of BLOB does not require any precision or scale. The maximum size calculation involves several factors such as a starting size of 4GB, a parameter known as CHUNK, and the database block size setting.
CLOB is used for large text data elements. Similar to BLOB, its declaration doesn't involve precision or scale. The maximum size is calculated using a formula similar to that of the BLOB data type.
Lastly, NCLOB is designed for storing CLOB data in Unicode. The maximum size calculation aligns with that of the BLOB data type. Unicode is a character set offering a more universal standard than ASCII, supporting major languages more comprehensively. Given its widespread adoption in databases and web technologies, its usage is recommended, especially in applications with global and multilingual requirements.
0 Comments