Thursday, November 15, 2012

Difference in Space Allocation information from DBC views

DBC.DiskSpace[X] This view gives AMP vproc information about disk space usage for any database or account. It gets this information from the ALL table.

DBC.TableSize[X] This view gives AMP vproc information about disk space usage (excluding spool) for any table or account.

DBC.AllSpace[X] This view gives AMP vproc information about disk space usage (including spool) for any database, table, or account. It also include table ALL leading to confusion.

Tuesday, August 28, 2012

All about COLLECT STATS

To check stats required for Query. Execute query after below command.
DIAGNOSTIC HELPSTATS ON FOR SESSION;

Statistical information is vital for the optimizer when query plans are built. But collecting statistics can involve time and resources. By understanding and combining several different statistics gathering techniques, users of Teradata can find the correct balance between good query plans and the time required to ensure adequate statistical information is always available.

Collect Full Statistics
  • Non-indexed columns used in predicates
  • All NUSIs with an uneven distribution of values * 
  • NUSIs used in join steps
  • USIs/UPIs if used in non-equality predicates (range constraints)
  • Most NUPIs (see below for a fuller discussion of NUPI statistic collection)
  • Full statistics always need to be collected on relevant columns and indexes on small tables (less than 100 rows per AMP)


Can Rely on Random AMP Sampling   

  • USIs or UPIs if only used with equality predicates
  • NUSIs with an even distribution of values
  • NUPIs that display even distribution, and if used for joining, conform to assumed uniqueness (see Point #2 under “Other Considerations” below)
  • See “Other Considerations” for additional points related to random AMP sampling


Option to use USING SAMPLE

  • Unique index columns
  • Nearly-unique columns or indexes**


Collect Multicolumn Statistics

  • Groups of columns that often appear together in conditions with equality predicates, if the first 16 bytes of the concatenated column values are sufficiently distinct. These statistics will be used for single-table estimates.
  • Groups of columns used for joins or aggregations, where there is either a dependency or some degree of correlation among them.***  With no multicolumn statistics collected, the optimizer assumes complete independence among the column values. The more that the combination of actual values are correlated, the greater the value of collecting multicolumn statistics will be.


Other Considerations

1. Optimizations such as nested join, partial GROUP BY, and dynamic partition elimination will not be chosen unless statistics have been collected on the relevant columns.
2. NUPIs that are used in join steps in the absence of collected statistics are assumed to be 75% unique, and the number of distinct values in the table is derived from that.  A NUPI that is far off from being 75% unique (for example, it’s 90% unique, or on the other side, it’s 60% unique or less) will benefit from having statistics collected, including a NUPI composed of multiple columns regardless of the length of the concatenated values. However, if it is close to being 75% unique, then random AMP samples are adequate. To determine what the uniqueness of a NUPI is before collecting statistics, you can issue this SQL statement:
code
EXPLAIN SELECT DISTINCT nupi-column FROM table;
 3. For a partitioned primary index table, it is recommended that you always collect statistics on:
  • PARTITION. This tells the optimizer how many partitions are empty, and how many rows are in each partition. This statistic is used for optimizer costing.
  • The partitioning column. This provides cardinality estimates to the optimizer when the partitioning column is part of a query’s selection criteria.
4. For a partitioned primary index table, consider collecting these statistics if the partitioning column is not part of the table’s primary index (PI):
  • (PARTITION, PI). This statistic is most important when a given PI value may exist in multiple partitions, and can be skipped if a PI value only goes to one partition. It provides the optimizer with the distribution of primary index values across the partitions. It helps in costing the sliding-window and rowkey-based merge join, as well as dynamic partition elimination.
  • (PARTITION, PI, partitioning column). This statistic provides the combined number of distinct values for the combination of PI and partitioning columns after partition elimination. It is used in rowkey join costing.
5. Random AMP sampling has the option of pulling samples from all AMPs, rather than from a single AMP (the default). All-AMP random AMP sampling has these particular advantages:
  • It provides a more accurate row count estimate for a table with a NUPI. This benefit becomes important when NUPI statistics have not been collected (as might be the case if the table is extraordinarily large), and the NUPI has an uneven distribution of values.
  • Statistics extrapolation for any column in a table will not be attempted for small tables or tables whose primary index is skewed (based on full statistics having been collected on the PI), unless all-AMP random AMP sampling is turned on. Because a random AMP sample is compared against the table row count in the histogram as the first step in the extrapolation process, an accurate random AMP sample row count is critical for determining if collected statistics are stale, or not.

 * Uneven distribution exists when the High Mode Frequency (ModeFrequency column in interval zero) in the histogram is greater than the average rows-per-value (RPV) by a factor of 4 or more.  RPV is calculated as Number of Rows / Number of Uniques.
** Any column which is over 95% unique is considered as a neary-unique column.
*** Correlated columns within a multicolumn statistic are columns where the value in one may influence, or predict the values in the second. For example in a nation table, there is a tight correlation between nationkey and nationname. In a customer table there might be a correlation, but a somewhat looser correlation, between customer zip code and customer income band.
Dependent columns are columns where the value in the one column will tend to directly influence the value in the second column. An example of a dependent column could be customer zip code, which is dependent on the customer state. If they both appeared in a multicolumn statistic they would be a dependency between them. Other columns where there is some dependency might be job title which is sometimes dependent on the industry segment, if they both were in a multicolumn stat.

To get syntax for a command in teradata

After logging on to teradata through bteq/Sql assistant
type the following command

exec dbc.getsyntax('sql','create table');

getsyntax is a macro that accepts 2 parameters.

First parameter is the type of utility that the command belongs to.
valid values includes the following.

PMPC
BULKLOAD
DUMP
ARCHIVE
MULTILOAD
VERSION
SPL
FASTEXPORT
DATA_TYPE
HELP
SQL
FASTLOAD
TPCCONS

Second parameter the command name for which we need to find the syntax.

examples includes:
exec dbc.getsyntax('FASTLOAD','DEFINE');
exec dbc.getsyntax('MULTILOAD','.IMPORT');
exec dbc.getsyntax('SQL','UPDATE');

Thursday, August 09, 2012

Linux Command Reference

CommandDescription
apropos whatisShow commands pertinent to string. See also threadsafe
man -t ascii | ps2pdf - > ascii.pdfmake a pdf of a manual page
which commandShow full path name of command
time commandSee how long a command takes
time catStart stopwatch. Ctrl-d to stop. See also sw
dir navigation
cd -Go to previous directory
cdGo to $HOME directory
(cd dir && command)Go to dir, execute command and return to current dir
pushd .Put current dir on stack so you can popd back to it
alias l='ls -l --color=auto'quick dir listing
ls -lrtList files by date. See also newest and find_mm_yyyy
ls /usr/bin | pr -T9 -W$COLUMNSPrint in 9 columns to width of terminal
find -name '*.[ch]' | xargs grep -E 'expr'Search 'expr' in this dir and below. See also findrepo
find -type f -print0 | xargs -r0 grep -F 'example'Search all regular files for 'example' in this dir and below
find -maxdepth 1 -type f | xargs grep -F 'example'Search all regular files for 'example' in this dir
find -maxdepth 1 -type d | while read dir; do echo $dir; echo cmd2; doneProcess each item with multiple commands (in while loop)
find -type f ! -perm -444Find files not readable by all (useful for web site)
find -type d ! -perm -111Find dirs not accessible by all (useful for web site)
locate -r 'file[^/]*\.txt'Search cached index for names. This re is like glob *file*.txt
look referenceQuickly search (sorted) dictionary for prefix
grep --color reference /usr/share/dict/wordsHighlight occurances of regular expression in dictionary
archives and compression
gpg -c fileEncrypt file
gpg file.gpgDecrypt file
tar -c dir/ | bzip2 > dir.tar.bz2Make compressed archive of dir/
bzip2 -dc dir.tar.bz2 | tar -xExtract archive (use gzip instead of bzip2 for tar.gz files)
tar -c dir/ | gzip | gpg -c | ssh user@remote 'dd of=dir.tar.gz.gpg'Make encrypted archive of dir/ on remote machine
find dir/ -name '*.txt' | tar -c --files-from=- | bzip2 > dir_txt.tar.bz2Make archive of subset of dir/ and below
find dir/ -name '*.txt' | xargs cp -a --target-directory=dir_txt/ --parentsMake copy of subset of dir/ and below
( tar -c /dir/to/copy ) | ( cd /where/to/ && tar -x -p )Copy (with permissions) copy/ dir to /where/to/ dir
( cd /dir/to/copy && tar -c . ) | ( cd /where/to/ && tar -x -p )Copy (with permissions) contents of copy/ dir to /where/to/
( tar -c /dir/to/copy ) | ssh -C user@remote 'cd /where/to/ && tar -x -p' Copy (with permissions) copy/ dir to remote:/where/to/ dir
dd bs=1M if=/dev/sda | gzip | ssh user@remote 'dd of=sda.gz'Backup harddisk to remote machine
rsync (Network efficient file copier: Use the --dry-run option for testing)
rsync -P rsync://rsync.server.com/path/to/file fileOnly get diffs. Do multiple times for troublesome downloads
rsync --bwlimit=1000 fromfile tofileLocally copy with rate limit. It's like nice for I/O
rsync -az -e ssh --delete ~/public_html/ remote.com:'~/public_html'Mirror web site (using compression and encryption)
rsync -auz -e ssh remote:/dir/ . && rsync -auz -e ssh . remote:/dir/Synchronize current directory with remote one
ssh (Secure SHell)
ssh $USER@$HOST commandRun command on $HOST as $USER (default command=shell)
ssh -f -Y $USER@$HOSTNAME xeyesRun GUI command on $HOSTNAME as $USER
scp -p -r $USER@$HOST: file dir/Copy with permissions to $USER's home directory on $HOST
scp -c arcfour $USER@$LANHOST: bigfileUse faster crypto for local LAN. This might saturate GigE
ssh -g -L 8080:localhost:80 root@$HOSTForward connections to $HOSTNAME:8080 out to $HOST:80
ssh -R 1434:imap:143 root@$HOSTForward connections from $HOST:1434 in to imap:143
ssh-copy-id $USER@$HOST Install public key for $USER@$HOST for password-less log in
wget (multi purpose download tool)
(cd dir/ && wget -nd -pHEKk http://www.pixelbeat.org/cmdline.html)Store local browsable version of a page to the current dir
wget -c http://www.example.com/large.fileContinue downloading a partially downloaded file
wget -r -nd -np -l1 -A '*.jpg' http://www.example.com/dir/Download a set of files to the current directory
wget ftp://remote/file[1-9].iso/FTP supports globbing directly
wget -q -O- http://www.pixelbeat.org/timeline.html | grep 'a href' | headProcess output directly
echo 'wget url' | at 01:00Download url at 1AM to current dir
wget --limit-rate=20k urlDo a low priority download (limit to 20KB/s in this case)
wget -nv --spider --force-html -i bookmarks.htmlCheck links in a file
wget --mirror http://www.example.com/Efficiently update a local copy of a site (handy from cron)
networking (Note ifconfig, route, mii-tool, nslookup commands are obsolete)
ethtool eth0Show status of ethernet interface eth0
ethtool --change eth0 autoneg off speed 100 duplex fullManually set ethernet interface speed
iwconfig eth1Show status of wireless interface eth1
iwconfig eth1 rate 1Mb/s fixedManually set wireless interface speed
iwlist scanList wireless networks in range
ip link showList network interfaces
ip link set dev eth0 name wanRename interface eth0 to wan
ip link set dev eth0 upBring interface eth0 up (or down)
ip addr showList addresses for interfaces
ip addr add 1.2.3.4/24 brd + dev eth0Add (or del) ip and mask (255.255.255.0)
ip route showList routing table
ip route add default via 1.2.3.254Set default gateway to 1.2.3.254
host pixelbeat.orgLookup DNS ip address for name or vice versa
hostname -iLookup local ip address (equivalent to host `hostname`)
whois pixelbeat.orgLookup whois info for hostname or ip address
netstat -tuplList internet services on a system
netstat -tupList active connections to/from system
windows networking (Note samba is the package that provides all this windows specific networking support)
smbtreeFind windows machines. See also findsmb
nmblookup -A 1.2.3.4Find the windows (netbios) name associated with ip address
smbclient -L windows_boxList shares on windows machine or samba server
mount -t smbfs -o fmask=666,guest //windows_box/share /mnt/shareMount a windows share
echo 'message' | smbclient -M windows_boxSend popup to windows machine (off by default in XP sp2)
text manipulation (Note sed uses stdin and stdout. Newer versions support inplace editing with the -i option)
sed 's/string1/string2/g'Replace string1 with string2
sed 's/\(.*\)1/\12/g'Modify anystring1 to anystring2
sed '/ *#/d; /^ *$/d'Remove comments and blank lines
sed ':a; /\\$/N; s/\\\n//; ta'Concatenate lines with trailing \
sed 's/[ \t]*$//'Remove trailing spaces from lines
sed 's/\([`"$\]\)/\\\1/g'Escape shell metacharacters active within double quotes
seq 10 | sed "s/^/      /; s/ *\(.\{7,\}\)/\1/"Right align numbers
sed -n '1000{p;q}'Print 1000th line
sed -n '10,20p;20q'Print lines 10 to 20
sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'Extract title from HTML web page
sed -i 42d ~/.ssh/known_hostsDelete a particular line
sort -t. -k1,1n -k2,2n -k3,3n -k4,4nSort IPV4 ip addresses
echo 'Test' | tr '[:lower:]' '[:upper:]'Case conversion
tr -dc '[:print:]' < /dev/urandomFilter non printable characters
tr -s '[:blank:]' '\t' </proc/diskstats | cut -f4cut fields separated by blanks
history | wc -lCount lines
set operations (Note you can export LANG=C for speed. Also these assume no duplicate lines within a file)
sort file1 file2 | uniqUnion of unsorted files
sort file1 file2 | uniq -dIntersection of unsorted files
sort file1 file1 file2 | uniq -uDifference of unsorted files
sort file1 file2 | uniq -uSymmetric Difference of unsorted files
join -t'\0' -a1 -a2 file1 file2Union of sorted files
join -t'\0' file1 file2Intersection of sorted files
join -t'\0' -v2 file1 file2Difference of sorted files
join -t'\0' -v1 -v2 file1 file2Symmetric Difference of sorted files
math
echo '(1 + sqrt(5))/2' | bc -lQuick math (Calculate φ). See also bc
seq -f '4/%g' 1 2 99999 | paste -sd-+ | bc -lCalculate π the unix way
echo 'pad=20; min=64; (100*10^6)/((pad+min)*8)' | bcMore complex (int) e.g. This shows max FastE packet rate
echo 'pad=20; min=64; print (100E6)/((pad+min)*8)' | pythonPython handles scientific notation
echo 'pad=20; plot [64:1518] (100*10**6)/((pad+x)*8)' | gnuplot -persistPlot FastE packet rate vs packet size
echo 'obase=16; ibase=10; 64206' | bcBase conversion (decimal to hexadecimal)
echo $((0x2dec))Base conversion (hex to dec) ((shell arithmetic expansion))
units -t '100m/9.58s' 'miles/hour'Unit conversion (metric to imperial)
units -t '500GB' 'GiB'Unit conversion (SI to IEC prefixes)
units -t '1 googol'Definition lookup
seq 100 | (tr '\n' +; echo 0) | bcAdd a column of numbers. See also add and funcpy
calendar
cal -3Display a calendar
cal 9 1752Display a calendar for a particular month year
date -d friWhat date is it this friday. See also day
[ $(date -d '12:00 +1 day' +%d) = '01' ] || exitexit a script unless it's the last day of the month
date --date='25 Dec' +%AWhat day does xmas fall on, this year
date --date='@2147483647'Convert seconds since the epoch (1970-01-01 UTC) to date
TZ='America/Los_Angeles' dateWhat time is it on west coast of US (use tzselect to find TZ)
date --date='TZ="America/Los_Angeles" 09:00 next Fri'What's the local time for 9AM next Friday on west coast US
locales
printf "%'d\n" 1234Print number with thousands grouping appropriate to locale
BLOCK_SIZE=\'1 ls -lUse locale thousands grouping in ls. See also l
echo "I live in `locale territory`"Extract info from locale database
LANG=en_IE.utf8 locale int_prefixLookup locale info for specific country. See also ccodes
locale -kc $(locale | sed -n 's/\(LC_.\{4,\}\)=.*/\1/p') | lessList fields available in locale database
recode (Obsoletes iconv, dos2unix, unix2dos)
recode -l | lessShow available conversions (aliases on each line)
recode windows-1252.. file_to_change.txtWindows "ansi" to local charset (auto does CRLF conversion)
recode utf-8/CRLF.. file_to_change.txtWindows utf8 to local charset
recode iso-8859-15..utf8 file_to_change.txtLatin9 (western europe) to utf8
recode ../b64 < file.txt > file.b64Base64 encode
recode /qp.. < file.qp > file.txtQuoted printable decode
recode ..HTML < file.txt > file.htmlText to HTML
recode -lf windows-1252 | grep euroLookup table of characters
echo -n 0x80 | recode latin-9/x1..dumpShow what a code represents in latin-9 charmap
echo -n 0x20AC | recode ucs-2/x2..latin-9/xShow latin-9 encoding
echo -n 0x20AC | recode ucs-2/x2..utf-8/xShow utf-8 encoding
CDs
gzip < /dev/cdrom > cdrom.iso.gzSave copy of data cdrom
mkisofs -V LABEL -r dir | gzip > cdrom.iso.gzCreate cdrom image from contents of dir
mount -o loop cdrom.iso /mnt/dirMount the cdrom image at /mnt/dir (read only)
cdrecord -v dev=/dev/cdrom blank=fastClear a CDRW
gzip -dc cdrom.iso.gz | cdrecord -v dev=/dev/cdrom -Burn cdrom image (use dev=ATAPI -scanbus to confirm dev)
cdparanoia -BRip audio tracks from CD to wav files in current dir
cdrecord -v dev=/dev/cdrom -audio -pad *.wavMake audio CD from all wavs in current dir (see also cdrdao)
oggenc --tracknum='track' track.cdda.wav -o 'track.ogg'Make ogg file from wav file
disk space (See also FSlint)
ls -lSrShow files by size, biggest last
du -s * | sort -k1,1rn | headShow top disk users in current dir. See also dutop
du -hs /home/* | sort -k1,1hSort paths by easy to interpret disk usage
df -hShow free space on mounted filesystems
df -iShow free inodes on mounted filesystems
fdisk -lShow disks partitions sizes and types (run as root)
rpm -q -a --qf '%10{SIZE}\t%{NAME}\n' | sort -k1,1nList all packages by installed size (Bytes) on rpm distros
dpkg-query -W -f='${Installed-Size;10}\t${Package}\n' | sort -k1,1nList all packages by installed size (KBytes) on deb distros
dd bs=1 seek=2TB if=/dev/null of=ext3.testCreate a large test file (taking no space). See also truncate
> filetruncate data of file or create an empty file
monitoring/debugging
tail -f /var/log/messagesMonitor messages in a log file
strace -c ls >/dev/nullSummarise/profile system calls made by command
strace -f -e open ls >/dev/nullList system calls made by command
strace -f -e trace=write -e write=1,2 ls >/dev/nullMonitor what's written to stdout and stderr
ltrace -f -e getenv ls >/dev/nullList library calls made by command
lsof -p $$List paths that process id has open
lsof ~List processes that have specified path open
tcpdump not port 22Show network traffic except ssh. See also tcpdump_not_me
ps -e -o pid,args --forestList processes in a hierarchy
ps -e -o pcpu,cpu,nice,state,cputime,args --sort pcpu | sed '/^ 0.0 /d'List processes by % cpu usage
ps -e -orss=,args= | sort -b -k1,1n | pr -TW$COLUMNSList processes by mem (KB) usage. See also ps_mem.py
ps -C firefox-bin -L -o pid,tid,pcpu,stateList all threads for a particular process
ps -p 1,$$ -o etime=List elapsed wall time for particular process IDs
last rebootShow system reboot history
free -mShow amount of (remaining) RAM (-m displays in MB)
watch -n.1 'cat /proc/interrupts'Watch changeable data continuously
udevadm monitorMonitor udev events to help configure rules
system information (see also sysinfo) ('#' means root access is required)
uname -aShow kernel version and system architecture
head -n1 /etc/issueShow name and version of distribution
cat /proc/partitionsShow all partitions registered on the system
grep MemTotal /proc/meminfoShow RAM total seen by the system
grep "model name" /proc/cpuinfoShow CPU(s) info
lspci -tvShow PCI info
lsusb -tvShow USB info
mount | column -tList mounted filesystems on the system (and align output)
grep -F capacity: /proc/acpi/battery/BAT0/infoShow state of cells in laptop battery
#dmidecode -q | lessDisplay SMBIOS/DMI information
#smartctl -A /dev/sda | grep Power_On_HoursHow long has this disk (system) been powered on in total
#hdparm -i /dev/sdaShow info about disk sda
#hdparm -tT /dev/sdaDo a read speed test on disk sda
#badblocks -s /dev/sdaTest for unreadable blocks on disk sda
interactive (see also linux keyboard shortcuts)
readlineLine editor used by bash, python, bc, gnuplot, ...
screenVirtual terminals with detach capability, ...
mcPowerful file manager that can browse rpm, tar, ftp, ssh, ...
gnuplotInteractive/scriptable graphing
linksWeb browser
xdg-open .open a file or url with the registered desktop application

Teradata Datatypes

This chart indicates which data types that Teradata currently supports as ANSI Standards:

Data Type
Description
Data Value Range
INTEGER
Signed whole number
-2,147,483,648 to 2,147,483,647
SMALLINT
Signed smaller whole number
-32,768 to 32,767
DECIMAL(X,Y)
Where: X=1 thru 18, total number of digits in the number
And Y=0 thru 18 digits to the right of the decimal
Signed decimal number
18 digits on either side of the decimal point
Largest value DEC(18,0)
Smallest value DEC(18,18)
NUMERIC(X,Y)
Same as DECIMAL
Synonym for DECIMAL
Same as DECIMAL
FLOAT
Floating Point Format (IEEE)
<value>x10307 to <value>x10-308
REAL
Stored internally as FLOAT

PRECISION
Stored internally as FLOAT

DOUBLE PRECISION
Stored internally as FLOAT

CHARACTER(X)
CHAR(X)
Where: X=1 thru 64000
Fixed length character string, 1 byte of storage per character,
1 to 64,000 characters long, pads to length with space
VARCHAR(X) CHARACTER VARYING(X)
CHAR VARYING(X)
Where: X=1 thru 64000
Variable length character string, 1 byte of storage per character, plus 2 bytes to record length of actual data
1 to 64,000 characters as a maximum. The system only stores the characters presented to it.
DATE
Signed internal representation of YYYMMDD (YYY represents the number of years from 1900, i.e. 100 for Year 2000)
Currently to the year 3500 as a positive number and back into AD years as a negative number.
TIME
Identifies a field as a TIME value with Hour, Minutes and Seconds

TIMESTAMP
Identifies a field as a TIMESTAMP value with Year, Month, Day, Hour, Minute, and Seconds


This chart indicates which data types that Teradata currently supports as extensions:
Data Type
Description
Data Value Range
BYTEINT
Signed whole number
-128 to 127
BYTE (X)
Where: X=1 thru 64000
Binary
1 to 64,000 bytes
VARBYTE (X)
Where: X=1 thru 64000
Variable length binary
1 to 64,000 bytes
LONG VARCHAR
Variable length string
64,000 characters (maximum data length) The system only stores the characters provided, not trailing spaces.)
GRAPHIC (X)
Where: X=1 thru 32000
Fixed length string of 16-bit bytes (2 bytes per character)
1 to 32,000 KANJI characters
VARGRAPHIC (X)
Where: X=1 thru 32000
Variable length string of 16-bit bytes
1 to 32,000 characters as a maximum. The system only stores characters provided.

How to Fix Data Skew

What Data Skew can cause?

  • Why I cannot insert 50 GB evenly distributed data in a database having 500 GB unused space?
  • Why we are not able to move only 10% of  the unused space from a database?
  • Why most of my requests are Spooling out?
  • Why most of the queries running on this database usually have high CPU SKEW?

Because

  • Most of the database space is being wasted
  • Almost all allocated space for a database on at least one of the AMPs has been occupied
  • The chunk of upcoming data that needs to go on the highly congested AMP, after distribution, do not finds a space for accommodation
  • If space needs to be moved, the chunk of space that is supposed to be fetched from the congested AMP would not be allowed. As, equal space will be fetched from each of the AMPs and the activity will fail if one of the AMPs does not have FREE SPACE=“Total Space to be moved/# of AMPs”  
  • During processing of any request on the database, there would be high chances of spooling out on the congested AMP as “MaxPerm – CurrPerm” will be the available Spool which will be less due to high Skew on the AMP
  • In some cases, the CPU Skew of the sql request on the skewed tables of the database will be more due to uneven distribution of data and different CPU required on different AMPs




To find row counts of all tables within a DB

------------------------------------------------------------------------------------
--
-- tbl-rowcount.sql
--
-- find the tables within the specified database and get a rowcount
-- run section (1) along 
-- then copy the answer set from section (1) to section (2)
-- then run sections (2) and (3) together
-- finally print the answer set containing the rowcount values
-----------------------------------------------------------------------------------
--(1)
Drop table lidlbjh.Table_Row_Count;
Create    Volatile Table lidlbjh.Table_Row_Count
(
Tablename VarChar(30)
,RowCount Integer
)
On Commit Preserve Rows;

Select    'Insert into lidlbjh.Table_Row_Count Select ' || '''' || trim(databasename) || '.' || trim(tablename) || ''',' || ' Count(*) From ' || trim(databasename) || '.' ||
trim(tablename) || ';' (Title '')
From dbc.tables
Where tablekind =  't'
and databasename = 'hci'
--and ( tablename like 'v_rals%' or tablename like 'whs%' or tablename like 'ms%' or tablename like 'retro%')
order by tablename;

--(2)
/*------------ copy answer set here and then execute ----------------------------------------------------*/
  
Delete from lidlbjh.Table_Row_Count all; 
 
Insert into lidlbjh.Table_Row_Count Select 'HCI.T_ALLPROV_CONTEXT', Count(*) From HCI.T_ALLPROV_CONTEXT;
Insert into lidlbjh.Table_Row_Count Select 'HCI.T_ALLPROV_NOCONTEXT', Count(*) From HCI.T_ALLPROV_NOCONTEXT;
Insert into lidlbjh.Table_Row_Count Select 'HCI.T_BU_NORMS', Count(*) From HCI.T_BU_NORMS;


--(3)
select current_date as "_As_of_Date_"
  ,tablename
  , rowcount (varchar(20))(format 'x(20)')
 from table_row_count
order by 3 desc;

Teradata DBC Errorcodes and Description

ErrorCode=======ErrorText
0      =======**** ErrMsgs file as of 07:11:26 04/05/2004.
2507=======Out of spool space on disk.
2566=======Processor nnnn failed to respond
2585=======An I/O error on an MLOAD table caused a crash.
2587=======Segment overflow -- FastExport request is too complex to process.
2589=======Request block is not found.
2606=======Bad argument involving %TVMID.%FLDID for LOG function.
2608=======Bad argument involving %TVMID.%FLDID for LN function
2610=======Internal error:  field maintenance error
2627=======Internal error:  invalid operand for UpCaseOp
2629=======Internal error:  bad data type for operation.
2648=======Internal error:  could not find spool header.
2650=======Numeric processor operand error.
2669=======Outrow reference to source is illegal case (in MoveRow).
2671=======Unable to find a source constant (in MoveRow).
2688=======Numeric Processor Operand Error.
2690=======Zero table length in Unload.
2709=======SSSWAIT has timed out.
2711=======Inconsistent AMP/PE Vproc count.
2728=======Aborted due to a non-valid reference index on %DBID.%TVMID.
2730=======Warning: RI is not valid on one or more AMPs.
2732=======Warning: Skipping parent key validation in level 3 checking.
2749=======Primary and fallback USI counts do not match
2751=======Data row out of order.
2770=======Data row id referenced by NUSI is on wrong AMP.
2772=======NUSI row indexes non existent data row.
2791=======Primary and fallback data row checksums do not match.
2793=======MLoad MARK DUPLICATE INSERT (dup row)
2810=======Internal error:  permanent journal error found.
2812=======Internal error:  invalid record found on transient journal.
2814=======Data size exceeds the maximum specified.
2831=======No row was found for the event in the Recovery Control Catalog.
2833=======Row already exists with different content.
2852=======Inconsistent check point records found on Restored journal.
2854=======Specified checkpoint row does not exist in Journal table %TVMID.
2871=======Missing table header for DBC.DBCInfoTbl, updates bypassed.
2873=======Pending operation on DBC.DBCInfoTbl prevented updates.
2875=======Task Lock cannot be granted - resource busy
2892=======Null value eliminated in set function
2894=======The cursor spool row specified is invalid.
2913=======A specific-AMP recovery was executed out of order.
2915=======The journal for the updated table does not exist
2934=======Input is in new row format -- expected old format.
2936=======No row was found for Bulk Load delete/update.
2955=======Invalid before image on transient journal.
2974=======Invalid backup message received.
2976=======Message with invalid kind has been received.
2995=======Duplicate field in into specification.
2997=======Requested field does not exist in row.
3016=======Userid, password or account > 30 characters.
3018=======RAR response is malformed.
3035=======DBC.Sessiontbl is corrupted, session cannot be recovered.
3117=======The continue request contains invalid parcels.
3119=======Continue request submitted but no response to return.
3136=======Internal Error: The request cannot be aborted, session is in-doubt.
3138=======Waiting to load in-doubt session information
3140=======Request has already been committed
3157=======DQM Cache invalid or size exceeded
3159=======DQM internal error
3180=======Service console programming fault detected.
3220=======RCT program fault.
3239=======Invalid RepBegin message received.
3241=======MDS database does not exist.
3260=======The LOGOFF SESSIONS option must have a value of Y,y,N,n,BLANK or NULL.
3262=======The LOCAL CHANGE option must have a value of Y,y,N,n,BLANK or NULL.
3279=======There are no more Monitor Sessions available.
3281=======Internal Error: Inconsistent Monitor software condition detected.
3283=======Internal Error:  No free context blocks are available.
3403=======Request parcel contains fewer session ids than specified.
3405=======Session %VSTR not found or does not have an in-doubt transaction.
3504=======Selected non-aggregate values must be part of the associated group.
3506=======Internal error: Tree segment contains errors.
3525=======The user cannot create or drop a PRIMARY index.
3527=======Format string '%VSTR' has combination of  numeric, character  and GRAPHIC values.
3544=======Partial string matching requires character operands.
3546=======Default values may not be parameterized.
3548=======The user cannot run FastLoad and MLoad at the same time.
3565=======Database has tables without After Image Journaling.
3567=======Deleting database 'DBC' is not allowed.
3586=======Processor specification error(too many or invalid values).
3588=======A User may not be assigned a NULL password.
3607=======Too many values in a subquery.
3609=======Excessively complex set conditional.
3626=======Internal error:  WITH BY clause contains multiple tables.
3628=======Only identical Aggregated Expressions can use the DISTINCT option.
3647=======NULL may not be used as the argument for an aggregate function.
3649=======Compressed NULL and NOT NULL cannot coexist for column %VSTR.
3668=======A table listed in the Dictionary is not present.
3670======='%VSTR' is not a User.
3687=======Special characters may not be used in passwords.
3689=======END EXPORT statement is valid only if in FastExport Task.
3708=======Syntax error, %VSTR should be deleted.
3710=======Insufficient memory to parse this request, during %VSTR phase.
3729=======Invalid use of non-SQL field definition(s) in ALTER TABLE.
3731=======The user must use IS NULL or IS NOT NULL to test for NULL values.
3748=======Parcel kind or ordering is invalid.
3750=======Illegal Kanji object name %VSTR.
3752=======Too many digits in exponent.
3769=======Checksum locking modifier must reference the same table as the SELECT statement that follows it.
3771=======Illegal expression in WHEN clause of CASE expression.
3790=======GRANT/REVOKE LOGON not allowed TO DBC.
3792=======Conflicting CASESPECIFIC options.
3809=======Column '%VSTR' is ambiguous.
3811=======Column '%VSTR' is NOT NULL.  Give the column a value.
3813=======The positional assignment list has too many values.
3832=======A 2PC parcel must not directly follow an Options parcel.
3851=======Macro is greater than %VSTR after name expansion.
3853======='%VSTR' is not a table.
3872=======Illegal use of an aggregate operation in an UPDATE statement.
3874=======No journaling options have been specified.
3891=======Check text for check constraint (explicit or derived from partitioning expression) is greater than 8192 characters.
3912=======Invalid message kind received.
3914=======ABORT request transaction ids do not match.
3933=======The Maximum Possible Row Length in the Table is too Large.
3935=======Session and/or request number does not match an aborted task.
3954=======Non-GRAPHIC data cannot be assigned to column %VSTR.
3956=======Improper representation of hexadecimal GRAPHIC data.
3973=======Cannot specify multiple constraints in an Alter Table statement.
3975=======Table '%VSTR' is an illegal referential constraint target.
3994=======Alias name defined should be used instead of table name%VSTR.
3996=======Right truncation of string data.
5014=======A CI with no DBDs has been found.
5016=======The table header passed in memory is invalid.
5033=======An invalid first sector number has been detected.
5035=======An unexpected FSG error has occurred.
5054=======An invalid SRD reference array index was detected.
5056=======An unfree (uninitialized) cylinder has been allocated.
5073=======The DB returned by IdxUpgradeDB is not the original one.
5075=======An invalid DBKOPERATION was specified.
5077=======An invalid deltasectors was specified.
5096=======Sort attempted to generate an invalid row uniq value.
5115=======An attempt was made to generate an invalid row uniq.
5117=======dbkmoddbs could not perform the block split indicated by dbkfit.
5136=======The CI was unlocked to avoid an undetectable deadlock.
5138=======An unexpected error has occurred in a UNIX service routine.
5157=======TID: %d %d %d  First RID: %d %d %d %d %d  Last Partition/Hash: %d %d %d\n
5159=======System test forced tpareset\n
5176=======The data block checksum is invalid.
5178=======The file information block checksum is '                                    'invalid.
5300=======Foreign Key column '%VSTR' cannot reference itself.
5319=======The specified DATABLOCKSIZE value is invalid
5321=======Constraint, primary index, or secondary index with the same name '%VSTR' already exists in table.
5338=======More than 1,000 volatile tables in current session.
5340=======Database name, if specified, must be the login user name for a volatile table.
5342=======Table option not allowed for temporary/volatile table.
5361=======The compress value must be in the repertoire of the character column.
5363=======The default value must be in the repertoire of the character column.
5380=======Internal error: Unexpected character Data type.
5382=======The word APPLY is not spelled correctly in the Release Mload statement.
5401=======Number of fractional digits out of range.
5403=======WITH TIME ZONE must follow either TIME or TIMESTAMP.
5420=======No triggers exist for table '%VSTR'.
5422=======A triggered action statement cannot update the subject table.
5424=======A triggered action statement is the wrong kind.
5441=======Replicated table status cannot be INITIATED for Define Role statement.
5443=======The specified role of a replicated table is either missing or invalid.
5462=======No DDL statements will be allowed until MDS is ready to accept them.
5464=======Error in Join Index DDL, %VSTR.
5483=======Illegal use of Ordered Analytical Function operation in an UPDATE statement.
5485=======MLinReg can have only one entry in the order specification list.
5502=======The name specified as an argument does not match the corresponding parameter name '%VSTR'.
5523=======Invalid MULTITSR parcel received.
5525=======LOCKING modifier canot be specified with CALL SQL statement.
5544=======The OPTCOST GDO is uninitialized.
5546=======Sampleid cannot be nested inside Ordered Analytical and AGGREGATE Functions.
5563=======The table specified in the UPSERT statement contains triggers that could be fired.
5565=======The UPDATE specified in the UPSERT statement is a complex update.
5584=======Data exception: numeric value out of range.
5586=======Random not allowed in inserted PI column value.
5605=======A NEW_TABLE or OLD_TABLE reference is not allowed to be passed to a macro.
5607=======Check output for possible warnings encountered in compiling UDF.
5624=======The Branch offset is very long.
5628=======Column %FSTR not found in %VSTR.
5645=======The value of '%VSTR' supplied is out of range.
5647=======Internal erorr: Data dictionary table UDFInfo in inconsistent state.
5666=======LOBs are not allowed in indexes.
5687=======REFERENTIAL INTEGRITY is not allowed on LOB columns.
5689=======LOBs are not allowed to be compared.
5706=======Cannot BEGIN QUERY LOGGING for user because ALL rule exists.
5708=======Table header size exceeds limit.
5727=======Missing the last range end.
5729=======Primary index for the table is not partitioned.
5769=======The Locator is invalid due to an internal error.
5788=======%VSTR currently do not support Identity Columns.
5790=======UPSERT on a table with an Identity Column primary index is illegal.
5809=======SET keyword in ANSI UPDATE is required.
5811=======MINUS of queries is not in Entry Level ANSI.
5828=======PRIVILEGES keyword is required in ANSI.
5830=======ROLLBACK WORK cannot indicate a condition or return text to user in ANSI.
5832=======LOCKING modifier is not ANSI.
5849=======INSTEAD OF triggers are not supported.
5851=======BEFORE triggers having data changing statements are not supported.
6013=======Vproc Number is not defined in the Physical Configuration.
6015=======The host number is invalid.
6032=======Warning: The number of Amps in a cluster is less than one-half of the Default Cluster value specified.
6034=======Incompatible operation with the Vproc type in current config map.
6036=======The range is invalid.
6053=======Unable to rebuild reference index, needs to be rebuilt.
6114=======Message Kind is invalid in the message header.
6135=======Reconfig is aborted at a reversible state.
6137=======The hash bucket arrays on an AMP are invalid.
6156=======Encountered disk error, causing too many Amps down in cluster
6158=======Recovery is still active, Reconfig will not run until recovery done
6604=======A changed data row was received that whose contents are in error
6606=======Not enough space was provided to receive the data into
6665=======RSG failed to locate the TD AMP Worker Task's well-known mailbox
6667=======Contents of RSG configuration file are incorrect
6705=======An illegally formed character string was encountered during translation.
6707=======Internal error: A move of overlapping strings was requested of the ISF subsystem.
6766=======Syntax error in SDF.
6768=======Invalid numeric format.
6770=======Invalid CCD value.
7420=======An invalid hash bucket has been encountered.
7439=======Row references non-existent session.
7441=======The date does not correspond to an exiting era.
7443=======The length of argument is incorrect.
7460=======Cannot continue utility operation on %DBID.%TVMID because system configuration has changed.
7462=======Operation not allowed: pending replication copy operation on table %DBID.%TVMID.
7481=======Operation with same utility Id and table Id already running.
7483=======Warning: RI checks bypassed for Referential Constraints.
7502=======A system trap was caused by UDF %DBID.%TVMID for %VSTR
7504=======in UDF %DBID.%TVMID: SQLSTATE %VSTR
7521=======Primary LOB row is on wrong AMP.
7523=======Fallback LOB row is in wrong subtable.
7544=======Cannot logon utility session.
7563=======Source row is deleted while update is in progress.
7565=======One or more rows found for table not in DBC.TVM. Deleted.
7624=======%FSTR: Invalid message class (%VSTR) encountered.
7626=======%FSTR: Invalid (%VSTR) step.
7643=======%FSTR: Error Initiating a new task; tskget failed with return code (%VSTR).
7645=======%FSTR: Error writing into file; fwrite failed with return code (%VSTR).
7704=======The RepPhase1 message indicates fewer RepDDL than already received.
7706=======Multiple RepBegTrans received for the same transaction before RepPhase1.
7708=======Transaction previously asynchronously aborted.  Now transaction aborted with RepPhase1Rsp message.
7725=======MDS received more DDL chunks than expected for a transaction.
8011=======The LAN message Format field is invalid.
8013=======The LAN message MessageLength field is invalid.
8051=======No network protocol exist.
8112=======Secondary index is not unique.
8114=======Update is illegal.
8215=======ROLLBACK recovery stopped.
8217=======Maximum number of tables exceeded.
8234=======Duplicate rows will not be restored to table
8236=======Cross Platform Migration is currently not supported.
8257=======Improperly formatted directed request.
8501=======LOB end of data encountered.
8503=======Invalid LOB ID
8522=======LOT could not allocate a PDE mailbox
8541=======LOT send was passed conflicting values in the descriptor
8543=======LOT got an unexpected return code discarding a scratch segment

Date and time manipulation on Teradata

Timestamp

The timestamp data type does not have a format command in V2R4, but V2R5 does have a format commend.  The quickest way to format a date time in V2R4 is:

select cast(current_date as TimeStamp(2)) 
        + ((current_time - time '00:00:00') hour to second);

SYS_CALENDAR.CALENDAR


Teradata provides a handy calendar table for date lookup called sys_calendar.calendar.   Using this table you can calculate useful dates such as weekending dates, quarter ending dates, etc.

 

Find the previous Friday from today:


select *  from   sys_calendar.calendar where  day_of_week=6 and calendar_date between date -6 and date;

Find the first Monday one week after the last Friday of previous month:


select a.calendar_date from sys_calendar.calendar a,
(sel max(calendar_date) + 10 as calendar_date from sys_calendar.calendar
where
extract(year from add_months( date, -1) ) = year_of_calendar
and extract(month from add_months(date,-1)) = month_of_year
and day_of_week = 6) b
where
a.calendar_date = b.calendar_date;

Extracting Date Components

Select extract(year from foo_date), extract(month from foo_date from db.snafu;

Simple Date Arithmetic

Days difference:


Select date – old_date from foo;

Calculate the Julian date


select current_date (format 'yyyyddd') (char(7));


Adding or Subtracting Months


This is as simple as it gets:

select add_month(date_col,1) from table;
select add_month(date_col,-10) from table;

Calculating Date Intervals

These examples will not work ODBC unless you select type IIA in the ODBC configuration.  These examples will work in Bteq as given.

SELECT
 (CAST(((END_DT(DATE, FORMAT'YYYY-MM-DD'))(CHAR(10)))||' '||
END_TIME AS TIMESTAMP(0)) - CAST(((START_DT(DATE, FORMAT'YYYY-MM-DD'))(CHAR(10)))||' '||
START_TIME AS TIMESTAMP(0))) day(3) TO second AS TIMEDIFF
FROM whatever;

Teradata Builtin Values and Functions

Built in Value               Value Returned            Data Type                _ACCOUNT                      Current User Account      VARCHAR(30)
CURRENT_DATE                 Current Date              DATE
CURRENT_TIME [(n)]           Current Time              TIME(n) WITH TIMEZONE
CURRENT_TIMESTAMP [(n)]      Current Date/Time         TIMESTAMP(n) WITH TIMEZONE
DATABASE                     Current Default DB        VARCHAR(30)
DATE                         Current Date              DATE
NULL                         The NULL value            As required
PARTITION                    The Index Partition       INTEGER
ROWID                        The Internal Row Id       INTEGER
SESSION                      Current Session Number    INTEGER
TIME                         Current Time              FLOAT
USER                         Current User Id           VARCHAR(30)
      Descriptor Functions                Value Returned            Data Type  _BYTE[S]     (arg)                     Length of arg              INTEGER
CHAR[ACTERS] (string)                  Length of str              INTEGER
MCHAR[ACTERS] (string)                 Length (Mbyte Chars)       INTEGER
CHAR[ACTER]_LENGTH(string)             Length of str (Chars)      INTEGER
OCTET_LENGTH (string [charset])        Length of str (Bytes)      INTEGER
FORMAT      (arg)                     Format of arg              VARCHAR(30)
NAMED       (arg)                     Name assigned to arg       VARCHAR(30)
TITLE       (arg)                     Title on arg               VARCHAR(60)
TYPE        (arg)                     Data Type of arg           VARCHAR(??)

         Math Functions                   Value Returned            Data Type  _ABS         (num)                      Absolute value             Same as arg
ACOS        (num)                      ArcCosine                  FLOAT
ACOSH       (num)                      Hyperbolic ArcCosine       FLOAT
ASIN        (num)                      ArcSine                    FLOAT
ASINH       (num)                      Hyperbolic ArcSine         FLOAT
ATAN        (num)                      ArcTangent                 FLOAT
ATANH       (num)                      Hyperbolic ArcTangent      FLOAT
COS         (num)                      Cosine                     FLOAT
COSH        (num)                      Hyperbolic Cosine          FLOAT
EXP         (num)                      e to the power arg         FLOAT
LOG         (num)                      Base 10 Logarithm          FLOAT
LN          (num)                      Base e  Logarithm          FLOAT
SIN         (num)                      Sine                       FLOAT
SINH        (num)                      Hyperbolic Sine            FLOAT
SQRT        (num)                      Square Root                FLOAT
TAN         (num)                      Tangent                    FLOAT
TANH        (num)                      Hyperbolic Tangent         FLOAT

     Conversion Functions                 Value Returned            Data Type  _
ADD_MONTHS  (date, n)                  Add ‘n’ months to date     DATE
            (timestamp, n)                                         TIMESTAMP
CASE  WHEN exp1 THEN val1              Value substitution         Same as valn
     [WHEN exp2 THEN val2]
           ... [ELSE valn] END
CASE_N      (expr [.. ,expr] [,NO CASE [OR UNKNOWN]] [,UNKNOWN])   INTEGER
CAST        (expr AS typeinfo)         Type/Format conversion     as in typeinfo
CHAR2HEXINT (string)                   HEX display of string      VARCHAR
COALESCE    (expr1, expr2 [.. ,exprn]) First non-Null expr        Same as expr
DATE        ‘YYYY-MM-DD’               YYYY-MM-DD as a date       DATE
EXTRACT     (part FROM date)           Day, Hour, Minute etc      ‘part’ type
HASHROW     (expr, expr)               The Row Hash
HASHBUCKET  (expr)                     The Hash Bucket
HASHAMP     (expr)                     The Primary AMP
HASHBACKAMP (expr)                     The Backup AMP
INDEX       (string, substr)           Start pos. of substr       INTEGER
LOWER       (string)                   String in LowerCase        CHAR
MINDEX      (MBstring, substr)         Start pos. of substr       INTEGER
NULLIF      (expr1, expr2)             Null if e1=e2 else e1      Same as expr1
NULLIFZERO  (num)                      NULL if arg is Zero        Same as arg
(start,end) OVERLAPS (start,end)       Dates/intervals overlap?   BOOLEAN
POSITION    (str1 IN str2)             Start pos of str1 in str2  INTEGER
RANGE_N     (expr BETWEEN start [AND end] [EACH size] [..., ]
            [,NOT IN RANGE [OR UNKNOWN]] [,UNKNOWN] )              INTEGER
RANDOM      (low-bound, high_bound)    A random number            FLOAT
SOUNDEX     ( )
SUBSTR      (string, start, len)       Sub-String of string       CHAR
       [BOTH    ]
TRIM ([[LEADING ] [chr] FROM] string)  Remove blanks (or chr)     CHAR
       [TRAILING]
UPPER       (string)                   String in UpperCase        CHAR
WIDTH_BUCKET ( )
ZEROIFNULL  (num)                      Zero if arg is NULL        Same as arg

Aggregate and OLAP Functions                              Description             AVG         ([DISTINCT] arg)                              Average value
CORREL      (y, x)                                        Correlation
COUNT       ([DISTINCT] arg)                              Number occurrences
COUNT       (*)                                           Number of rows
COVARIANCE  (y, x)                                        Covariance
COVAR_SAMP  (y, x)                                        Sample Covariance
CSUM        (col, sort-expr [… ,sort-expr] )              Cumulative Sum
GCOUNT      (col)                                         ?
GSUM        (col)                                         ?
KURTOSIS    (arg)                                         Kurtosis               
LINREGSLOPE (y, x)                                        Slope: Linear Reg
LINREGINTERCEPT(y, x)                                     Intercept: Linear Reg
MAX[IMUM]   (arg)                                         Maximum value
MIN[IMUM]   (arg)                                         Minimum value
MAVG        (col, #rows, sort-expr [… ,sort-expr] )       Moving Average
MDIFF       (col, #rows, sort-expr [… ,sort-expr] )       Moving Difference
MLINREG     (col, #rows, sort-expr )                      Linear Regression
MSUM        (col, #rows, sort-expr [… ,sort-expr] )       Moving Sum
PERCENT_RANK() OVER ([PARTITION BY spec] ORDER BY spec [ASC | DESC])
QUANTILE    (#partitions, sort-expr [… ,sort-expr] )      Quantile position
RANK        (sort-expr [… ,sort-expr] )                   Rank Position
RANK        () OVER ([PARTITION BY spec] ORDER BY spec [ASC | DESC])
REGR_AVGX   (y, x)                                        Avg of x values in regression
REGR_AVGY   (y, x)                                        Avg of y values in regression
REGR_COUNT  (y, x)                                        # non-null pairs in regression
REGR_R2     (y, x)                                        R squared of Regression
REGR_SXX    (y, x)                                        ? in regression
REGR_SXY    (y, x)                                        ? in regression
REGR_SYY    (y, x)                                        ? in regression
SKEW        ([DISTINCT] arg)                              Skew                   
STDEV       (arg)                                         Standard Deviation     
STDEVP      (arg)                                         Standard Deviation (pop)  
STDDEV_POP  ([DISTINCT] arg)                              Population Std Deviation  
STDDEV_SAMP ([DISTINCT] arg)                              Sample Std Deviation      
SUM         ([DISTINCT] arg)                              Sum of values          
SUM         (arg) OVER ( [PARTITION BY spec]
                         [ORDER BY     spec]
                         ROWS {windowsize} PRECEDING [ASC | DESC] )
                              {UNBOUNDED }
VARIANCE      (arg)  Variance     
VARIANCEP     (arg)  Population Variance
VAR_POP       ([DISTINCT] arg)     Population Variance 
VAR_SAMP      ([DISTINCT] arg)     Sample Variance