- csv2td
- kvpairs2td
- mrkv2td
- td2html
- td2kvpairs
- td2mrkv
- td-add-headers
- td-alter
- td-collapse
- td-disamb-headers
- td-expand
- td-filter
- td-gnuplot
- td-keepheader
- td-lpstat
- td-ls
- td-pivot
- td-ps
- td-rename
- td-select
- td-sort
- td-trans
- td-trans-fixcol
- td-trans-group
- td-trans-gshadow
- td-trans-ls
- td-trans-mount
- td-trans-passwd
- td-trans-shadow
- vcf2td
csv2td - Transform CSV to tabular data format.
Read CSV data on STDIN. Output tabular data to STDOUT.
Any option which Text::CSV(3pm) takes.
See Text::CSV-
known_attributes> for extensive list.
Example:
csv2td --sep=';' --blank-is-undef=0 --binary
becomes:
Text::CSV->new({sep=>";", blank_is_undef=>0, binary=>1})
Why there is no td2csv?
Why would you go back to ugly CSV when you have nice shiny Tabdata?
kvpairs2td - Transform lines of key-value pairs to tabular data stream
-
-i, --ignore-non-existing-columns
Do not fail when encounters a new field after the first record.
-
-w, --warn-non-existing-columns
-
-c, --column COLUMN
Indicate that there will be a column by the name COLUMN. This is useful if the first record does not have COLUMN. This option is repeatable.
-
-r, --restcolumn NAME
Name of the column where the rest of the input line will be put which is not part of key-value pairs. Default is _REST.
-
-u, --unknown-to-rest
Put unknown (non-existing) fields in the "rest" column (see -r option).
td2mrkv(1), td2kvpairs(1)
mrkv2td - Transform multi-record key-value (MRKV) stream to tabular data format.
As tabular data format presents field names at the start of transmission, mrkv2td(1) infers them only from the first record, so no need to buffer the whole dataset to find all fields, and it's usual for all records to have all fields anyways.
-
-s, --separator REGEXP
Regexp which separates field name from cell data in MRKV stream. Default is TAB (
\t
). -
-g, --multiline-glue STRING
td2mrkv(1)
td2html - Transform tabular data stream into a HTML table.
td2html
Takes a tabular data stream on STDIN and outputs a HTML table
enclosed in <table>...</table>
tags.
td2kvpairs - Transform tabular data into key-value pairs
-
-r, --prefix-field NAME
Put this field's content before the list of key-value pairs. Default is _REST. Prefix and the key-value pairs are separated by a space char, if there is any prefix.
td2mrkv(1), kvpairs2td(1)
td2mrkv - Transform tabular data into multi-record key-value (MRKV) format.
-
-s, --separator STR
String to separate field name from content. Default is TAB (
\t
).
getent passwd | tr : "\t" | td-add-headers USER PW UID GID GECOS HOME SHELL | td-select +ALL -PW | td2mrkv
td-add-headers - Add headers to the tabular data stream and pass through the rows.
td-add-headers COLNAME_1 COLNAME_2 ...
Add header row to the tabular data stream. Headers names will be the ones specified in the command line arguments, from the left-most 1-by-1.
If there are more fields in the first data row, then additional columns will be added with names like "COL4", "COL5", etc. by the index number of the column counting from 1. This may be prevented by --no-extra-columns option.
-
-x, --extra-columns
Give a name also to those columns which are not given name in the command parameters.
-
-X, --no-extra-columns
Do not add more columns than specified in the command parameters.
who | td-trans | td-add-headers USER TTY DATE TIME COMMENT
td-alter - Add new columns and fields to tabular data stream, and modify value of existing fields.
td-alter COLUMN=EXPR [COLUMN=EXPR [COLUMN=EXPR [...]]]
On each data row, sets field in COLUMN to the value resulted by EXPR Perl expression.
In EXPR, you may refer to other fields by $F{NAME}
where NAME is the column name;
or by $F[INDEX]
where INDEX is the 0-based column index number.
Furthermore you may refer to uppercase alpha-numeric field names, simply by bareword COLUMN
,
well, enclosed in paretheses like (COLUMN)
to avoid parsing unambiguity in Perl.
It's possible because these column names are set up as subroutines internally.
Topic variable ($_
) initially is set to the current value of COLUMN in EXPR.
So for example N='-$_'
makes the field N the negative of itself.
You can create new columns simply by referring to a COLUMN name that does not exist yet. You can refer to an earlier defined COLUMN in subsequent EXPR expressions.
Add new columns: TYPE and IS_BIGFILE. IS_BIGFILE depends on previously defined TYPE field.
ls -l | td-trans-ls | td-alter TYPE='substr MODE,0,1' IS_BIGFILE='SIZE>10000000 && TYPE ne "d" ? "yes" : "no"'
Strip sub-seconds and timezone from DATETIME field:
TIME_STYLE=full-iso ls -l | td-trans-ls | td-alter DATETIME='s/\..*//; $_'
-
-H, --no--header
do not show headers
-
-h, --header
show headers (default)
"Alter" in td-alter comes from SQL. td-alter(1) can change the "table" column layout. But contrary to SQL's ALTER TABLE, td-alter(1) can modify the records too, so akin to SQL UPDATE as well.
td-collapse - Collapse multiple tabular data records with equivalent keys into one.
td-collapse [OPTIONS]
It goes row-by-row on a sorted tabular data stream and if 2 or more subsequent rows' first (key) cell are the same then collapse them into one row. This is done by joining corresponding cells' data from each row into one cell, effectively keeping every column's data in the same column.
If you want to group by an other column, not the first one, then first
reorder the columns by td-select(1). Eg. td-select KEYCOLUMN +REST
.
-
-g, --glue STR
Delimiter character or string between joined cell data. Default is space.
-
-u, --distribute-unique-field FIELD
Take the FIELD column's cells from the first collapsed group, and multiplicate all other columns as many times as many rows are in this group, in a way that each cell goes under a new column corresponding to that cell's original row. FIELD field's cells need to be unique within each groups.
If an unexpected value found during processing the 2nd row group and onwards, ie. a value which was not there in the first group, it won't be distibuted into the new column, since the header is already sent, but left in the original column just like -u option would not be in effect. See "pause" and "resume" in the example below.
Example:
ID | EVENT | TIME | STATUS 15 | start | 10:00 | 15 | end | 10:05 | ok 16 | start | 11:00 | 16 | end | 11:06 | err 16 | pause | 11:04 | 16 | resume | 11:05 | td-collapse -u EVENT COUNT | ID | EVENT | TIME | TIME_start | TIME_end | STATUS | STATUS_start | STATUS_end 2 | 15 | | | 10:00 | 10:05 | | | ok 4 | 16 | pause resume | 11:04 11:05 | 11:00 | 11:06 | | | err
-
-s, --distributed-column-name-separator STR
When generating new columns as described at -u option, join the original column name with each of the unique field's values by STR string. See example at -u option description. Default is underscore
_
.
This pipeline shows which users are using each of the configured default shells, grouped by shell path.
# get the list of users
getent passwd |\
# transform into tabular data stream
tr : "\t" |\
td-add-headers USER X UID GID GECOS HOME SHELL |\
# put the shell in the first column, and sort, then collapse
td-select SHELL USER | td-keepheader sort | td-collapse -g ' ' |\
# change header name "USER" to "USERS"
td-alter USERS=USER | td-select +ALL -USER
Output:
| COUNT | SHELL | USERS |
| 4 | /bin/bash | user1 user2 nova root |
| 5 | /bin/false | fetchmail hplip sddm speech-dispatcher sstpc |
| 1 | /bin/sync | sync |
| 1 | /sbin/rebootlogon | reboot |
| 6 | /usr/sbin/nologin | _apt avahi avahi-autoipd backup bin daemon |
Have to sort input data first.
Group key is always the first input column.
If a row in the input data has more cells than the number of columns, those are ignored.
td-expand(1) is a kind of an inverse to td-collapse(1).
td-collapse(1) roughly translates to SELECT COUNT(*) + GROUP_CONCAT() + GROUP BY in SQL.
td-disamb-headers - Disambiguate headers in tabular data
Change column names in input tabular data stream by appending a sequential number to the duplicated column names. The first occurrance is kept as-is. If a particular column name already ends with an integer, it gets incremented.
echo "PID PID PID2 PID2 USER CMD" | td-disamb-headers
Output:
PID PID3 PID2 PID4 USER CMD
td-expand - Generate multiple rows from each one row in a Tabular data stream.
td-expand [-f FIELD] [-s SEPARATOR]
It goes row-by-row and splits the given FIELD at SEPARATOR chars, creates as many rows on the output as many parts FIELD is split into, fills the FIELD column in each row by one of the parts, and fills all other columns in all resulted rows with the corresponding column's data in the input.
More illustratively:
| SHELL | USERS |
| /bin/bash | user1 user2 |
| /bin/dash | user3 user4 |
| /bin/sh | root |
td-expand -f USERS -s ' ' | td-alter USER=USERS | td-select +ALL -USERS
| SHELL | USER |
| /bin/bash | user1 |
| /bin/bash | user2 |
| /bin/dash | user3 |
| /bin/dash | user4 |
| /bin/sh | root |
-
-f, --field FIELD
Which field to break up. Default is always the first one.
-
-s, --separator PATTERN
Regexp pattern to split FIELD at. Default is space.
td-collapse(1) is a kind of inverse to td-expand(1).
td-filter - Show only those records from the input tabular data stream which match to the conditions.
td-filter [OPTIONS] [--] COLUMN OPERATOR R-VALUE [[or] COLUMN OPERATOR R-VALUE [[or] ...]]
td-filter [OPTIONS] --perl EXPR
Pass through those records which match at least one of the conditions (inclusive OR). A condition consists of a triplet of COLUMN, OPERATOR, and R-VALUE. You may put together conditions conjunctively (AND) by chaining multiple td-filter(1) commands by shell pipes. Example:
td-filter NAME eq john NAME eq jacob | tr-filter AGE -gt 18
This gives the records with either john or jacob, and all of them will be above 18.
The optional word "or" between triplets makes your code more explicite.
td-filter(1) evaluates the Perl expression in the second form and passes through records
only if the result is true-ish in Perl (non zero, non empty string, etc).
Each field's value is in @F
by index, and in %F
by column name.
You can implement more complex conditions in this way.
-
-H, --no-header
do not show headers
-
-h, --header
show headers (default)
-
-i, --ignore-non-existing-columns
do not treat non-existing (missing or typo) column names as failure
-
-w, --warn-non-existing-columns
only show warning on non-existing (missing or typo) column names, but don't fail
-
-N, --no-fail-non-numeric
do not fail when a non-numeric r-value is given to a numeric operator
-
-W, --no-warn-non-numeric
do not show warning when a non-numeric r-value is given to a numeric operator
These operators are supported, semantics are the same as in Perl, see perlop(1).
== != <= >= < > =~ !~ eq ne gt lt
For your convenience, not to bother with escaping, you may also use these operators as alternatives to the canonical ones above:
-
is
-
= (single equal sign)
string equality (eq)
-
is not
string inequality (ne)
-
-eq
numeric equality (==)
-
-ne
numeric inequality (!=)
-
<>
numeric inequality (!=)
-
-gt
numeric greater than (>)
-
-ge
numeric greater or equal (>=)
-
-lt
numeric less than (<)
-
-le
numeric less or equal (<=)
-
match
-
matches
regexp match (=~)
-
does not match
-
do not match
-
not match
negated regexp match (!~)
Other operators:
-
is [not] one of
-
is [not] any of
R-VALUE is split into pieces by commas (
,
) and equality to at least one of them is required. Equality to none of them is required if the operator is negated. -
contains [whole word]
Substring match. Plural form "contain" is also accepted. Optional whole word is a literal part of the operator.
-
contains [one | any] [whole word] of
Similar to is one of, but substring match is checked instead of full string equality. Plural form "contain" is also accepted. Optional whole word is a literal part of the operator.
-
ends with
-
starts with
Plural forms are also accepted.
Operators may be preceeded by not, does not, do not to negate their effect.
If there is no COLUMN column in the input data, it's silently considered empty. td-filter(1) does not need R-VALUE to be quoted or escaped, however your shell may do.
td-filter(1) is analogous to SQL WHERE.
td-gnuplot - Graph tabular data using gnuplot(1)
td-gnuplot [OPTIONS]
Invoke gnuplot(1) to graph the data represented in Tabular data format on STDIN. The first column is the X axis, the rest of the columns are data lines.
Default is to output an ascii-art chart to the terminal ("dumb" output in gnuplot).
td-gnuplot guesses the data format from the column names. If the 0th column matches to "date" or "time" (case insensitively) then the X axis will be a time axis. If the 0th column matches to "time", then unix epoch timetamp is assumed. Otherwise specify what date/time format is used by eg. --timefmt=%Y-%m-%d option.
Plot data read from STDIN is buffered in a temp file
(provided by File::Temp->new(TMPDIR=>1)
and immediately unlinked so no waste product left around),
because gnuplot(1) need to seek in it when plotting more than 1 data series.
-
-i
Output an image (PNG) to the STDOUT, instead of drawing to the terminal.
-
-d
Let gnuplot(1) decide the output medium, instead of drawing to the terminal.
-
--SETTING
-
--SETTING=VALUE
Set any gnuplot setting, optionally set its value to VALUE. SETTING is a setting name used in
set ...
gnuplot commands, except spaces replaced with dasshes. VALUE is always passed to gnuplot enclosed in double quotes. Examples:--format-x="%Y %b" --xtics-rotate-by=-90 --style-data-lines
Gnuplot equivalent command:
set format x "%Y %b" set xtics rotate by "-90" set style data lines
-
-e COMMAND
Pass arbitrary gnuplot commands to gnuplot. This option may be repeated. This is passed to gnuplot(1) in command line (-e option) after td-grnuplot(1)'s own sequence of gnuplot setup commands and after the --SETTING settings are applied, so you can override them.
td-keepheader - Plug a non header-aware program in the tabular-data processing pipeline
td-keepheader [--] []
ls -l | td-trans-ls | td-select NAME +REST | td-keepheader sort | tabularize
td-lpstat - lpstat(1) wrapper to output printers status in Tabular Data format
td-ls - ls(1)-like file list but more machine-parseable
td-ls [OPTIONS] [PATHS] [-- FIND-OPTIONS]
OPTIONS, ls(1)-compatible
- -A, --almost-all
- -g
- -G, --no-group
- -i, --inode
- -l (implied)
- -n, --numeric-uid-gid
- -o
- --time=[atime, access, use, ctime, status, birth, creation, mtime, modification]
- -R, --recursive
- -U (implied, pipe to sort(1) if you want)
OPTIONS, not ls(1)-compatible
-
--devnum
-
-H, --no-header
-
--no-symlink-target
-
--add-field FIELD-NAME
Add extra fields by name. See field names by --help-field-names option. May be added multiple times.
-
--add-field-macro FORMAT
Add extra fields by find(1)-style format specification. For valid _FORMAT_s, see -printf section in find(1). May be added multiple times. Putting
\\0
(backslash-zero) in FORMAT screws up the output; don't do that. -
--help-field-names
Show valid field names to be used for --add-field option.
Columns are similar to good old ls(1): PERMS (symbolic representation), LINKS, USERNAME (USERID if -n option is given), GROUPNAME (GROUPID if -n option is given), SIZE (in bytes), time field is either ATIME, CTIME, or default MTIME (in full-iso format), BASENAME (or RELPATH in --recursive mode), and SYMLINKTARGET (unless --no-symlink-target option is given).
Column names are a bit different than td-trans-ls(1) produces, but this is intentional, because fields by these 2 tools have slightly different meaning. td-trans-ls(1) is less smart because it just transforms ls(1)'s output and does not always know what is in the input exactly; while td-ls(1) itself controls what data goes to the output.
No color support.
Output format is tabular data: a table, in which fields are delimited by TAB and records by newline (LF).
Meta chars may occur in some fields (path, filename, symlink target, etc), these are escaped this (perl-compatible) way:
| Raw char | Substituted to |
|-----------|----------------|
| ESC | \e |
| TAB | \t |
| LF | \n |
| CR | \r |
| Backslash | \\ |
Other control chars (charcode below 32 in ASCII) including NUL, vertical-tab, and form-feed are left as-is.
-
TIME_STYLE
TIME_STYLE is ignored as well as --time-style option. Always show date-time in
%F %T %z
strftime(3) format! It's simply the most superior. Equivalent to TIME_STYLE=full-iso.
td-select(1), td-filter(1), td-trans-ls(1)
td-pivot - Switch columns for rows in tabular data
td-pivot
Must read and buffer the whole STDIN before output any data, so inpractical on large data.
td-rename - Rename tabular data columns
td-rename OLDNAME NEWNAME [OLDNAME NEWNAME [OLDNAME NEWNAME [...]]]
conntrack -L | sd '^(\S+)\s+(\S+)\s+(\S+)' 'protoname=$1 protonum=$2 timeout=$3' | kvpairs2td | td-rename _REST FLAGS
Not to confuse with rename.td(1) which renames files, not columns.
td-select - Show only the specified columns from the input tabular data stream.
td-select [OPTIONS] [--] [-]COLUMN [[-]COLUMN [...]]
-
-H, --no--header
do not show headers
-
-h, --header
show headers (default)
-
-i, --ignore-non-existing-columns
do not treat non-existing (missing or typo) column names as failure
-
-w, --warn-non-existing-columns
only show warning on non-existing (missing or typo) column names, but don't fail
-
--strict-columns
warn and fail on non-existing (missing or typo) column names given in parameters, even if it's prefixed with hyphen, ie. when the user want to remove the named column from the output.
COLUMN is either a column name, or one of these special keywords:
-
+ALL
all columns
-
+REST
the rest of columns not given yet in the parameter list
COLUMN is optionally prefixed with minus (-
),
in which case the given column will not be shown,
ie. removed from the shown columns.
So if you want to show all columns except one or two:
td-select +ALL -PASSWD
If you want to put a given column (say "KEY") to the first place and left others intact:
td-select KEY +REST
ls -l | td-trans-ls | td-select -- NAME +REST -INODE -LINKS -MAJOR -MINOR
"Select" in td-select comes from SQL. Similarly to SQL, td-select(1) is to choose some of the columns and return them in the given order.
td-sort - Sort tabular data by the columns given by name
td-sort OPTIONS
All those which are accepted by sort(1), except you don't need to refer to columns by ordinal number, but by name.
-
-k, --key=KEYDEF
sort(1) defines KEYDEF as
F[.C][OPTS][,F[.C][OPTS]]
, where F is the (1-based) field number. However with td-sort(1) you may refer to fields by name. But since F is no longer consists only of digits, but is an arbitrary string, it's may be ambiguous where the name ends. So you may enclose them in round/square/curly/angle brackets. Choose the one which does not occur in the column name.You don't need to even type -k, because a lone COLUMN-NAME is interpreted as "-k F" where F is the corresponding field number.
td-sort(1) is analogous to SQL ORDER BY.
td-trans - Transform whitespace-delimited into TAB-delimited lines ignoring sorrounding whitespace.
-
-m, --max-columns NUM
Maximum number of columns. The _NUM_th column may have any whitespace. By default it's the number of fields in the header (first line).
td-trans-fixcol - Transform a table-looking text, aligned to fixed columns by spaces, into tabular data.
First line is the header consisting of the column names. Each field's text must start in the same terminal column as the column name.
-
-m, --min-column-spacing NUM
Minimum spacing between columns. Default is 2. This allows the input data to have column names with single spaces.
arp -n | td-trans-fixcol
td-trans-ls - Transform ls(1) output into fix number of TAB-delimited columns.
ls -l | td-trans-ls
Supported ls(1) options which affect its output format:
- --human-readable
- --inode
- --recursive
- --time-style={iso,long-iso,full-iso}
Unsupported options:
- --author
- -g
- -o
- --time-style=locale
td-trans-mount - Transform mount(1) output to tabular data stream.
Supported mount(1) options which affect output format:
- -l (show labels)
mount | td-trans-mount
mount -l | td-trans-mount
vcf2td - Transform VCF to tabular data format.
-
-c, --column COLUMN
Indicate that there will be a column by the name COLUMN. Useful if the first record does not contain all fields which are otherwise occur in the whole data stream. By default, vcf2td(1) recognize fields which are in the first record in the VCF input, does not read ahead more records before sending the header. This option is repeatable.
-
-i, --ignore-non-existing-columns
Don't fail and don't warn when ecountering new field names.
Tabular data format declares all of the field names in the column headers, so it can not introduce new columns later on in the data stream (unless some records were buffered which are not currently). However in VCF, each record may have fields different from the first record. That's why vcf2td(1) fails itself by default if it encounters a field it can not convert to tabular.
-
-w, --warn-non-existing-columns
Only warns on new fields, but don't fail.
-
-g, --multivalue-glue STR
A string to glue repeated fields' values together when the repeated fields are handled by uniting their content into one tabdata column. Default is newline.
Note, eventhough newline is the default glue, but if you want to be explicit about it (or want to set an other glue STR expressed often by some backslash sequence),
vcf2td -g "\n" ...
probably won't quite work as one may expect (depending on one's shell), because the shell passes the "backslash" + "n" 2-chars string, instead of a string consisting just 1 "newline" char. So, in bash, put it asvcf2td -g $'\n' ...
.
-
N
N is for a contact's name, different parts separated by
;
semicolon. vcf2td(1) simplifies the N field by removing excess semicolons. If you need one or more name parts precisely, request the N.family, N.given, N.middle, N.prefixes fields by the -c option if you want, but this name partitioning method is not quite internationally useful, use the FN (full name) field for persons' names as much as you can.