deans-dox
 / Linux / Perl-tr.md

tr/// (aka y///)

Perl's tr/// is the sleeper op 🔧—blazing fast for byte/char mapping, deleting, and squeezing. It’s not regex; it’s a transliterator.


Quick facts

  • Syntax: $str =~ tr/SEARCH/REPLACE/flags
  • Returns: number of characters affected (super handy!)
  • Modifies the string in place (use /r to return a copy instead).
  • Flags:

    • d = delete chars with no replacement
    • c = complement the search set (operate on everything except it)
    • s = squeeze runs of the same replaced char into one
    • r = (Perl ≥5.14) return result, don’t modify original

Greatest hits

1) Count occurrences (fast)

my $n = ($s =~ tr/a/a/);        # how many 'a'?
my $digits = ($s =~ tr/0-9//);  # how many digits?

2) Whitelist filter (keep only allowed chars)

$s =~ tr/a-zA-Z0-9_.-//cd;      # delete everything NOT in the set

3) Delete specific chars

$s =~ tr/\r\n//d;               # strip CR/LF
$s =~ tr/ //d;                  # remove spaces

4) Squeeze runs (collapse duplicates)

$s =~ tr/ / /s;                 # collapse multiple spaces to one
$s =~ tr/\n/\n/s;               # collapse blank lines

5) ROT13 in-place (classic)

$s =~ tr/A-Za-z/N-ZA-Mn-za-m/;

6) Map one set to another (1:1 by position)

$s =~ tr/äöü/aeu/;              # simple transliteration
$s =~ tr/+-/_./;                # replace '+'→'_', '-'→'.'

7) Non-destructive variant

my $t = $s =~ tr/0-9//r;        # copy with digits removed; $s unchanged

8) Fast “has only allowed chars?” check

if ( ($s =~ tr/a-zA-Z0-9_.-//c) == 0 ) { ... }  # 0 affected ⇒ all allowed

Gotchas

  • It’s positional mapping, not pattern matching: tr/abc/xyz/ maps a→x, b→y, c→z.
  • If REPLACE is shorter, the last char is reused unless /d is set. e.g. tr/abc/x/ maps a→x, b→x, c→x. Use /d to delete extra: tr/abc/x/d maps a→x, deletes b and c.
  • For Unicode, tr/// works on characters (not bytes) if your string is upgraded (use proper decoding). Ranges like A-Z are by code point, not locale rules.
  • Want case folding? Use lc/uc/fc unless you explicitly want simple mapping.

Tiny patterns you’ll actually use

# strip non-printables (except newline/tab):
$s =~ tr/\x20-\x7E\n\t//cd;

# normalize whitespace to single spaces:
$s =~ tr/\t\r\n / /s;

# filename-safe slug (lowercase + whitelist + squeeze dashes):
$s = lc $s; $s =~ tr/a-z0-9-/_/c; $s =~ tr/_/_/s; $s =~ s/^_+|_+$//g;

# “did string contain any vowels?” (answer = count return value)
if ( $s =~ tr/aeiouAEIOU// ) { ... }