To help me pick tags for my posts I wrote a little bash script to help list the tags on my website.


After searching my memory I can’t remember if I’ve ever really talked much about the format of the repository that the content of this site is stored in.

To start, all the content of the website is written in reStructuredText (RST), which is a markup language with very minimal syntax. RST is ideal for this type of thing because it can be easily converted into other markup formats like latek, wiki-markup, epub and HTML.

RST has a way to list some metadata fields in each file. The metadata are basically just key, value pairs in the format ":key: value". So for example the dates of each post is written like, ":date: 2016-01-05 09:30" and the tags are listed like, ":tags: tag1, tag2, tag3".

One challenge that I have come across is that there is now enough content that I can’t keep track of all the tags that I’ve used in the past. I also wish to avoid having silly duplicate tags like "code" vs "coding" so I needed a way to view the tags that have already been used. The More Tags page was a start because it does list all of the tags. However, I’m not always sitting in front of a PC with my browser open.

So instead I wrote a bash script that would grep through the repository and list out all of the tags. At first that was all it did, and it printed all the tags in a single column. As time went on and the number of tags grew that became cumbersome again and I needed a better solution. Here is what I came up with:

#!/bin/bash

# Parse the tags out of the articles in the repository
tags=$(grep -hIsri --exclude get_tags.sh ":tags: " * | \
       sed -e 's/:tags: //g' -e 's/,//g' -e 's/ /\n/g' | \
       sort | uniq)

# Find the length of the longest tag
# this becomes the minimum per-tag output width
min_width=0
for tag in $tags; do
    if [[ ${#tag} -gt $min_width ]]; then
        min_width=${#tag}
    fi
done

# Add a few extra spaces for visual separation
min_width=$(($min_width + 4))

# print tags with trailing padding to meet the minimum output width
function print_tags () {
    for tag in $tags; do
        printf "%-${min_width}s" $tag
    done
}

# Format the tag output so it looks nice
print_tags | fmt

I think everything in the script is fairly straight forward. First I use grep, sed, sort and uniq to get an alphabetical, list of all of the tags with any duplicates removed.

That was where the first version of the script ended. The improved version then goes on to loop through all of the tags and get the length of the longest one. This lets the script pad each tag with trailing space characters so that each tag is the same length. The final step is to call the fmt tool to format the array of uniform length strings into nice columns so that it is easier to read through.

That’s it! Now I can easily glance through and choose tags for my articles.

Following is an example of the output if you’re curious:

2015                 3d                   7-string             AC
android              announcement         apartment            apple
apt                  avengers             bands                banff
bash                 beer                 bitbucket            blog
bodom                books                bose                 breakfast
browser              btrfs                c++                  cell
chores               chromecast           circuits             cleaning
code                 coffee               cold                 commute
concert              crunchbang           crypto               cube
curling              day-off              delayed              dhcp
donation             dropbox              ed-bounty-hunting
editor               ed-ships             ed-trading
elite-dangerous      email                encryption
fall                 family               fedora               fiction
filesystem           file-transfer        flying               forgetful
game                 games                gaming               gift
git                  github               google-chrome        gpg
graphics             grinder              gtd                  gtx970
guitar               haggis               headphones           htc
ikea                 ip                   iptables             jedi
joystick             juice                kelowna              ksp
landlord             laptop               leak                 learning
linux                long-weekend         luggage              lunch
lvm                  magnific-popup       markdown             massage
memory               metal                monday               monitor
mosfet               motivation           move                 movie
movies               moving               mtp                  multiarch
music                mutt                 neighbors            networking
networkmanager       new-year             nic                  nm-applet
nvidia               office-renos         old                  optometrist
packing              partitioning         PC                   pelican
php                  pics                 podcast              police
policykit            poop                 procrastination      projects
pun                  python               rain                 reading
recording            robbie-burns         rpm                  rsync
scotch               script               security             sgit
shoes                shower               sick                 site
snow                 software-release     sore                 sore-back
sore-feet            soundcloud           space                spring
standing-desk        star-wars            steam                story
strings              sudoers              superheroes          systemd
tablet               terminal             theater              tires
tmpfs                tmux                 travel               tv
twitter              udev                 urxvt                vacation
vim                  virtualbox           virtualenv           vm
vmdk                 w3m                  wallet               warranty
weather              weird-people         westjet              windows
winter               work                 xanth                xmas
yoga                 yuml
" "