I made a program called Thoughts — here's a line-by-line description of what its code does

Early in lockdown, I made a program for writing short text posts in a terminal and putting them on the internet. It's called Thoughts. It was a super fun experience, and I use it to this day. This is an overly detailed explanation of how it works.

What is Thoughts?

As its README description reads, Thoughts is “a POSIX-compliant shell program for making less-than-a-blog-sized text posts from a terminal.”

Like many, during early lockdown my Twitter usage approximately quadrupled. This led to bird-site burnout, and I wanted a different outlet for spouting off on the internet in a way that was linkable, timestamped, and low-friction. I couldn't find any existing tools that worked the way I wanted, and I desperately needed something engaging to spend all my newly-free time on, so I decided to make the thing.

Before writing code, I knew what I wanted the workflow to look like:

  1. Type a command in a terminal and press enter
  2. A terminal-based text editor opens
  3. Type the text you want to post on the internet
  4. Save and exit the editor
  5. The text is on the internet

I also knew that I wanted Thoughts not to depend on an existing platform to work. There are already lots of ways to tweet from a terminal, and corollaries for other social media platforms too. But the point of Thoughts isn't to enable a user to post something from a terminal — its point is to enable a user to put their thoughts on the internet in a way that's easily shareable but otherwise immune to the freaky virality economy of a platform like Twitter. So Thoughts just emits a web page and gives each post a unique link. This means it's self-hosted, which is unfortunate. It would be great if something like Thoughts existed that was more accessible to non-technical users.

That's more than enough backstory, let's talk about code!

How it works

Thoughts is just POSIX compliant shell code, HTML, CSS, and some AWK. Initially, I didn't imagine that Thoughts would be something anyone else would use, so I didn't put much forethought into its architecture. This was definitely the most enjoyable part of working on it — it was just nothing matters, look it works, wow that's so fun, let's keep going. If I could go back and do it again, I'd do the exact same thing.

Thoughts is comprised of a number of little files. Let's look at them:

A screenshot of the root directory of the Thoughts git repository on GitHub

.gitignore is the gitignore file, LICENSE is the MIT license it uses, and README.md is the readme (and the source of the breaking change referenced in that most recent commit). None of those files are actually part of the program.

.foot.html and .head.html are one of the quirky parts of Thought's design. They're basically everything that will eventually go in the web page that Thoughts outputs, other than the user's actual posts. Head will become the beginning of the HTML document; so it contains meta tags, CSS, and everything else that comes before actual posts; and foot will become the end. Foot mostly contains closing tags and a link to the source code:

<hr>
<p style="text-align:center">
<a href=https://github.com/thwidge/thoughts>source</a>
</p>
</body>
</html>

install.sh

install.sh is the installer, and it looks like this:

#!/bin/sh
set -euf

if [ -z ${1+x} ]; then
  cmd='default'
else
  cmd="$1"
fi

binDir=$HOME/.local/bin
stuffDir=$HOME/.local/share/thoughts

if [ -d "$stuffDir" ]; then
    printf "Thoughts is already installed. Reinstall? [y/n]: "
    read -r reply
    if [ ! "$reply" = "y" ]; then
        echo
	echo "OK, nothing's been installed."
        exit 0
    fi
fi

# copy all the program files to the right places
# and also write the local ignore file
#
# haha I just realized that this doesn't have to
# write the ignore file, it could just rename an existing file
mkdir -p "$stuffDir"/bin
cp parse.awk "$stuffDir"/bin
cp update.sh "$stuffDir"/bin
cp README.md "$stuffDir"
cp .head.html "$stuffDir"
cp .foot.html "$stuffDir"
touch "$stuffDir"/.rawthoughts.html
echo '*' > "$stuffDir"/.gitignore
echo '!thoughts.html' >> "$stuffDir"/.gitignore
echo '!.gitignore' >> "$stuffDir"/.gitignore
echo '!.rawthoughts.html' >> "$stuffDir"/.gitignore
echo '!.head.html' >> "$stuffDir"/.gitignore
echo '!cloudbuild.yaml' >> "$stuffDir"/.gitignore
echo '!Dockerfile' >> "$stuffDir"/.gitignore

mkdir -p "$binDir"
cp thoughts "$binDir"
chmod +x "$binDir"/thoughts

if [ "$cmd" = "another" ]; then
    printf "What's the git clone URL for your existing thoughts repository?: "
    read -r reply
    git clone "$reply" "$stuffDir"/thoughts-temp
    cp -r "$stuffDir"/thoughts-temp/. "$stuffDir"
    rm -rf "$stuffDir"/thoughts-temp
    echo
    echo 'Done! Add $HOME/.local/bin to your PATH'
    exit 0
fi

echo
echo 'Done! Add $HOME/.local/bin to your PATH, and create a git repo: https://github.com/marenbeam/thoughts#first-install'

The logic is:

  1. Check if the user issued ./install.sh or ./install.sh another
  2. Check whether Thoughts has already been installed
  3. Copy all the necessary program files into the correct places in the user's $HOME, and write a gitignore file
  4. If the user issued ./install.sh another, clone their existing thoughts page from a remote git repository

That's it! It's pretty simple, and there aren't many guardrails or choices.

update.sh

update.sh is run when the user wants to update Thoughts to the newest version, and it's broken out into its own program so that it can run the way it needs to. Here's what it does:

#!/bin/sh
set -euf

stuffDir=$HOME/.local/share/thoughts
binDir=$HOME/.local/bin

###
### This script must handle *all* "update" steps
### because it reinstalls its own caller (thoughts itself)
###

cp "$stuffDir"/thoughts-temp/.foot.html "$stuffDir"
echo "copied footer"
cp "$stuffDir"/thoughts-temp/README.md "$stuffDir"
echo "copied readme"
cp "$stuffDir"/thoughts-temp/parse.awk "$stuffDir"/bin
echo "copied parse"
cp "$stuffDir"/thoughts-temp/thoughts "$binDir"
echo "copied thoughts itself"
chmod +x "$binDir"/thoughts
echo "chmod thoughts"

# Handle the possibility of overwriting user's custom CSS
echo
if ! diff "$stuffDir"/thoughts-temp/.head.html "$stuffDir"/.head.html; then
    echo
    echo "WARNING:"
    echo "The CSS in this release is different than what you currently have."
    echo "It could be upstream updates, or maybe you made some customizations."
    echo "Check out the diff above."
    echo
    echo "If you haven't made custom CSS changes, you can safely overwrite and install."
    echo "If you HAVE made CSS changes, just select 'n' and the new CSS will be written somewhere else."
    echo
    printf "DO YOU WANT TO OVERWRITE YOUR CSS? [y/n]:"
    read -r reply
    if [ "$reply" = "y" ]; then
        cp "$stuffDir"/thoughts-temp/.head.html "$stuffDir"
	echo
	echo "CSS overwritten and installed. You're good to go!"
    else
        cp "$stuffDir"/thoughts-temp/.head.html "$stuffDir"/.head-new.html
        echo
        echo "New CSS is in $HOME/.local/share/thoughts/.head-new.html"
	echo "If you want, you can update the CSS yourself with \"thoughts style\""
    fi
fi
rm -rf "$stuffDir"/thoughts-temp
echo
echo "Done updating!"
exit 0

And here's where it's called from the main Thoughts program:

# "update" command
update () {
    git clone https://github.com/thwidge/thoughts.git "$stuffDir"/thoughts-temp
    cd "$stuffDir"/thoughts-temp
    cp "$stuffDir"/thoughts-temp/update.sh "$stuffDir"/bin/update.sh
    chmod +x "$stuffDir"/bin/update.sh
    
    # run update.sh from the new release
    sh "$stuffDir"/bin/update.sh
}

update.sh is broken into its own program for two reasons:

  1. So that it's possible for an update to Thoughts to involve changes to the update script itself (on update, thoughts runs update.sh from the newly cloned repo rather than from the already-installed version)
  2. Because whatever thing is doing the updating must delete the previous version of thoughts itself. I'm not necessarily sure that a shell program can't delete its own file and then continue running, and I'm sure there's a clear answer to this question that involves understanding subshells and processes, but I just made it this way and then moved on. If the way I made it is actually broken, feel free to send me an email.

thoughts

Finally! The actual thing that runs each time you post a thought.

One of the most fun parts of working on Thoughts was using a bunch of coreutils I'd never used before, in ways I'd never used them. I decided that if Thoughts was going to be largely inaccessible to people who weren't already pretty computery, then I'd at least make it as accessible to computery people as possible. That meant I needed to strive for POSIX compliance and edge-case portability everywhere so that Thoughts could work well on any UNIX-adjacent system.

This introduced some really fun constraints. Including:

Let's look at just one block of code, since a lot of the logic is duplicated in each command:

default () {
    cd "$stuffDir"
    # get an editor so we can type our thought.
    # generate a random temp filename to avoid collisions.
    # who knows what's in there!
    rand=$(date | cksum | tr -d ' ')
    
    "${EDITOR:-vi}" "$rand".txt
    if [ ! -f "$rand".txt ]; then
        echo
        echo "you don't always have to share your thoughts"
        exit 0
    fi
    # If this thought doesn't have a trailing newline, add one
    tail -c 1 "$rand".txt | read -r _ || echo >> "$rand".txt
    
    # replace some newlines with <br>
    # and convert codeblock tag into real one
    # and linkify things outside of codeblocks
    awk -f "$stuffDir"/bin/parse.awk "$rand".txt > temp.txt && mv temp.txt "$rand".txt
    # get the last 4 characters from the file
    # if they are "<br>", delete them.
    br=$(tail -c 5 "$rand".txt)
    if [ "$br" = '<br>' ]; then
        sed '$ s/.\{4\}$//' "$rand".txt > temp.txt && mv temp.txt "$rand".txt
    fi
    thought=$(cat "$rand".txt)
    
    now=$(date +"%I:%M %p | %Y-%m-%d")
    dateHash=$(date | cksum | tr -d ' ')
    blob="<section class=\"thought\"><div class=\"thought-date\"><a class=\"thought-date\" id=\"$dateHash\" href=\"#$dateHash\">\n$now</a></div><div class=\"thought\">\n$thought\n</div></section>\n"
    
    git pull
    echo "$blob" | cat - .rawthoughts.html > "$dateHash".html && mv "$dateHash".html .rawthoughts.html
    cat .head.html .rawthoughts.html .foot.html > thoughts.html
    git add .
    git commit -m "update thoughts"
    git push
    rm "$rand".txt
    echo
    echo "your thoughts have been shared"
}

First, thoughts generates a very-much-not-random value that's misleadingly named rand. We're going to use this value in a few places:

rand=$(date | cksum | tr -d ' ')

This command says “output a checksum of the date but remove everything after the space in the checksum.” We're doing this because we just want a POSIXy way to get a unique-ish integer — we don't care about the actual checksum of the date. Here's what we get without tr -d ' ':

thwidge@uwc:~$ date
Mon 26 Oct 2020 10:07:25 PM EDT
thwidge@uwc:~$ date | cksum
927889335 32

Now that we've got our unique-ish rand integer, we want to open a file <rand>.txt in a text editor. The goal is to start writing in a new text file with a randomish name rather than a hard-coded name, just in case something weird happened and the user has an unpublished thought saved in this directory as a result of a previous crash.

"${EDITOR:-vi}" "$rand".txt
    # If $rand.txt doesn't exist, the user quit without saving or something
    # Handle and exit -- they did not write a thought
    if [ ! -f "$rand".txt ]; then
        echo
        echo "you don't always have to share your thoughts"
        exit 0

The first line "${EDITOR:-vi}" "$rand".txt took me an unreasonably long time to figure out because I was thinking too hard. It says “open rand.txt using the program specified by $EDITOR, or fall back to vi”. For many people this is vim, for others it's nano, for others it's neovim or ed or emacs — great. We can all live in harmony and post our thoughts on the internet 😉

If after we return from the editor $rand.txt does not exist, then we know that the user exited the editor without saving anything. We print a friendly message to the terminal and exit the whole program.

If the user did write a thought, we need to be sure there's a newline at the end of it since not all editors always add newlines to the ends of files (VSCode most notably), but all of our coreutils depend on text files (maybe all files?) ending with newlines. We do that here:

tail -c 1 "$rand".txt | read -r _ || echo >> "$rand".txt

This line says “pipe the last character of the file containing the user's thought into read, and if read doesn't exit with zero then append nothing to the file with echo.”

I honestly don't completely understand how this line is working, and it depends on a few coreutils hacks.

  1. read is waiting for a newline, that's how it works. So I think that this newline test with read works because if the char we've sent it isn't a newline then read won't exit with zero and so the left side of the OR test evaluates to false
  2. If the left side of the OR test evaluates to false, we can just add a newline by using echo to append “nothing” to the file. This will actually append a newline, because that's just how echo works. I'm not sure whether this is officially documented behavior, but you could investigate here if you'd like.

Overall, one of the biggest things I learned while working on Thoughts is that dealing with newlines pretty hard.

Now that we've made sure the user's thought has a newline at the end, we're ready to convert what they've typed into valid html that's ready to get posted on the internet. We use an AWK script to do that here:

awk -f "$stuffDir"/bin/parse.awk "$rand".txt > temp.txt && mv temp.txt "$rand".txt

Originally I was going to do a walk through of this AWK script here, but now I've decided to break it out into another blog post since AWK is its whole own thing!

Much of the time, an unnecessary <br> tag is added to the last line of the post by the AWK script, so we need to check for this and remove it:

br=$(tail -c 5 "$rand".txt)
    if [ "$br" = '<br>' ]; then
        sed '$ s/.\{4\}$//' "$rand".txt > temp.txt && mv temp.txt "$rand".txt
    fi

First we grab the last five characters with tail, because we know that the last character is a newline and and we want the four characters that precede it. If those four characters equal <br>, then we delete the last four characters of the line with sed '$ s/.\{4\}$//'. There's some interesting stuff going on here with how tail thinks about newlines and how sed thinks about newlines, since we're telling sed to do something with the last four characters but telling tail to do something with the last five characters. Like I said above, newlines are pretty hard.

I'm going to cheap out of breaking down the syntax of that sed regex, because it was hard to figure out and I don't have to understand it anymore! 😅 But I won't leave you totally hanging — here's a link to an awesome site with great information about all things relating to Unix coreutils: https://www.grymoire.com/Unix/Sed.html

Finally now we're ready to package our thoughts post up and append it to the existing file that has all this users previous posts in it. We do that here:

thought=$(cat "$rand".txt)
    
now=$(date +"%I:%M %p | %Y-%m-%d")
dateHash=$(date | cksum | tr -d ' ')
blob="<section class=\"thought\"><div class=\"thought-date\"><a class=\"thought-date\" id=\"$dateHash\" href=\"#$dateHash\">\n$now</a></div><div class=\"thought\">\n$thought\n</div></section>\n"

git pull
echo "$blob" | cat - .rawthoughts.html > "$dateHash".html && mv "$dateHash".html .rawthoughts.html

Here, we've already done all the processing we need to to get the actual text that the user typed ready for the web. Now we just need to get it into the HTML file with all the previous thoughts.

First we get the contents of $rand.txt into a variable we can work with more easily (thoughts), then we generate our right-now date stamp (we've already done this previously, but we want to generate a new one in case the user started typing this post a long time ago and now the old date stamp is very inaccurate), then we generate another semi-random hash dateHash which we're going to use as the unique id attribute for linking to this specific thought, and then we package it all into blob with all the surrounding HTML it needs:

<section class="thought"><div class="thought-date"><a class="thought-date" id="274890470929" href="#274890470929">
03:20 PM | 2020-12-05</a></div><div class="thought">
Here's an example of one complete thoughts post. <br>
<br>
This is what the variable <code>blob</code> holds in the above code block.
</div></section>

.rawthoughts.html contains all of the user's previous thoughts posts, so we add this new thought to the file with the old thoughts with this:

echo "$blob" | cat - .rawthoughts.html > "$dateHash".html && mv "$dateHash".html .rawthoughts.html

If you'll notice though, we do a git pull before doing this. This is part of the work of handling the “thoughts can be installed on N computers” feature. If the user has posted thoughts from another computer between the last time they posted from this computer and now, then we need to get that updated .rawthoughts.html from the remote before publishing this new thought. What's more, since we do this right before actually adding the new thought, the user could even have:

  1. Started typing a thought on this computer, but not finished it
  2. Posted a thought from another computer
  3. Come back to this computer and then finished their previously half-finished thought and posted it

And everything should still work! Very nice 🤓

After this, we're almost done publishing this thought. First we package all the thoughts, including the new one, into the final version of the webpage with:

cat .head.html .rawthoughts.html .foot.html > thoughts.html

And then push to remote and clean up a bit:

git add .
git commit -m "update thoughts"
git push
rm "$rand".txt
echo
echo "your thoughts have been shared"

I think this covers most of the interesting parts of Thoughts, and I'm running out of steam on working on this post, so that's all for now! Feel free to send me an email with any thoughts or questions you might have :)