TorWAL - tracking work hours with Python

As many other, I found myself primarily working from home all of a sudden. The lines between working and not working was fading. I suspected I worked too many hours (spoiler; 1), and also that real development work got overshadowed by an increasing amount of Slack and video meetings (spoiler; 2). 😬

So I created this Window Activity Logger (WAL, not an overload term at all!) - a tool to collect and make stats of your application use (in Linux). Previously known as window-activity-logger, but as a friend pointed out, TorWAL was more fitting. I only use it for tracking work hours (more on the how later), but it is also applicable tracking and categorizing all your activates.

Example

Looking back at September i.e.:

$ torwal stats --since 2021-09-01 --before 2021-10-01
--- Top 10 uncategories ---
[...]
--- Top 10 active windows ---
[...]
--- Top 10 categories ---
 49h42m (31%) of Slack
 27h55m (18%) of Firefox (uncategorized)
 18h47m (12%) of Video Meeting
 15h19m (10%) of Firefox - Business internal documentaion tool
 10h36m (7%) of Terminal (uncategorized)
 10h13m (6%) of VIM (in dev folders)
 09h52m (6%) of Terminal (in dev folders)
 07h42m (5%) of Firefox - Monitoring stack tools
 04h54m (3%) of Firefox - Google Cloud Platform
 03h55m (2%) of Firefox - GitHub Pull Request
--- Active time (at all hours) ---
 2021-09-01 (Wed):  08h04m ( 00h34m) |                    ▃▃▇▇▇▇▇▇▇▇▇▅▆▇▇▅▄▁▇▇▅       |
 2021-09-02 (Thu):  06h40m (-00h49m) |                    ▆▇▇▇▇▆▇▇▇▇▇▇▅▇▇▂            |
 2021-09-03 (Fri):  05h43m (-01h46m) |                     ▇▇▇▆▇▅▂▇▇▇▇▇▄▃ ▅           |
 2021-09-04 (Sat):  00h00m
 2021-09-05 (Sun):  00h23m ( 00h23m) |                                        ▄▂      |
 2021-09-06 (Mon):  06h17m (-01h12m) |                     ▅▂▇▇▇▇▇▇▇▇▇▇▇▇▇▇▁          |
 2021-09-07 (Tue):  07h03m (-00h26m) |                      ▄▇▇▅▇▇▇▇▇▇▇▇▇▇▇▃▆         |
 2021-09-08 (Wed):  06h40m (-00h49m) |                     ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▄          ▁ |
 2021-09-09 (Thu):  06h43m (-00h47m) |                    ▁▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇  ▂         |
 2021-09-10 (Fri):  06h24m (-01h05m) |                    ▁▇▇▇▇▇▇▇▇▇▇▇▇▇▂             |
 2021-09-11 (Sat):  00h00m
 2021-09-12 (Sun):  00h00m
 2021-09-13 (Mon):  08h26m ( 00h56m) |                    ▃▇▇▇▅▂▇▇▇▇▇▇▇▇▇▇▅▇▇▃        |
 2021-09-14 (Tue):  08h07m ( 00h37m) |                    ▁▇▇▇▇▆▇▇▇▇▇▇▇▇▁▆▇▇▇▂▁       |
 2021-09-15 (Wed):  08h56m ( 01h26m) |                      ▁▆▇▇▇▇▇▇▇▇▇▇▅▇▅▇▇▇▄   ▆▅  |
 2021-09-16 (Thu):  08h08m ( 00h38m) |                    ▁▇▇▇▇▇▇▇▇▇▇▇▇▇▅▇▇▇          |
 2021-09-17 (Fri):  05h03m (-02h26m) |                     ▇▁▂▇▇▇▇▇▇▇▇▇▇▇▂            |
 2021-09-18 (Sat):  00h00m ( 00h00m) ||
 2021-09-19 (Sun):  00h51m ( 00h51m) |                                             ▆▁ |
 2021-09-20 (Mon):  08h54m ( 01h24m) |    ▇▇▇▁              ▇▆▄▇▇▇▇▇▇▇▇▇▇▇▇▇▇ ▄▂      |
 2021-09-21 (Tue):  06h59m (-00h30m) |                       ▅▇▇▇▇▇▆▇▇▇▇▇▇▇▇          |
 2021-09-22 (Wed):  08h13m ( 00h43m) |                      ▄▇▇▆▇▇▇▇▇▇▇▆▅▂    ▄▇▇▇▇▆▁▁|
 2021-09-23 (Thu):  08h08m ( 00h38m) |                     ▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▅▂    ▂▇▁   |
 2021-09-24 (Fri):  07h30m ( 00h00m) |                     ▃▇▇▇▇▇▇▂▆▇▃▇▇▇▇▇▇▇▃▁▁      |
 2021-09-25 (Sat):  02h59m ( 02h59m) |                               ▃            ▄▁▇▇|
 2021-09-26 (Sun):  03h22m ( 03h22m) |    ▇▇▇▇▆           ▄▇▄  ▁                      |
 2021-09-27 (Mon):  08h13m ( 00h43m) |                    ▂▇▇▇▆▆▇▇▇▇▇▇▇▇▇▆▇▄▄▃        |
 2021-09-28 (Tue):  04h53m (-02h36m) |                       ▃▄▇▃▇▇▇▇▇▇▇▅             |
 2021-09-29 (Wed):  07h49m ( 00h19m) |                      ▅▇▇▇▆▇▇▇▇▇▂▃▆▆▇▇▅     ▃▇▆ |
 2021-09-30 (Thu):  09h39m ( 02h09m) |                     ▄▇▇▇▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▅     |

 170h19m total
 05h19m off balance during this period

What you see is a list of categories I spend my time in. Slack is shockingly high. I think it is a result of poor async communication adaptation and the supporting nature of being a Reliability Engineer. Slack is the new open landscape. 🤷

Secondly, you get a list of days. Each with the sum of all active time, and weather or not I've hit my target of 7h30m. It also holds a nice ASCII histogram of the full day, where as one full bar is 30 min of activity.

As you can see, the second last week, I had on-call duties. 🤙

How does it work?

You'll find the code here https://github.com/torvald/TorWAL.

Each 10 seconds, or for each tick, I save a row in a SQLite3 database with the title of the currently active window on and for how long I've been inactive.

A tick counts as active time if you have been idle for less then a idle threshold (I use 5 min). If I attend a virtual meeting I use slightly higher thresholds, as virtual meetings tend to lead to some inactivity (from your window managers point of view).

IDLE_TIME_GENERAL = 5 * 60  # Five min
IDLE_TIME_VIDEO_CONFERENCING = 20 * 60  # Twenty min
VIDEO_CONFERENCING_APP_PATTERN = "%<insert tool here>%"

Now, after working from the office more and more, I added this post-pandemic feature where I also track which SSID I'm connected to. This way, I know exactly how long I've been to the office, and I count all of that as active time.

SSIDS_PATTERNS = ['%office_SSID%', '%old_office_SSID%']

I personally check if Slack is running, as a signal to whether I'm «working» or not. If I close Slack, I don't register active time. This is configurable. This has the side effect of making me more mindful about closing work related programs when I'm actually not working.

ACTIVITY_FILTER_CMD = "ps aux | grep 'usr/lib/slac[k]/'"

To create an incentive of not overspending time on activities that are not strictly work, I can add these to the list of IGNORE_PATTERNS. That also works well as an escape hatch if I suddenly find myself head deep into YouTube videos of Honda Monkeys.

IGNORE_PATTERNS = ["%WeeChat%", "%Reddit%", "%1,000 Miles in Baja on Honda Monkeys%"]

Bottom line

If not fully restored, my work-life balanced has at least improved drastically. I've become way more mindful about where I spend my time. Also, as of before, I could pull through a couple days with some extra hours – but not knowing exactly how many, it's hard to justify taking a day of. That is now justified by a quick torwal stats, without even touching my conscience.


  1. I did! 

  2. I did! 

Cowshed configuration optimization simulation

cowsim

You know when you have a traditional barn full of cows, and the modern age forces you to make the choose whether to switch career or to invest in a milking robot? That feeling. Making the jump to a milking robot system requires you to redesign your cowshed from a static stall based model to a dynamically free range model. In the former, the milking equipment moves from cow to cow, but in the latter, the cows need to move to the equipment.

So not everyone have faced this problem, but I kinda did, and I immediately though it sounded like a fun project to simulate! Not that it would be of any help, but I would be fun!

DeLaval VMS Source: DeLaval Voluntary Milking System™

There are a handful of products on the marked to solve this problem, but most of they require you to buy and install a gigantic milking robot in a fixed location in you barn and start pushing cows through it. News at eleven; cows are not smart. They don't understand that they need to be milked, and nonetheless how. Instead, what you do is redesign your barn to incentivized cows to move a lot, and when they pass – what we call – «the smart gate», they are diverted into the milking robot.

The cows actively wants a couple of things, food (in some different flavors) and sleep, and the system takes advantage of that. Most of all they want concentrate, it's like candy for them. You design the system so that you get the perfect balance of physical movement between these activities, and – if a cow has not been milked the last 12 hours or so – you divert it into the milking robot. It will be feed concentrate there as well, so the cow is happy either way.

I build this so it would be possible to run it as a genetic algorithm, where each cowshed configuration would equal a possible solution, and the production of milk (and maybe number of dead cows x_x) would act as the heuristics function. I never got that far. Neither did I get to the point to add the actual milking robot. But I had fun.

So what are we looking at here? Here is some early testing in a poor screengrab.

cowsim

So is obviously a cow. is basically a water tray, is where grass is severed, and is a concentrate feeder. Less interesting is which is a wall. But more interesting is which is a one way gate and the which is a cow bed(!) - the cow's preferred chilling area.

# Create a random cowshed configuration
place_random_agents("wall", groups=50, max_agents=10, min_agents=3)
place_random_agents("onewaygate", groups=25, max_agents=1)
place_random_agents("grass", groups=5, max_agents=5)
place_random_agents("water", groups=5, max_agents=3)
place_random_agents("feeder", groups=5, max_agents=2)
place_random_agents("bed", groups=5, max_agents=20, min_agents=20, cluster=True)

# Place some cows
for cow in self.cows:
x, y = None, None
while True:
    x, y = self.random_pos()
    if self.barn.is_cell_empty((x, y)):
        break
self.barn.place_agent(cow, (x, y))

# Run it!
for s in range(self.config["steps"]):
self.step = s
for cow in self.cows:
    if cow.alive:
        cow.step()

# For each step, have the cow figure out what it want the most and have it
# move towards that objective
def step(self):
    self._update_state()
    new_objective = self._calc_objective()
    if new_objective != self.current_objective:
        self._update_target(new_objective)
    # move toward current target
    self.move()

Using the settings above as the initial random borad, you get solutions like this. cowsim

Without the heatmap, here is a simulation at full speed. cowsim

Some bonus images where things are running a little bit slower. cowsim cowsim

Horrendous code quality, but it is available at https://github.com/torvald/cowsim.

Anki like note reminders

I use my simple note system to scribble down thoughts, project descriptions, reminders from meetings and the usual note kinda stuff.

Despite of its intrinsic value, most notes gets written down and never revisited. Here is a short snippet that, based on the note's dated titles, email me the note after 2, 7, 14 and 30 days. The repetitions are carefully chosen spaced intervals after the Anki method, optimized for making stuff easier to remember.

#!/bin/bash

notes_dir="$HOME/sync/notes/"
interval="2 7 14 30"

email="note-reminder@torvald.no"

for i in $interval; do
    date=$(date --date="$i days ago" +"%Y-%m-%d")
    while read -r file; do
        subject=$(echo "$i days since $file" | sed "s,$notes_dir,,")
        cat "$file" | mail -r "Note reminder <$email>" -s "$subject" "$email"
    done < <(find "$notes_dir" | grep "$date")
done

I simply have this snippet in my crontab.

screenshot An test email.

A hacker's replacement for Dropbox (and that old NAS in the basement with dying hard drives)

TL;DR My answer here is rclone (mount) + insert-favorite-cloud-storage (I use Jottacloud) + Seafile on top. Multiple levels of accessibility, no more Dropbox and worrying about hard drives. Scroll to the bottom for a best effort Draw.io diagram. (-:

Perhaps you can relate; I had a Dropbox account on a free tier, always 90% full — and I had images, videos and random stuff laying around on multiple laptops and servers, wherever stuff made sense in the moment. Everything in my Dropbox was encrypted with encfs, but except for that, the situation was not ideal.

I decided to make a change for the better when I bought myself a new mobile phone and found out that Dropbox had introduces a max limit on the number of connected devices. So short story long;

I needed to

  • buy or find something to replace Dropbox, meaning; have a folder on each personal computer and server that is
    • synced across devices,
    • actual files on the file systems and
    • encrypted at third party.
  • find some kind of archive system to dump archive graded stuff without making them inaccessible.
    • I really would like to not worry about physical hard drives.
    • Bonus if I can mount the archive as a local file system.

I was discussing Plex with a colleague, he had all of his media content encrypted in his Google Drive account, mounted – like sshfs – to his server with rclone! Interesting!

I am really not an advocate for Google's products, but rclone supports a range of backends and one of them happens to be Jottacloud! There is nothing unique with Jottacloud, but they are Norwegians and it feels good to support Norwegian cloud provides when I can! The $7.5 per month plan gives me «unlimited storage» 1, and with rclone I will have a layer of indirection in case I ever want to swap backend.

This fixed my archive problem, but did not work well as a Dropbox replacement. Editing files which are mounted (very) remotely can sometimes take multiple seconds to open and write. And every now and then I want to access my files while I am offline or on a low bandwidth connection.

/r/selfhosted suggested Seafile! I have used it briefly in the past, at a former workplace, but it looks a lot more mature now.

So my life now looks like this! ↓

diagram

In short; I now got two folders on every client (desktops and servers) I use where sync is an Seafile folder with offline content, and archive; being a slower, albeit unlimited, file system folder. With this I can edit and use my notes and daily files via the former, and run borg backup 2, image/video gallery etc in the latter.

torvald@gauda ~/sync $ ls
bin  certs  docs  dotfiles  gpg  scrape  mailattachments  notes  projects  tickets  work  www
torvald@gauda ~/jotta/archive $ ls
backups  gitrepos  lekvam_cam  pictures  videos  www

Run rclone config to add your preferred cloud storage resources. Add additional layers with encryption. I divided the two use cases, but one remote folder with two subfolders is also a possibility. 3

torvald@gauda ~ $ rclone listremotes --long
jotta-archive-encrypted: crypt
jotta-archive-raw:       jottacloud
jotta-sync-encrypted:    crypt
jotta-sync-raw:          jottacloud

Further, rclone allows us to mount external resources to your file system with rclone mount. Make it persistent with systemd.

[Unit]
Description=Mount jotta crypt archive
Wants=network.target
Before=network.target

[Service]
Type=forking
User=torvald
Group=torvald
WorkingDirectory=/home/torvald
ExecStart=/usr/bin/rclone mount --allow-other --daemon jotta-archive-encrypted: /home/torvald/jotta/archive

[Install]
WantedBy=multi-user.target

Seafile comes with a well written guides for both server setup as well as desktop and CLI clients. Seafile's data directory is simply pointed to my rclone mounted sync folder. The default SQLlite did not work well with this approach, so I set up a Seafile with MySQL. No PostgreSQL support is a bummer.

Bonus: The Seafile android app also allows for auto photo upload, archiving my photos to a «selfhosted» encrypted location automatically.


  1. They limit bandwidth when you exceed 5TB, but I guess I'm good for a couple of years as is. 

  2. A friend have already blogged about this 

  3. You can instead encrypt your content with Seafile if you want, but if you want to add unencrypted Seafile libraries as well (i.e for sharing) rclone encryption is a nice default anyhow.