Categories
Code Linux

Query installed packages

Sometimes I just want to quickly find an installed package and show its version. Maybe I don’t even care if it’s a deb or a snap package. Maybe I’d just like know if it is installed at all. And I’d like to search for it using words.

Here is a shell function qi (query installed) that can be added to ~/.bashrc or similar, which will list all installed deb and snap packages with version in a friendly format. You can easily narrow down the results by supplying words to filter. It mostly uses awk(1) to accomplish the task.

function qi() {
  (
    dpkg-query -W -f '${db:Status-Abbrev} ${Package} ${Version}\n'
    snap list|sed -e '/^Name/d' -e 's/^/snap /'
  ) | awk -v argline="$*" \
      'BEGIN { split(argline, fwords, / +/) }
       function include(package) {
         for (i in fwords) {
           if (index(package, tolower(fwords[i])) == 0) {
             return 0
           }
         }
         return 1
       }
       /^ii/ && include($2 $3) {
         printf("%-30s %s\n", $2, $3)
       }
       /^snap/ && include($2 $3 "[snap]") {
         printf("%-30s %-20s [snap]\n", $2, $3)
       }
      '
}

The function consists of two main parts:

  1. A subshell which lists all installed deb and snap packages in a particular format. Its output stream is piped directly to an awk program.
  2. The awk program which does the filtering (highlighted in dark blue).

List all installed packages

$ qi
accountsservice                0.6.55-0ubuntu12~20.04.5
acl                            2.2.53-6
acpi-support                   0.143
acpid                          1:2.0.32-1ubuntu1
adduser                        3.118ubuntu2
adwaita-icon-theme             3.36.1-2ubuntu0.20.04.2
aisleriot                      1:3.22.9-1
alsa-base                      1.0.25+dfsg-0ubuntu5
alsa-topology-conf             1.2.2-1
alsa-ucm-conf                  1.2.2-1ubuntu0.13
[... 1945 more lines not shown here]

List packages matching words

$ qi chrom
chrome-gnome-shell             10.1-5
libchromaprint1                1.4.3-3build1
chromium                       107.0.5304.121       [snap]

Snap packages can be distinguished from debs by the [snap] marker in the third column.

$ qi image linux
linux-image-5.14.0-1054-oem    5.14.0-1054.61
linux-image-5.15.0-53-generic  5.15.0-53.59~20.04.1
linux-image-generic-hwe-20.04  5.15.0.53.59~20.04.21
linux-image-oem-20.04d         5.14.0.1054.52

Use multiple words to narrow down, and the ordering does not matter. All supplied words must be substrings of the package name and version for a match to occur.

More examples

List only snap packages:

$ qi \[snap\]
bare                           1.0                  [snap]
chromium                       107.0.5304.121       [snap]
core                           16-2.57.4            [snap]
core18                         20221103             [snap]
core20                         20221027             [snap]
cups                           2.4.2-4              [snap]
docker                         20.10.17             [snap]
[...]

Search within -dev packages:

$ qi -dev ssl
libssl-dev                     1.1.1f-1ubuntu2.16
$ qi gtk dev
libgtk-3-dev                   3.24.20-0ubuntu1.1

If you want grep filtering, just use grep:

$ qi|grep -E 'ssl|tls'
libcurl3-gnutls                7.68.0-1ubuntu2.14
libgnutls30                    3.6.13-2ubuntu1.7
libio-socket-ssl-perl          2.067-1
libneon27-gnutls               0.30.2-4
libnet-smtp-ssl-perl           1.04-1
libnet-ssleay-perl             1.88-2ubuntu1
libssl-dev                     1.1.1f-1ubuntu2.16
libssl1.1                      1.1.1f-1ubuntu2.16
openssl                        1.1.1f-1ubuntu2.16
[...]

Other package formats

It should be a simple matter to add support for other package formats. Just change or add to the commands which supplies the package lists in the first sub shell, keeping in mind the common output format. Then possibly add prefix patterns for the awk program to recognize and match on lines from other package managers.

Categories
Code

Navigating Maven projects on the command line

Or how to avoid..

~/dev/myproject/src/main/java/com/example/foo $ cd ..
~/dev/myproject/src/main/java/com/example $ cd ..
~/dev/myproject/src/main/java/com $ cd ..
~/dev/myproject/src/main/java $ cd ../../..
~/dev/myproject $

This following Bash shell function, which is called pom.. (yes with two dots in the name), will allow you to navigate up to the closest ancestor directory containing a pom.xml file (closest module) with one command. Put it in your ~/.bashrc:

function pom..() {
    local start_dir="$(pwd)" prev_dir= rel_dir="$1"
    while [ "$prev_dir" != "$(pwd)" ]; do
        prev_dir="$(pwd)"
        cd ..
        if [ -f pom.xml ]; then
            if [ -d "$rel_dir" ]; then
                cd "$rel_dir"
            elif [ "$rel_dir" ]; then
                echo >&2 "Directory not found relative to pom.xml: $(pwd)/$rel_dir"
                cd "$start_dir"
                return 1
            fi
            pwd|sed "s#^$HOME#~#"
            return 0
        fi
    done
    echo >&2 "No pom.xml found in ancestor directories."
    cd "$start_dir"
    return 1
}

So you don’t have to waste any more time typing “cd ..” multiple times when navigating upwards to a Maven module root on the command line. Just type pom.. once.

~/dev/myproject/src/main/java/com/example/foo $ pom..
~/dev/myproject
~/dev/myproject $ 

It also accepts an optional argument, which is a desired directory relative to the nearest POM:

~/dev/myproject/src/main/java/com/example/foo $ pom.. target
~/dev/myproject/target
~/dev/myproject/target $ pom.. src/test
~/dev/myproject/src/test
~/dev/myproject/src/test $

The strategy will work for any style of hierarchically organized source code project where a typical marker file or directory exists at certain source code roots. Just be creative and modify the code.

Categories
Code Linux

Locking critical sections in shell scripts

A critical section is some piece of code which, due to its nature and effects, should be executed by at most one thread or process at a time. If such code is executed concurrently, the results often become undefined and arbitrary. Atomically locking a shared resource is a common pattern to synchronize execution of critical sections and ensure mutually exclusive access.

Shell scripts mostly deal in processes and files, and there are several common scenarios where code is actually a critical section. If such code is run in several processes concurrently, it could introduce race conditions and arbitrary results. Consider a script that starts a background process if it is not already running – a typical pattern to start a singleton daemon process. Such code is a critical section, because you can end up with two running daemons if the code runs concurrently (and the daemon does not check for other instances of itself). Another good example is multiple scripts writing to a shared file, or even a shared directory structure.

There are a few strategies to implement locking in shell scripts, some are better than others. In this post, I will focus on one of the most robust ways: using flock(1). This nice tool gives you access to kernel level file locking from you shell. It has some clear advantages over traditional existence based file locking:

  1. It is truly atomic.
  2. The kernel manages the locks and releases them automatically when lock owning processes die. So no more stale lock files to clean up.
  3. You can block and wait for lock, indefinitely or with a timeout, and instantly get it when another process frees it. (Avoid lock polling loops with sleeping.)

It is important to understand that file locks are tied to both a file on the file system and running processes with open file descriptors to it. Even if a file used for locking exists on the file system, it does not mean the lock is taken ! The file system acts as a namespace of shared resources on which we can attach locks. Also note that the locks are advisory only – if a process does not care to check for locks, it will not participate in any synchronization and can do whatever it pleases.

Shell script with locking functions

We will look at a script which needs to protect a critical section with locking. The locking shall be done on a common file job.lock, which means any process with access to that file can obtain or check for a lock. To code along, you can copy the script to your own file and run the examples.

job.sh

#!/bin/bash

lock_acquire() {
    # Open a file descriptor to lock file
    exec {LOCKFD}>job.lock || return 1

    # Block until an exclusive lock can be obtained on the file descriptor
    flock -x $LOCKFD
}

lock_release() {
    test "$LOCKFD" || return 1
    
    # Close lock file descriptor, thereby releasing exclusive lock
    exec {LOCKFD}>&- && unset LOCKFD
}

lock_acquire || { echo >&2 "Error: failed to acquire lock"; exit 1; }

# --- Begin critical section ---

if [ -f job.dat ]; then
    value=$(<job.dat)
else
    value=0
fi
value=$((value + 1))

echo $value >job.dat

# --- End critical section ---

lock_release

The lock_acquire function uses flock -x N to obtain an exlusive lock on a file descriptor N. Since the file descriptor is opened by the script process itself, it will be the owner of the lock after flock exits. Flock is able to lock the descriptor because it is inherited from the shell process that started it. The critical section reads a number from a file if it exists, increments it by one, and writes the updated number back to the file.

Testing

First we’ll run the job script once:

$ bash job.sh 
$ ls
job.dat  job.lock  job.sh
$ cat job.dat 
1

A job.dat file is produced with a value of 1, which is entirely expected and not very interesting.

Next we’ll start 100 job processes asynchronously as fast as possible in the background, which means that many of them will run concurrently. We do this two times:

$ rm job.dat 
$ (for i in {1..100}; do bash job.sh & done; wait)
$ cat job.dat 
100
$ (for i in {1..100}; do bash job.sh & done; wait)
$ cat job.dat 
200

The for loop is started in a sub shell to avoid job control messages. The data has been incremented exactly 100 times after the first run, and incremented again by 100 after the second. If you look at the code in the critical section, it both reads, updates and then writes to the shared file, and doing this without locking would not work consistently.

Actually, let us try that, by commenting out the lock_acquire call in the script:

[...]

#lock_acquire || { echo >&2 "Error: failed to acquire lock"; exit 1; }

# --- Begin critical section ---

Then we run the test again:

$ rm job.dat
$ (for i in {1..100}; do bash job.sh & done; wait)
$ cat job.dat
3
$ (for i in {1..100}; do bash job.sh & done; wait)
$ cat job.dat
14

Which ends with a final result of 14, clearly incorrect and arbitrary. The results will vary with each run and depend on things like the speed of your computer.

Releasing the lock ?

In this case the script actually does not need to release the lock right before it exits, because the kernel will automatically do that when the process exits anyway. We will try it by re-enabling the locking call, and commenting out the lock_release call:

[...]

lock_acquire || { echo >&2 "Error: failed to acquire lock"; exit 1; }

# --- Begin critical section ---
[...]
# --- End critical section ---

#lock_release

And run the test:

$ rm job.dat
$ (for i in {1..100}; do bash job.sh & done; wait)
$ (for i in {1..100}; do bash job.sh & done; wait)
$ cat job.dat
200

It still works fine.

Starting a daemon process from your shell init scripts

A common use case is the need to start a single daemon process from your shell init scripts, unless one is already running. So you only ever want one such thing running. Consider the following:

if ! ps -ef|grep some-daemon|grep -qv grep; then
    some-daemon & pid=$!
    echo Started some-daemon with pid $pid
fi

This code is racy unless it is protected by locking. If you were to start two terminals more or less simultaneously, both executing your shell init scripts, you could possibly end up with two running daemon processes.

To protect this with flock, you could do the following:

if exec {bashrc_fd}<~/.bashrc && flock -nx $bashrc_fd; then
    # --- Begin critical section ---
    if ! ps -ef|grep some-daemon|grep -qv grep; then
        some-daemon & pid=$!
        echo Started some-daemon with pid $pid
    fi
    # --- End critical section ---

    flock -u $bashrc_fd && exec {bashrc_fd}>&-
fi

Here we open a read-only file descriptor to ~/.bashrc and then try to grab an exclusive lock on it, but we do it non-blocking with option -n. If some other bash process is already executing that part of the init file, flock will not succeed and immediately exit with a non-zero code, so the block is skipped. It has the effect that only one bash process will execute the code, and others running at the same time will skip it.

You may notice that we explicitly release the lock using flock -u $bashrc_fd after the critical section. Normally it is enough to close the file descriptor used for locking, but when starting child processes, those may inherit and keep such descriptors open. So the parent process closing its copy of the descriptor may not be enough actually release lock. Therefore we do it explicitly.

Closing notes

The manual page for the flock command lists a few good examples of how you can use it your scripts. However, none of those examples show how you can make the current shell process own and control the locks without using sub shells or flock invocations to wrap critical sections/commands.

The manual page for the flock(2) system call is a good read if you are interested in more details about how it works.

Read more about handling file descriptors with Bash in this part of the bash(1) manual.