Solution to os-prober not finding other operating systems (Windows & Linux)

Occasionally os-prober fails to detect other operating systems installed on other disks or partitions. While this might be due to complex issues with the installation or boot, there’s a workaround that usually remedies the issue. In this article, however, I’d like to provide information that actually solves os-prober issues once and for all and provide fundamental understanding of how this extremely popular package functions.

We will look at different approaches based on detecting other UNIX installations (including Linux and MacOS) and other Windows installations (which uses NTFS file system) when os-prober fails.

Let’s consider how os-prober functions first.

How does os-prober work?

os-prober is an impressive solution that detects operating systems and adds them to the GRUB’s boot list. But how does os-prober actually work? Often this package will be confused with things it doesn’t do and it gets a lot of bad reputation for not finding OS installations sometimes. Understanding the way os-prober works will give you more confidence in troubleshooting booting issues with Linux or Windows.

I remember when I first used os-prober and when it detected a bunch of operating systems on my disk I was so amazed. It felt like magic because for some reason I thought it did so without mounting any disks or partitions.

The reality is it’s not magic. os-prober will attempt to mount each disk/partition as read only in order to scan them for operating systems and if Linux, Windows or MacOS installations are found, it will create an additional record in the boot configuration. In doing so, os-prober will exclusively use grub-mount (this part is critical knowledge) to mount partitions/disks.

Let’s look at os-prober‘s workflow. First of all in a Linux system a disk can’t be read before it is mounted. Unless mounted, a disk (or its partitions) is just a piece of board attached to the computer. No matter how advanced the software is, it needs to be mounted before it can be read.

  • So first, os-prober temporarily mounts all the available disks to /var/lib/os-prober
  • Then it checks OS directories on them strategically such as /etc

That’s what happens when you run the following command:

grub-mkconfig -o /boot/grub/grub.cfg

And the operating systems which are found will be listed in the Grub boot menu.

To have os-prober activated along with grub-mkconfig command you will need to have the following command line uncommented in the /etc/default/grub file:

#GRUB_DISABLE_OS_PROBER=false

This assumes it is disabled by default.

Introducing grub-mount

os-prober utilizes grub-mount to ensure disks (especially NTFS disks) are mounted only as read-only and never written on. This prevents potential corruption in file system and the disk tables which can be fragile in some cases.

Here is the grub-mount command and its detailed options as a list:

grub-mount [OPTION...] IMAGE1 [IMAGE2 ...] MOUNTPOINT
  •  -C, –crypto: Mount crypto devices.
  • -d, –debug=STRING: Set a debug environment variable.
  • -K, –zfs-key=FILE|prompt: Load a ZFS crypto key.
  • -r, –root=DEVICE_NAME: Set the root device.
  • -v, –verbose: Print verbose messages.
  • -?, –help: Display the help list.
  • -usage: Display a short usage message.
  • -V, –version: Print the program version.

Usually you don’t have to worry about how partitions or disks are mounted as it all takes place in the background while os-prober detects and adds operating systems. But as you probably know, it often fails to do so as well.

It’s crucial to find out if grub-mount is actually functioning properly to ensure os-prober can mount and detect operating systems under the hood.

Quick Workaround 1: Forcing os-prober to detect other Linux Operating Systems

There is a quick workaround which often succeeds to help os-prober detect other Linux installations which is mounting root partitions of other Linux systems manually before running os-prober or grub-mkconfig (also sometimes used as grub-update).

Typically you would do something like this:

(Replace “dev/nvme0n1p5” with the partition or disk where other Operating System’s root is)

!! Important : Don’t casually mount Windows installations to Linux systems as this can corrupt fragile NTFS file system and your Windows installation.

 

mount /dev/nvme0n1p5 /mnt

grub-mkconfig -o /boot/grub/grub.cfg

Quick Workaround 2: Forcing os-prober to detect Windows on NTFS file system

Windows typically runs on NTFS file system which is quite outdated in my opinion but that’s another discussion.

You can run the following command, remember to replace your exact partition with the example below such as /dev/nvme0n1XY or dev/sdaX

Note: You will need the ntfs-3g library.

"Carefully" Mounting Windows Partitions or Disks in Linux File Systems

Below you can find the exact commands needed to safely mount a Windows partition from inside a Linux operating system with specific options that are necessary to ensure protecting the Windows OS’ integrity.

Here is what those ntfs-3g and mount options mean: (They basically ensure a secure read-only Windows partition mounting but its quite intriguing to see each option’s function.)

  • -t ntfs-3g: This option specifies the file system type to be mounted, which is ntfs-3g in this case. ntfs-3g is a third-party NTFS driver that provides read and write support for NTFS partitions on Linux systems.
  • -o ro,noexec,nodev,nosuid: These options specify various mount options for the NTFS file system:
  • ro: Mounts the file system in read-only mode, ensuring that no modifications can be made to the files on the NTFS partition.
  • noexec: Prevents the execution of binaries or programs from the mounted NTFS partition, enhancing security by disallowing the execution of potentially malicious code.
  • nodev: Disables the interpretation of special device files on the NTFS partition, ensuring that no special device files can be created or accessed.
  • nosuid: Prevents the execution of setuid programs from the mounted NTFS partition, ensuring that no programs can be executed with elevated privileges.

/dev/sdX#: This is the device file representing the specific NTFS partition you want to mount. You should replace sdX# with the actual device identifier and partition number of the NTFS partition you wish to mount. For example, /dev/sda1 represents the first partition on the first hard disk.

/mnt: This is the mount point directory where you want to mount the NTFS partition. In this example, it is set to /mnt. You can choose a different directory if desired.

sudo mount -t ntfs-3g -o ro,noexec,nodev,nosuid /dev/nvme0n1pX /mnt

grub-mkconfig -o /boot/grub/grub.cfg

Fixing os-prober Problems' Root Causes

In most cases os-prober fails to detect other OS installations because one of its dependencies isn’t functioning properly.

In majority of the cases, fixing os-prober means fixing grub-mount which is one of the major background actors os-prober relies on.

Try to isolate the issue by manually using grub-mount to mount a partition.

sudo grub-mount /dev/nvme0n1p2 /mnt

You might get an error like:

Error: grub-mount: error while loading shared libraries: libfuse3.so.3: cannot open shared object file: No such file or directory

Missing libfuse3.so.3 means this particular library is missing. To confirm you can run a find command on the root directory.

sudo find / -name libfuse3.so.3 2> /dev/null

Last part silences the irrelevent errors such as accessing proc. Most likely nothing will be found because libfuse3 doesn’t exist on your system and that’s most likely because fuse3 isn’t installed.

If we look at the dependencies of grub:

Name : grub
Version : 2:2.06.r591.g6425c12cd-1
Description : GNU GRand Unified Bootloader (2)
Architecture : x86_64
URL : https://www.gnu.org/software/grub/
Licenses : GPL3
Groups : None
Provides : grub-common grub-bios grub-emu grub-efi-x86_64
Depends On : sh xz gettext device-mapper
Optional Deps : freetype2: For grub-mkfont usage [installed]
fuse3: For grub-mount usage
dosfstools: For grub-mkrescue FAT FS and EFI support [installed]
lzop: For grub-mkrescue LZO support
efibootmgr: For grub-install EFI support [installed]
libisoburn: Provides xorriso for generating grub rescue iso using grub-mkrescue
os-prober: To detect other OSes when generating grub.cfg in BIOS systems [installed]
mtools: For grub-mkrescue FAT FS support

You can see that fuse3 is listed as an optional dependency for grub-mount. Even if you have fuse or fuse2 you will want to install this fuse3 grub optional dependency so that grub-mount can work so that os-prober can work.

Depending on your Linux distribution you can use one of the following commands:

sudo pacman -S fuse3
sudo apt-get install fuse3
sudo yum install fuse3
sudo zypper install fuse3
sudo dnf install fuse3

You can also compile fuse3 yourself. See Installing AUR Packages.

Once you clear the grub-mount-related issues you will see that it functions properly and you can navigate to your mount point with sudo cd to confirm, it might not appear in lsblk output due to permissions. You will also see that os-prober now works as it should.

sudo grub-mount /dev/nvme0n1p2 /mnt

In my experience, most os-prober fails boil down to issues with executing grub-mount command or dependencies and/or libraries involved such as fuse or libfuse3.

Hopefully, methods above can provide a sensible alternative to manually mounting those partitions so that os-prober can pick them up and add them to GRUB.

You can apply some of the suggested solutions above for detecting and adding boot records for Debian, Arch Linux, Fedora, Ubuntu, Deepin, Manjaro, CentOS, Linux Mint, EndeavourOS and many others as well as Windows 10 and Windows 11. Please make sure to mount Windows partitions carefully as explained above as writing on a Windows NTFS disk from a Linux system can corrupt your Windows installation. (Methods shared above to mount Windows properly).

Fixing Grub on an Encrypted Luks Disc

Deciding on Different Boot Solutions

Fixing Grub on an encrypted disc can be similar to fixing unencrypted Linux installation or it can be quite different.

It’s important that you understand your exact situation first and it mainly comes down to:

  • Mount points
  • Encryption

Boot Partition Mount Points

Normally, you will have a structure similar to this:

  • /boot : You will see Linux kernel and initramfs right here.
  • /boot/grub : You will have grubenv, grub configuration here.
  • /boot/efi : You will find the efi bootloader here.

Encryption of Disc or Partitions

It’s common to encrypt only the root partition with cryptsetup and leave the boot partitions unencrypted.

This is useful especially if you are sometimes using other bootloaders or dual-booting with Windows etc. It’s slightly less-secure obviously since your boot partition is unencrypted. But unless you are Ethan Hunt or Edward Snowden or something it should be fine.

Logic of Boot Partition Mounting Points

Understanding both sections is crucial because if you mount your boot partition to only /boot/efi , then grub-install will install EFI bootloader there but the rest will be on /boot directory which is a directory on the encrypted root partition.

If you however, mount boot partition to /boot now the grub installation as well as /boot/efi and linux kernel (plus initramfs) will all be on the unencrypted /boot directory.

A mini-test to understand this is, if you are using the /boot dir for boot partition and if you unmount boot partition local /boot directory of the root partition should be completely empty,  Otherwise when you unmount the boot partition only /boot/efi would be empty.

Implications of it all

Hopefully it’s clear that we have two scenarios where Grub can freely read all boot files including kernel versus grub can only read EFI bootloader entry and then it would need encryption password to proceed.

Solutions will be different and the former is less complicated and generally convenient while the latter is more secure. Rest assure, in both cases you can save your operating system even if you completely wipe the boot partition.

Re-generating vmlinuz linux kernel and initramfs files.

If you have accidentally deleted your boot directory’s content also, this is actually an easy fix to replace those files.

First chroot into your root partition using a live USB.

In a Linux computer, typically crucial system files such as initramfs, vimuz-linux kernel and efi and grub folders reside inside the boot directory.

Although these are seriously important system files for a computer, they can actually be restored quite conveniently in most cases.

To generate linux kernel and initramfs files simply run this command:

mkinitcpio -P

This will recreate the files named:

  • initramfs-linux-fallback.img
  • initramfs-linux.img
  • intel-ucode.img
  • vmlinuz-linux

under the boot directory.

Solutions for Luks encrypted Root Partition only

Most tutorials will suggest following mount points when installing grub.

sudo mount /dev/sdXY /mnt/boot/efi
sudo mount /dev/sdXX /mnt

However, when you are dealing with a luks encrypted root partition, you will want to mount as below:
(See below for more explanation on this.)

sudo mount /dev/sdXY /mnt/boot
sudo mount /dev/sdXX /mnt

This is mainly because we don’t want grub and linux kernel will be installed inside /boot directory and we don’t want them to fall outside the unencrypted EFI partition. If they are installed on the root partition, they won’t be accessible during startup boot loading period as they will be on the encrypted partition.

Luks encrypted systems are in most cases

  • sda
    • sda1 (Buffer for Legacy)
    • sda2 (EFI Partition)
    • sda3 (Luks Encrypted Root)

In 90%+ of the cases users don’t have to encrypt the boot partitions as there won’t be any personal data in there (there can still be security implications).

This means uncommenting #GRUB_ENABLE_CRYPTODISK=y is usually not necessary.

If you are encrypting the whole disk including the boot partitions however, this is also possible and you will need to uncomment. see below section.

When grub configuration goes wrong on an encrypted setup users often get errors like below while attempting to reinstall grub:

grub-install: error: attempt to install to encrypted disk without cryptodisk enabled. Set `GRUB_ENABLE_CRYPTODISK=1′ in file `/etc/default/grub’..

Don’t do this:

#GRUB_ENABLE_CRYPTODISK=y

!! Leave this option commented unless your EFI boot partition is encrypted (luks). !!

However, let’s reiterate this, unless your boot partition is encrypted as well as your root partition, you don’t need to enable crypto disk in grub configuration. But why the error then?

The answer is, in most cases it will be a mounting issue. Let’s clarify a couple things to solve this issue forever.

Normally root partition is mounted to /mnt while EFI partition is mounted to /boot/efi

In a luks encrypted system however, it makes more sense to mount EFI partition (for example /dev/sda2) into /boot instead of /boot/efi.

This is because while you install grub to /boot/efi, linux kernel and initramfs will remain in /boot directory.and if the /boot resides under encrypted /mnt such as /mnt/boot, this means kernel and initramfs files will fall outside the unencrypted EFI partition and they will be inside the luks partition. In this case grub-install will think you are trying to install grub to an encrypted disc and suggest crypto disc to be enable although you actually don’t need that if you mount properly.

Long story short, 

mount EFI partition to /mnt/boot

create /mnt/boot/efi

install grub to efi directory with –efi-directory=/boot/efi argument

 

## !! ATTENTION: change sdaX as need, values below are hypothetical !!
# Assuming sda3 is encrypted root partition and sda2 is unencrypted EFI partition.

sudo cryptsetup luksOpen /dev/sda3 cryptdisc
mount /dev/mapper/cryptdisc /mnt
mount /dev/sda2 /mnt/boot

Chroot into /mnt and install and configure grub on right mount points as discussed above.

Make sure /etc/fstab has the correct mount points as well for after the reboot.

arch-chroot /mnt

grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=GRUB
grub-mkconfig -o /boot/grub/grub.cfg

Now, if you need to regenerate vimuz-linux kernel and initramfs files, you can simply run the command below and they will be created inside the boot directory which is unencrypted as EFI partition is mounted there.

You might have to install linux kernel if it’s deleted.

sudo pacman -S linux

mkinitcpio uses linux kernel to create initramfs or initial ram disk file system.

mkinitcpio -P

After running mkinitcpio, run grub-mkconfig as above again and then you can unmount everything with umount -a and reboot.

-P flag signifies all presets which can be found inside the /.etc/mkinitcpio.d/linux.preset file as below:

# mkinitcpio preset file for the 'linux' package

#ALL_config="/etc/mkinitcpio.conf"
ALL_kver="/boot/vmlinuz-linux"
ALL_microcode=(/boot/*-ucode.img)

PRESETS=('default' 'fallback')

#default_config="/etc/mkinitcpio.conf"
default_image="/boot/initramfs-linux.img"
#default_uki="/efi/EFI/Linux/arch-linux.efi"
#default_options="--splash /usr/share/systemd/bootctl/splash-arch.bmp"

#fallback_config="/etc/mkinitcpio.conf"
fallback_image="/boot/initramfs-linux-fallback.img"
#fallback_uki="/efi/EFI/Linux/arch-linux-fallback.efi"
fallback_options="-S autodetect"
Finally, this will all result in a harmonious GRUB > initramfs > kernel > root pipeline as explained below:

Solutions for Luks encrypted Root & Boot Partitions

In this case solutions won’t be too different to above. Edit /etc/default/grub file to uncomment below command. This will allow Grub to recognize and boot from Encrypted boot directory/partition. Obviously this will cause you to get password prompt twice:

  • Once before boot
  • Once before system is mounted
#GRUB_ENABLE_CRYPTODISK=y

If you enable cryptodisk but you’re actually not using encrypted boot partition you might get an error like below:

Error : grub-install: error: attempt to install to encrypted disk without cryptodisk enabled. Set `GRUB_ENABLE_CRYPTODISK=1′ in file `/etc/default/grub’..

Solution is either, don’t use GRUB_ENABLE_CRYPTODISK=y or actually encrypt your boot partition. If it’s a bit confusing make sure to understand the mount points at the beginning of this tutorial.

Uncommenting the line above only instructs Grub about encrypted boot. It doesn’t solve all your boot problems. If something is wrong with your grub installation or configuration, you will still have errors like this one:

Enter passphrase for (hd,gpt):

Error : Invalid passphrase.
Error : access denied.
Error : no such cryptodisk found.
Error : disk ‘cryptouuid/……’ not found.
Entering rescue mode…
grub rescue>

To solve this, you can make sure grub-install is properly executed for /boot/efi and contains valid linux kernel and initramfs files along with grub folder.

If somehow you managed to corrupt or remove the Encryption Headers in the beginning of the encrypted disc, in most cases there is not much to do to save that system. Getting the boot errors and invalid passphare above doesn’t necessarily mean this is the case and just fixing the boot issues will resolve all errors. Unless again, if encryption data is corrupted. You can read more about LUKS Encryption below:

LUKS Encryption Headers

LUKS (Linux Unified Key Setup) is a disk encryption specification used in Linux-based systems. It provides a standard format for storing encrypted data on disk, allowing for secure storage of sensitive information.

In the context of LUKS, the “LUKS encryption header” refers to a specific data structure that contains important information about the encrypted disk or partition. This header is located at the beginning of the encrypted device and holds essential metadata needed to access and decrypt the data.

The LUKS encryption header includes the following information:

Key Slots: The header contains several key slots, which store encryption keys used to decrypt the data. Each key slot can hold a passphrase or a key file that grants access to the encrypted disk. Multiple key slots allow for the use of different passphrases or multiple users with distinct encryption keys.

Cipher and Encryption Parameters: The header stores information about the encryption algorithm, key size, and other parameters used for the encryption process. These parameters determine the strength of the encryption and the algorithm used to protect the data.

UUID and Label: The LUKS header may also include a Universally Unique Identifier (UUID) and a user-defined label. The UUID uniquely identifies the encrypted device, while the label provides a human-readable identifier for easier identification and management.

Anti-Forensic Information: LUKS supports anti-forensic features to mitigate attacks aimed at revealing information about the encrypted data. The header may include details such as the number of times the device has been opened or various parameters related to anti-forensic measures.

The LUKS encryption header is crucial for the proper functioning of the encrypted disk or partition. It contains the necessary information for the system to validate and access the encrypted data. During the boot process or when mounting an encrypted device, the system reads and verifies the header, prompts for a passphrase or key, and uses the provided information to decrypt the data.

It’s important to ensure the security and integrity of the LUKS encryption header. Any damage or corruption to the header could lead to the loss of data or difficulties in accessing the encrypted device. Therefore, it is recommended to keep backup copies of the LUKS header and store them securely to ensure the ability to recover the encrypted data if needed.

Flow of a Boot Process in Linux Systems

GRUB, or the GRand Unified Bootloader, is a commonly used bootloader in Linux systems. Its main function is to load and start the operating system. While the Linux kernel and initramfs files are essential components of the operating system, they need a bootloader like GRUB to be loaded into memory and executed.

Here’s a brief overview of how GRUB functions when Linux is installed on a machine:

Powering on the machine: When you power on a computer, the BIOS (Basic Input/Output System) or UEFI (Unified Extensible Firmware Interface) firmware initiates the boot process.

GRUB installation: GRUB is typically installed in the Master Boot Record (MBR) or the EFI system partition, depending on the boot mode (legacy or UEFI).

GRUB initialization: Once the firmware locates and loads GRUB, GRUB initializes itself and presents the user with a boot menu (if configured).

Kernel and initramfs loading: From the boot menu, the user selects the desired Linux kernel and initramfs. GRUB then reads the necessary files from the configured boot partition.

Passing control to the kernel: GRUB transfers control to the loaded Linux kernel by passing the necessary parameters and memory location details.

Kernel execution: The Linux kernel takes control and starts initializing the hardware, loading drivers, and setting up the essential components of the operating system.

Initramfs usage: The initramfs (initial RAM file system) is a temporary root file system that the kernel mounts early during the boot process. It contains essential drivers and utilities required to mount the actual root file system. The kernel uses initramfs to load the necessary modules and prepare the system for the handover to the real root file system.

How to Properly Serve Python Apps with PM2 and Gunicorn on any Port and any Server

Gold unicorn for Python Flask Apps

Normally you can serve a Python Flask app using Gunicorn like this:

gunicorn -b 0.0.0.0:8080 app:app

This binds all network traffic (0.0.0.0) on port 8080 to app.py program’s app object. First one is the file name (without py extension) and second one is the app object inside the Python code.

Normally you can serve a Python Flask app using Gunicorn like this:

sudo firewall-cmd --permanent --add-port=8080/tcp
sudo firewall-cmd --reload

Here is a common structure of a flask app:

Using Gunicorn with Pm2

With pm2 usage of gunicorn is slightly more complicated.

pm2 start "gunicorn -b 0.0.0.0:8080" app:app

Or giving the app a name for monitoring purposes like below.

pm2 --name=PythonApp start "gunicorn -b 0.0.0.0:8080" app:app

The difference between using 0.0.0.0 and localhost as the bind address in the Gunicorn command can affect the accessibility of the server.

When you use gunicorn -b 0.0.0.0:8000 app:app, Gunicorn binds to all network interfaces, allowing connections from any IP address, including external IP addresses. This makes your server accessible from both the local machine and other machines on the network.

However, when you use gunicorn -b localhost:8000 app:app, Gunicorn only binds to the loopback interface, also known as 127.0.0.1. This means that the server is only accessible from the local machine itself. Connections from other machines on the network or external IP addresses will be blocked.

Add Ingress Rule to allow Inbound Traffic

You should configure the server to allow Inbound traffic for specific port you’d like to use such as 8000, 8080, 80, 81, 3000, 5000 etc.

Port 80 is the universally accepted HTTP port and browsers automatically look for port 80 when a root domain URL or IP is accessed.

You need to open the port to listening under VCN (Virtual Cloud Networks) > Subnet > Default Security List for vcn-xyz > Add Ingress Rules

Listing Post Listening Activity

lsof -i -P -n

You can use lsof to show a list of files being served and filter it to internet files only and force it to show the ports instead of app names.

-i: Filters the output to show only Internet-related open files (network connections).
-P: Prevents the conversion of port numbers to service names.
-n: Skips the conversion of IP addresses to hostnames.

Alternatively, you can use netstat to do something similar:

netstat -tln  # Show only listening TCP ports
netstat -uln  # Show only listening UDP ports

If you don’t see your desired port here. First place to look is the firewall rules.

Additionally, you will want to allow your cloud instance or server to accept inbound traffic from those ports if you want to achieve access from other machines to the specific port.

A good test for this is to use curl command.

If

curl localhost:8080

works from inside ssh but you still can’t access 8080 through public ip that means inbound traffic is not being allowed.

Serving Flask App Using Gunicorn via Pm2

Even a classier way to serve Flask app is to generate a configuration JS file and instruct Pm2 to use gunicorn to serve the Flask app.

File can be created as below:

module.exports = { 
  apps: [ 
    { 
      name: 'Flaskmyapp', 
      script: 'gunicorn', 
      args: 'app:app -b 0.0.0.0:5000', 
      interpreter: 'python', 
      interpreter_args: '-u', 
      instances: 1, 
      autorestart: true, 
      watch: false, 
      max_memory_restart: '1G', 
      env: { 
        FLASK_APP: 'app.py', 
        FLASK_ENV: 'production', 
      }, 
    }, 
  ], 
}; 

Nginx Reverse Proxy

It’s common to reverse proxy with Nginx to specific ports. You can instruct Nginx to Reverse Proxy multiple ports, especially other than port 80, like this:

server {
  listen 80;
  server_name servername;

  location / {
    proxy_pass http://localhost:3000;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  }
}


server {
  listen 81;
  server_name servername;

  location / {
    proxy_pass http://localhost:5000;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  }
}

In this example port 80 on public IP will forward to port 3000 while port 81 on public IP forwards to port 5000.

Deploying Node.js Apps on Google App Engine: Typical Server Errors & Solutions

How to Debug Google App Engine Errors

In recent years, I’ve witnessed the transformative impact of serverless computing on the evolution of DevOps. By embracing serverless technologies, you can unlock scalability and simplify your deployment processes while creating more bang for your buck. The modular architecture and reduced infrastructure management allows our teams to innovate rapidly, delivering exceptional value to our clients with speed and efficiency.

In this article, I will showcase some of the most confusing errors that commonly happen during Node.js deployments on Google’s Serverless App Engine and solutions to them. Node.js is the commonly utilized JavaScript runtime environment and JavaScript (Js) is the most commonly utilized programming language for web apps (WordPress effect needs to be removed from statistics as it’s a widely-adapted mostly no-code CMS solution which utilizes PHP). Throughout the years, JS remains many developers’ first web app development programming language choice.

Let’s check out the errors.

How to Debug Google App Engine Errors

First of all let’s briefly discuss how you can actually identify errors in this specific case which is when deploying apps to Google App Engine. While some errors will be clearly visible in your command line or directly in the browser (such as 500 Internal Server Error), other errors will be hard to identify without the right tool.
 
App Engine Logs are a great example to error identification tools (and info about many other things) which you can access through multiple links/windows.
  • If you click the App Engine’s Navigation Menu (very top left, hamburger button), you will then see
    • Logging Menu > Logs Explorer.
  • Alternatively, you can click Navigation Menu again and then App Engine > Services
    • Your default service will have a last column called Diagnose under which you can see the link to Logs.

You might need to adjust date and time of the Logs Explorer or regenerate the errors by attempting to run the app or its functions but Logs Explorer is a very powerful debugging tool which will show you detailed information about what is wrong and what type of errors your app is facing.

Let’s start checking out the actual errors that might be occurring.

Alternative Debugging Method

Alternatively, you can always go to your browser’s developer tools and see the logs and errors on the front end of the application but we can assume that many server-side errors won’t actually show up here hence App Engine’s Logs Explorer is a lot more relevant in this case.

Google App Engine Errors & Solutions Provided Below

The errors I’ve dealt with can be listed as:

  • TypeError
  • Cannot Get / Error
  • syntaxerror missing ) after argument list
  • Google App Engine 503 Service Unavailable Error
  • ERROR: (gcloud.app.deploy)
  • ERROR: (gcloud.app.deploy) INVALID_ARGUMENT
  • Infinite Page Loading due to port error
  • 500 Internal Server Error

Below you can find solutions to all these typical App Engine Errors.

Server.js or Index.js related TypeError

This error typically means server.js file is not declaring the app properly hence the issues with app.use function as the app doesn’t have use method.

Make sure you have imported the necessary modules correctly. In many cases, this error occurs when the express module is not imported or initialized properly. Ensure that you have the following line at the beginning of your server.js file:

Verify that you have created an instance of the Express application using express() or const app = express(). This creates an Express application object that has the use method and other middleware functions. Make sure you have the following code before using app.use:

const express = require(‘express’);

const app = express();

Check if you have any typos or syntax errors in your code. The error could be caused by a mistake in the syntax or incorrect usage of the Express is a middleware here and any typo or syntax error will also cause similar app.use TypeError on the server.

Additionally, you must ensure dependencies are defined properly so  that the necessary Express modules are not missing or not properly installed.

By reviewing and addressing these points, you should be able to resolve the TypeError: app.use is not a function issue in your server.js file.

An example from Google Cloud Service Health Page

TypeError: app.use is not a function
at Object.<anonymous> (/workspace/server.js:6:5)
at Module._compile (node:internal/modules/cjs/loader:1255:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1309:10)
at Module.load (node:internal/modules/cjs/loader:1113:32)
at Module._load (node:internal/modules/cjs/loader:960:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:83:12)
at node:internal/main/run_main_module:23:47

For example, Node.js app below defines app variable as require(‘express’), but instead of creating an app, this code line attempts to import the express library. As a result, since app isn’t created properly and program won’t be able to call use method on it, we will get a basic example of TypeError.

const app = require('express');

app.get('/', (req, res) => {
  res.send('Hello from App Engine!');
});

app.use(express.static(__dirname + '/public'));

// Listen to the App Engine-specified port, or 8080 otherwise
const PORT = process.env.PORT || 8080;
app.listen(PORT, () => {
  console.log(`Server listening on port ${PORT}...`);
});

In this case, it should be pretty easy to fix the error by properly creating the app variable. Here is the fixed code:

We are importing Express and creating an Express App the right way in this example.

const express = require('express');
const app = express();

app.use(express.static(__dirname + '/public'));

app.get('/', (req, res) => {
  res.send('Hello from App Engine!');
});

// Listen to the App Engine-specified port, or 8080 otherwise
const PORT = process.env.PORT || 8080;
app.listen(PORT, () => {
  console.log(`Server listening on port ${PORT}...`);
});

In addition to TypeError, Node.js has below error classes that can be useful in debugging the applications/deployments.

  • Class: AssertionError
  • Class: RangeError
  • Class: ReferenceError
  • Class: SyntaxError
  • Class: SystemError

Cannot GET / error

This error signifies that your app is not able to get the public folder. Most likely culprit is wrong definition of static folders or issues with public folder and its front-end content (typically index.html).

Simply fix by adding static folder entry to the server-side code (typically index.js or server.js) as below:

app.use(express.static('public'));

Here is a common directory structure of a very basic Node.js app:

- project_directory
  - public
    - index.html
  - index.js
  - package.json
- app.yaml

And here is a common directory structure of a very extensive Node.js app:

- project_directory
  - src
    - controllers
    - models
    - routes
    - services
    - utils
  - public
    - css
    - js
    - images
  - views
  - config
  - tests
  - node_modules
  - index.js
  - package.json
- app.yaml

503 System Unavailable error

This is likely the least likely error to happen. 503 indicates something is wrong with the availability of the Google Cloud or Google App Engine Services. Luckily Google Cloud is a very robust IT service and 503 errors will occur very seldom and for very short periods of time.

Other than waiting or reaching out to Google Cloud Customer Support, you can monitor Google Cloud’s System Health page where they list the status of Google Services across many regions.

An example from Google Cloud Service Health Page

syntaxerror missing ) after argument list error

Another particular error is a syntaxerror pointing to a missing parenthesis. In reality, there can be so many reasons for this error.

First of all, obviously make sure your index.js, server.js etc. files aren’t missing a parenthesis. In my experience it can be a missing curly bracket as well and sometimes App Engine will throw the same syntax errorr.

Additionally, it’s not uncommon to have a version conflict to cause this error as different versions of nodejs can interpret the files differently. Make sure to define your runtime in app.yaml:

nodejs20

Also make sure to define your engine version if needed:

nodejs version =  20.3.0

An example from Google Cloud Service Health Page
import express, { Request, Response } from "express";

const app = express();
const port = 3000;

app.get("/", (req: Request, res: Response) => {
  res.send("Hello, World!");
});

app.listen(port, () => {
  console.log(`Server is running on port ${port}`);
};

For example the JS code sample above uses express but there is a rookie syntax error. Last line is supposed to be

});

rather than just };

These little mistakes happen all the time.

(gcloud.app.deploy) error

If you forget to include a crucial statement or make syntax errors in your configuration files such as app.yaml or packages.json, you will get a somewhat generic error as ERROR: gcloud.app.deploy. You can see the actual error statement below.

ERROR: (gcloud.app.deploy) An error occurred while parsing file: […/app.yaml]

In this case, you have to make sure mandatory statements are included and there aren’t any syntax errors in your files. An example to this is stating the runtime of the application and it can be simply done by including this line in your app.yaml:

runtime: nodejs20

If you make syntax mistakes similar to below or put something App Engine doesn’t work with, you will still get a deploy error as app.yaml can’t be parsed by the App Engine platform.

runtime nodejs20 (Colon is missing)

:runtime: nodejs20 (Colon syntax error)

(gcloud.app.deploy) INVALID_ARGUMENT error

In app.yaml file, one thing you will be required to include is the runtime variable. As mentioned above typically it will be something like this for JavaScript applications: runtime: nodejs19
An example from Google Cloud Service Health Page

The issue here is that nodejs19 is not an available runtime environment in App Engine.

Available Node.js runtimes on Google App Engine are:

  • nodejs10
  • nodejs12
  • nodejs14
  • nodejs16
  • nodejs18
  • nodejs20

If you specify something like nodejs19, you will get the error below.

Simply assign an available runtime and the error will be resolved.

ERROR: (gcloud.app.deploy) INVALID_ARGUMENT: Invalid runtime ‘nodejs19’ specified. Accepted runtimes are: [php, php55, python27, java, java7, java8, nodejs18, php72, nodejs16, php74, python310, php73, python311, nodejs10, ruby30, nodejs14, nodejs12, go120, ruby32, java17, php82, php81, java11, go118, go119, go116, go114, go115, nodejs20, go112, python38, go113, python37, go111, python39, ruby27, ruby26, ruby25]

Here is a correct app.yaml example:

runtime: nodejs20

(gcloud.app.deploy) error due to missing app.yaml

This one is quite self-explanatory. App Engine needs an app.yaml file in the application’s root folder even if it’s just a one-line runtime assignment. If this file is completely missing, you will get a deployment error.

ERROR: An app.yaml (or appengine-web.xml) file is required to deploy this directory as an App Engine application. Create an app.yaml file using the directions at https://cloud.google.com/appengine/docs/flexible/reference/app-yaml (App Engine flexible environment) or https://cloud.google.com/appengine/docs/standard/reference/app-yaml (App Engine standard environment) under the tab for your language.

ERROR: (gcloud.app.deploy) [/home/GAE-gcloud-tests] could not be identified as a valid source directory or file.

Solution is simple, check that you have an app.yaml file with the right file name and extension. Also check that it’s where it’s supposed to be, which is the root folder of your app.

Infinite Page Loading due to port error

Typically, App Engine serves applications on port 8080. Normally, you don’t have to specify any other port and you can just use the following port values in your server-side app (such as server.js or index.js files). A mismatch between application’s port assignment and Google App Engine’s port availability will cause an error due to request deadline being exceeded without any activity.

process.env.PORT || 8080

After the timeout you can go to the Log section of your Google Cloud Account either via the main Logging section from the main service selection usually on the left side or under App Engine’s Service section you can find a “See Logs” shortcut near your related service. In the logs you will see an error like below.

Error: Process terminated because the request deadline was exceeded. Please ensure that your HTTP server is listening for requests on 0.0.0.0 and on the port defined by the PORT environment variable. (Error code 123)

Assigning a different port number will cause an app to load infinitely since Google App Engine is serving the application on a different port than the application is attempting to listen. Google App Engine’s port is fixed and cannot be changed by the individual application.

So, if your browser wheel is turning for an unexpectedly long time and the request eventually times out without anything happening, the most likely culprit is a wrong port value being used in the app’s server-side JavaScript code.

This can be a very tricky error especially for the inexperienced DevOps engineers since there won’t be any specific errors during or before the deployment nor during the execution of the app. App will deploy fine and work fine locally (perhaps on port:3000 which is commonly used in development of Node.js apps), browser won’t give any errors, browser’s developer tools and debuggers won’t show any errors and eventually timeout will occur.

const PORT = 3000;
app.listen(PORT, () => {
  console.log(`Server listening on port ${PORT}...`);
});

Above, as a common practice port is assigned to 3000 for local development purposes. Although nothing’s wrong with this code, it is the wrong setup for App Engine deployment which uses port 8080 by default.

Simply solve infinite loading application error and timeout by assigning the right port environment variable according to App Engine as below:

const PORT = process.env.PORT || 8080;
app.listen(PORT, () => {
console.log(`Server listening on port ${PORT}`);
});

Google App Engine 500 Internal Server Error

500 Internal Server Error is an umbrella error code for all server errors. You might have heard the saying in IT that there is nothing the user can do about the 500 Internal Server Error. But if you think about it, when you are deploying a Node.js app on Google App Engine, your app is the server. In fact, Node.js itself is the most commonly used JavaScript Runtime Environment.that takes care of server-side of the application.

So if there is any incompatibility, conflict, error, glitch, bug in anywhere in your code or your app’s back-end or front-end or any configuration, you might see this annoying error. You might think, how am I supposed to find what the exact issue is? In this article, I will point out various methods to pinpoint the most common errors and provide solutions to these common errors.

Here are 2 configuration-related bugs that will cause App Engine deployment to throw 500 Internal Server Error:
(In most cases, fixing either one will eliminate 500 Internal Server Error but following both practices won’t cause any damage.)

Error: Server Error
The server encountered an error and could not complete your request.

Please try again in 30 seconds.

Entrypoint causing 500 Internal Server Error

A pretty common cause of 500 Internal Server Error with Google App Engine is having a different entrypoint name than default and not specifying it appropriately.

For example with nodejs runtime, App Engine looks for server.js file as default entrypoint. Entrypoint is the main server-side program that initiates serving the application and its front-end. So many developers will choose a file name such as index.js for their application’s main server-side code. This is doable as long as entrypoint is specified in the app.yaml file that instructs Google App Engine. It’s a quite simple fix. Here is the code example:

runtime: nodejs20
entrypoint: node ./index.js

Start Script (package.json) causing 500 Internal Server Error

This will solve the Internal 500 Error as App Engine doesn’t try to initiate an nonexistent server.js file. It’s equally important to assign a start script to the same file via package.json file. package.json file acts as a configuration file for your Node.js project. It contains metadata about your project, including its name, version, dependencies, and various scripts that can be executed using npm. When you assign the start script, and App Engine runs npm start by default to initiate JavaScript apps in Node.js runtime environment, npm start automatically knows the start script.

But when you don’t include the start script in package.json, the application can only be run by node index.js command which isn’t something App Engine does by default. This is the reason why neglecting start script in package.json will cause 500 Internal Server Error with App Engine. You can easily address this by adding following start line to your app’s package.json file under scripts dictionary.

  "scripts": {
    "start": "node index.js",
  }

500 Internal Server Error due to wrong file references

Another example to why 500 Internal Server Error will occur in your app is when you are referencing other style and functional files such as css or javascript but they don’t exist or file path or file names are wrong.

If you look at the simple index.html below, we are calling two external files, a css file and a js file. Since styles.css is misnamed as stylez.css and stylez.css doesn’t exist server will look for the wrong file and throw an Internal Server Error. Solution is to fix the file name but also it’d be very smart to deploy a bare bones application without any file references first to be able to isolate the errors.

<!DOCTYPE html>
<html>
<head>
  <link rel="stylesheet" type="text/css" href="css/stylez.css">
  <script src="js/script.js"></script>
</head>
<body>
  <h1 id="Hello World"></h1>
</body>
</html>

If you fix the styles.css file name above, or completely remove the line for test purposes, the error will disappear (or you’ll be able to catch other errors.)

500 Internal Server Error due to await command inside a non-async function

If you use the await keyword inside a non-async function, it will result in a syntax error. The await keyword can only be used inside an asynchronous function marked with the async keyword.

The purpose of the await keyword is to pause the execution of an asynchronous function until a Promise is resolved. It allows you to write asynchronous code that looks and behaves more like synchronous code.

When you use await inside an asynchronous function, it waits for the Promise to settle (either resolve or reject) and then continues the execution of the function. The await expression itself returns the resolved value of the Promise.

If you attempt to use await outside of an async function, the JavaScript engine will raise a syntax error because it doesn’t expect the await keyword in that context. To use await, you need to wrap it inside an asynchronous function.

const express = require('express');
const app = express();
const port = 3000;

function fetchData() {
  return new Promise((resolve) => {
    setTimeout(() => resolve('Data has been fetched!'), 2000);
  });
}

app.get('/', (req, res) => {
  res.send('Hello World!');

  try {
    const result = await fetchData(); // Syntax Error: await is only valid in async function
    console.log(result);
  } catch (error) {
    console.error(error);
  }
});

app.listen(port, () => {
  console.log(`App listening at http://localhost:${port}`);
});

Summary

In this Node.jsApp Engine debugging tutorial we covered a number of errors that can be quite puzzling. The article doesn’t cover all of the html or server-related errors extensively but it aims to provide guidance and solutions to some of the most confusing and least explanatory errors you can get after deploying on App Engine. Good examples to such confusing errors are generic 500 Internal Server Error and infinite page loading until timeout (or just white screen with nothing loading).

Inserting Python Command’s Output in VIM

1) r!

:r! python test.py

Let’s say test.py is a file with print(123) inside it.

This will run the existing file with Python. It will enter the output in Vim. Here is the breakdown:

  • : (command line)
  • r reads
  • ! executes linux command
  • python runs python with test.py file

2) !! (Bang Bang)

:.! python test.py

Alternatively, if you type !! in normal mode, this will type ! with a dot in command line like above. The same outcome will be achieved. Here is the breakdown.

  • . enter to current line
  • ! execute system command
  • python runs Python
  • % feeds current file to Python

3) Feed Python with a Quick Command

:.! python <<< "print(123)"

Previous outcome can be matched with Python command execution style mentioned here.

  • <<< feeds Python code from system’s temporary file system to Python Interpreter.
  • “” inside the quotes our content for Python command is formed
  • print(123) actual Python command itself
This is incredible as so many systems are used on the fly and in harmony. Vim, Linux OS and Python are utilized in two words and a few characters and the program’s result is written in the same file.

4) Feed Python with the Existing File

:.! python %

Finally, you can feed the existing Vim file to the Python directly and insert the output from Python on the cursor’s line. For this task, you need to feed % to Python representing the whole current file.

  • . enter to current line
  • ! execute system command
  • python runs Python
  • % feeds current file to Python
This is really next level scripting pipeline. I always appreciated Vim’s intuitive style for writing code and human language. When mix n matched with Python or Linux commands it gets even more impressive.

Fixing Pacman Errors When Updating Arch Linux After A While

There is a specific error you might get when updating/upgrading the Arch Linux OS after a long break (or sometimes even after a short break if you’re unlucky with the intervals) via sudo pacman -Syu. No worries though. Linux in general and Arch Linux specifically are incredibly robust and repairable systems so there are many different ways to address the issue. Table of Contents
  • Introduction and Pacman Errors
    1. Updating the Archlinux Keyring
    2. Updating the Pacman Mirrorlist
    3. Removing the Pacman Cache Files
    4. Internet restriction and VPN or Hotspot Solutions
    5. Compiling and Installing static-pacman
    6. Booting Via Live Arch Installation USB & Chroot
  • References

The errors thrown by the Arch Linux and its package manager Pacman look something like this:

error: openssl: signature from “Pierre Schmitz <pierre@archlinux.org>” is marginal trust
:: File /var/cache/pacman/pkg/openssl-3.0.7-4-x86_64.pkg.tar.zst is corrupted (invalid or corrupted package (PGP signature)).
Do you want to delete it? [Y/n] Y
error: failed to commit transaction (invalid or corrupted package)
Errors occurred, no packages were upgraded.

Starting from the least complicated and the least invasive, here are some solutions to these pacman errors.

Disclaimer: This tutorial is created for journaling & educational purposes only. The author cannot be responsible for any damages due to the guidance provided in this post. Please proceed at your own risk and always keep backups of your files.

1. Update the Archlinux Keyring first

While partial updates aren’t recommended as they can mess up the dependencies, SSL / TSL encryption mechanisms and the sync between packages and the package manager, you can always get away with updating the Archlinux-keyring first. If you haven’t run any updates/upgrades in a while, the update operation might be interrupted due to out of date Archlinux keyring libraries.

You can simply do this by running this following command:

sudo pacman -S archlinux-keyring

After that the keyring will be upgraded all your problems regarding the package signatures and marginal trust might potentially go away. You can try to proceed with a full system upgrade with the following command:

sudo pacman -Syu

If everything goes well, issue is sorted out and you don’t have to follow the following suggestions.

Note about openssl: Openssl package is a dependency of the GNU Core Utilities package: coreutils. In this specific case due to out-of-date archlinux-keyring package, the required version of openssl package can’t be found / appears corrupted and as a result core database cannot be retrieved.

Another way to fix broken keyrings is to update the keyring with these commands:

pacman -S archlinux-keyring
pacman-key --init
pacman-key --populate archlinux
pacman-key --refresh.

If everything goes well, issue is sorted out and you don’t have to follow the following suggestions.

(According to Arch Wiki: this is something you want to do in the chroot of your installation (arch-chroot /mnt))

This section is inspired by Nan Xiao’s post about How to update Archlinux keyring in old systems.

2. Update the Pacman Mirrorlist

You can also try updating the pacman mirrorlist by simply adding anything in the official Arch mirrorlist in the link below to the file in your system in : /etc/pacman.d/mirrorlist.

Here is the mirrorlist link: Arch Pacman Mirrorlist.

After updating the Pacman mirrorlist, you may want to give it a try by attempting the update/upgrade command again.

If it doesn’t work you can try the next step and try this command again.

sudo pacman -Syu

Note about openssl: Openssl package is a dependency of the GNU Core Utilities package: coreutils. In this specific case due to out-of-date archlinux-keyring package, the required version of openssl package can’t be found / appears corrupted and as a result core database cannot be retrieved.

This post is inspired by Nan Xiao’s How to update Archlinux keyring.

3. Removing Pacman cache files

Every time you update your system pacman keeps cache files of the old packages in case there is a need to downgrade. These files take lots of space cumulatively and if they have the wrong version this might be confusing the pacman’s update mechanism as well. Cache files are kept under

/var/cache/pacman/pkg &

/var/lib/pacman.

 

Cache files can be removed by running the following command:

sudo pacman -Sc

You’ll be prompted twice to delete the cache files.

You can then try to update the system with pacman again by running sudo pacman -Syu. There is a way to completely remove ALL cache files including other than other packages as well.

sudo pacman -Scc

Then you will be prompted twice and one of the prompts will have no (N) as default choice so you might want to make sure to enter y (for yes).

Cache directory: /var/cache/pacman/pkg/
:: Do you want to remove ALL files from cache? [y/N]

Database directory: /var/lib/pacman/
:: Do you want to remove unused repositories? [Y/n]

sudo pacman -Syu

4. Internet Restriction & VPN usage

In some rare cases there can be access restriction if you are using restricted internet. Server will return a 403 access denied error. In those cases you know it’s something to do with the internet connection most likely. You will see errors similar to below:

error: failed retrieving file ‘core.db’ from archmirror.it : The requested URL returned error: 403
error: failed retrieving file ‘extra.db’ from archmirror.it : The requested URL returned error: 403
error: failed retrieving file ‘community.db’ from archmirror.it : The requested URL returned error: 403
warning: too many errors from archmirror.it, skipping for the remainder of this transaction

In such cases temporarily connecting to a VPN or a Hotspot and trying to run a system update can also be helpful.

5. Compiling a Static Pacman (pacman-static) from AUR

grep -R  --ignore-case "IMPORT" /

Here is the AUR repository for pacman-static where you can find the git clone URL of pacman-static. Then following this tutorial you can compile the pacman-static:

Once pacman-static is installed, you will be able to update the openssl package or other crucial system packages that need to be repaired. In many cases it’s the openssl package which can be updated / installed with the following command.

sudo pacman-static -S openssl

You can also see the version of your openssl with the following command.

pacman -Q openssl

In the case of partial updates, one important package gets upgraded while others don’t. Since those other system packages depend on the correct version of the upgraded package they will stop functioning properly and it will break the whole system.

Here is the Arch Wiki entry for using pacman-static.

5. Firing Up the Live USB (Arch ISO)

If all fails, it’s still fairly easy to boot with a live Arch Installation USB. It’s also a good idea to always have a Live USB ready for such circumstances. Then you can chroot and install necessary packages to your local installation to solve the more complex issues. It’s probably also a good idea to back up what’s needed at this step while you still can. Here is the command to chroot:

arch-chroot /mnt

Fixing Broken Pacman

If partial upgrades, broken libraries, missing SSL TSL encryption is causing bigger issues, you can follow this procedure:

  • Boot through Arch ISO USB
  • Connect to internet using iwctl
  • mount the broken system on /mnt
  • use pacman –sysroot /mnt … to reinstall/update the broken/outdated packages.
  • Install pacman-static
    • either via AUR as explained above
  • or if system files aren’t working to makepkg/install you can use  extracted binary compiled by official Arch sources.
  • use pacman-static instead of pacman to reinstall or update the broken or outdated packages.

Please note, you will need to mount the necessary system partitions before chrooting to the root of your local installation like above.

Official files and mirrors for Arch Linux can be found here.

References

[1] Arch Wiki: chroot

Recursive Grep

In this bash tutorial we will check out recursive grep usage examples and how it can be extremely beneficial in some use cases.

Table of Contents

  • Basics of grep -R
    1. Excluding Directories
    2. How to remove “Permission denied” messages
    3. Ignoring case sensitivity
    4. You can combine grep with regex
  • Benefits of recursive grep

grep -R, Recursive Query Function

grep -R can be used for recursive text search across many files in a directory. It can be extremely useful and can be applied to many different application cases.

1. Excluding Directories

Sometimes specific folders can cloud the query results when using grep recursively. For example you might have specific folders full of the stuff you are searching for but you might find to look for your query everywhere else under the same structure.
A typical occurrence is when you search under the root folder and proc directory becomes a nuisance in the results. This folder includes process IDs and cpu and kernel specific information which is often irrelevant for search results. Here is an example for excluding proc directory when using grep recursively but you can use it for any directory you like.
grep -R --eclude-dir=/proc "import requests" /

grep recursively looks for “import requests” under / (root) while excluding /proc directory.

2. How to remove "Permission denied" messages

grep -R  "import" /

Sometimes when you might have to use grep without sudo. Or some system directories might be unreachable even for the admin. In those cases you might get results with hundreds of lines similar to below which will be quite hard to read.

Check out an excerpt of the results from the bash script above.

/usr/share/doc/libjpeg-turbo/TJExample.java:import java.io.*;
grep: /proc/net/ip_tables_names: Permission denied
grep: /proc/net/ip_tables_matches: Permission denied
grep: /proc/net/ip_tables_targets: Permission denied
grep: /proc/sys/fs/binfmt_misc/register: Permission denied
grep: /proc/sys/kernel/cad_pid: Permission denied
grep: /proc/sys/kernel/usermodehelper/bset: Permission denied
grep: /proc/sys/kernel/usermodehelper/inheritable: Permission denied
grep: /proc/sys/net/core/bpf_jit_harden: Permission denied
grep: /proc/sys/net/core/bpf_jit_kallsyms: Permission denied
grep: /proc/sys/net/core/bpf_jit_limit: Permission denied
grep: /proc/sys/net/ipv4/route/flush: Permission denied
grep: /proc/sys/net/ipv4/tcp_fastopen_key: Permission denied
grep: /proc/sys/net/ipv6/conf/all/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/conf/default/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/conf/lo/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/conf/wlp58s0/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/route/flush: Permission denied
grep: /proc/sys/vm/compact_memory: Permission denied
grep: /proc/sys/vm/drop_caches: Permission denied
grep: /proc/sys/vm/mmap_rnd_bits: Permission denied
grep: /proc/sys/vm/mmap_rnd_compat_bits: Permission denied
grep: /proc/sys/vm/stat_refresh: Permission denied
grep: /proc/tty/driver: Permission denied
grep: /proc/kmsg: Permission denied
grep: /proc/mtrr: Operation not permitted
grep: /proc/kcore: Permission deniedgrep: /proc/slabinfo: Permission denied
grep: /proc/kpagecount: Permission denied
grep: /proc/kpageflags: Permission denied
grep: /proc/timer_list: Permission denied
grep: /proc/kpagecgroup: Permission denied
grep: /proc/vmallocinfo: Permission denied
grep: /proc/pagetypeinfo: Permission denied
grep: /proc/sysrq-trigger: Permission denied
Permission denied output can be excluded by adding 2> /dev/null to the query. Here is an example.
grep -R  "import requests" /home/usa 2> /dev/null
/usr/share/doc/libjpeg-turbo/TJExample.java:import java.io.*;

2. Ignoring case sensitivity

grep -R  --ignore-case "IMPORT" /

You can also disable case sensitivity when using grep as above.

3. You can combine grep with regex

Regex is beyond the scope of this tutorial but you can combine grep to create very specifically tailored queries. This can be useful in server administration, cloud migration, cloud management, devops operations and similar industry-level tasks.

grep -R  --ignore-case "IMPORT" /

Benefits of Recursive Grep

Using grep recursively can be incredibly useful. Here are a few examples.

  • You can find files without knowing the file names. Imagine you recovered some data but file names are a mess. Let’s say you have 100K files with weird names. You could use grep -R for recursion and find specific files with specific content. For example you can search for a query like import requests or another Python code and you will get Python. files.
  • You can find specific text strings across many text files. You can search for an Einstein quote or a price information without opening each text file.
  • Applying a similar logic you can filter html files or you can query images. Most images include EXIF information which can be used to find specific images as well.

grep -R will treat each file as strings and go through every single file in the specified folder and output query results.

References

[1] Grep Command: GNU Grep Manual

Launching desktop environments with startx

You can have total control over launching your operating system’s graphic server in Linux.

Most desktop environments use a graphic server called either X (or Xorg) or Wayland. X is approximately 40 years old and it is the legacy graphic server that serves its purpose really well. X still dominates the market.

Wayland on the other hand is a modern approach that’s been gaining traction and it also has lots of support and is being used by quite a few desktop environments in Linux such as KDE.

I find it very useful to be able to manually launch and quit desktop environments in Linux and shut down (or never launch) the graphic server when I want to.

This Linux tutorial  will explain how to take control over launching the Xorg graphic server in almost any Linux distro and the reasoning behind it.

Table of Contents

  • X Graphic Server Terminology
  • Launching Desktop Environments with startx
    1. Gnome .xinitrc
    2. XFCE .xinitrc
    3. Cinnamon .xinitrc
    4. KDE Plasma .xinitrc
    5. Qtile .xinitrc
  • Handling the Login Manager
  • Benefits of Manually Launching Graphic Environment in Linux
  • Troubleshooting startx & .xinitrc

X Graphic Server Terminology

Here is a breakdown of the most commonly used desktop environments and the codes you can use to start them with startx which will initiate X graphic server/client through Xinit (Server Launcher Program) and Xorg (Graphic Server).

Xorg is synonymous with X server or sometimes just X.

You can create .xinitrc under user’s home and add just a one line code such as:

Xinit : Program which launches Xorg graphic server.

Xorg : X Graphic Server

startx : Command used to launch graphic applications which defaults to .xinitrc file.

.xinitrc : File under the home directory that instructs startx command

startplasma-x11 start. : an example .xinitrc entry. See below for specific entries for each Desktop Environment or Window Manager.

Screenshot of Endeavour OS based on Arch Linux and KDE Plasma Desktop

1. Launching desktop environments with startx

Here is a breakdown of the most commonly used desktop environments and the codes you can use to start them with startx which will initiate X graphic server/client through Xinit (Server Launcher Program) and Xorg (Graphic Server).

Xorg is synonymous with X server or sometimes just X.

You can create .xinitrc under user’s home and add just a one line code such as:

startplasma-x11 start.

So the code goes inside .xinitrc, and the directory structure would be:

/home/userxyz/.xinitrc

Make sure .xinitrc file has a dot before it as shown. Xinit and Xorg directly looks for this file under home directory. (dot makes the file hidden which can be seen by ls -la command)

Here are the specific codes to include in the .xinitrc file for each desktop environment.

There are also a few points down below which might be helpful for troubleshooting or avoiding pitfalls if you are doing this kind of stuff for the first time. You can also read about the benefits of manually launching desktop environments or at least my reasoning for it. If you have comments feel free to drop it at the bottom.

Please review “Disabling the Login Manager” down below before starting to use the startx command.

Gnome .xinitrc

You can instruct startx either via exec in the beginning or start in the end.

gnome-session start

or

exec gnome-session

XFCE .xinitrc

startxfce4 start

or

exec startxfce4

Cinnamon .xinitrc

cinnamon-session start

or

exec cinnamon-session

KDE Plasma .xinitrx

startplasma-x11 start

or

exec startplasma-x11

Qtile .xinitrc

qtile start

or

exec qtile

Now, almost a year later, we have Google’s Imagen published which makes Dall-e 2 results look primitive. Imagen’s image outcomes are so impressive that, it looks beyond the artistic capabilities of human designers and illustrators. Of course art and design are subjective terms but every single Imagen image published is truly mind-blowingly accurate, well-designed and photorealistic.

Starting desktop environment with startx

2. Handling the Login Manager

If you are building a Linux system from the ground up then you might not have a Login Manager but if you are trying to modify an existing distro such as Endeavour OS, Mint OS, Manjaro OS, Fedora, Cent OS, Ubuntu, OpenSUSE etc. then you will want to disable to existing login manager.

Starting and Stopping the Login Manager

You can also immediately start and stop the SDDM Login Manager service using the commands below. (Stopping the login manager on the go might be tricky sometimes but disabling fixes that problem for the next boot.)

  • For starting SDDM:
    • systemctl start sddm
  • For stopping SDDM:
    • systemctl stop sddm

Enabling and Disabling the Login Manager

First you need to find out which Login Manager your system is using if there is one. Login Managers (or Display Managers or DMs) are graphical login screens that can be useful in some cases. I personally don’t find them to be very useful. They take resources and complicate things and running Desktop environments from tty is perfectly minimalistic and aesthetically pleasing anyway.

Here are some commands to list login managers, find them, enable/disable them and start/stop them…

You can list the services enabled at systemd level via systemctl by using either of the following commands.

First one is the native way of filtering enabled services with systemctl’s own flag while second method makes use of grep command for a similar output.

systemctl list-unit-files --state=enabled

systemctl list-unit-files | grep enabled

SDDM Login Manager

So disabling and enabling login manager at startup level is quite easy with systemd. Here are the example commands for SDDM login manager.

For disabling SDDM:

systemctl disable sddm

For enabling SDDM:

systemctl enable sddm

Make sure you are comfortable with working with the CLI (command line interface) before disabling the login manager first because you might not have the login manager which triggers the graphic interface of the desktop manager automatically behind the scenes.

Login Managers are quite persistent meaning even if you stop them, they might auto-launch and get in the way when you are trying to manually launch your graphic environment (through desktop environments, windows managers or sometimes specific individual apps.)

If you disable your login manager and reboot your Linux system, you will just get a TTY command line without graphics server activated. Normally you will get TTY1.

You can usually navigate between multiple virtual terminals using the Ctrl+Alt F2 (or F3 for TTY3, F1 for back to TTY1, F4 for TTY4 and so on.)

Light DM (LDM) Login Manager

You can enable / disable Light DM as below:

For disabling Light Display Manager:

systemctl disable lightdm

For enabling Light Display Manager:

systemctl enable lightdm

3. Exiting Xorg Graphic Server and Desktop Environments

After launching the desktop environment from the tty command line, you might want to go back to the tty environment and/or leave your computer on without the Xorg graphic server.

Termination of Xorg and desktop environments can be conveniently achieved when there is no Login Manager such as SDDM running. That’s because SDDM can be persistent and auto launch itself which can make it difficult to kill without disrupting the system.

To kill the desktop environment just execute the appropriate one for your setup from the following commands.

Commands to kill various Desktop Environments

So disabling and enabling login manager at startup level is quite easy with systemd. Here are the example commands for SDDM login manager.

For killing Gnome session:

killall gnome-session

For killing XFCE session:

killall startxfce4

For killing Cinnamon session:

killall cinnamon-session

For killing KDE Plasma session:

killall startplasma-x11

For killing Qtile session:

killall qtile

It’s also perfectly possible to map the commands above to a key combination to kill the desktop environment. One combination that works perfectly for me is Ctrl+Meta+Q. It’s usually a free combo and has the Q (quit) in it so it’s easy to remember. It’s also not easy to accidentally hit it.

Benefits of Manually Launching Graphic Environment in Linux

It’s starting to feel like we are there. AI’s real-world successes are being increasingly felt. Every new applied-AI milestone leaves your mouth open. It’s hard not to imagine the societal impacts. Millions of humans who derive lots of self-worth and satisfaction from their work suddenly becoming unemployed or even worse irrelevant.

  • Achieving focus: Being immersed in a graphic environment has its pros and cons. Having the graphic environment can be counterproductive when you need to focus on certain tasks. For some tasks that can be done in the command line, I find it helpful to not having a desktop environment launched at all. There is something special about not having the graphic server launched at all versus having the GUI programs available in the environment. That being said of course sometimes the graphic environment can be very useful and even inevitable. Some tasks I find myself performing through the CLI are:
    • Writing code
    • Writing scripts
    • Playing music (cmus)
    • Taking notes
    • Writing articles
    • Reading notes/books
  • Increased Ram & CPU resources: In most computers, the most resource hungry components are often the graphic environments. Graphic server communicates with each input and each output at pixel level which understandably creates a burden on RAM and CPU resources.
  • Increased battery life: Manually launching X graphic server when needed can optimize battery life enormously. I forgot my Dell XPS on for one week and the battery was still 90%+. Since there’s no graphic server launched the necessary processes to keep the OS running are at absolute minimal. X graphic server comes with many support libraries as well which can bloat the system.
  • Having more control: Being able to launch and terminate the graphic server and all the desktop environment and other related programs in a very practical way gives you more power over using your computer’s operating system and even its hardware. It’s also helpful to understand the inner workings of the operating system’s graphic components.

graphical.target vs multi-user.target

systemctl get-default

systemctl set-default graphical.target

systemctl set-default multi-user.target

Troubleshooting startx and .xinitrc

1- startx is not meant to be run as root

2- .xinitrc is not under home directory.

3- .xinitrc has incorrect entries.

start startplasma-x11 instead of startplasma-x11 start

entries after using exec

forgetting & character at the end of the line when there is multiple entries.

4- Having multiple .xinitrc in multiple locations which will confuse Xorg server or make it ignore intended .xinitrc.

5- User doesn’t have access to video group

6- Not having necessary Xorg packages installed.

7- Not having necessary graphic firmware installed.

References

[1] X Graphic Server Documentation: https://www.x.org/wiki/UserDocumentation/
[2] Arch Xorg Documentation: https://wiki.archlinux.org/title/xorg
[3] Debian’s Xinitrc Documentation: https://wiki.debian.org/Xinitrc

Installing AUR Packages with Pacman

We will compile and install software packages from source code.

Content Table:

  • AUR vs Arch Packages
  • Installing packages from source
    • Step 1: Cloning the source code
    • Step 2: Compiling the package
    • Step 3: Installing the package
  • Post Installation: Updating & removing external packages
  • Troubleshooting makepkg
Software packages can be comopiled from source code in Arch Linux.

AUR - Arch User Repository

AUR or Arch User Repository is a repo which holds PKGBUILDs or source code of various computer programs.

AUR can be used to find source code ready to be compile and make packages. You will realize while AUR lists these packages, source codes themselves are hosted on Github.

Let’s get to it.

AUR vs Arch Packages

87K packages can be found on AUR as of December 2022 while there are about 13K packages on Arch Packages.

Arch packages can be installed directly using the pacman package manager.

sudo pacman -S <packagename>

Installing packages through AUR requires a couple of extra steps.

In this mini-tutorial we will learn how to install packages directly from source code using 2 methods:

  • makepkg
  • pacman

1. Cloning the source code

git clone https://aur.archlinux.org/librewolf.git

Make sure git clone isn’t run as root, a common mistake…

git clone will create a folder with the necessary program files along with the PCKGBUILD (a script used to build packages). You can then navigate to this folder and continue with the instructions in the following steps from inside this folder.
cd -newpackagefolder-

2. Compiling the package

We will use makepkg to compile source code. Makepkg is a convenient package build utility. You can read more about it here.
We can compile a package using makepkg -s.
-s flag: Stands for syncing and syncs the dependencies of the package being compiled so you don’t have to compile them one by one.
makepkg -s

You can also use the following command to continue with installation of the package built from source code.

makepkg -si

If you’ve used makepkg -si, you don’t need to follow the 3rd step to install the package using pacman.

3. Installing the package

I prefer this method because I am used to using pacman to install packages.

pacman -U <.zst file>
pacman -U <.zst file>

Post Installation (Updating or Removing Packages)

You can still manage packages installed through compiling source code or other external sources to an extent.

You can’t update them automatically since they were compiled and they aren’t included in the pacman’s arch packages repo. But you can use pacman for related operations below:

  • remove manually installed packages
    • pacman -R <packagename>
  • get information about manually installed packages
    • pacman -Qi <packagename>

If you find out your package’s version is becoming outdated. You can update it using the same compiling and installation steps explained above and the package will be updated.

Troubleshooting makepkg

Here are a few of the most common makepkg errors and their solutions. It’s usually a simple fix to solve makepkg errors.

1- Pacman failed to install missing dependencies

When makepkg -s is used, dependencies will be installed given they exist in the pacman repository. Sometimes this is not the case and dependencies can’t be satisfied. In which case you’ll get an error similar to below:
==> ERROR: 'pacman' failed to install missing dependencies.
==> Missing dependencies: -> ocaml-ocamlsdl -> ocaml-graphics -> lablgtk2
==> ERROR: Could not resolve all dependencies.
In that case dependencies also need to be compiled manually. You can find them on AUR and repeat the steps above. Once you have the dependencies installed you can continue with makepkg -s to compile the main package you intended to install.

2- You do not have write permission for the directory $BUILDDIR

This is usually caused by wrong directory privileges and particularly package directory being owned by the root.

If you run sudo git clone you might get write error.

Solution is to just remove the directory and rerun git clone without sudo or change directory ownership to the user. Simple as that.

==> ERROR: You do not have write permission for the directory $BUILDDIR (/tmp/librewolf-bin).
Aborting...

If you run git clone as root you would be required to run makepkg as root as well. The problem is running makepkg as root is not allowed. You can see the specific explanation for that specific error in the following section.

3- Running makepkg as root is not allowed as it can cause permanent damage

Makepkg should never be run as root and the system won’t allow you to do so. If you attempt running makepkg as root you will get the following error.

==> ERROR: Running makepkg as root is not allowed as it can cause permanent, catastrophic damage to your system.

The rationale behind this restriction is that if you compile a source code acquired from external resources as root then you would completely open the computer and the linux os to intentional and unintentional risks as these packages are created  by users and added to the AUR (or another repo or Github) repo.

Resources

[1] Pacman Package Manager Documentation: https://wiki.archlinux.org/title/pacman

[2] AUR (Arch User Repository): https://aur.archlinux.org/packages

[3] Makepkg package build utility documentation: https://wiki.archlinux.org/title/makepkg

Here Comes AI

It’s May 25, 2022. Two days ago

on May 23, 2022, Monday at 17:42:53 UTC Google Brain published their text-to-image diffusion model implementation: Imagen.

Imagen is comparable to Dall-E (from OpenAI) which was initially launched on January 5, 2021 and then greatly improved on via Dall-E 2 on April 6, 2022. Creatively, Dall-E derives its name from a combination of Salvador Dali and Wall-E.

Dall-e 2 had great improvements on Dall-E and the image outcomes were just much more photorealistic and impressive.

Dall-E vs Dall-E 2 results

Imagen from Google Brain

Now, almost a year later, we have Google’s Imagen published which makes Dall-e 2 results look primitive. Imagen’s image outcomes are so impressive that, it looks beyond the artistic capabilities of human designers and illustrators. Of course art and design are subjective terms but every single Imagen image published is truly mind-blowingly accurate, well-designed and photorealistic.

Samples produced by Imagen have higher image quality and better text-image alignment.

I think it’s a tie since OpenAI pioneered this implementation and Google Brain took even Dall-E 2’s achievement to a complete next level. Scoreboard shows OpenAI 1 – 1 Google.

Both Imagen and Dall-E 2 have specific AI characteristics. They are,

  • Generative: Generative vs Discriminative Machine Learning Models.
  • Transformative: Transformative models contribute to the outcomes with nuances and perspectives similar to the humans such as generating a visually pleasant and photorealistic image rather than just combining values. [1]
  • Diffusion Models: Diffusion models denoise images. For example a low resolution or pixelated area can be improved with diffusion models. The image with noise (particularly Gaussian noise) is converted to higher quality samples hence the pixels (and noise) are diffused resulting in a higher quality image. Diffusion models yield great quality results and are computationally more efficient than alternative methods such as autoregressive models. UC Berkeley’s Ho et al. have a fantastic research paper on denoising with diffusion models.
Youtuber Marques Brownlee has a pretty informative and entertaining video on Dall-E 2 posted in May 2022. It’s great and shows a sneak peak to some unofficial text-to-image experiments since Marques was granted permission by OpenAI to make some tests with the model. There aren’t even many videos on Imagen yet but expect crazy amount of discussions and content on Imagen in the next couple of years.
 
Imagen’s academic paper is also a gem for Artifical Intelligence enthusiasts.

OpenAI states “We recognize that work involving generative models has the potential for significant, broad societal impacts. “

Google Imagen Samples: Images created based on text descriptions

Digital Image Basics

Pretty much all of these incredibly exciting advancements are based on digital image fundamentals. If it seems too confusing to you, you can take a look into how images are represented by numbers at each pixel level and how they can be represented and manipulated numerically through these tutorials:

Once you are able to see images as matrices of numerical values (usually a combination of RBG, red, blue and green and sometimes RGBA with alpha transparency value) covering the resolution array of the image, your whole perspective about how AI and ML can be implemented through computer vision models changes.

Societal Impacts

It’s starting to feel like we are there. AI’s real-world successes are being increasingly felt. Every new applied-AI milestone leaves your mouth open. It’s hard not to imagine the societal impacts. Millions of humans who derive lots of self-worth and satisfaction from their work suddenly becoming unemployed or even worse irrelevant.

At this rate of AI innovation, it’s not hard to imagine the next decade being truly disrupted by AI implementations. By 2032 we would probably have an army of trained AI algorithms that can walk, talk, drive, write better than humans. And the way AI works is, we are not talking about slightly better, they will be crushingly better and the gap will exponentially widen.

Here comes AI and the world is definitely not ready..

Economically, emotionally, physically, socially nor theologically.

References

[1] Forecasting Transformative AI: An Expert Survey: https://arxiv.org/ftp/arxiv/papers/1901/1901.08579.pdf