Recursive Grep

In this bash tutorial we will check out recursive grep usage examples and how it can be extremely beneficial in some use cases.

Table of Contents

  • Basics of grep -R
    1. Excluding Directories
    2. How to remove “Permission denied” messages
    3. Ignoring case sensitivity
    4. You can combine grep with regex
  • Benefits of recursive grep

grep -R, Recursive Query Function

grep -R can be used for recursive text search across many files in a directory. It can be extremely useful and can be applied to many different application cases.

1. Excluding Directories

Sometimes specific folders can cloud the query results when using grep recursively. For example you might have specific folders full of the stuff you are searching for but you might find to look for your query everywhere else under the same structure.
A typical occurrence is when you search under the root folder and proc directory becomes a nuisance in the results. This folder includes process IDs and cpu and kernel specific information which is often irrelevant for search results. Here is an example for excluding proc directory when using grep recursively but you can use it for any directory you like.
grep -R --eclude-dir=/proc "import requests" /

grep recursively looks for “import requests” under / (root) while excluding /proc directory.

2. How to remove "Permission denied" messages

grep -R  "import" /

Sometimes when you might have to use grep without sudo. Or some system directories might be unreachable even for the admin. In those cases you might get results with hundreds of lines similar to below which will be quite hard to read.

Check out an excerpt of the results from the bash script above.

/usr/share/doc/libjpeg-turbo/TJExample.java:import java.io.*;
grep: /proc/net/ip_tables_names: Permission denied
grep: /proc/net/ip_tables_matches: Permission denied
grep: /proc/net/ip_tables_targets: Permission denied
grep: /proc/sys/fs/binfmt_misc/register: Permission denied
grep: /proc/sys/kernel/cad_pid: Permission denied
grep: /proc/sys/kernel/usermodehelper/bset: Permission denied
grep: /proc/sys/kernel/usermodehelper/inheritable: Permission denied
grep: /proc/sys/net/core/bpf_jit_harden: Permission denied
grep: /proc/sys/net/core/bpf_jit_kallsyms: Permission denied
grep: /proc/sys/net/core/bpf_jit_limit: Permission denied
grep: /proc/sys/net/ipv4/route/flush: Permission denied
grep: /proc/sys/net/ipv4/tcp_fastopen_key: Permission denied
grep: /proc/sys/net/ipv6/conf/all/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/conf/default/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/conf/lo/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/conf/wlp58s0/stable_secret: Permission denied
grep: /proc/sys/net/ipv6/route/flush: Permission denied
grep: /proc/sys/vm/compact_memory: Permission denied
grep: /proc/sys/vm/drop_caches: Permission denied
grep: /proc/sys/vm/mmap_rnd_bits: Permission denied
grep: /proc/sys/vm/mmap_rnd_compat_bits: Permission denied
grep: /proc/sys/vm/stat_refresh: Permission denied
grep: /proc/tty/driver: Permission denied
grep: /proc/kmsg: Permission denied
grep: /proc/mtrr: Operation not permitted
grep: /proc/kcore: Permission deniedgrep: /proc/slabinfo: Permission denied
grep: /proc/kpagecount: Permission denied
grep: /proc/kpageflags: Permission denied
grep: /proc/timer_list: Permission denied
grep: /proc/kpagecgroup: Permission denied
grep: /proc/vmallocinfo: Permission denied
grep: /proc/pagetypeinfo: Permission denied
grep: /proc/sysrq-trigger: Permission denied
Permission denied output can be excluded by adding 2> /dev/null to the query. Here is an example.
grep -R  "import requests" /home/usa 2> /dev/null
/usr/share/doc/libjpeg-turbo/TJExample.java:import java.io.*;

2. Ignoring case sensitivity

grep -R  --ignore-case "IMPORT" /

You can also disable case sensitivity when using grep as above.

3. You can combine grep with regex

Regex is beyond the scope of this tutorial but you can combine grep to create very specifically tailored queries. This can be useful in server administration, cloud migration, cloud management, devops operations and similar industry-level tasks.

grep -R  --ignore-case "IMPORT" /

Benefits of Recursive Grep

Using grep recursively can be incredibly useful. Here are a few examples.

  • You can find files without knowing the file names. Imagine you recovered some data but file names are a mess. Let’s say you have 100K files with weird names. You could use grep -R for recursion and find specific files with specific content. For example you can search for a query like import requests or another Python code and you will get Python. files.
  • You can find specific text strings across many text files. You can search for an Einstein quote or a price information without opening each text file.
  • Applying a similar logic you can filter html files or you can query images. Most images include EXIF information which can be used to find specific images as well.

grep -R will treat each file as strings and go through every single file in the specified folder and output query results.

References

[1] Grep Command: GNU Grep Manual

Here Comes AI

It’s May 25, 2022. Two days ago

on May 23, 2022, Monday at 17:42:53 UTC Google Brain published their text-to-image diffusion model implementation: Imagen.

Imagen is comparable to Dall-E (from OpenAI) which was initially launched on January 5, 2021 and then greatly improved on via Dall-E 2 on April 6, 2022. Creatively, Dall-E derives its name from a combination of Salvador Dali and Wall-E.

Dall-e 2 had great improvements on Dall-E and the image outcomes were just much more photorealistic and impressive.

Dall-E vs Dall-E 2 results

Imagen from Google Brain

Now, almost a year later, we have Google’s Imagen published which makes Dall-e 2 results look primitive. Imagen’s image outcomes are so impressive that, it looks beyond the artistic capabilities of human designers and illustrators. Of course art and design are subjective terms but every single Imagen image published is truly mind-blowingly accurate, well-designed and photorealistic.

Samples produced by Imagen have higher image quality and better text-image alignment.

I think it’s a tie since OpenAI pioneered this implementation and Google Brain took even Dall-E 2’s achievement to a complete next level. Scoreboard shows OpenAI 1 – 1 Google.

Both Imagen and Dall-E 2 have specific AI characteristics. They are,

  • Generative: Generative vs Discriminative Machine Learning Models.
  • Transformative: Transformative models contribute to the outcomes with nuances and perspectives similar to the humans such as generating a visually pleasant and photorealistic image rather than just combining values. [1]
  • Diffusion Models: Diffusion models denoise images. For example a low resolution or pixelated area can be improved with diffusion models. The image with noise (particularly Gaussian noise) is converted to higher quality samples hence the pixels (and noise) are diffused resulting in a higher quality image. Diffusion models yield great quality results and are computationally more efficient than alternative methods such as autoregressive models. UC Berkeley’s Ho et al. have a fantastic research paper on denoising with diffusion models.
Youtuber Marques Brownlee has a pretty informative and entertaining video on Dall-E 2 posted in May 2022. It’s great and shows a sneak peak to some unofficial text-to-image experiments since Marques was granted permission by OpenAI to make some tests with the model. There aren’t even many videos on Imagen yet but expect crazy amount of discussions and content on Imagen in the next couple of years.
 
Imagen’s academic paper is also a gem for Artifical Intelligence enthusiasts.

OpenAI states “We recognize that work involving generative models has the potential for significant, broad societal impacts. “

Google Imagen Samples: Images created based on text descriptions

Digital Image Basics

Pretty much all of these incredibly exciting advancements are based on digital image fundamentals. If it seems too confusing to you, you can take a look into how images are represented by numbers at each pixel level and how they can be represented and manipulated numerically through these tutorials:

Once you are able to see images as matrices of numerical values (usually a combination of RBG, red, blue and green and sometimes RGBA with alpha transparency value) covering the resolution array of the image, your whole perspective about how AI and ML can be implemented through computer vision models changes.

Societal Impacts

It’s starting to feel like we are there. AI’s real-world successes are being increasingly felt. Every new applied-AI milestone leaves your mouth open. It’s hard not to imagine the societal impacts. Millions of humans who derive lots of self-worth and satisfaction from their work suddenly becoming unemployed or even worse irrelevant.

At this rate of AI innovation, it’s not hard to imagine the next decade being truly disrupted by AI implementations. By 2032 we would probably have an army of trained AI algorithms that can walk, talk, drive, write better than humans. And the way AI works is, we are not talking about slightly better, they will be crushingly better and the gap will exponentially widen.

Here comes AI and the world is definitely not ready..

Economically, emotionally, physically, socially nor theologically.

References

[1] Forecasting Transformative AI: An Expert Survey: https://arxiv.org/ftp/arxiv/papers/1901/1901.08579.pdf