Automate README updates with GitHub actions

Automate README updates with GitHub actions

Today, well yesterday actually, I learned about TILs (Today I Learned). It was always there in the back of my head that I should keep a journal for all the tiny things I learn, all those really small scripts that I use for some small task. A database dump, scraping a webpage, resolving messed up merge conflicts, anything. I even started this with my data structures and algorithms study, but it was pen & paper (very boring) and I couldn't sustain it. I started to write this blog to help anyone doing the same google searches as I did in the past land up learning what I needed to learn. But writing about every small thing that I learn in a blog format is very time-consuming, requires a lot of research and, hence again, not sustainable. I might even write complete articles on my TILs. But for now, let's write some TILs and automate updating those on GitHub readme.

Automating the update

Let me walk you through how I automated the process. It involves two parts: the workflow and a python script. The workflow sets up a container for running the script, runs the script, and then commits the results.

  • Here's what the Github action file looks like:

  • The file is called "Build README" and it's triggered every time I push to the main branch. It has write permission to my repo.

  •   name: Build README
    
      on:
        push:
          branches:
          - main
    
      permissions:
        contents: write
    
  • The workflow has several jobs, starting with checking out the repo.

  •   jobs:
        build:
          runs-on: ubuntu-latest
          steps:
          - name: Check out repo
            uses: actions/checkout@v3
            with:
              fetch-depth: 0
              path: main
    
  • Setting up a Python environment with the latest version (3.10.9). Then, it installs all the necessary Python dependencies, specified in the requirements.txt file.

  •       - uses: actions/setup-python@v4
            with:
              python-version: 3.10.9
    
          - name: Install Python dependencies
            run: |
              python -m pip install --upgrade pip
              pip install -r main/requirements.txt
    
  • The next step is updating the README using the update_readme.py script. The script updates the README if the --rewrite flag is set, and then it prints the contents of the README file to verify if the update was successful.

  •       - name: Update README
            run: |-
              cd main
              python update_readme.py --rewrite
              cat README.md
    
  • Finally, if the README file was changed, the workflow commits the changes and pushes them to the main branch. This is all done by a bot, which is configured with a global email and name of "" and "README-bot", respectively.

  •       - name: Commit and push if README changed
            run: |-
              cd main
              git diff
              git config --global user.email "actions@users.noreply.github.com"
              git config --global user.name "README-bot"
              git diff --quiet || (git add README.md && git commit -m "Updated README")
              git push
    
  • The script fetches file titles & created dates and recreates README.md with updated index(table of contents).

    1. Import necessary libraries: re, pathlib, git and timezone from datetime.

       from datetime import timezone
       import re
       import pathlib
       import sys
      
       import git
      
    2. Set the root path of the current script using pathlib.Path(__file__).parent.resolve(). Define two regular expression patterns index_pattern and count_pattern. Define a string template count_template for the count section.

       root_path = pathlib.Path(__file__).parent.resolve()
      
       index_pattern = re.compile(r"<!\-\- index starts \-\->.*<!\-\- index ends \-\->", re.DOTALL)
       count_pattern = re.compile(r"<!\-\- count starts \-\->.*<!\-\- count ends \-\->", re.DOTALL)
      
       count_template = "<!-- count starts -->{}<!-- count ends -->"
      
    3. get_file_created_and_updated_times: a function that takes in a repository path and a reference (default "main") and returns a dictionary of file information including creation and updated time in ISO format, both in local time and UTC. This function uses the GitPython (git) library to get the commits and statistics, it iterates over the commits and updates the dictionary accordingly.

       def get_file_created_and_updated_times(repo_path, ref="main"):
           file_times = {}
           repo = git.Repo(repo_path, odbt=git.GitDB)
           commits = list(repo.iter_commits(ref))[::-1]
           for commit in commits:
               commit_time = commit.committed_datetime
               affected_files = list(commit.stats.files.keys())
               for file_path in affected_files:
                   if file_path not in file_times:
                       file_times[file_path] = {
                           "created": commit_time.isoformat(),
                           "created_utc": commit_time.astimezone(timezone.utc).isoformat(),
                       }
                   file_times[file_path].update(
                       {
                           "updated": commit_time.isoformat(),
                           "updated_utc": commit_time.astimezone(timezone.utc).isoformat(),
                       }
                   )
           return file_times
      
    4. regenerate_readme: a function that generates the table of contents for the TIL repository. It calls get_file_created_and_updated_times to get file times and creates an array index that stores the table of contents. It then loops through the subdirectories and files in the root path, extracts the topic and title from each file, adds an entry to the index array, and finally, append the end of the index marker.

      
       def regenerate_readme(repo_path):
           file_times = get_file_created_and_updated_times(repo_path)
      
           index = ["<!-- index starts -->"]
      
           for folder in root_path.glob("*/"):
               if not folder.is_dir() or folder.stem.startswith("."):
                   continue
      
               folder_path = str(folder.relative_to(root_path))
               topic = folder_path.split("/", maxsplit=1)[0]
      
               index.append(f"## {topic}\n")
      
               for file_path in root_path.glob(f"{topic}/*.md"):
                   file_url = f"https://github.com/Azanul/til/blob/main/{file_path}"
      
                   with file_path.open() as f:
                       title = f.readline().lstrip("#").strip()
                       date = file_times[str(file_path.relative_to(root_path))]["created_utc"].split("T")[0]
      
                   index.append(f"* [{title}]({file_url}) - {date}")
      
           index.append("<!-- index ends -->")
      
    5. Here, the script reads README.md , rewrites it with updated index and if --rewrite flag is set, overwrites old `README.md` otherwise prints to standard output.

           readme = root_path / "README.md"
           readme_contents = readme.open().read()
      
           index_txt = "\n".join(index).strip()
           rewritten = index_pattern.sub(index_txt, readme_contents)
           if "--rewrite" in sys.argv:
               readme.open("w").write(rewritten)
           else:
               print(index_txt)
      
    1. Calling the function if the file is executed.

         if __name__ == "__main__":
             regenerate_readme(root_path)
      

Writing the TIL

We're going to create a blank repo named til . Let's say I learned about Pydantic today. I'm going to create respective classification folders and create a markdown file (first time writing a .md file? visit). Let's write whatever you learned. Here's what I wrote.

Let's make a small change in our TIL and see if everything is working fine or not.

Seems like everything is working, awesome!

Never stop learning

So, there you have it! Automating the update of my TILs on my GitHub profile has made the process of keeping track of my learning journey so much more sustainable. Now, every time I learn something new, I can quickly add it to my TILs without having to worry about updating my GitHub profile manually.

To get notified of similar articles follow me on Twitter at AzanulZ or connect with me on LinkedIn, Azanul Haque.

Did you find this article valuable?

Support Azanul Haque by becoming a sponsor. Any amount is appreciated!