Analyze Markdown Documents Used In Static Sites For SEO

Installation

pip install mdseo

Usage

mdseo provides CLI tools to check various statistics and metadata in markdown files. If an unwanted property is discovered, and error is raised. An overview of the CLI tools are below:

!mdseo_dupe_title -h
usage: mdseo_dupe_title [-h] [--srcdir SRCDIR]

Check for duplicate titles. Ignore with front matter `mdseo-ignore:
[dupe_title]`

optional arguments:
  -h, --help       show this help message and exit
  --srcdir SRCDIR  directory of files to check (default: .)
!mdseo_len -h
usage: mdseo_len [-h] [--n N] [--srcdir SRCDIR]

Check if docs contain less than `n` words. Ignore with front matter `mdseo-
ignore: [length]`

optional arguments:
  -h, --help       show this help message and exit
  --n N            minimum number of words a document should contain (default:
                   50)
  --srcdir SRCDIR  directory of files to check (default: .)
!mdseo_chk_fm -h
usage: mdseo_chk_fm [-h] [--srcdir SRCDIR] [--minlen MINLEN] [--maxlen MAXLEN]
                    {description,slug,image,authors}

Check front matter for various rules.

positional arguments:
  {description,slug,image,authors}  front matter field to check

optional arguments:
  -h, --help                        show this help message and exit
  --srcdir SRCDIR                   directory of files to check (default: .)
  --minlen MINLEN                   the minimum character length allowed for the
                                    field
  --maxlen MAXLEN                   the maximum character length allowed for the
                                    field

Examples

Check that description is between 50 and 300 characters:

!mdseo_chk_fm description --minlen 50 --maxlen 300
Traceback (most recent call last):
  File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
    sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
  File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
    tfunc(**merge(args, args_from_prog(func, xtra)))
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 101, in chk_fm
    return _checker(partial(_min_len_err, key=key, n=minlen),
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
    if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files have the field `description` in their front matter that is less than 50 characters:
	./test_files/front_matter3.md

Check that the front matter slug exists:

!mdseo_chk_fm slug
Traceback (most recent call last):
  File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
    sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
  File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
    tfunc(**merge(args, args_from_prog(func, xtra)))
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 107, in chk_fm
    _checker(partial(_missing_fm, key=key), f"do not have the field `{key}` in their front matter", srcdir)
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
    if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files do not have the field `slug` in their front matter:
	./CONTRIBUTING.md
	./test_files/false_fm2.md
	./test_files/false_fm.md
	./test_files/test_docs.md

Check that the front matter slug is no longer than 45 characters:

!mdseo_chk_fm slug --maxlen 45
Traceback (most recent call last):
  File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
    sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
  File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
    tfunc(**merge(args, args_from_prog(func, xtra)))
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 104, in chk_fm
    return _checker(partial(_max_len_err, key=key, n=maxlen),
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
    if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files have the field `slug` in their front matter that is greater than 45 characters:
	./test_files/front_matter_test_docs.md

Check that the front matter authors exists:

!mdseo_chk_fm authors
Traceback (most recent call last):
  File "/Users/hamel/opt/anaconda3/bin/mdseo_chk_fm", line 33, in <module>
    sys.exit(load_entry_point('mdseo', 'console_scripts', 'mdseo_chk_fm')())
  File "/Users/hamel/github/fastcore/fastcore/script.py", line 113, in _f
    tfunc(**merge(args, args_from_prog(func, xtra)))
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 107, in chk_fm
    _checker(partial(_missing_fm, key=key), f"do not have the field `{key}` in their front matter", srcdir)
  File "/Users/hamel/github/mdseo/mdseo/core.py", line 89, in _checker
    if fnames: raise Exception(f"The following files {msg}:\n\t{files}")
Exception: The following files do not have the field `authors` in their front matter:
	./CONTRIBUTING.md
	./test_files/front_matter2.md
	./test_files/false_fm2.md
	./test_files/front_matter_test_docs.md
	./test_files/false_fm.md
	./test_files/test_docs.md

Ignoring Checks

You may wish to ignore checks on individual files, there are two ways to do this (1) Through a special front-matter field called mdseo-ignore or (2) by placing the word mdseo-ignore-all in your markdown file.

With Front Matter

To ignore a check via front matter, supply the proper value(s) in the mdseo-ignore field in your front matter. For example, if you wanted to ignore the mdseo_dupe_title and mdseo_image checks in a particular markdown file, you would inject the following front matter:

---
mdseo-ignore: [dupe_title, image]
---

You can find these values by consulting the help of the appropriate cli command, for example mdseo_dupe_title -h says:

... Ignore with front matter `mdseo-ignore:[dupe_title]`

If you want to ignore all seo rules, you can also pass all like so:

---
mdseo-ignore: all
---

There is a generic function mdseo_chk_fm that checks the presence, min length and max length of a front matter field. You can ignore any checks conducted by this function by passing in the appropriate fields to mdseo-ignore. These are the fields that you can ignore:

description, slug, image, authors

For example, if you wanted to ingore all of these fields you could put the following in your front matter:

---
mdseo-ignore: [description,slug,image,authors]
---

With The Keyword mdseo-ignore-all

Some markdown files may not have front matter, or it may not be appropriate to add front matter to a file. In this case you can place the text mdseo-ignore-all anywhere in the file and all checks will be ignored, the most common way to add this keyword is with a markdown comment:

<-- mdseo-ignore-all -->