Moving Site from GitHub

Generate the redirects for all pages

Hi All,

So yesterday I decided to move my site from github.io to one of my personal domains. I am not sure why I decided this. I mean Github provides a great free service. And I had lots of automation in place. In this article I will just cover how i went about updating all pages with redirects from github to the new domain.

The Goal

My goal was to leave my old site up on GitHub for a few weeks to give Google some time to update its indexing. With all old pages being able to redirect to their new counterparts. This requires that I have redirects in place for each page. I will then remove the github site repo.

Limitations with Github

From my understanding. There is no support for server side redirects. This left me with the option of using HTML/JavaScript redirects.

Resources and Research

I did a fair amount of reading before tacking this mini project.

Implementation

I chose to go with the HTML rewrite approach.
One of the reason for this is that I create my site using two Github repos. One is private with some workflow actions which populates the second public repo. The public repo only has the generated webpage code.

Initial Setup

Since I am focusing on the re-write script. I am going to assume you have done the following things.

  1. Have the new site up and running on the new domain.
  2. Disabled Google Analytics and AdSense on the old site, or already moved these properties to the new domain if you are using them.

Getting things into place.

In my case this was pretty straight forward. My site is being served from a repo that only contains the page code. So working in a seperate area from my normal local copy of my private repo I simply cloned the public site to my local machine.

git clone git@github.com:Bas-Man/bas-man.github.io.git

Within this directory. I then removed files and folders that would not be needed.

  • css
  • images
  • fonts

Those sorts of things.

The script components

New HTML file contents

I decided to go with the following html code with some adjustments.

<!DOCTYPE HTML>                                                                 
<html lang="en">                                                                
    <head>                                                                      
        <meta charset="utf-8">
        <meta http-equiv="refresh" content="0;url={{THE_NEW_URL}}" />       
        <link rel="canonical" href="{{THE_NEW_URL}}" />                     
    </head>                                                                                                                                                                   
    <body>                                                                      
        <h1>                                                                    
            The page been moved to <a href="{{THE_NEW_URL}}">{{THE_NEW_URL}}</a>
        </h1>                                                                   
    </body>                                                                     
</html>

Within this local repo I created a file called source with the following contents

<!DOCTYPE HTML>                                                                 
<html lang="en">                                                                
    <head>                                                                      
        <meta charset="utf-8">
        <meta http-equiv="refresh" content="0;url=https://bas-man.dev/NAME" />       
        <link rel="canonical" href="https://bas-man.dev/NAME" />                     
    </head>                                                                                                                                                                   
    <body>                                                                      
        <h1>                                                                    
            The page been moved to <a href="https://bas-man.dev/NAME">https://bas-man.dev</a>
        </h1>                                                                   
    </body>                                                                     
</html>

The value NAME is just a place holder and will be replaced in each file using the output of the good old trusty sed command.

Transversing the Directory Structure.

For this I chose the following.

find . -print0 | while IFS= read -r -d '' file
do 
    echo "$file"
done

And did some testing which gave a huge amount of output.

--snip--
./categories/coding/page/4/index.html
./categories/coding/page/3
./categories/coding/page/3/index.html
./categories/coding/page/2
./categories/coding/page/2/index.html
./categories/coding/page/5
./categories/coding/page/5/index.html
./categories/coding/index.html
./categories/automation
./categories/automation/page
./categories/automation/page/1
./categories/automation/page/1/index.html
./categories/automation/index.html
./categories/front-end
./categories/front-end/page

Straight away I noticed there would be an issue with the ./ at the start of each line. But I will get to that.

What do I actually want to do

What I want to do is to replace any file ending in .html with the contents of source where name has been replaced with the path to the html file. Let’s take a look at a simple example from above

./categories/coding/index.html needs to have its contents replaced with the contents of source but with NAME replaced with categories/coding/index.html.

Example
<!DOCTYPE HTML>                                                                 
<html lang="en">                                                                
    <head>                                                                      
        <meta charset="utf-8">
        <meta http-equiv="refresh" content="0;url=https://bas-man.dev/categories/coding/index.html" />       
        <link rel="canonical" href="https://bas-man.dev/categories/coding/index.html" />                     
    </head>                                                                                                                                                                   
    <body>                                                                      
        <h1>                                                                    
            The page been moved to <a href="https://bas-man.dev/categories/coding/index.html">https://bas-man.dev</a>
        </h1>                                                                   
    </body>                                                                     
</html>

The final Script and Break Down

find . -print0 | while IFS= read -r -d '' file
do 
    if [ "${file: -4}" == "html" ] ; then
       cat ./source | sed "s|NAME|${file:2}|g" > ${file}
    fi
done
The if statement
  • This check if the file ends in html. To do this we use ${file: -4} which return the last 4 characters of file. If this is true we will replace the contents of file.
Re-write the file contents
  • Replacing the contents is done using old classic tools. cat, sed and > the output redirect.
    • I pass the contents of source to sed.
    • Sed then replaces NAME with the value of file starting from the 3rd character. We count from 0. So 0,1,2 <- 3 position.
    • I use double quotes since I am doing interpolation
    • I use | since of / as my separator since my string also contains / characters.
    • I use the g option since I need to place all occurances of NAME, so using greedy mode.

With this done. I take the easy way and run the script by a shell. The script was simply called update

sh update

And all the HTML files are updated with the correct references to the exact same pages, but to the new domain.

Update the old site

From here it was a simple matter to update the repo with some git commands and push the updates back up to GitHub

git add .
git reset -- update
git reset -- source
git commit -m "my comment"
git push -u origin

The resets were just to remove the two files I didn’t need pushed to the live site.

Closing

That’s about it. Seems to have worked pretty well. I just need to keep an eyes on things and see if I need to make any changes.

Hope this might help others.


See also