How To Make a XML Sitemap for Google/Yahoo

First what we must do is signup for Google Webmaster Tools and Yahoo Site Explorer, Live Search still doesn’t have a webmaster console where you can add sitemaps (they did announce one is in the making).

Then we verify ourselves using their verification methods. Once verified we can continue adding our sitemap, but we still can't do that cause we must first construct our sitemap, so let's make it. We can begin by making a template group called ‘sitemap’. Why you say? Well since Google and Yahoo allow you to add multiple sitemaps, you can basically make for each section or weblog  a sitemap. This is useful cause as we all know a sitemap can only have a maximum of 50.000 URLs or 10mb in size. So this way you can manage different sitemaps with one template group. This method can also be used when you want to assign different priorities for different sections of your site, you can make a sitemap for each of those sections with their own unique priority and settings.

Now that we have made our template group we can start making our template. Press the ‘New Template’ link to make a new template, we will name it ‘articles-sitemap’ and set the template type to ‘RSS Page’ and hit submit.

Click on the newly created template so we can start editing. Paste the following inside of your template:

{assign_variable:weblog_name="default_site"} <?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<
url>
<
loc>{homepage}</loc>
<
lastmod>{exp:stats}{last_entry_date format="%Y-%m-%dT%H:%i:%s%Q"}{/exp:stats}</lastmod>
<
changefreq>always</changefreq>
<
priority>1.0</priority>
</
url>#br#
{exp:weblog:entries weblog="{weblog_name}" limit="500" disable="categories|custom_fields|member_data|pagination|trackbacks" rdf="off" dynamic="off" status="Open"}
<url>
<
loc>{title_permalink="{template_group_name}/comments"}</loc>
<
lastmod>{gmt_edit_date format='%Y-%m-%dT%H:%i:%s%Q'}</lastmod>
<
changefreq>daily</changefreq>
<
priority>0.5</priority>
</
url>
{/exp:weblog:entries}
</urlset>

 

Now let's break that up:

{assign_variable:weblog_name="default_site"}
Here we set our weblog short name, so the variable {weblog_name} will contain in this case ‘default_site'. We can also assign multiple weblogs here using the pipe character: "webloga|weblogb|weblogc". Using the pipe character we can make 1 sitemap for all our weblogs.

<loc>{homepage}</loc>
Here we set our weblog homepage using the http://www.trulyee.com/ variable. (We set that in the control panel: Admin › System Preferences › General Configuration)

<lastmod>{exp:stats}{last_entry_date format="%Y-%m-%dT%H:%i:%s%Q"}
{
/exp:stats}</lastmod>
Here we say when our last modification was to the site, to aid us in this we use the {exp:stats} tag pair and the {last_entry_date} variable. Note that ‘%Y-%m-%dT%H:%i:%s%Q' is the way to display the date/time in your sitemap, else your sitemap will not validate.

<changefreq>always</changefreq>
Here we are saying we are ‘always' changing. Other values that can be used are: always,hourly, daily, weekly, monthly, yearly, never. (seriously who would use ‘never')

<priority>1.0</priority>
Here we set the priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value has no effect on your pages compared to pages on other sites, and only lets the search engines know which of your pages you deem most important so they can order the crawl of your pages in the way you would most like. Also the priority you assign to a page has no influence on the position of your URLs in a search engine's result pages.

{exp:weblog:entries weblog="{weblog_name}" limit="500"
disable="categories|custom_fields|member_data|pagination|trackbacks"
rdf="off" dynamic="off" status="Open"}
Here we begin parsing our entries with the {exp:weblog:entries} tag. Normally I set a limit of 500 links, depending on your needs you can adjust this. Be ware of setting a too high value you can run out of memory doing so, you can check how much memory you have by looking at your php info (Admin › Utilities › PHP Info, and look for ‘memory_limit'.)

<loc>{title_permalink="site/article"}</loc>
Here we set the URL to the full entry page. Here you see that ‘site' is the template group and ‘article' is the template. This will render like this: http://www.mydomain.com/index.php/site/article/a_entry_title
But I prefer to use
{comment_url_title_auto_path}
you see it in the docs.

<lastmod>{gmt_edit_date format='%Y-%m-%dT%H:%i:%s%Q'}</lastmod>
Here we say the last time we modified the page/entry. Note that ‘%Y-%m-%dT%H:%i:%s%Q' is the way to display the date/time in your sitemap, else your sitemap will not validate.

<changefreq>daily</changefreq>
<
priority>0.5</priority>
Here we set the priority of the URLs that we are going to submit. Previously we set the priority for the homepage. Normally I do the standard value of ‘0.5' and a change frequency of ‘daily'. You can adjust this to your wishes.

 

Placing the sitemap in your server's root level

According to Google's and Yahoo's new sitemap rules, their sitemap tool does not accept URLs from a higher level or a different level.
http://www.mydomain.com/sitemap.php (ACCEPTED)
http://www.mydomain.com/sitemaps/sitemap.php (NOT ACCEPTED)

A simple work around makes our day.
Step 1
: Create a ‘sitemap.php' file (you can name it anything you like, but keep the .php extension)
Step 2:
Paste the following code into the sitemap.php file:

<?php
// Prevent content to be cached
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");  // Content was generated on past
header("Last-Modified: " gmdate("D, d M Y H:i:s") . " GMT"); //Content is always modified
// Inform user agent that content is XML and is UTF-8 encoded
header('Content-type: text/xml; charset=UTF-8');
// Read content from template and show it
@readfile ('http://www.yoursite.com/index.php/weblog/sitemap/');
?>

Step 3:
Replace http://www.yoursite.com/index.php/weblog/sitemap/ with your URL of your sitemap template we created earlier.
Step 4: Save the file and upload it to the root dir of your site.
Step 5: Go to Google/Yahoo webmaster console and add your sitemap (the url must be to your sitemap.php you have just uploaded: http://www.mydomain.com/sitemap.php).

This is a quick way of generating a sitemap of your site. There is always room for you if your site is very big to make more custom sitemaps. We can also generate a sitemap for your static pages or regular templates but this I will cover in another tutorial.