I have using the excellent Sharetribe framework to build a marketplace for food businesses and commercial kitchens for my new startup, The Food Corridor. However, it didn’t have support for generating a
sitemap.xml file for all the listings available.
How is someone going to find the right kitchen space when they use google, but we don’t have a sitemap so google can keep apprised of all the options?
This wouldn’t do. So, I added the ability to generate a sitemap for all the listings in the marketplace.
First off, install the gem–I used sitemap_generator as it seemed to do what I needed–allow me to call out certain routes and add them to my sitemap. Then you need to create a configuration file, at
config/sitemap.rb. Mine looks like:
SitemapGenerator::Sitemap.default_host = "https://"+APP_CONFIG.domain SitemapGenerator::Sitemap.create do Listing.where(deleted: false, open: true).find_each do |listing| add listing_path(listing), :lastmod => listing.updated_at end end
Then I just ran
bundle exec rake sitemap:refresh:no_ping and a
sitemap.xml.gz was generated in my
If you are running on AWS or someplace else with a persistent filesystem, you can skip to the text starting with “Then, I scheduled”.
If you are running on a PAAS like Heroku, where you don’t get a persistent filesystem, you’ll want to push this generated file to a persistent place. I chose S3. Since sharetribe already has paperclip as a dependency, I used the instructions here and here, with a few modifications for sharetribe.
My rake task to upload the sitemap file was:
require 'aws' namespace :sitemap do desc 'Upload the sitemap files to S3' task upload_to_s3: :environment do s3 = AWS::S3.new( access_key_id: ENV['aws_access_key_id'], secret_access_key: ENV['aws_secret_access_key'] ) bucket = s3.buckets[ENV['s3_bucket_name']] file = File.join(Rails.root, "public", "sitemap.xml.gz") path = "sitemap/sitemap.xml.gz" begin object = bucket.objects[path] object.write(file: file) object.acl=(:public_read) rescue Exception => e raise e end end end
I then run the
upload_to_s3 tasks in the same heroku scheduled task:
rake sitemap:refresh:no_ping sitemap:upload_to_s3. If you don’t do that (and instead do separate dynos) then the upload task won’t have access to the file (because it will have been generated on the first dyno’s filesystem).
You also need to make sure to add a sitemap controller to redirect from
yourdomain.com/sitemap.xml.gz to the S3 bucket (again, as outlined in the articles linked above.
Then, I scheduled a daily refresh of the
sitemap.xml file and submitted the file to relevant search engines.
Things I didn’t do:
- handle more than 50k urls
- support multiple communities (not really needed for me, but I bet if the folks behind sharetribe.com wanted to use this, they’d want such support).
- add the sitemap.xml file to my robots.txt file, as outlined here.