Best Desktop and Cloud Based SEO Crawlers (2016)

A website crawler is a tool that every SEO will need to do their job effectively. SEOs are expected to know how the major search engines are crawling a particular site and identify errors, redirects, spider traps, or other issues. We’ve come a long way from GSiteCrawler and Xenu Link Slueth, browser extensions, and one page SEO audit graders that were the few options available in the past. This post is intended to build a greater understanding of the most popular web crawlers out there today.

 

botify web crawler

Botify (Paid)
Pricing: Lower tier package starts at $569 per month (for 5M URLs per month), however they also advertise SMB pricing options as low as $49 per month.
Why Botify: Crawl high volume of URLs for onpage data at the lowest cost per URL

It’s a cloud-based solution, so you don’t have any of the desktop software based memory limitations. Assuming you can afford it and your server can handle it, Botify can crawl up 100 URLs per second. That’s roughly 5M URLs in 14 hours! They claim to be the most powerful & comprehensive crawler and analysis tool for your website, and based on the feature set, it’s hard to argue against that.

Notable Features:

  • Spider site from internal linking or crawl by top pages via Google analytics
  • Compare changes and performance from previous crawl
  • Advanced segmentation and URL rewriting abilities to properly handle internal linking tracking parameters that don’t change the page content
  • API (included with enterprise package)
  • Logfile analysis (included with enterprise package)
  • Chrome plugin

 

DeepCrawl

DeepCrawl (Paid)
Pricing: Starting at $80 a month for up to 100K URLs.
Why DeepCrawl: Crawl high volume of URLs for robust on page data reporting and previous crawl comparison.

They also sell pay as you go packages, allowing you to buy a set number of URL credits to use at your own leisure. Despite having an interface that is a bit clunkier than Botify, DeepCrawl offers better comparison views of previous crawls and a greater number of visualizations to slice the crawl data.

Notable Features:

  • Crawler IP, User Agent, speed, and customizable scheduling
  • Advanced segmentation and URL rewriting abilities to properly handle internal linking tracking parameters that don’t change the page content
  • Regex extraction
  • Compare changes and performance from the previous crawl
  • Hreflang markup checker
  • Open Graph markup checker
  • Proprietary internal link weight measurement calculated in a similar way to PageRank

 

import_io

Import.io (Paid w/free version)
Pricing: Not sure, they advertise custom pricing based on particular needs
Why Import.io: Turn the web into a spreadsheet

Import.io is a cloud based web data extraction tool. They offer a WISIWIG interface to select elements you’re looking to extract from any given web page. This includes images, links, copy, and more. The free version is very robust with the ability to quickly scrape pricing information from an eCommerce product listing page, or extract content from a blog or news source.
Notable Features:

  • User-friendly interface to highlight page elements to extract
  • Build your own APIs based on the data extracted
  • Export data in any format

 

screaming frog seo spider

Screaming Frog (Paid, with free version available)
Pricing: Crawl up to 500 URLs Free, 99 Euros per year for premium
Why Screaming Frog: Low cost, ad hoc URL crawling for on page data.

Screaming Frog is known for it’s low cost, high flexibility, and speed to quickly dig into issues on the fly. They continue to improve the feature set while the price remains static. The issue that larger or enterprise-level SEOs run into, is that this tool drains memory and maxes at 30K-100K URLs depending the local memory you have allotted. Of course, you can setup a remote machine to VPN into and run your crawls without slowing down you daily workflow. Alternatively, a few folks have leveraged cloud installs to run screaming frog on the cloud as described here.

Notable features:

  • Premium version offers custom selectors to extract contents from source code via CSS, XPath, or regex
  • Crawl mode for list upload, spider, and SERP
  • Respect canonicals and/or robots.txt
  • include/exclude URL strings with Regex commands

 

site_condor

SiteCondor (Paid)
Pricing: Starting at $19 per month for 60K URLs
Why SiteCondor: Not sure.

SiteCondor is a cloud-based solution, although they don’t advertise the ability to handle URL volumes such as DeepCrawl or Botify.

Notable Features:

  • WordPress Plugin
  • API
  • Cloud-based

 

url_profiler

URL Profiler (Paid with Free 14 Day trial)
Pricing: $15.95 per month (max 5K URL import) to $31.95 per month (max 250K URL import)
Why URL Profiler: Speedy, low-cost way to pull HTTP status codes and extensive off-page metrics.

URL Profiler is a desktop solution that focuses on off-page factors such as social shares and inbound links associated with the particular pages of interest. URL profiler has the ability to quickly rip through a high volume of URLs leveraging local memory. Part of this comes from extracting less on-page information comparable to Screaming Frog.

Notable Features:

  • Include Majestic, Mozscape, Ahrefs, Google (page speed), Copyscape, and SEMrush data with credentials
  • Social sharing data
  • Google Analytics integration
  • Import Screaming Frog data to speed up report building

 

visual-seo-studio

Visual SEO Studio (beta version is free for now)

Pricing: Not sure, although it’s still Free while in Beta
Why Visual SEO Studio: Free alternative to Screaming Frog without the max 500 URLs

This is a desktop solution that spiders URLs and returns basic data such at title/meta, outbound links, external CSS files, JS files, and more. The logo and feature set looks very much that of Screaming Frog, but they are currently offering the full beta software version free with the key sent over email.

Notable Features:

  • HTML and URL SEO suggestions
  • Page-Speed performance suggestions
  • URL Screenshots

Please let me know if this overview was helpful, or if there are additional tools of note.

Manage Your Personal Budget With Google Sheets

The beginning of the year is a common time to make promises to yourself, and hopefully, this can help you keep those promises related to personal finance and budgeting.

This document started as a way for my wife and me to manage our budget while focusing resources on getting our student loans paid off. Trust me, I’m aware there are a number of (free-ish) tools out there, so if you already have a system that works, great. If not, or if you’re looking for a free alternative to the budgeting tool you’re using, then this post is for you.

The Personal Budget template described can be found here:

https://docs.google.com/spreadsheets/d/1fUjNr-n5u-GLzvmWLJk2_m9CYOIHSpmS-A8m3Jjz1-c/edit?usp=sharing

The following steps will help you through the process, to get your budget set up in around 20 minutes.

Create a Copy

Upon entry to this URL, you’ll quickly see you won’t be able to edit the current version. You’ll need  to sign in to your Google Account (create an account if you don’t have one already) and create a copy for your personal use.

personal budget template

Choose Spending Categories

Click on the “Spending Categories” tab, and review the pre-populated list of expense and income categories. Add, edit, and/or delete categories to align with your own spending patterns and budget goals. If you haven’t done a budget before, start with generalized categories, and revisit once you have a better feel for your spending patterns.

Export Bank Transactions

From your bank, export transactions to CSV from the beginning of the month you plan to start your budget.

bank transaction export

Click on the “Bank Data” tab, and clear the placeholder data. Your bank may differ in the ordering of the columns, so you’ll need to paste your transaction data into the “Bank Data” tab to correspond with the Date, Description, and Debit, and Credit columns.  Some banks separate the Debit and Credit values into two columns and some list all values in the same column.  If the values are combined in the same column, just paste into column D, under the “Debit” heading, and the formulas will take care of the rest.

Categorize Bank Transactions

By clicking the right corner of the cells in Column “A” on the “Bank Data” tab, you’ll see the populated list, based on the categories specified in the “Spending Categories” tab. Go through each row to categorize each transaction. This is important since the Monthly Budget tab uses these labels to determine how much is spent in each category.

categorize transactions

Monthly Budget

Each month is broken out into Expenses and Income.

Carryover – For the income, the first tab you’ll notice the first tab is labeled “Carryover.” In this case, you will paste in the dollar value in your checking account at the beginning of the month. This will help you monitor your current balance as transactions are added, and to ensure it lines up with your account balance reported straight from your bank.

Next, if you changed the categories “Income” or the “Gift/Award” categories, you will need to use the arrow drop down to select the income categories desire. Next, use column “B” to add in the estimated income based on the different category groups. In this example, the budget shows a monthly income of $4,200, resulting from two $2,000 payments, and two $100 payments. Disregard the Cash flow for now, we’ll cover that shortly.

planned income

 

Moving onto the expenses. Here, you’ll assign a monthly value to each expense category. Recurring costs are simple to budget, as you can look at the previous month’s bill for the exact figure, where groceries and dinner/out budget will take additional thought to come to a figure that is accurate. As the estimated expenses are populated, the running balance will adjust to account for the total planned income and the total expense for the month. It all starts will the plan. This is your chance to assign money to the categories that are important to your life, and not the expenses that are short lived and spontaneous.

Be honest with yourself, and accept the spending and cuts you plan to make.

monthly expenses

 

Deleting Duplicates

After you return from the initial budget setup, you’ll need to frequently (daily/weekly/bi-weekly) paste in the latest transaction data to keep your budget current. Use the same process as before, but paste the data below the last transaction, rather than overriding previous transactions. Sometimes transactions take multiple days to clear, so you might have missed transactions that didn’t process a day or two before your last import. To solve this issue, I recommend pasting in data overlapping 7 days from the previous import date. Using this method, you’ll have duplicates that need to be removed (luckily I built a script for that).

On the far right of the menu, click on “Duplicates” to remove duplicates from the Bank Data tab. The first time you run the script, Google will ask you for Authorization for the script to run, and for the ability to connect to the Google sheet.

google script authorization

Once it completes, any duplicates will be removed. If you are activating for the first time, you’ll need to re-run the Remove Duplicates script for it to fully execute.

The rest is on you to manage your budget… It takes time, but it’s worth it.