If you don’t get to the thread (link below) I would add the comment regarding validating your HTML. If your page doesn’t validate, you might have trouble with your site being crawled.
As of 1/27/2007 the list compiled by Fribble stood as:
- Reciprocal link request pages coupled with existing reciprocal links to non-relevant sites.Thanks thecoalman
- Outdated copyright date or last modified date visible on the pages. (Debatable)
- error pages that don’t send 404 headers or send content regardless of the page requested/querystring entered.
- Massive numbers of incoming links from link farms.
- dead/404ing links.
- High link churn.
- No published contact address, email address or phone number. (Debatable) (site-dependent?)
- A high bounce rate (surfers clicking back on their browser and selecting another search result).
- Too much duplicate content.
- Whois info for the domain which is the same as other domains previously penalized or banned. (Could also be true of adsense publisher/affiliate ID’s and other identifiable footprints)
- Use of/links to affiliate programs that are known scams
- Domains previously used for spam or that are blacklisted.
- excessively long URI’s/URL’s (query strings or folder and file names)
- A high percentage of affiliate links vs regular outbound links.
- No / very few outbound links (depending on the site’s type/niche).
- No / very few inbound links (depending on the site’s type/niche).
- All inbound links are to homepage only (Debatable)(Site Dependant?)
- Outbound links to questionable/spammy/crap sites.
- [Matt Probert]The inappropriate or gratuitous use of profanity
- Too many spelling errors.
- Contains unrelated subjects (ex: a site that reviews toys and tries to sell insurance or viagra).
- Lack of interest from social bookmarking sites.
- MySQL or PHP errors in the pages
- [tedster]No real menu or information architecture — just a laundry list of links going on down the page.
- [abbeyvet]relatively short pages/articles containing unnaturally high keyword density for their topic – almost always contain within them 2 or more large adsense or other advertising units.
- [steveb]More than 25% links from blogs
- [steveb]No links to the site from any domain in the top 100 for a query where the page ranks in the top 20 for that query
- [mattg3]Small font size text framed by ads. Or big adsense blocks. I avoid them cause I am sue my users think too they are cr@p. Just disregard the heatmap.
- [buckworks]The drop list for the user to select their credit card expiry date still includes last year.
Join the discussion and throw your two cents in. If you are new to Webmaster World, you will be happy that you discovered the site.