Atila Popularity isn’t everything

I remember the first time I heard the name YouTube. At the time, YouTube wasn’t nearly the giant it is today. It is currently the fourth most visited website according to Alexa.com. People upload homemade videos and their favorite movies,…


Optimize Non-HTML Documents
comment No Comments Written by Atila on August 26, 2008 – 7:39 am

Search-engine algorithms were originally capable of indexing only HTML documents. However, with advances in technology, search engines are now indexing other documents such as Microsoft Word, Microsoft Excel, Microsoft PowerPoint, Rich Text Format (RTF), and Adobe Portable Document Format (PDF). Therefore, it is acceptable to include non-HTML types of document files within the body of your Web site content; however, these files must be optimized to improve the results when search engines index them. For SEO purposes, you should fill your page primarily with textual content rather than image files.

A simple rule is that even if your image contains information that is viewable to humans, search engines cannot read the image unless it also appears in a basic text format. Image files might be visually stimulating and eye-catching to a reader, but they are of little value as far as search engines are concerned. The fact is that although some search engines are capable of reading some aspects of PDF files and other kinds of images, search engines cannot read text or messages within image files, and tend to disregard them as empty space on your site. It is prudent and beneficial to infuse your site with text that search-engine spiders can recognize. Include a call-to-action phrase in a text box, or use a creative textual format, which stands out to the reader. Textual content is not as visually pleasing as images, but it garners the necessary attention from search engines. Bottom line: Text lends itself to SEO; images do not.

Beyond basic HTML files, search engines now index other file formats such as Microsoft Word, Microsoft Excel, Microsoft PowerPoint, Rich Text Format (RTF), and Adobe Portable Document Format (PDF) documents. If you choose to post non-HTML files to your Web site, you can optimize each document to improve the results when search engines index them. The basic rule for whenever you attempt to optimize a non-HTML document for search-engine purposes is to make sure that it contains readable text. Images that contain text cannot be read and therefore are not properly indexed by search-engine spiders. Optimizing a PDF document is very much like optimizing a Web page.

Treat each PDF file like it is a separate Web page and link to it from your sitemap the same way you would an HTML page. Whenever possible, the anchor text to the PDF file should contain keywords. Make sure your PDF file contains text and that it contains a nice blend of your target keywords and phrases. Include your target keywords in the title of your PDF document. Consider breaking your PDF document into several smaller documents if it is very large or if it contains multiple topics. Avoid search-engine indexing pitfalls associated with large PDF files by creating an HTMLformatted abstract of the PDF file that links to the PDF. Although most leading search engines can now read and index the content of a PDF file, some have certain restrictions and may index only the first thousand or so characters of the document. Therefore, PDF files should be used sparingly, and large PDF files should be minimized. Sometimes you may want to convert HTML into PDF and other times PDF into HTML. HTMLDOC 1.8.27 from Easy Software Products located at www.easysw.com/htmldoc is a tool that provides a free 21-day demo license for first-time users.

PDF files can be used to help your Web site generate leads. For example, you can set up your PDF files so that an unregistered user can read only the first page or two of the file. If that user wants to read the remainder of the document, you can require him to submit contact information to you, such as an e-mail address, name, address, and anything else that you want to track. Also, because many search engines refuse or are unable to index very large PDF files, you can break the PDF into several separate files so that search engines will read and index them. If you have a very popular PDF file, you may want to consider structuring it as an e-book and charging your visitors to view it. An e-book is the equivalent of a conventional printed book, but it is made available only online or through special e-book readers. The downside of an e-book is that because you restrict access to it, search engines do not index it and therefore your Web site is not ranked based on the content of the book.

Liked this post? leave a comment!

Browse Timeline

Related Post

  • No Related Post

Post a Comment

About The Author: Atila



Want to subscribe?

 Subscribe in a reader Or, subscribe via email:
Enter your email address:  
Find entries :