Bloggers, Beware Page Jacking!

I was accused a while ago of putting someone's blog on a "porn aggregator." Heck, I wouldn't know how even if I wanted to. And I dislike porn; its very upsetting for me to see. But now I do know how that other site got on a porn aggregator -- and that I had zero to do with it!

FROM THIS SITE

Security on the Internet means more than just protecting your personal computer from online predators and viruses. If you own a domain or have a website you may need to protect your online reputation as well as your content.

The web pages you see when you browse the Internet with Internet Explorer or Firefox is not all there is to see out there. Your web browser reads pages written in a language called HTML and interprets instructions for formating, placement, and file retrieval. The web browser shows you the interesting, full-color, animated, graphical version of a file of text references. The text and images you see are not the whole story. Hidden within the HTML pages are keywords and instructions aimed specifically at search engines like Google, Yahoo, and MSN Search.

These hidden sections can be in a few places within the HTML file where search engines look for relevant data that allows them to index each site and retrieve the page and links to it when a user searches for the right keywords. It is in this practice that some porn sites use non-porn keywords or even steal non-porn content to force their sites to appear in more searches.

Suppose for instance you own a site dedicated to camping gear and the ideal camping locations in the northeast United States. Now, suppose your website gets a fairly high amount of traffic from legitimate users who are actively seeking the information you provide.

Now suppose you perform the same search your users might and you see links to porn sites on the third, fourth, or even the first page of results. How did those porn sites become associated with your content? Chances are, it is not your fault and it is not a coincidence. One way for your site to become associated with porn sites is called Page Jacking.


Page Jacking works like this:

1) A porn site designer creates or steals a legitimate web page with content non-porn consumers might want. This information is usually stolen in part or in whole from other sites simply because it is easier to steal content than devise your own.

2) The porn site then uses non-porn keywords in the meta-tag section of the HTML file and submits the legitimate content with meta-tags to the various search engines for indexing.

3) When web pages are indexed the web spiders review the information submitted and index it according to the stolen content and meta-tags.

4) A user performs a search for the legitimate information, maybe even using some form of the camping site url and begins clicking the results.

5) An unsuspecting user, basing his clicking decisions on the innocuous nature of the search terms and the first few pages of results, may click a link that looks promising and be redirected to a porn site, completely different from what he wanted.

What makes this particular scenario so difficult to police is that it is difficult for the search engine developers to police such content. The original indexing method of using strictly the meta tags to determine what content a site holds has been changed to scanning and indexing the first few lines of text in the user-viewable HTML.

This is where Page Jacking becomes insidious.

The porn site owners know the first few lines of their content is scanned for relevant information, so they use those sections to store non-porn content for indexing before presenting their porn content. Some designers may even have the content of both porn and non-porn data on the same HTML page by hiding the non-porn content from view. This tricks the search engines into believing the site is non-porn related and presenting only the content they think you requested.


The only way to combat this type of theft is to police the search results. Most search engines have some method of doing this, but they cannot catch everything. Many website owners have taken to performing their own searches using their URL, their own meta-tag keywords, and various other content specific search terms to monitor and catch porn sites using their content. If found, these porn sites can be reported to the search engine companies and are generally removed from the indexes.

However, this is a very reactive way of addressing this as all the porn site owners have to do is steal more content from a different site and the process begins again. At this time there is no definitive way to prevent this type of theft. Like all methods of home and PC security, it requires diligence and attention to detail by you to protect your content and the associations made by others with your content.

Help prevent this by reporting to your state and local officials…

Comments

Anonymous said…
Good G-d Barbara that is horrible. I was not aware you were accused of such a vicious incident. Terrible - just terrible. I am glad that you are now vindicated.

Had I known about this, I could have myself supplied you with this information because this info you present here are how in many cases sites are hacked and put out of commission.

Pathetic!
Barbara said…
Yes - I was. But I don't engage in "lashon hara" -- so I decided to give the benefit of the doubt.

Sometimes those that accuse unjustly or with incorrect facts shoot themselves in the foot.

Popular posts from this blog

A Day to Bare Our Souls - and Find Ourselves

'Fat People Aren't Unstable' -- For This We Needed a Study?

Miriam's Cup