April 7th, 2008
You didn’t come here from Google
That’s because I’m currently not being indexed by Google.
Two weeks ago Google dropped this site from their search results because of the number of spam links I had on my pages. I did not put them there, of course, and I tried multiple times to get rid of them, but hackers kept coming back to the site and thoughtfully adding them back again.
It turns out that they were the result of someone exploiting a well-known Wordpress hack. Hackers were literally overwriting php scripts on my web server so that when people requested pages, tons of spam links were inserted into the result. Several people told me that this was happening, so I diligently overwrote the files in question (header.php, footer.php). Then, the next day someone else would tell me that it was happening (again), and I would fix it (again). I want to thank everyone who helped out…I truly appreciate you letting me know when this was occurring.
So then when Google crawled the site, they assumed I was a spam site and de-listed me. Literally overnight, I’m losing 300-400 visits per day. If you search on “Joshua Porter” or “bokardo”, you will not find this site. I’ve been the top link for those two terms for years.
From a technical standpoint, this is quite boring. But from a social design standpoint, the situation is quite interesting.
Obviously, if I met these hackers in real life, I would have choice words for them. Among other things, I would ask them to stop. And, if their identities were known, their behavior would also likely stop. Not just because of any warning or threat I could come up with, but because they would feel pressure from society to stop. Social norms would moderate their behavior better than anything I could do.
This is how behavior normally works. Most people don’t act out badly because of fear of getting caught and punished…although that certainly has an effect. Most people behave well because of how they’ll be treated afterward by society when they do something bad. The effect of our social groups is as strong as any threat of punishment.
In order to design software where people behave, the best practice is to tie identity to behavior. Once these two things are known, and related to each other, then social norms can kick in.
My question: if everything we do on the web is recorded somewhere…why can’t this sort of thing be stopped? I’m sure it’s a very small population of people who are doing this…why can’t someone reverse engineer the request to my web server and find out where it came from? Is that possible?
My guess is that it is possible, but involved. It probably gets done when a serious crime is committed, but not when the pain is seeing some spam links or getting de-indexed from Google. I would probably have to pay someone to do it.
Also, I would presume that other people using Wordpress are also having this problem. Couldn’t we set up a tracking system (much like a bug-tracking system) that catalogues these breaches? Then we would see how widespread they are, and we might gain some momentum in combatting them. Maybe the perpetrators are a small group or a small number of individuals?
At this point, however, I’m not sure what I can do. I’ve upgraded to the latest Wordpress, and that seems to have stemmed the flow of hacks temporarily. But this is going to happen again and again. Do I simply get into an arms race with the hackers and hope that I can outpace them?
How can I be sure that Google is even going to return to the site? Perhaps they’ve seen spam so long on my site that it’s now blacklisted?
Anyway, I think the root of this problem is social as much as technological. The available solutions, however, seem mostly technological. Any advice/thoughts you have on the matter would be greatly appreciated.
Links to this Post
Comments
1. Eric DeLabar 1:39pm, Mon 7th, 2008
Since your site is obviously not spam, you simply need to file a Google reconsideration request through the webmaster tools. You basically tell them that you were hacked, the spam has been removed, and you’ve done something to keep it from happening again. More details at Matt Cutts’ blog. Hope it helps! This site is a great resource and it would be a shame if people wouldn’t be able to find it!
2. karl 1:45pm, Mon 7th, 2008
I’ve also been on the bad end of a php hack and Google blacklisting. Instead of spam, however, the hacker left a link to a virus. It took a good month or so before Google’s ban on my site was lifted. Interesting enough, I eventually discovered the online identity of the hacker who got me after they took credit for it in exchange for “points” on a competitive Russian hacking site. Sad to say, I wasn’t worth very much.
3. Jelmer de Jong 1:47pm, Mon 7th, 2008
It is smart to contact Google about this problem. You are not the first to be removed from the Google index due to spam on his blog caused by hackers. They will re-evaluate your website and put you back in the index (normally with your old pagerank and positions).
Next to this, prevent these things to happen in the future. Always update immediately to the latest version of WordPress, change your FTP password regularly and keep an eye on your log files.
Goodluck!
4. Tim Howland 1:57pm, Mon 7th, 2008
The attack can be trivially tracked back to it’s originating IP address. Unfortunately, one of two things will almost certainly be true:
1) The attacking computer will belong to a little old lady from petaluma who has no idea her computer has been compromised by a spammer.
2) The attacking computer will belong to a university in Romania, South Korea, or some other geography difficult to reach.
I’m in favor of free enterprise here; congress issuing letters of marque and reprisal seems like the only way to actually take the bastards out.
5. Patrick 2:11pm, Mon 7th, 2008
Did upgrading to wordpress 2.5 help at all or are these exploits still there?
6. Patrick 2:16pm, Mon 7th, 2008
Ack. Sorry hadn’t read all the way through. You can delete these two comments if you wish.
7. Mark 2:18pm, Mon 7th, 2008
Well, reverse engineering, like you said, is hard, and can be nearly impossible.
You’d first have to figure out exactly when the attack occurred and then determine the IP address linked to the attack. Then you’d have to trace the IP address back to it’s place of origin: likely an ISP (and likely an ISP in another country that isn’t the US). From there you’d likely run into a brick wall. Since most IP addresses are assigned dynamically by the ISPs, you’ll never be sure who the individual is on the other end. You’d have to sue the ISPs to give up the individual’s name, but even then, you can’t know with 100% certainty who the individual is (see the RIAAs legal suits against file sharers and the arguments against them). This assumes you even have access to sue the ISP. That wouldn’t be likely in a country like Russia or China.
Then there is the possibility that the attacker used a proxy machine, where they routed their attack through another computer so that you see the IP address of that computer. Now you have no way of tracing it back.
It’s part of the power of the internet. It’s decentralized to such a degree that it’s easy to be anonymous and hard to pin down identity. It is also why it’s near impossible to censor the internet with 100% accuracy. Any one with enough technological know how can get around it.
So your best bet is to keep your software up-to-date and to use good security practices, like having difficult passwords. Not that reassuring I know, but currently there is no good solution.
Most of the attackers who are caught are caught because they left a bread crumb pointing back to themselves, not because someone reverse engineered the technology back to them.
8. Mark 2:26pm, Mon 7th, 2008
Another thought: For one of my coworkers, who has a blog much more popular than mine, he turns off comments for articles 30 days or older. He found that after 30 days, articles typically get very few valid comments, but can attract tons of spam comments. So turning off comments for those articles helps keep the spam at bay.
9. Tony Wright 2:54pm, Mon 7th, 2008
Hey Josh– same thing happened to me… Right at the time where I was going thru YCombinator and fundraising for my startup (which is a time when I might get googled a lot more than normal). I’m back now, after a WP upgrade, but it was very inconvenient timing.
You were savvy enough to detect this– so was I. But you gotta figure that there are a pile of long-tail bloggers out there who have no idea why Google backhanded them out of the results.
The problem is the currency of links– you have a valuable property and links from it are SEO gold.
If it happens a lot, you could switch to hosted WordPress (which has its own set of issues, but at least it’s secure), but I don’t think there is a solution to hackers on the horizon.
10. Scott 3:25pm, Mon 7th, 2008
Josh – I feel your pain; I’m sure most of us who spend any amount of time online with our own websites can relates. One of the reasons I stopped using WordPress in favor of another blog/CMS was the fact that hackers know full well how to target and exploit the software. It is an arms race, and as we learned from watching War Games with Matthew Broderick, in the end no one wins.
Regarding tying identity to behavior… I hear you, but somehow feel that you are being unduly optimistic. As mentioned by Karl in the comments, this type of bad behavior is shared, bragged about and rewarded.
11. Ed Finkler 3:25pm, Mon 7th, 2008
If you stick with Wordpress, yes, it seems that way. You might consider another system that has done a better job in terms of security architecture. WP does do a very good job on the user side, but it has a terrible history of security problems, and I feel that the apps architecture is inherently prone to security mistakes.
12. Josh G 3:39pm, Mon 7th, 2008
Appreciably an escalation on the technological arms race side of things, but I’ve found that using the Subversion repositories for Wordpress allow for very easy upgrading. I subscribe to the WP news RSS feed so I know when there’s a new version out and then one command and one site visit later everything is upgraded. Obviously it doesn’t help if there’s an exploit in a current version, but it’s something that makes staying current a lot easier.
Also, those of us who subscribe are definitely still here
Good luck with the re-listing!
13. Jim Jeffers 3:42pm, Mon 7th, 2008
It’s very important to keep wordpress up to date. This happened to my blog before – I try to make a solid effort to keep updating my wordpress in a timely manner as a result. There are also some extra security precautions you can take such as renaming the admin account (which you have to do directly in the DB because wordpress will not let you do it from within the environment) amongst other things.
14. Colin Scroggins 3:53pm, Mon 7th, 2008
Talk about your timely posts…
Look at what Google posted today!
15. Seyora 4:40pm, Mon 7th, 2008
As great as that would be, my guess is that overreacting privacy advocates (probably the same ones who complain about Gmail’s contextual ads) will blow this out of proportion and start quoting Orwell all over again.
Best of luck to getting back on the indexes. In either case, you’ve still got your loyal readers who’re more than willing to link you from their websites (:
16. Matt Cutts 5:06pm, Mon 7th, 2008
Hi Joshua, I’m a software engineer at Google. I hope it makes sense why we remove hacked pages from our index. Instead of serving up spammy links, your web server could easily have been distributing malware that would infect users, for example.
As far as communicating with Google, I would check out http://www.google.com/webmasters/ and register that you own bokardo.com if you haven’t already. We left a message for you in our webmaster console on March 19th about the hacked content that you can see once you verify your site. That’s also the right location to communicate back to Google when the hacked content is completely gone–look for the “Request reconsideration” link. Colin in the comments above already noticed that we had a blog post scheduled about this very topic that came out today.
You raise some interesting questions about the nature of web hacking. If the web were less anonymous it would be easier to track down crackers that do evil things like hack sites, but that would raise other tough issues as well.
17. Josh 7:17pm, Mon 7th, 2008
Thanks to everyone who shared their wisdom in the comments here. I truly appreciate it! It sounds like my frustration is a relatively common occurrence out there, which is both good and bad…somewhat nice to know that others share the pain but also unfortunate that this problem is so widespread.
I’ve taken steps to get relisted, per @Matt’s suggestions. Thanks Matt, I appreciate you taking the time to stop by and help.
18. satts 10:25pm, Mon 7th, 2008
Josh
It is sad that your google index was compromised and the comments helped me understand the way to get the SEO rating back.Although I follow yours posts in GReader
19. soong 1:37am, Tue 8th, 2008
i came here from google reader, if that counts.
20. Jaakko 1:49am, Tue 8th, 2008
I feel for you Josh: loosing visitors due hacking is really unfair. But wasn’t it up to Google to decide whether to block you or not? The ultimately unfair thing is that Google has gained so much power that blacklisting from their search becomes such an issue
21. Steve Mills 5:36am, Tue 8th, 2008
For what it’s worth Josh I just did a search for bokardo on google and found the site… maybe they have added you back on
22. Marty Alchin 12:05pm, Tue 8th, 2008
What’s funny though is that now that you’re back in Google’s good graces, when I search for either “Joshua Porter” or “bokardo”, I’m greeted the following snippet under the link:
23. David 12:06pm, Tue 8th, 2008
I feel for you. I still remember when I got hacked and spammed so bad that I had to pull my site down (I had free hosting at the time – a couple of years ago – and it was not fair to other users to have the server bombarded by the spambots hitting my site.) Thankfully I’ve never had that problem since I moved to Wordpress.
24. J Wynia 4:09pm, Tue 8th, 2008
Happened to me this week too. I’m down 2700 page views a day from before the blacklisting. As far as I can tell, only a single post had the spammer content from the Wordpress vulnerability. But, because it was on the front page, that’s all it took.
I submitted for reconsideration, but am waiting and can’t be 100% sure that I actually got it all.
Given how my domain is also my name, it’s like 90% of my online identity disappeared in one fell swoop.
25. Ian Kallen 2:22pm, Wed 9th, 2008
I’ve been actively chasing down the WordPress vulnerability issues (and their consequences), taking measures to reduce their impact on Technorati’s data & systems and evangelizing measures to thwart the perps. Latest post on the topic is here.
thanks,
-Ian
Technorati
26. Theo 7:16am, Fri 11th, 2008
Well, I came here from Google.
Welcome back
27. Erin Hawk 5:12pm, Fri 11th, 2008
I got here from Google (4/11)! You’re back! (Thanks for the great workshop today at the Summit.)
28. Sebastian Lewis 1:23am, Sun 13th, 2008
Does this have anything to do with your feed stopping? I came onto this site to obtain the link (copying it from the address bar is faster than most methods) and noticed this new post that isn’t in my RSS reader.
Just a note, you might want to add a CAPTCHA to the comments instead of the less noticeable and probably easily hacked SPAM check you have at the bottom.
Sebastian
29. Mehdi 1:48am, Sun 13th, 2008
em..
actually the people you are calling hacker are just a bunch of kids who know nothing from hack .
all they is just surf the well-known wxploit sites which there are ready to use exploits or how-to , then they use it on any site that is famous or they are jealous of.
so they are not really hacker, and just by calling them hacker ,they think they are something .
but they are not,they are just bunch of stupid kids who thinks this is hack .
one more thing, change your captcha, this one is stupid and bots can easily find it .
30. Mike Schinkel 12:26pm, Sun 13th, 2008
I proposed a potential solution for spam but didn’t have much uptake. Maybe your readers can either give it some momentum or explain why it would not work:
http://blog.welldesignedurls.org/2007/02/08/rel-spam-to-fight-comment-spam/
31. Alexander 2:50pm, Sun 13th, 2008
I actually did come here from Google, seems like your problem has been resolved.
32. Goos 4:58am, Tue 15th, 2008
I feel your pain as well josh, if you are not a php guru like me platforms like wordpress and joomla are the only tools we can use to create a site. And at the same time having no control over the security. I saw you are back in google, keep the blog going!
33. Ton Keuken 9:53am, Wed 16th, 2008
How troublesome. Glad it’s resolved, dropped by from google search.
A wordpress site of mine was also hacked about a year a ago. Had to call my hoster to fix it, what a mess that was.
34. Gegen Haarausfall 12:32pm, Wed 16th, 2008
yes, came here from google too. your site is being found again
but a friend of mine has the same problem. his site disappeared from the google search and he says that the number of visitors has decreased as a result.
35. smaaz 7:35am, Fri 18th, 2008
bad to hear that – i had a similar problem too last year, but after sending a reinclusion request my site was back within 2 weeks.
36. Rajesh Anandakrishnan 9:21pm, Fri 25th, 2008
This has been rectified by google now
37. Warren 11:33am, Mon 19th, 2008
glad to see you back on google. Let us know what the vulnerability was if you find out.
38. Niels 3:59pm, Sat 24th, 2008
Hi Josh,
I had a similar experience with a site that was running on Joomla. It took me more then two weeks before everything was back to normal. And then it took me another two weeks before I was listed again in Google. How long did it take in your case?