The State of Web Spam: Human-Posted Spam is on the Rise

Even though we have lots of tools to detect blog comment spam these days, spammers always tend to be one step ahead of our algorithms. While early blog spam was often posted by robots and easily detectable, today’s blog spammers are smarter. Instead of relying on robots, the team behind Automaticc ‘s Akismet spam filter reports that modern blog spam is often written by low-paid workers in India, South-East Asia and Turkey. Sponsor The “best written spam,” according to Akismet, comes from South-East Asia. As the Akismet team notes, SEO firms will often hire these low-paid workers and set them up to work out of Internet cafes and local universities. Akismet: “The ‘best written spam’ comes from South-East Asia.” Detecting Human-Posted Spam is Hard We have definitely seen this increase in human-posted spam here at ReadWriteWeb over the last two years or so. While early comment spam was easily detectable because it had nothing to do with the actual post, we now have to take a closer look at all the links our commenters use in their personal profiles in order to weed out the spammers. Often, comments that look perfectly legit will include a link to a Viagra or SEO site in the profile link. What About Regular Spam? Besides the rise of human-powered spam, traditional spam is still going strong as well. Akismet notes that “old-fashioned” pill, porn and malware spam still tends to originate from Eastern Europe and the Russian Federation. Spammers there still operate huge networks of malware-infected machines that run spambots. According to Akismet, the number of fake blog networks on services like Blogspot, Weebly, Tumblr, Ning and WordPress is also becoming more frequent and more highly organized. Instead of just abusing other people’s blogs, these spammers just create their own blog networks. Other forms of blog-related spam that are on the rise are auto-blog pingbacks from people using auto-blogging plugins ( mostly for WordPress sites), as well as hijacked blogs and wikis. From Porn and Pills to Pet Food and Roofing Akismet also notes that while early blog spammers used to focus on the traditional (and highly lucrative) niches around pornography, pills and malware, today’s spammers are often more interested in search engine optimization than hawking fake Viagra. Because of this, modern blog spam often includes links to “dentists, roofing and pet food.” Discuss

Top 10 Mobile Trends of 2010, Part 3: Emerging Markets

In preparation for the upcoming ReadWriteWeb Mobile Summit , we’re outlining the 10 leading trends of the Mobile Web in a 3-part series of posts. In this the final instalment, we look at three markets for mobile which promise to be hugely valuable: commerce , cloud computing and health . As a reminder, in Part 1 we covered design and development issues and in Part 2 we looked at trending mobile apps such as geo-location and AR. We’ll explore these and other trends with you at the ReadWriteWeb Mobile Summit , a 1-day event we’re running on Friday 7 May, in Mountain View, California. That’s the day after Web 2.0 Expo (2-6 May), so we hope you’ll extend your trip to the West Coast to help us define the future of mobile! To be certain of getting a ticket, we invite you to register now . Sponsor Commerce As more and more consumers use smart phones, how can businesses utilize this channel? That’s one question we will analyze at the RWW Mobile Summit. Consider these statistics: nearly one quarter of the mobile web, according to a recent report from mobile search engine Taptu , is made up of shopping and services . Taptu surveyed about 326,000 sites that are optimized for touch-screen browsing and found that the largest concentration of these sites falls into Taptu’s “shopping and services” category. In total, Taptu found 83,000 mobile-enabled commerce sites, ranging from mobile shopping assistants to banks and mobile real estate sites. According to Taptu, mobile shopping and services sites make up close to 25% of all mobile-friendly sites in the company’s index, followed by sites in the “photo and design” category (17.7%). Social sites rank third with 9.2%. Top 10 Mobile Trends of 2010: – Part 1: Design & Development – Part 2: Apps, Apps, Apps In a recent report , Morgan Stanley analyst Mary Meeker Meeker claimed that mobile will revolutionize e-commerce. She cited location-based services, push notifications, transparent pricing, and instant mobile delivery as four potential areas where this will occur. Mobile advertising is also a growing segment. In November, Google acquired AdMob , a mobile display ad serving platform, for $750 million. In January Apple acquired Quattro , a relatively unknown mobile advertising network, for an estimated $275 million. Later in January, Opera bought AdMarvel . In April, Apple announced an advertising platform called iAd . Cloud Computing According to a recent study from Juniper Research , the market for cloud-based mobile applications will grow 88% from 2009 to 2014. The market was just over $400 million this past year, says Juniper, but by 2014 it will reach $9.5 billion. Driving this growth will be the adoption of the new web standard HTML5, increased mobile broadband coverage and the need for always-on collaborative services for the enterprise. Explained ReadWriteWeb’s Sarah Perez in February, “there are already a few well-known mobile cloud apps out there including Google’s Gmail and Google Voice for iPhone. When launched via iPhone homescreen shortcuts, these apps perform just like any other app on the iPhone, but all of their processing power comes from the cloud.” Health Mobile health applications will play a large and important role in shaping the future of the health care system, wrote Mike Kirkwood at the mHealth initiative conference in February. He wrote that mobile and wireless health applications “directly impact the individual’s health and have the promise of ensuring that when a patient leaves a doctor visit, they don’t become “lost” in the system. It allows consumers to be engaged with health and wellness in their daily lives and connect back to their health care provider.” It’s not just from within the health system where mobile services will change health care, it’s also in the applications that consumers are downloading to their smart phones. In February I surveyed the latest health and fitness apps on the iPhone platform . For example, an iPhone app called Diamedic allows diabetics to record their blood sugar levels and insulin doses. Top 10 Mobile Trends of 2010: – Part 1: Design & Development – Part 2: Apps, Apps, Apps We’d love to discuss these and other mobile topics with you at our ReadWriteWeb Mobile Summit 2010 . See our announcement post for more details. If you’re a company in the Mobile Internet market, you may be interested in becoming a sponsor for this event. Please contact our COO Sean Ammirati for more information about sponsor packages. And a big thank-you to our current event sponsors: CallFire , WorldMate , Alcatel-Lucent and Ipevo . Discuss

TwitterClaims: Be First In The Twitter Username Land Rush

Last week after Twitter’s Chirp conference, Danny Sullivan of Search Engine Land asked Twitter Co-founder Evan Williams when we would begin to see the release of inactive and deleted Twitter usernames back into the wild. The answer turns out to be soon for some and later for others, but the question remains – how will we know when that name is finally available? Well, two developers, Blake Crosley and Luke Woodard , have jumped onto this goldrush and created TwitterClaims . Sponsor According to Sullivan, Twitter is still trying to figure out the proper way to handle the situation, as some usernames have been used but have recently sat inactive, while others were swept up in mass name claims by squatters and others still have simply been abandoned. (Sullivan notes an anecdote by Williams of one person who registered more than 10,000 names in one fell swoop but has done nothing with them.) So if you’ve been eyeballing that perfect Twitter username, just watching it sit there and do nothing, TwitterClaims claims to have the answer. Simply enter your email address and give the site up to ten names that you’re looking forward to having and the service will email you when the name becomes available. The service checks once an hour to see if the name is available and once it is, it emails you to let you know. Simple. It looks like anyone can claim a name, so once it becomes available and the notifiction is set, it’s on. You’ll still have to get there first, and others can be getting the same notification about that same username. Discuss

The Modigliani Test: The Semantic Web’s Tipping Point

In our recent posts about Structured Data , we’ve emphasized that most of the current initiatives have been around uploading new data to the Web – whatever the format. The U.S. and U.K. governments have led the way with their ‘open data’ websites, but much of that data isn’t ‘linked’ yet . In other words, it’s online – but siloed. So how do we get to the next stage of the Semantic Web, linking disparate data sets together so that people can begin to use that data? The tipping point for the long-awaited Semantic Web may be when you can query a set of data about someone not too famous, and get a long list of structured results in return. I’ve decided to term this ‘The Modigliani Test.’ Sponsor Amedeo Modigliani is one of my favorite artists. He was moderately famous during the early 20th century and has something of a cult following nowadays. But he’s not Da Vinci or Picasso famous. What I’d like to do in a Semantic Web is type the following query into a search engine and get back a large list of results: tell me the locations of all the original paintings of Modigliani. As of today, there’s no place to type that query in and get a list of structured data . The closest I can find to doing that is the Artcyclopedia entry for Modigliani, which has a list of locations for Modigliani artworks. It’s great that they have the location data listed on one web page. However it’s not structured data, so we can’t query it. There’s also not much order to the data, we have no idea if this is a comprehensive list, it’s not verified data, and so on. In summary, there’s a lot of data on the Web about the location of original art works – but much of it is in traditional ‘document’ web pages. What we’re after is a giant database of art works, which anybody can query and re-use. Here’s an early, overly geeky view at what a Linked Data of painting locations would look like (hat-tip @dakoller ): The above is a far from comprehensive list of art works by Hieronymus Bosch (a search for Modigliani, by the way, brought up zero results). Plus of course we need a much more intuitive UI, so that non-geeks can use it too. What do you think, when will The Modigliani Test be passed on the Web? Discuss

Bing Keeps Getting Smarter: Adds More Info About Cars, Sports Teams

Bing now knows a lot more about cars and will also give a select group of users the option to compare the performance of different sports teams. Microsoft just announced these updates at the Search Engine Strategies event in New York. The new comparison answers for sports will be rolled out to only about 5% of Bing’s users at first. In addition, Microsoft will also begin to roll out some minor design changes to a small group of users today that will better highlight Bing’s assets like weather and travel search. Sponsor Starting today, mobile users will also see improvements to Bing’s autosuggest feature, which will now include answers for things like stock quotes right in the autosuggest box. Domain Task Pages for Cars Whenever a user searches for cars and car-related topics (” 2010 Toyota Camry specs ,” for example), Bing will now bring up a page will all of the car’s specs instead of directing you to another site with this info. This page will also include links to additional images and videos about the car, as well as the ability to restrict the search query by different trims and links to the specs of cars in the same class. In Microsoft parlance, these pages are called “domain task pages” and chances are that – if successful – the company plans to roll out more of these for additional topics in the near future. The task pages are part of Bing’s efforts to provide users with specialized answers for popular queries in verticals like weather . According to a recent job posting , other topics for these pages that Microsoft plans to launch in the future could include “movies, music, games and other high-volume domains.” Given that Bing bills itself as a “decision engine,” it only makes sense for Microsoft to try to capture as many popular searches as possible and present its users with relevant answers right on Bing.com instead of sending them on to other sites. Sports Comparison and UI Changes A small number of Bing users will now also be able to compare the performance of sports teams by simply typing the name of two teams in the search form. As Stefan Weitz, Microsoft’s Director of Bing, also told us during a briefing earlier this month, about 0.7% of all queries on Bing are comparison searches and the company hopes to capture more of these in the future and present the right answer in Bing instead of sending users to multiple sites. The same number of users (about 5%) will now also see a new user interface for the boxes at the top of the page that Bing will often display for popular topics. For searches related to cities and towns, for example, these “Bing boxes” will now include info about local weather, a relevant link to Bing Maps, as well as airfare info from your current location (based on your IP address). For popular artists, these boxes will now also include information about upcoming concerts and other relevant information. Sadly, this concert info is based on chronology and doesn’t take a user’s current location into account. Overall, these are interesting updates – not necessarily because Bing now knows a lot more about cars and sports, but because they show the direction the Bing team is going in. As a “decision engine,” the Bing team’s intend is to give users more information directly on the site instead of just presenting them with a couple of links. While these links can be relevant (and Bing still shows them most of the time, too), the Bing team wants to reduce the number of queries that result in links and increase the number of times the software can present users with direct answers Microsoft has sourced from its own databases or from sources across the Internet. Discuss