Filed under: Uncategorized
You probably don’t even realize that when you have to solve two “CAPTCHA’s” (visual word images created to prevent SPAM) that you were actually helping the world to digitize and thus preserve old, crumbling books.
Ahhhh, but so much is always going on under the surface! Read on and keep on CAPTCHA-ing!
Typing in two words correctly results in the digitization of one word
By Paul Rubens
A weapon used to fight spammers is now helping university researchers preserve old books and manuscripts.
Many websites use an automated test to tell computers and humans apart when signing up to an account or logging in.
The test consists of typing in a few random letters in an image and is designed to fight spammers.
Carnegie Mellon is using this test to help decipher words in books that machines cannot read by letting sites use them to authenticate log-ins.
The test, known as a CAPTCHA (Completely Automated Turing Test To Tell Computers and Humans Apart), was originally designed at Carnegie Mellon to help to keep out automated programs known as “bots.”
Spam messages
Bots are designed by spammers to post advertisements in discussion forums or to sign up for large numbers of e-mail addresses which are later used to send spam messages.
A CAPTCHA consists of an image containing letters or numbers which have been heavily distorted, making it hard or impossible for a bot to “read.”
“ There’s still about 100 million books to be digitised, which at the current rate will take us about 400 years to complete ”
Luis von Ahn, Carnegie Mellon
By requiring web site visitors to type in the contents of the CAPTCHA before being allowed in to the site, humans can be admitted while all but the smartest bots are rebuffed.
CAPTCHAs are unpopular with many Internet users because the words they contain are often so heavily distorted to foil bots that that many humans struggle to read them.
This means potential visitors’ time is wasted while they make repeated attempts to decipher the CAPTCHA they are presented with.
But the CMU research team, based in Pittsburgh, Pennsylvania, has devised an ingenious system to put the time used interpreting CAPTCHAs to good use.
Text files
The team is involved in digitising old books and manuscripts supplied by a non-profit organisation called the Internet Archive, and uses Optical Character Recognition (OCR) software to examine scanned images of texts and turn them into digital text files which can be stored and searched by computers.
But the OCR software is unable to read about one in 10 words, due to the poor quality of the original documents.
The only reliable way to decode them is for a human to examine them individually – a mammoth task since CMU processes thousands of pages of text every month.
To solve this problem the team takes images of the words which the OCR software can’t read, and uses them as CAPTCHAs.
These CAPTCHAs, known as reCAPTCHAS, are then distributed to websites around the world to be used in place of conventional CAPTCHAs.
When visitors decipher the reCAPTCHAs to gain access to the web site, the answers – the results of humans examining the images – are sent back to CMU.
Every time an Internet user deciphers a reCAPTCHA, another word from an old book or manuscript is digitised.
Deciphered correctly
To ensure that the reCAPTCHAs are deciphered correctly, website visitors are actually presented with images of two words to examine, the contents of one of which is already known.
“If a person types the correct answer to the one we already know, we have confidence that they will give the correct answer to the other,” says Luis von Ahn, a Professor at CMU.
“We send the same unknown words to two different people, and if they both provide the same answer then effectively we can be sure that it is correct.
If they don’t agree then we send it to a lot more people to examine.”
Thanks to the adoption of reCAPTCHAs by popular websites like Facebook, Twitter and StumbleUpon, the system is helping to decipher about one million words every day for CMU’s book archiving project, according to von Ahn.
Given that it takes about 10 seconds to decipher a reCAPTCHA and type in the answer, this represents the equivalent of almost three thousand man hours a day spent deciphering words that CMU’s computers find illegible.
A handy extra benefit of this system is that reCAPTCHAs are particularly good at foiling bots while remaining legible to people.
“Firstly, we are starting with words that we know our computers can’t read,” says von Ahn. “These words have also been distorted naturally over time, and the number of ways they have been distorted is very large.
‘Distorted further’
“The more ways they are distorted, the harder it is for spammers to write software which can read them.”
To make it even harder for bots, these words are then distorted further.
“What we do is the equivalent of placing the image on a rubber sheet and pulling it to distort the geometry,” he says.
Using the reCAPTCHA system von Ahn’s team is digitising documents and manuscripts as fast as the Internet Archive can supply them, and the good news for book lovers (and bad news for spammers) is that the supply of reCAPTCHAs is not likely to dry up any time soon.
“There’s no danger of us running out of words,” says von Ahn. “There’s still about 100 million books to be digitised, which at the current rate will take us about 400 years to complete.”
Story from BBC NEWS: http://news.bbc.co.uk/2/hi/technology/7023627.stm
Feeling a little frustrated trying to get good Internet coverage for your new business website? Confused about how to show up on Google and other search engines at all, let alone, rank competitively?
Market yourself and get quick Search Engine visibility using these simple methods and resources:
- Be sure to submit your site to the major search engines. Not sure which engines to focus on? The engines with the most web traffic are Google, MSN, Yahoo, Ask and AOL.
- List your business with Google Maps for more local, business traffic (for people searching for products and services in your geographic region.)
- Have a qualified, Internet professional access your website structure and content to be sure that there are not any barriers or issues preventing the search engines from “seeing” all of your content. (For instance, sites built in frames and navigation buttons in Flash can present big problems.)
- Be sure to use plenty of relevant key phrases your website content. This is the most important, powerful way to cultivate visibility. My examples: Oregon coast website design, Manzanita Internet marketing, Nehalem web site design, Oregon coast search engine optimization. Develop a set of the 10 most important phrases and use them as much as possible (without making your content too spammy.)
- Be sure to include links to your website and other materials in every, single posting, and online listing or comment that you create as well as in the signature at the end of all your email messages.
- Utilize free, social networking sites like Twitter, FaceBook, MySpace and create a free blog or two on WordPress with plenty of content, keywords and links to your main site. Offer valuable tips and info to visitors…not just a sales pitch, so they will read your posts and share them. All of these free sites get listed very fast on search engines…sometimes overnight.
- Post ads on sites like Craig’s List, Kijiji and list your business on open directory sites like the DMOZ. All offer free listings and rank high with search engines.
- If you have the budget for a few paid listings, do a quick Google search for your major key phrases, first. See what sites consistently rank well. If your competitor’s paid ads turn up in those rankings, consider buying ad space with the same site.
- Keep posting new updates to your blogs, MySpace, website and other sites. The search engines love fresh content! Perhaps even chronicle your marketing efforts and share your newfound wisdom and success with others. Oh, and be sure to include plenty of links and key phrases.
) - Keep a record of your rankings for main key phrases on Google and other search engines. Use this as a barometer of your progress.
Filed under: Uncategorized
#1 Direct Selling SEO
#1 Direct Selling Search Engine Optimization
#2 Direct Selling Web Development
#3 Direct Selling Web Design
Now that I am also serving small businesses on the Oregon coast I am targeting new keywords. That work is in progress but I have noticed that this blog is already showing up for:
Manzanita Oregon SEO
I’ll post more search rankings as well as info on what is working and what isn’t in a few weeks.
Filed under: Uncategorized
Community members and friends often contact me with their general computer questions. I’m sure that every web professional has this happen…neighbors who call when their email isn’t working, friends who call when their computer dies, clients who email with word processing questions… I thought about charging for questions…but generally, I prefer to spread goodwill (and good Karma) and help out those who are less fortunate, so-to-speak.
I decided to start posting the most popular questions and answers to my blog, though. Here is the first in my series:
How to change the default font in Microsoft Word
Change the font style, size and color that shows up in every new MS Word document you create. (This works for Word for Macs, too.)
- In the drop down bar choose “format” and then “font.”
- Select the font and size you want and then there should be a button at the bottom, left of the font window that says “default” (see example, below.)
- Click the “default” button.
You don’t have to save the document but every new doc you open will magically use the new font for paragraph text.
Filed under: fun stuff, marketing, oregon coast, tips | Tags: andy norris, dawn shears, event, flash mob, health, independent filmmaker, laneda, laugh in, manzanita, mother nature's natural foods, oregon, publicity, wall street journal
I live on the beautiful, northwest Oregon coast in an area that boast 3 little towns (Manzanita, Nehalem, Wheeler) on a 6 mile stretch along the 101. Rainforest, sand dunes, farms, rivers and bays abound…it’s a natural paradise but quite wild and rugged, too.
I believe there are less than 2,000 full-time residents, in all three towns. Tourism is an important source of revenue and most homes are second homes. Winter months can be trying for the residents who live here year-round… As a new full-time resident, I can fully attest to that!
Financial challenges, isolation and stormy, rainy weather can test even the stoutest…and the current economic climate has added a challenging twist for us all.
That being said, we are not easily discouraged. Residents remain optimistic and I have found that they can be quite creative with efforts to bolster morale. Well, hell, they are just creative no matter what. The latest of these creative endeavors has turned out to be quite newsworthy.
Andy Norris, a resident filmmaker, had the idea to gather a group of locals in a visible, public place every week to stand there and just laugh for a few minutes. This smacks of the Flash Mob phenomen that I’ve read about in the past. The funny thing really is that what started out as a whim and a fun idea ended up capturing more PR and media interest than expected. I gathered that Andy would have loved to have had this kind of media attention when he released his last (excellent) film, “Source to Sea.”
What can we learn about from this about marketing? I think that if you follow your instincts and your bliss you cannot go wrong. If you are doing what you love to do and talking and writing about it it’s bound to eventually make waves and gather attention.
Read about Andy Norris’ Manzanita, Oregon Laugh In here:
The Oregonian
The Wall Street Journal
The weekly Laugh In is held every Thursday at 12:05 pm in front of Mother Nature’s Foods: 298 Laneda, Manzanita, Oregon. Costumes are encouraged.
Looks like Google is trying a low-carbon approach to mowing the fields around their Mountain View offices. They rent about 200 goats to graze their grounds every month….and it costs just about the same as the mowing services they formerly used. They think the goats are cuter to watch, though.
Filed under: Uncategorized
Geesh…
About time. I have had to warn clients about putting their sites and navigation in flash as it totally kills their search engine rankings. The search engines simply cannot access the content contained in the flash and they cannot follow links to the pages in the animated navigation systems.
This may help. Although I still think that following simple, proven, best-practices is a good credo.
Personally, I use hard data to determine whether these things are working. I will do the same for Flash items on my client sites. The data will tell if this is, indeed, true…
So, here is the article in the news:
Uncloaking ‘invisible’ Flash Web content
Adobe announced late Monday night that it was providing optimized Adobe Flash Player technology to Google and Yahoo to help them better index dynamic Web content and rich Internet applications that include the Shockwave Flash file (SWF) format.
It sounds exciting, but what exactly does it mean for Web searchers, Webmasters, and Flash creators? CNET News.com asked Adobe, Google, and Yahoo and got some answers.
Q: What is Adobe doing?
A: Adobe is providing Google and Yahoo with optimized Adobe Flash Player technology so that their search engine spiders will be able to find and index SWF content, including Flash “gadgets” such as buttons or menus and self-contained Flash Web sites.
Q: How does this work?
A: When a search engine spider hits a normal HTML page and encounters Flash content it will load it in an optimized Flash player on the search engine server. Google has developed an algorithm that explores Flash files in the same way a person would, such as by clicking on buttons and entering input. The algorithm then indexes all the text it encounters through the navigation.
Q: How will the search experience change as a result?
A: The text that people see when they interact with Flash files, such as captions and introductions, will now be used when Google generates a snippet that appears below the URL on the search results page. The words that appear in the Flash files can now be used to match query terms in Google searches. In addition, the URLs that appear in Flash files will be fed into Google’s crawling system and be indexed.
Overall, more content will be indexed and search engine result rankings will change to reflect the additional content and its relevance. The snippets will give better information about the page on the search results. You can also expect search engine optimizers to figure out ways to improve rankings of Flash-based Web sites just like they do with HTML-based sites.
Q: Why is this necessary?
A: More than 98 percent of the Internet-connected desktops have Flash Player installed and Flash is hugely popular. Until now, the search engines were able to index some static text and links within SWF files, but much of the content was not getting indexed because of the dynamic aspect of the rich media files. Currently, all that content that was essentially invisible to the search engines will appear in the search results and the small amount of content that gets indexed appears on the search results page in jumbled words and code that are of no use to the Web searcher.
“Now, you are losing all the context of what content was near each other and running at the same time,” says Justin Everett-Church, a senior product manager for Adobe Flash Player. He likened the impact to the difference between reading the index of a book and reading the contents of the book.
So, who would ever have thought that anyone would want DawnShears.com? I admit, I manage many domain names but can be a bit lazy when it comes to renewing my own. Since I’m using my name it’s not seemed likely that anyone would want to purchase my domain from under me. And “Dawn Shears” is a pretty unique name.
Seems I was very, very wrong…
I had meant to renew my DawnShears.com domain and it just slipped off my radar, somehow. Well, it seems that somebody in Turkey bought the domain out from under me the very day it went up for renewal. Whaaaa? So, take a look at DawnShears.com, now: www.DawnShears.com and you see a landing page (and where he swiped some of my keywords and actually has my name, there) links to his sites. What is happening is that he is using my keyword traffic on the search engines to drive clicks to his sites.
The part about him using my name, well, that’s infringement but I don’t think there is any easy way for me to do much about that. He’s in Turkey, for one. I did a WHOIS lookup to find his info and here it is:
Registrant:
Burak Dogan
catalarmut mah samsun no 22
samsun, samsun 55234
Turkey
Registered through: GoDaddy.com, Inc. (http://www.godaddy.com)
Domain Name: DAWNSHEARS.COM
Created on: 29-Mar-06
Expires on: 29-Mar-09
Last Updated on: 12-May-08
Administrative Contact:
Dogan, Burak eyle@windowslive.com
catalarmut mah samsun no 22
samsun, samsun 55234
Turkey
902553423232 Fax –
Technical Contact:
Dogan, Burak eyle@windowslive.com
catalarmut mah samsun no 22
samsun, samsun 55234
Turkey
902553423232 Fax –
Domain servers in listed order:
NS5.SERVETR.COM
NS6.SERVETR.COM
So, this gives me an email address to contact him. it looks like he purchased the domain name and his hosting from GoDaddy, too, so GoDaddy may be able to help me if he refuses to stop using my name. There is even phone and fax numbers if I wanted to call Turkey.
For now, I sent an email out requesting that he provide a link to my new site if he was going to use my name in his site content. Otherwise, kindly remove my name from your site text.
Stay tuned…I’ll let you know if anything happens.
People ask me all the time to research and purchase domain names for them. The research and purchase process is very straightforward and you can easily do this part, yourself. (note: knowing what to do with the domain name to get it working is not so straight forward, though.)
When choosing a domain name, my advice is to use some of your important key phrases in it, keep it as simple as possible, keep it easy to remember.
For instance, my friends named their software business, Party Plan Solutions, as they build software for direct selling party plan businesses. Party Plan Software and Party Plan Solutions are two of their top key phrases. After this and my SEO tune-up of their website, they rank very high for their chosen key phrases…the domain name helps.
To research and purchase domain names go to GoDaddy >>
(I also highly recommend their hosting services)
I’d like to share another great tool; one that I regularly use to help choose my clients’ key phrases. Like all of the tools I use/promote here, it’s free…
If you enter your key phrase, for example, Oregon Coast Vacation Home Rentals, this tool not only suggests other key phrases, it also shows which ones are more popular in Google searches. You can also filter the results in a number of ways if you play around a little with the tool.
So, obviously, if you are trying to decide your top 3 key phrases out of a long list, you can use this tool to choose ones with the highest number of searches, among other factors.
Click here to go to the Google Keyword Tool >>

