Every Wednesday, Daniel Russell, a researcher working with Google, posts a search question on his search & research blog. The search question for 26 September 2012 related to differences between the coastlines on the East and West coasts of the USA. Attempting to answer the question I typed in [Atlantic islands] into Google. Unlike the usual list I’d expected, I got this:
The images at the top of my search were a surprise. Clicking on the arrows gave me further images – totalling 55 island pictures. I tried a few other searches [Pacific Islands], [Indian Ocean Islands], etc. and found similar results. Yet most searches such as [Scottish Islands] gave me the normal type of listing.
Intrigued, I contacted a couple of colleagues – Karen Blakeman of RBA Information Services and Marydee Ojala, Editor of Online magazine (and the Online Insider blog). Both Karen and Marydee are also members of the Association of Independent Information Professionals and like so many AIIP members, are expert searchers. (All three of us are presenting at the forthcoming Internet Librarian Conference in London and led the London Websearch Academy in 2011).
Marydee admitted to being bemused but guessed it was connected to Google’s Knowledge Graph initiative – the new service that puts details on a search topic to the right of the search results – as with this example search for [Albert Einstein].
Knowledge Graph was launched by Google in May 2012 and aims to give instant answers to many encyclopedia type search queries. However this didn’t explain what I’d found. Marydee looked a bit further and found that the TechCrunch blog had discovered this earlier in September.
I mentioned that I’d found it because of Dan Russell’s blog and Marydee asked him about the new feature. Dan responded that the “carousel” of images is triggered whenever Google knows about a collection or group of connected items such as “Atlantic Islands”. The group is then summarised and made available at the top of the results list – allowing searchers to quickly recognise the collection and the other group members.
So that’s it then! It’s a new feature giving a “carousel” of images. If you search for [knowledge graph carousel] you get the above Techcrunch link and also Google’s own search blog on the topic . (There’s a lesson here – always check Google’s own blog posts if you spot what looks like odd Google behaviour). A search for [Knowledge graph] gives Google’s own description of the feature, including a YouTube video explaining it.
Dan Russell’s reply however said more:
What it triggers on is a bit more problematic. Answer: only collections we know about, which can be a bit odd. [moons of Saturn] but not [U.S. presidents]. [famous jazz composers] works, but not [cities in UAE]
This seems to explain why not all searches show the carousel. [Atlantic Islands] does. So does [Pacific Islands] but [Islands] doesn’t. [Greek Islands] is mentioned as an example in the YouTube video – but the less touristy [Scottish Islands] fails to show the carousel. It’s not just islands that give oddly inconsistent results. [Famous Jazz composers] results in the carousel appearing but [famous composers] gives a normal display. [20th century composers] works as does [19th century composers]. Bizarrely [18th century composers] doesn’t work and nor does [20th century artists] or [19th century artists]. Yet [impressionist artists] and [surrealist artists] do work. The results definitely seem surreal!
The TechCrunch blog tested the feature looking at rides at the Cedar Point theme park in Northern Ohio. I decided to ride the carousel on Disney parks. Again the results were odd – but a pattern seemed to emerge. [Disneyland rides], [Epcot rides], [Magic Kingdom Rides] all worked but [Disneyworld rides] didn’t. I then tried [Disney Paris Rides]. That works. So does [Disney California Rides]. However [Disney Florida Rides], [Disney Tokyo Rides] and [Disney Hong Kong Rides] all failed to work.
It seems as if there are two factors playing out here. The first is whether Google knows enough about the topic to create a set of common images. My guess is that Disney Hong Kong and Tokyo fail on that count – and possibly this explains why 18th century composers also fails. That can’t however explain the difference between Disney California and Paris, compared to Disney Florida. That brings in the second factor: the number of items in the collection. There are several Disney World theme parks for Disney Florida – Epcot, Magic Kingdom and more. I suspect that there are too many rides to be displayed in a meaningful manner. The aim of the Carousel is to encourage exploration – and a never-ending list tends to do the opposite: like a carousel that goes to fast, there is a risk that people may fall off.
At the end of April 2012, the official web-site for the 2012 London Olympics was launched – listing participating countries. The list contained embarrassing errors – which illustrate how political and geographical ignorance overcame factual accuracy and even elementary school knowledge.
As an example, the web-site gives Asia as the location for Palestine but the country next door – Israel – is in Europe. A quick check on any atlas will show that Israel is located in Asia – as are its neighbours (Palestine, Jordan, Syria, Egypt).
When the web-site launched, this failure to check facts was even more inept. Originally the country profiles included the country capitals, population and currency. However for Israel the site initially put a blank for Israel and named Israel’s capital city, Jerusalem, as the capital of Palestine. When this was pointed out (as reported by the Times of Israel) this was reversed – with Palestine’s capital left out. Meanwhile the US Dollar was listed as the official currency for Palestine.
It’s not difficult to check such facts. There are numerous web-sites that list country capitals, currency and much more. For example, About.com has a geography section listing capitals. About.com is compiled by subject experts and is a good first stop when looking for general information – whether about geography, science, or many other school curriculum topics. Wikipedia also has a page listing country capitals. A quick search on WolframAlpha lists Jerusalem as Israel’s capital although an equivalent search fails on Palestine - perhaps because Palestine is not yet a country. There is also WorldCapitals.info – another listing, and the CIA Factbook, which correctly names Jerusalem as Israel’s capital. Whatever one may think about the CIA as an organisation, its website giving information on the countries of the world is generally reliable and an excellent site for anybody trying to find geographical information.
Accepting that finding the above may be beyond the average Olympic bureaucrat, why not do a simple Google search to check the facts. Putting in Israel capital city as the search term quickly gives the answer: Jerusalem.
This failure hints that in fact the error may not have been just ineptitude but also included an element of political dogma that should be missing from the Olympics. I’m suggesting this because of a related error that appeared in the Guardian newspaper recently.
The Guardian states (note my emphasis):
The caption on a photograph featuring passengers on a tram in Jerusalem observing a two-minute silence for Yom HaShoah, a day of remembrance for the 6 million Jews who died in the Holocaust, wrongly referred to the city as the Israeli capital. The Guardian style guide states: “Jerusalem is not the capital of Israel; Tel Aviv is”
This, as can be seen by the previous listings, is hogwash and is a political attempt by the Guardian to redefine a country’s right to name its own capital city. While it is true that most countries position their embassies in Tel Aviv, this is because of the disputed nature of Jerusalem – despite it being the location for Israel’s government and other national institutions. Failure to give the truth is a disservice to the Guardian’s readers and discredits its position as a leading UK newspaper.
When newspapers such as the Guardian and bodies such as Britain’s National Olympic Committee start to rewrite facts (or fail to check facts) then what hope is there for a genuine peace agreement between Israel and Palestine. What is worrying is that this constant repetition of false information relating to Israel and Palestine is an example of what is commonly termed the Big Lie (Große Lüge). Hitler wrote in Mein Kampf
“But the most brilliant propagandist technique will yield no success unless one fundamental principle is borne in mind constantly and with unflagging attention. It must confine itself to a few points and repeat them over and over. Here, as so often in this world, persistence is the first and most important requirement for success.” (Volume 1, Chapter 6).
The false information relating to Israel in the press now often outweighs the truth. Even the terminology used has become the accepted dictum – and as Hitler counselled, is repeated over and over and over.
As an example, it is rare that the term used for the areas captured by Israel in the 1967 war is anything other than “occupied territories”. In fact, the Gaza strip was given back to Palestinian rule in 2005 and is no longer under Israeli control, and much of the captured territory (that Jordan annexed following the 1948 war) is under Palestinian rule (as had been the objective of the 1947 UN partition plan). Actual ownership of this land is disputed as there is no clear-cut international agreement on who owns the territory. Thus the correct term should be “disputed territories”. Anything else (i.e. “occupied” – as used by anti-Israel protagonists or “liberated” as used by the Israeli right-wing) is inaccurate.
Another example of a propaganda lie used against Israel is the word “apartheid” with Israel accused of adopting apartheid policies to discriminate against the Palestinians. Wikipedia states that
the crime of apartheid is defined by the 2002 Rome Statute of the International Criminal Court as inhumane acts of a character similar to other crimes against humanity ”committed in the context of an institutionalized regime of systematic oppression and domination by one racial group over any other racial group or groups and committed with the intention of maintaining that regime.”
Proof that the apartheid claim is a lie is not hard to find. Arab citizens of Israel have full voting rights, rights of employment, education and free movement (which was not the case for black Africans during the apartheid regime in South Africa). There are Arab members of the Israeli parliament (which have included Arab ministers such as Ayoob Kara and Raleb Majadale) and Arab supreme court judges (for example Salim Joubran). Israel’s giving control of Gaza to the Palestinians shows that Israel’s intention is not to maintain dominance over the Palestinians. Yet the lie that Israel is an apartheid State is repeated over and over and over again – just as Hitler counselled for false propaganda.
Using propaganda to make political points makes sense in war but doesn’t make sense when seeking peace. Peace requires honesty, together with an attempt to seek common ground and compromise without propaganda lies, so that reconciliation and trust can be built leading to bridges that end conflict. This applies to all parties – whether involved in the conflict or on the sidelines.
A failure to identify falsehood by basic checking of facts – such as the location of Israel’s capital – does the opposite and prolongs the state of conflict, reinforcing those who choose to believe propaganda over truth. In this, the Guardian and the London2012 websites are both guilty – as continuously repeating such lies (taking Hitler’s advice) aims to delegitimize Israel’s right to exist as a sovereign State, and the right of Jews to live freely and govern themselves in Israel.
AWARE has had a web-site since 1995 and our current domain (www.marketing-intelligence.co.uk) has been active since 1997. When we started there were less than 100,000 companies on the web. Google’s founders had not yet met each other, and even venerable search engines such as AltaVista had not yet started.
Over the years, we’ve made an effort to ensure that our web content was not copied and used on other sites without our permission.
Although doing manual checks by searching for key phrases is one way of checking for plagiarism and copyright theft, there are a number of dedicated plagiarism checking sites. One example is Plagium. Plagium’s drawback, shared with several similar services, is that you have to paste in the text you want to test rather than just enter the URL. Such services are generally aimed at helping teachers and college professors detect student cheating.
Although some services (such as Plagium) are free, most are not and may involve downloading dedicated software. Others only check a limited number of known “essay” type sites where students can download essays written by others. (We’ve found some of our content on such sites – evidently students who use them don’t care where they steal their content from. Once used successfully they then try to reuse it by uploading their A+ essay to the site).
Of all plagiarism detection websites probably the easiest and best is CopyScape. Copyscape’s aim is not only to help academics detect student cheating. It also allows webmasters to search for copied content in general. It doesn’t require users to paste in the suspect text. Instead web-site owners simply need to enter their URLs and get a report on other sites that use similar or identical wording. It’s sufficiently powerful that there is even a flippant web-page on ContentBoss’s website giving advice on how to bypass CopyScape and copy with impunity. (ContentBoss promises to provide unique content at a low monthly fee. Their bypass CopyScape tool uses a technique that will convert content into HTML guaranteed not to be picked up by plagiarism detectors. The catch is, as pointed out by ContentBoss, that using such content is also a guarantee that the site will be banned by search engines for spam content).
We’ve used CopyScape periodically over the years and miscreants included a competitor site that copied multiple pages from our site. We asked the site owner to change his pages and were ignored. We then took stronger action and within a couple of days the site was taken down. Another example involved an article published in a professional journal that took, almost verbatim, the content of our brief guide to competitive intelligence. We notified the publisher who ensured that the payment made to the “author” was recovered, and an apology published. The author said that he thought that material published on the web was copyright free. He was shown to be wrong.
Our most recent trawl for examples of copyright theft from AWARE’s pages turned up further examples where wording we’ve used has been stolen. The following images should show how effective the tool is – while at the same time naming and shaming the companies that are too weak, lazy or incompetent to produce their own copy and have to steal from others. (I’ve named them – but won’t give them the satisfaction of a link as this could help their search engine optimisation efforts – if they have any!)
The first example shows how text that appears on the footer of most of our pages is plagiarised.
This is the orignal text.
CopyScape found several sites had copied this text almost verbatim – for example Green Oasis Associates based in Nigeria:
or ICM Research from Italy and Pearlex from Virginia in the USA.
The ICM Research example is in fact the worst of these three, as their site has taken content from several other AWARE web-site pages.
The problem is that a company that is willing to steal content from other businesses is unethical – breaking the rule against misrepresenting who you are. If they are willing to steal content from others, they may also take short-cuts in the services provided and as a result should not be trusted to provide a competent service.
The page that is most often plagiarised is the Brief Guide to Competitive Intelligence Page, mentioned above. Clicking on a link found by CopyScape highlights the copied portions as seen in the following examples from AGResearch, Emisol and Wordsfinder.
Generally sites do not copy whole pages (although this does happen) but integrate chunks of stolen text into their pages – as seen in the AGResearch example, below – where 12% of the page is copied, and Wordsfinder where 13% has been copied.
The Emisol example below stole less – although copied key parts of the guide page:
Copyright theft is a compliment to the author of the original web-page, as it shows that the plagiarizing site views their competitor as top quality. However the purpose of writing good copy is to stand out and show one’s own capabilities. Sites that steal other site’s work remove this advantage as they make the claims seem anodyne and commonplace. They devalue both the copier – who cannot come up with their own material (and so are unlikely to be able to provide a competent service anyway) and the originator, as most people won’t be able to tell who came first. Fortunately search engines can, and when they detect duplication, they are likely to downplay the duplicated material meaning that such sites are less likely to appear high-up in search engine rankings. The danger is that both the originator of the material and the plagiarizer may get penalised by search engines – which is another reason to ensure that copyright thieves are caught and stopped. CopyScape is one tool that really works in protecting authors from such plagiarism.