What are the Search Quality Analyzers?
Search quality is very common and still very vague notion.
Each user has his/her own search habits and the most liked individual
search query types.
If, for example, you use mostly navigational queries (i.e., when you
need some place on the Net rather than some particular piece of
information), then you likely will like Google as this query type
is its core competence. And, on the contrary, if you like natural
language style queries, you may prefer the Ask search engine or some
other search engine.
The popularity of an SE does not directly reflect the quality of
search, because the popularity is a product of marketing and PR as well.
In order to develop an independent automatic test of search quality, we
developed a set of analyzers, one for each type of search queries. For
all these analyzers we use special sets of sample queries and sample
sites. We
measure quality of navigational and informational search, the level of
pornography in SE results page etc.
We hope that our tests are (or eventually would be) an objective and
reliable source of information on search quality. Enjoy.
How do the Analyzers work?
To estimate the search quality for various types of search queries, we use
special sets of sample queries and sample sites. For example, for
measuring the
navigational search quality, we use a set of approximately 5000 sample
queries
and the corresponding set of sample sites.
If the user inputs the 'CNN' query, she evidently (or, statistically)
wants to see the
site www.cnn.com at the first place on the search engine results page.
Thus, the www.cnn.com would be an organic result for the query 'CNN'. In
order to prevent the Analyzers from being compromised by search engine
developers, we use a sliding window of several tens of sample
queries each other day, and constantly replenish and refine the original
sample set of queries.
You can find description of the methods used in a particular analyzer on
the page where the analyzer data is shown.
We highly appreciate any corrections and welcome any criticism.
Please feel free to send us errors found, suggest new sample queries,
criticize the method etc.
|
|
Analyzer of nagivational search
|
|
|
Analyzer of subject search
|
|
|
Analyzer of correct hints
|
|
|
Typo resistance analyzer
A human is no machine, it is bound to make mistakes. This includes the mistakes while typing in a search query: a typo, next button pressed by accident ("quety" instead of "query"), a double character or a missed one ("qury" or "queery"), after all, the user can type the word 'by ear' not knowing the correct spelling (we would get "yandax" instead of "yandex"). →
In this case, the search engine can choose one of the following strategies: 1) no processing: search with exact spelling only 2) recognize the typo, but still search for the entered query with an additional hint: "perhaps you were looking for ?" 3) recognize the typo and search for the correct spelling immediately
Depending on the chosen strategy, the user either misses the fact that he is mistaken, or notices it and makes an extra click (up to the user), or gets the correct results without ever noticing his own mistake.
This analyzer compares the search results of the "correct query" and several forms of its possible mistypings. The similarity of results to those of a "correct" query is evaluated.
Beside deliberate typo correction, matches can arise in four cases: 1) accidentally 3) the page contains both the correct and mistyped spelling 4) incorrect reaction of the engine's morphology (e.g., the unknown word "mushroomz" which is a typo of "mushrooms" is corrected to "mushroom") 5) promotion of the same sites both for correct and incorrect spelling of queries
All of these cases produce noise for the given analyzer: an accidental match of results. The similarity is evaluated in the same way as for the update analyzer but with a different set of queries.
The more matching results are registered, the higher is the index of the search engine for given analyzer. This determines the order of search engines for the informer of the given analyzer.
In future, a rotation of query sets with typos from a wide array will be introduced.
Hide ↑
[ Link to article ]
|
|
|
Quotation search quality analyzer
|
|
|
Analyzer of search spam level
The company "Ashmanov and Partners" studies the phenomenon of search spam - the methods and technologies reducing the quality of search results and interfering with the operation of search engines.
Search spam is a text, URL, technology, program code or other web elements created by the web-master for the sole purpose of promoting the site in search engines' results, and not for a fast and reliable search based on complete and authentic information. →
The experts check Top 10 results of search queries on a regular basis, marking the sites which, in their opinion, contain elements of search spam. The collected data is entered into the informer. It shows the percentage of thus marked sites in the overall number of sites on the Top 10 of analyzed queries.
The source of information on the spam status of a given URL is internal data of the anti-spam lab of the company "Ashmanov and Partners". Following categories of search spam are used: * doorway - definite spam: doorways, leading the user to other pages, * spamcatalog - definite spam: spammer catalogues, * spamcontent - definite spam: spammers' stolen content, * pseudosite - definite spam: site disguised as corporate (pseudo-company), * catalog - catalogues, * board - bulletin boards, * domainsale - domains for sale, * secondary - secondary - stolen - content, * partner - any partner programs, * linksite - link support site, * spamforum - forum containing spam, * techspam - technical spam, * searchres - search results An aggregate indicator is the share of spam sites in the search results. The best search engine has the lowest indicator. This determines the order of search engines for the informer of the given analyzer.
Hide ↑
[ Link to article ]
|
|
|
SEO-pressing analyzer
Many queries are ambiguous. For instance, design, cars, sports, etc. These queries are defined as informational. The best result for such a query would be a selection of links for the resources corresponding different meanings of the query. Thus, the output for a "design" query should contain links to the sites on web-design, landscape design, interior design, etc. →
It is not easy to compile a quality multi-thematic set of links, especially regarding the fact that site optimizers abuse popular informational queries to promote their customers' sites. Due to such SEO-"pressing", the top is taken over by resources whose promotion is most profitable, so the results become monotonous, consisting of sites with the same kind of commercial offers.
The analyzer searches the title phrases and snippets of the top 10 search results for similar lines. The summarizing index is the percentage of similar lines in the overall number of sites found in the top 10 results for the analyzed queries. The higher this index is, the higher is the SEO-pressing on the given search engine. Typical words and phrases in the title or quotation are considered an indication of monothematicity. The percentage of search results that include "marker phrases" is the aggregate indicator. Best search engine has the lowest aggregate indicator for this analyzer. This determines the order of search engines in the informer of the given analyzer.
Hide ↑
[ Link to article ]
|
|
|
Analyzer of 'adult sites' entries in the search results
|
|
|
Recall analyzer
The Recall analyzer estimates the relative size of indices of a given set of Internet search engines. →
We believe that the data on the number of web pages in the index provided by search engines is not reliable, on the following reasons: a. The total number of documents indexed reported by the SE itself cannot be used for the index size estimate because different search engines have different methods of document count; for example, some of them include duplicate documents to the count while others do not etc. Including duplicates into the count may multiply the reported index size twice or even more. b. The number of documents found for some particular query does not reflect the real number of documents on the Net containing the query. For example, some SE show in the search results not only the documents where all the words comprising the query are found, but also add to the list all the documents containing only single query words . These "tail" documents are by no means relevant to the query, but for sure increase the total figure of results. c. Almost every frequent or semi-frequent query will return at least tens of thousands results in all search engines. But the user will never be allowed to see them all: the search session will be interrupted after browsing through first hundreds of pages . This means, the exact number of web pages found could be obtained and verified only in the case when there are tens of results, that means for very rare words. d. The size of index is very PR-sensitive issue as it is one of the very few simple notions in SE area easily understandable by journalists. This means that the bigger the index database size you report, the better press you get. The situation gets even worse when one realize that there is no standard methods to count documents in the SE index, and that even the definition of a web-document is vague and arguable. In order to obtain independent and more reliable data on the relative index size of popular search engines, we developed a simple automatic method, based on a set of sample queries. We gathered a set of very rare words, all of which are presented on the net in several tens of instances. Then, once a day, we count how many of these instances are found by a given search engine, and compare the date with the same data on other SE. Thus we obtain the comparison data. To make the data more steady, we use a sliding window of several sample queries from the whole sample set, so every day we use different set of sample queries. Of course, this is only a very indirect estimate of the relative recall figures of a given set of SE, and we understand all the limitations and drawbacks of this approach. But it is better than nothing, and far better than marketing bullshit from PR departments of popular SE. The set of sample queries are constantly replenished by our linguists. If you know more rare words and want to help us to cover some 'rare' and 'faraway' areas of the Net, that are not yet covered by our test, please send us these words, and we will consider including them into the sample queries list.
Hide ↑
[ Link to article ]
|
|
|
Update analyzer
Update of a search engine is the process of search results renewal. Some sites make it to the top 10, some sites "sink". Every search engine has its own update style which becomes clear in the corresponding analyzer. The search engine update analyzer monitors daily the top ten links referring to 140 queries to check the number of sites that changed their positions, and how much the positions have changed. →
The summary index of the update is calculated as follows: Example 1 For Day 1, a certain query has the following Top 10: C1, C2, C3, C4, C5, C6, C7, C8, C9, C10. For Day 2, the same query has this Top 10: Cn, C1, C2, C3, C4, C5, C6, C7, C8, C9. Example 2 For Day 1, a certain query has the following Top 10: C1, C2, C3, C4, C5, C6, C7, C8, C9, C10. For Day 2, the same query has this Top 10: Cn1, Cn2, Cn3, Cn4, Cn5, Cn6, Cn7, Cn8, Cn9, Cn10. Let N2i be the number in the 2nd Top 10 that was ith by the page order of the first Top 10. If the 2nd Top 10 did not have ith page, then N2i=11. The update indicator is calculated using the formula: 10 S Abs(i-N2i)/100 i=1 Where: S represents the sum; Abs is the absolute value of a number. If N2i=11 (ith page from the first Top 10 did not make it into the second Top 10), then Abs(i-N2i)=10. For the two examples above, the suggested calculation method gives the following results: Example 1 ((2-1)+(3-2)+(4-3)+(10-9)+10)/100 = 0.19 (19%) Example 2 10*10/100 = 1.00 (100%) Besides, additional parameters are calculated: the number of sites which disappeared from the search results and those which changed their positions.
This analyzer has no "valuation". The results can be interpreted in two ways: a search engine that has frequent large updates could be considered more up-to-date; a search engine with rare updates can be considered more stable and predictable. The informer of this analyzer sorts the search engines in the ascending order of update level.
Hide ↑
[ Link to article ]
|
|
|
Click analyzer
The analyzer of the percentage of clicks is not a "qualitative" analyzer, it only shows the popularity and usage of the search engines. For this analyzer, the data of the Liveinternet.ru is utilized. Thus we only take into account the clicks on sites that have a Liveinternet.ru counter installed. →
This analyzer reflects the percentage of clicks from the search engines to the sites of RuNet. Statistical data of the Liveinternet.ru counter on visits coming from search engines is used. A few comments are necessary here: A. Here we take a section of statistical data of the LiveInternet counter for Russian users (Russian IP address). In such a way we filter the noise produced by so-called "idiot clicks", i.e. random clicks of western users of "big" search engines such as Google, MSN Live Search, Yahoo. These are not really Russian search engines users, but they can significantly distort the statistics (since the western Internet is vast, and there number of such random users is high). B. These numbers are usually considered the shares of the search engines' market, but this is not quite correct. And this is why: a) LiveInternet counter only shows clicks on the sites where it is installed. Some bigger sites do not install it. This can result in a distortion of statistics when extrapolating for the whole of RuNet. b) the exact meaning of the percentage of clicks from a search engine and its correlation to the engine's true popularity are unclear. We can assume that the attendance of a search engine and the number of clicks do correlate, but it is only an assumption. What if, using a "bad" search engine, the user has to click on multiple search results before he finds the right site, while using a "good" one he finds what he needs at the first click? The "bad" search engine would in this case generate many clicks per user while the "good" one would generate only one. In general, the exact connection between popularity and clicks is unknown, and for this reason we called this a technical parameter. A formidable change in the percentage of clicks (say, 5 to 10 points or more) would probably be able to reflect a real change in attendance of a search engine. This can't be said about smaller (1-2%) fluctuations. á. We must not forget that these figures do not represent absolute attendance or absolute number of clicks, but percentage (the sum of which is constant and equals 100%). This produces the effect of mirror lows clearly visible on the monthly graph of Yandex, and inducted highs of Google: the attendance of Yandex decreases on weekends while that of Google suffers much less (the reason is unknown to us). Since the share of Yandex is high, its decrease results in proportional growth of the share of Google on weekends (since the sum of all search engines' shares remains constant). For Rambler, the weekend decrease is just as pronounced, so its share of percentage does not rise the way that of Google does.
For this analyzer, the search engine with most clicks is considered the best one. This determines the order of search engines in the informer of the given analyzer.
Hide ↑
[ Link to article ]
|