In part one of this article, we set out to document the little known Ask web search API by providing background information. In this continuation, we’ll look at the actual API details.
Update: Ask disabled access to their API on 6 March 2007. We are working on obtaining additional information. Write us if you would like to be notified of further developments.
The following information was determined by observation and conjecture. Write us if you want to be notified when we update this page with more complete information. We are assuming the reader has already worked with REST queries and is familiar with parsing XML data.
Request URL
The request URL is formed by adding query parameter and their values to a base URL using the format query parameter=values. Successive parameters are added using a & before each parameter.
Base URL: http://xml.teoma.com/e?
Request URL parameters should be URL encoded.
Before considering the data elements in detail, try a query using the ASK Web Search API. This example will return the first 20 results for “Italy”. If you have trouble viewing the xml results in your browser, try Firefox.
Unless indicated otherwise, Ask parameters, tag attributes and the tags themselves are suppressed when their values are 0 or empty.
Request Parameters
| Parameter | Value | Description |
|---|---|---|
| a | integer | ? |
| f | integer; default 0 | First result number (offset) to return. Default is 0. Used to “page” through results. |
| i | integer | ? |
| p | integer; default 0 | Treat query words as a phrase |
| t | string. required. | Query terms. |
| u | integer; default 10. Takes values 1 to 200. | Maximum number of results to return |
| y | integer | ? |
Ask advanced query parameter modifiers
The following advanced query modifiers are documented for general use with ask search queries. They appear to work with the web services API as well.
| Parameter | Value | Description |
|---|---|---|
| site: | string, e.g. domain name | Restrict query to specific site. Must be used with another query term. To query an entire site, use site:www.mydomain.com inurl:www.mydomain.com. |
| intitle: | string | limit query to page titles containing text |
| inurl: | string | limit query to page URLs |
| lang: | language code. (NL Dutch; EN English; FR French; DE German; IT Italian; PT Portuguese; ES Spanish) | limit query to specific language. |
| geoloc: | Region code. (CA Central America; EU Europe; IA India / Asia; NA North America; OC Oceania; SA South America) | limit query to specific geographic region. We noticed that Ask’s spelling system is not aware of geoloc as a query parameter. Ask suggests an alternate spelling! |
| inlink: | string | limit query to links containing a string. |
| last: | week, 2weeks, month, 3months, 6months, year, 2years | limit query to pages indexed in a specific time frame. |
| afterdate: | afterdate:yyyymmdd | limit query to pages modified after date e.g. afterdate:20061015 |
| beforedate: | beforedate:yyyymmdd | limit query to pages modified before date, e.g. beforedate:20060312 |
| betweendate: | betweendate:yyyymmdd,yyyymmdd | limit query to pages modified during a date range, e.g. betweendate: 19960115,20030412 |
Response Fields
The response is delivered as a xml version 1.0 file using UTF-8 encoding. The entire response is wrapped in a SEARCHRESULTS tag. Tags may contain data, attributes or both.
| Tag | Attributes | Value | Description |
|---|---|---|---|
| RESULTSET | Wrapper tag for both topic and web search result sets. | ||
| QUERY | string | Contains the processed query string. | |
| ESTIMATERESULT | integer | Estimate of total matching results in ASK. | |
| TOTALRESULT | integer | Total results available for queries. Maximum of 200. | |
| FIRSTRESULT | integer | 0 based offset, used for looping through “pages” of results. | |
| NUMRESULTS | integer | Count of results in result set. Default is 10. maximum is TOTALRESULT, i.e. 200. | |
| SORT | string; default: rank | Sort order. Other values? | |
| MORERESULTS | string; true or false | ||
| STOPWORDS | string | List of stop words (i.e. the, a, for) filtered from query | |
| COUNT | integer | Number of stop words excluded from query |
| Tag | Attributes | Value | Description |
|---|---|---|---|
| RESULTSET | Wrapper tag for both topic and web search result sets. | ||
| QUERY | string | Contains the processed query string. | |
| ESTIMATERESULT | integer | Estimate of total matching results in ASK. | |
| TOTALRESULT | integer | Total results available for queries. Maximum of 200. | |
| FIRSTRESULT | integer | 0 based offset, used for looping through “pages” of results. | |
| NUMRESULTS | integer | Count of results in result set. Default is 10. maximum is TOTALRESULT, i.e. 200. | |
| SORT | string; default: rank | Sort order. Other values? | |
| MORERESULTS | string; true or false | ||
| STOPWORDS | string | List of stop words (i.e. the, a, for) filtered from query | |
| COUNT | integer | Number of stop words excluded from query |
Topics
In the relentless pursuit of relevant results, search engines strive to find patterns and relationships in the relatively unstructured data that is the web. Most search engines are able to group similar results from the same site, displaying just one or two results with an option to see more results from the same site. This grouping is called site level topic clustering.

The Teoma search engine, integrated into the current Ask, was one of the first to apply topic clustering to the greater web. Teoma determined clusters by identifying communities, web pages which link to each other, automatically naming the clusters based common phrases appearing in the group. The clustering was dynamic, performed for each query (although presumably search engines cache popular queries).
At one point, Teoma displayed topic clusters as folders above the main search results under the heading Web Pages Grouped by Topic. Later they were called “Refine”. Users could click on a topic to refine their search by organizing results into specific sub-topic categories based on the search query.
Topics are not currently displayed in Ask search query results. Fortunately, this information is still available via the web services API.
At most 10 topics are returned.
Topic Response Header
| Tag | Attributes | Value | Description |
|---|---|---|---|
| TOPICGROUPS | NUMGROUPS | integer; 1 to 10 | Number of topic clusters returned. |
| MORERESULTS | string; true or false | More results are available than those returned? |
Topic Response Details
| Tag | Attributes | Value | Description |
|---|---|---|---|
| TOPICGROUP | ID | integer, 0 based | Detail item wrapper tag. Contains item id, an integer starting at 0. |
| NAME | string | Contains keyword or keyword phrase used to name topic group. | |
| URL | string; partial query URL | A concatenation of t= and the words appearing in name, each separated with “+” as required to create a URL |
Web Search
Web Search Response Header
| Tag | Attributes | Value | Description |
|---|---|---|---|
| PREVWEBPAGE | string | Tag appears if prior “pages” of results are available. Contains query string to append to base URL which will return previous results. | |
| A | integer; default 1. | ? | |
| F | integer | First result number (offset) to return in the next result set. Is the sum of the current first result ID and the number of records requested. | |
| P | integer; default: 0. 1 if phrase. | Treat query words as a phase for proximity searching. | |
| U | integer; default 10. Values 1 to 200. | Number of records requested/ to request. | |
| MOREWEBPAGE | string | Tag appears if more “pages” of results are available. Contains query string to append to base URL which will return next results. | |
| A | integer; default 1. | ? | |
| F | integer | First result number (offset) to return in the next result set. Is the sum of the current first result ID and the number of records requested. | |
| P | integer; default: 0. 1 if phrase. | Treat query words as a phase for proximity searching. | |
| U | integer; default 10. Values 1 to 200. | Number of records requested/ to request. | |
| RELATED | string | Query string to perform related query | |
| A | integer; default 1. | ? | |
| I | integer; default 0. Not suppressed as with other attributes. | ? | |
| Y | integer; default 1. | ? | |
| WEBPAGE | Wrapper tag for web search results detail records |
Web Search Response Details
| Tag | Attributes | Value | Description |
|---|---|---|---|
| RESPONSE | ID | integer | 0 based offset |
| SCORE | decimal; values 0.01 to 1.00 | Site rank for query | |
| PARTITION | x,y integer pair | ? …worth some reflection. | |
| IND | integer, tag not present or 1 | Appears to be a presentation tag, meaning “indent”. Appears when a site cluster result is present. | |
| DOCTYPE | string, default: tag not present. PDF appears to be the only currently used value. | mime type. Ask officially supports Flash (swf) and PDF documents in addition to standard text. In reality, flash documents are not highlighted in Ask search results nor is the DOCTYPE tag populated. Ask does not yet support a doctype search filter, although it was promised in the past. inurl:pdf or inurl:swf work to a degree as a workaround. | |
| TITLE | string (Truncated after the first ~65 characters) | Document title | |
| URL | string | Document URL | |
| ABSTRACT | string (upto ~140 characters) | Document Abstract. Based on user query. | |
| SITE | string, default not present. Value is a URL | More results from current site. Present when IND is 1, is URL to query site for given keywords | |
| CACHED | KEY | string; e.g. 00*knpldsckmkez | Teoma 3.0 introduced cached versions of “popular” sites. A document may not be cached if the site is not popular or the document has used the robots “noachieve” meta tag. |
| URL | string | complete URL to access cached document. |
Teoma Experts’ Links / Resources
Teoma offered Expert Links, later called Resources, which we have not encountered in the xml API.
Expert links are Web sites created by individual enthusiasts, or “fans”, containing lists of resources relevant to the search topic. For example, an amateur golfer might have created a page devoted to his personal collection of favorite golfing sites. Teoma’s expert identification technology discovers and presents these pages as “Expert Links.”
Query the Ask Web Search API with our Perl example program
We have created a sample Perl program to query Ask’s Web Search API, saving the returned xml file and placing the results in a spreadsheet file for analysis. Note that this program is only a sample; it is not intended for production use. Be kind to Ask: don’t kill their servers with incessant automated queries.
Download and uncompress ask-search-1.0.pl.tgz. Read the licence terms and warnings at the beginning of the program. Note that the name of this file will change. Link to this page, not the file! To use, type
perl ask-search.pl <query terms>
i.e.
perl ask-search.pl the rolling stones site:bbc.co.uk
In this case, Ask will ignore the, a stop word. All sites with the domain suffix bbc.co.uk will be queried for rolling stones. Three files will be created:
- ask-query-response.xml - contains the raw results from Ask, in XML.
- ask-query-results.txt - contains a processed view of the XML, used by the perl module XML::Simple
- ask-search.xls - contains the query results in an Excel format spreadsheet. The first tab contains the search results. The second tab contains up to 10 topics.
Known Limitations
- “Encyclopedia” entries, such as those from Wikipedia, are not integrated in this data.
- “Narrow your search” options, as in the current Ask.com interface, are not the same as the topic group options.
- Sponsored ads are not present in the xml data.
- The binocular preview option URL is not present.
Unfortunately Ask’s current crawling frequency of web sites appears to be much lower than Ask’s competitors Google, Yahoo! and Microsoft’s MSN Search and Windows Live. Consequently, the base data processed by Ask’s algorithms is more likely to be out of date, clearly impacting the quality of search results.
Ask Maintains Separate Regional Search Engines?
Our initial analysis indicates that the results available from xml.teoma.com are the same as those from the US version of Ask, www.ask.com. It appears that localized versions of Ask, such as Ask Italia, are probably using a different underlying data base. We see richer results for Italian sites in Ask Italia compared to Ask.com. With other search engines, a simply modification of language and/ or region code, in this case lang and geoloc, usually solves the problem. Not with Ask. We also note the lack of an advanced query options page for Ask Italia.
External Resources
Last updated: 2009-06-17
Similar Posts:
- Decrypting Ask’s Web Search API
- Web Text Search is Hard. Image indexing is even harder. Just ask Cuil.
- 6 methods to control what and how your content appears in search engines
- 7 sources of link intelligence data and key link analysis considerations
- Accented Characters, Symbols and Special Characters in HTML Documents: Considerations for Search Engine Optimization, Usability and XML Feeds.
If you're new here, please subscribe to my RSS feed or subscribe to me on Twitter, which is updated on a more frequent – and more meaningless – basis in English and Italian. Finally, if you're a Sphinn user, Sphinn love is welcome :-). Thanks for visiting!






















0 responses so far ↓
There are no comments yet...Kick things off by filling out the form below.
Leave a Comment
Warning: Comments are welcome insofar as they add something to the discussion. Anonymous and/or polemical comments without a rational justification of the author's position risk being mercilessly deleted at the sole discretion of the administrator. Yes, life is hard :-).