Targeted Crawlers

We have perfected the art of crawler writing.

A targeted crawler extracts information from web content and stores the data in a structured format. Get a Quote

For example, the temperature of each city on each day is available on the internet; however you cannot analyze the temperature distribution or trends unless all of this data is captured into a single database.

In this case, a targeted crawler could automate the collection of data from sites such as weather.com and automatically save the temperature by date and city. Our intelligent crawler would learn the structure of the target website and automatically capture data for all of the cities. Our targeted crawlers are even capable of grabbing other parameters such as region or country, and storing them in a properly structured hierachy. Now, with a detailed historical temperature database, you can do all kinds of analysis!

In addition to downloading information from the public domain, crawlers also work as general purpose data transformers. We have developed crawlers that convert data among various database systems, Excel spreadsheets, XML feeds and plain text files. These data conversions often involve schema transformations, too.

Writing a crawler can be challenging; it is often regarded as a black art. Websites are generated by many different serverside technologies such as JSP, .NET or PHP frameworks. In addition to having an intimate knowledge of the inner workings of each language, crawler writers must also attempt to reverse the original author's intent.

A crawling system typically consists of the following components:

Content Fetching
Content Scraping
Browser Action Emulation
Authentication
Site Traversal
Storage Structuring
Entity Reconciliation

Targeted Crawlers

Antradar has gained tremendous experience in the above areas by delivering a portfolio of ambitious data projects. We have extracted information from major retail sites, classifieds, knowledge databases and advertising networks. Our work has played a vital role in helping our clients to understand their target markets, boostrap core business data, identify and avert risks, and gain competitive intelligence.

When we deliver crawlers, we also enable our customers the sustained ability to acquire and consume external information.