Automate web content collection

A company specializing in handling equipment is looking to create an argus based on second-hand advertisements on specialized sites around the globe. Data from some of these sites has already been retrieved, standardized and grouped together in a file.

Some attributes are already extracted, but material identification requires a business operator, as the same model can be designated in several forms. It is therefore difficult to identify models without a standard repository.

Modules used

From the web to standardized database content

The solution

The scraping module recovers ad content from all specialized sites. The function module standardizes equipment names according to the company's own reference system. All of this is fed into a database, which is used to build up the "argus" and keep track of prices on a daily basis.

Source data and workflow




Web collect


Return on investment

Automated data collection, cleansing and standardization.

Automate information gathering from a wide variety of sources

Information cleansing, standardization, control and consolidation

Automatic labeling of recordings via a Machine Learning module

Reduced risks associated with human intervention

Let's talk about your project!