Posts Tagged ‘HTML’

A Benchmark Comparison Of Content Extraction From HTML Pages

Introduction Content extraction is the task of separating boilerplate such as comments, navigation bars, social media links, ads, etc, from the main body of text of an article formatted as HTML. The main content typically accounts for only a small portion of a page’s source code (highlighted in red in the image below). Extraction is…

Read More

Our mission

Skim’s mission is to empower people to use data more effectively and to demystify artificial intelligence. Rather than holding up the common narrative of machines replacing humans, we see how machines can help humans to have easier lives and better businesses.

Supported by

Contact

London office
27 Finsbury Circus,
London EC2M 5NT

Portugal office
R. de Cândido dos Reis 81,
4050-152 Porto, Portugal

+44 207 129 7497
sales@skimtechnologies.com

skim-logo