Blog

Linked Data Now!

23 Aug 2011

In February 2009 Tim Berners Lee, well renowned as the inventor of the modern internet announced his vision of the next phase of the web. His original concept was formed from the fact that there were mountains of documents stored on pepoles’ computers which contained vital information. The internet became the place to publish, share and find those documents.

In the last ten years it’s developed into a much more interesting place to be. Technology and some very insightful people have created what has been dubbed Web 2.0, a place where documents have become interactive, contributory, and in a huge range of formats. But the internet is now a huge library of things without any kind of categorisation, structure or definition.

Granted, amazing organsiations like Google have made amazing efforts to help us find the information we need immediately but it’s only a best guess. The secret to this is having data about the documents we’re publishing online.

Tim Berners-Lee’s vision of the second phase of the internet aims to solve this. In essence he wants us to publish any data we have online in well structured defined formats which are recognised globally. He’s called it the Linked Data web. Take a look at his talk on TED:-

So, that’s the science out of the way, how can this benefit you and drive traffic to your site.

Good Relations – The most powerful Web vocabulary for e-commerce
The Good Relations categorisation system enables web shops to publish their products automatically on the Google Shopping site as well as a host of RDFa-aware search engines, shopping comparison sites, and mobile services.

Effectively the data is collected, scraped by google and is republished in a structed format at google.com/shopping for instance.

This service is currently used by web shops such as Amazon and Play.com. It can be a really simple process to set this up and it’ll drive a huge amount of traffic. For much more information please contact us.

Posted in: blog and tagged with: , , , , ,

Monitor Changes – Keep up to date with changes on a website

7 Mar 2011

Our industry leading data mining techniques can scrape web content from any site you need to monitor and let you know when something has changed on the page. Here’s the top three uses for an application which can monitor the activity on a page.

1. Price Comparison
Perhaps you’ve got a vested interest in knowing when a competitor changes the price of a product you both sell. We can monitor the page of each of your competitors which relates to a particular product and send you a email notification when something changes.

2. Keeping a directory up to date
If you’ve got a large marketing database, a directory of suppliers, a database of clients or even just a spreadsheet .xls or microsoft access file of data you’ll understand the time it takes to keep it up to date.

We can make that nice and simple. If there’s a website linked to each record on your database we can return to the website and check the contact details match those on the site. If something has changed we’ll reflect that in your database keeping everything up to date and freeing your time up for more important things.

3. Reusing a websites data on your site
Re-publishing someone elses data can be a great way to keep your own site up to date but if there’s no RSS feeds how can I grab the data?

We can parse the website once a day looking for new content and if new articles have been published or new pages have been added we can then publish them at your site.

Screen Scraping sites with Javascript

28 Jan 2011

Most web scraping systems on the market can’t function on websites that require javascript to view them.  Due to the development of the internet people are demanding more and more interactivity from sites and often the simplest way of achieveing this is by writing large chunks of content using fancy features such as JavaScript, iFrames, DHTML, or Flash.

Even search engines such as google can’t handle spidering over large portions of websites that are written in these langagues so a lot of content is discounted. 

In contrast our industry leading web scraping framework enables us to work around any issues that these coding languages might present enabling us to get content that most systems cannot handle.

Our screen scraping technology can spider over sites and grab content which is dyanically created using client side scripts such as javascript. Whether you require statistics from a website which updates periodically using javascript, want to grab content from a website designed entirely in js, jquery or mootools or require content from a site which presents a barrier to server side scripts our system can help.

Please contact us to find out more about how we can help you get the data from any site, clean it and make it ready for republishing.

Click here now to contact us for a no obligation quote

Tel: +44 (0) 121 572 6472

Thanks for your email, we'll reply shortly