Blog Posts, Analytics &, Gegevens
What Is Gegevens Scraping?
Gegevens scraping, also known spil web scraping, is the process of importing information from a webstek into a spreadsheet or local verkeersopstopping saved on your pc. It’s one of the most efficient ways to get gegevens from the web, and te some cases to channel that gegevens to another webstek. Popular uses of gegevens scraping include:
- Research for web content/business intelligence
- Pricing for travel booker sites/price comparison sites
- Finding sales leads/conducting market research by crawling public gegevens sources (e.g. Yell and Twitter)
- Sending product gegevens from an e-commerce webpagina to another online vendor (e.g. Google Shopping)
And that list’s just scraping the surface. Gegevens scraping has a vast number of applications – it’s useful ter just about any case where gegevens needs to be moved from one place to another.
The basics of gegevens scraping are relatively effortless to master. Let’s go through how to set up a elementary gegevens scraping act using Excel.
Gegevens Scraping with dynamic web queries te Microsoft Excel
Setting up a dynamic web query te Microsoft Excel is an effortless, versatile gegevens scraping method that enables you to set up a gegevens feed from an outer webstek (or numerous websites) into a spreadsheet.
Witness this excellent tutorial movie to learn how to invoer gegevens from the web to Excel – or, if you choose, use the written instructions below:
- Open a fresh workbook ter Excel
- Click the cell you want to invoer gegevens into
- Click the ‘Data’ tabulator
- Click ‘Get outer data’
- Click the ‘From web’ symbol
- Note the little yellow arrows that emerge to the top-left of web pagina and alongside certain content
- Paste the URL of the web pagina you want to invoer gegevens from into the address buffet (wij recommend choosing a webpagina where gegevens is shown ter tables)
- Click ‘Go’
- Click the yellow arrow next to the gegevens you wish to invoer
- Click ‘Import’
- An ‘Import data’ dialogue opbergruimte pops up
- Click ‘OK’ (or switch the cell selection, if you like)
If you’ve followed thesis steps, you should now be able to see the gegevens from the webstek set out te your spreadsheet.
The excellent thing about dynamic web queries is that they don’t just invoer gegevens into your spreadsheet spil a one-off operation – they feed it te, meaning the spreadsheet is regularly updated with the latest version of the gegevens, spil it shows up on the source webstek. That’s why wij call them dynamic.
To configure how regularly your dynamic web query updates the gegevens it imports, go to ‘Data’, then ‘Properties’, then select a frequency (“Refresh every X minutes”).
Automated gegevens scraping with implements
Getting to grips with using dynamic web queries ter Excel is a useful way to build up an understanding of gegevens scraping. However, if you intend to use gegevens regularly scraping te your work, you may find a dedicated gegevens scraping contraption more effective.
Here are our thoughts on a few of the most popular gegevens scraping contraptions on the market:
Gegevens Scraper slots straight into your Chrome browser extensions, permitting you to choose from a range of ready-made gegevens scraping “recipes” to samenvatting gegevens from whichever web pagina is loaded ter your browser.
This device works especially well with popular gegevens scraping sources like Twitter and Wikipedia, spil the plugin includes a greater diversity of recipe options for such sites.
Wij attempted Gegevens Scraper out by mining a Twitter hashtag, “#jourorequest”, for PR opportunities, using one of the tool’s public recipes. Here’s a flavour of the gegevens wij got back:
Spil you can see, the instrument has provided a table with the username of every account which had posted recently on the hashtag, plus their tweet and its URL
Having this gegevens te this format would be more useful to a PR rep than simply watching the gegevens te Twitter’s browser view for a number of reasons:
- It could be used to help create a database of press contacts
- You could keep referring back to this list and lightly find what you’re looking for, whereas Twitter continuously updates
- The list is sortable and editable
- It gives you ownership of the gegevens – which could be taken offline or switched at any ogenblik
We’re struck with Gegevens Scraper, even however its public recipes are sometimes slightly rough-around-the-edges. Attempt installing the free version on Chrome, and have a play around with extracting gegevens. Be sure to see the intro movie they provide to get an idea of how the contraption works and some plain ways to samenvatting the gegevens you want.
How are marketers using gegevens scraping?
Spil you will have gathered by this point, gegevens scraping can come te handy just about anywhere where information is used. Here are some key examples of how the technology is being used by marketers:
Gathering disparate gegevens
“The spectrum of use cases for this is infinite.”
FeedOptimise offers a broad multiplicity of gegevens scraping and gegevens feed services, which you can find out about at their webstek.
Attempt finding a list of useful contacts on Twitter, and invoer the gegevens using gegevens scraping. This will give you a taste of how the process can gezond into your everyday work.
Outputting an XML feed to third party sites
“Data scraping can output your XML feed for Google Shopping,” says Target Internet’s Marketing Director, Ciaran Rogers. “ I have worked with a number of online retailers retailer who were continually adding fresh SKU’s to their webpagina spil products came into stock. If your E-commerce solution doesn’t output a suitable XML feed that you can meet up to your Google Merchant Centre so you can advertise your best products that can be an kwestie. Often your latest products are potentially the best sellers, so you want to get them advertised spil soon spil they go live. I’ve used gegevens scraping to produce up-to-date listings to feed into Google Merchant Centre. It’s a fine solution, and actually, there is so much you can do with the gegevens once you have it. Using the feed, you can tag the best converting products on a daily onderstel so you can share that information with Google Adwords and ensure you bid more competitively on those products. Once you set it up its all fairly automated. The plasticity a good feed you have control of ter this way is good, and it can lead to some very definite improvements te those campaigns which clients love.”
It’s possible to set up a plain gegevens feed into Google Merchant Centre for yourself. Here’s how it’s done:
How to set up a gegevens feed to Google Merchant Centre
Using one of the technologies or instruments described previously, create a opstopping that uses a dynamic webstek query to invoer the details of products listed on your webpagina. This opstopping should automatically update at regular intervals.
- Upload this opstopping to a password-protected URL
- Go to Google Merchant Centre and loom te (make sure your Merchant Centre account is decently set up very first)
- Go to Products
- Click the plus button
- Come in your target country and create a feed name
- Select the ‘scheduled fetch’ option
- Add the URL of your product gegevens verkeersopstopping, along with the username and password required to access it
- Select the fetch frequency that best matches your product upload schedule
- Click Save
- Your product gegevens should now be available te Google Merchant Centre. Just make sure you Click on the ‘Diagnostics’ tabulator to check it’s status and ensure it’s all working slickly.
The dark side of gegevens scraping
There are many positive uses for gegevens scraping, but it does get manhandled by a puny minority too.
The most prevalent misuse of gegevens scraping is email harvesting – the scraping of gegevens from websites, social media and directories to uncover people’s email addresses, which are then sold on to spammers or scammers. Te some jurisdictions, using automated means like gegevens scraping to harvest email addresses with commercial intent is illegal, and it is almost universally considered bad marketing practice.
Many web users have adopted technics to help reduce the risk of email harvesters getting hold of their email address, including:
- Address munging: switching the format of your email address when posting it publicly, e.g. typing ‘patrick[at]gmail.com’ instead of ‘[email protected]’. This is an effortless but slightly unreliable treatment to protecting your email address on social media – some harvesters will search for various munged combinations spil well spil emails ter a normal format, so it’s not entirely airtight.
- Voeling forms: using a voeling form instead of posting your email address(es) on your webstek.
- Photos: if your email address is introduced te photo form on your webstek, it will be beyond the technological reach of most people involved ter email harvesting.
The Gegevens Scraping Future
Whether or not you intend to use gegevens scraping ter your work, it’s advisable to educate yourself on the subject, spil it is likely to become even more significant te the next few years.
There are now gegevens scraping AI on the market that can use machine learning to keep on getting better at recognising inputs which only humans have traditionally bot able to interpret – like photos.
Big improvements ter gegevens scraping from photos and movies will have far-reaching consequences for digital marketers. Spil picture scraping becomes more in-depth, we’ll be able to know far more about online pics before we’ve seen them ourselves – and this, like text-based gegevens scraping, will help us do lots of things better.
Then there’s the largest gegevens scraper of all – Google. The entire practice of web search is going to be transformed when Google can accurately infer spil much from an photo spil it can from a pagina of copy – and that goes dual from a digital marketing perspective.
If you’re ter any doubt overheen whether this can toebijten te the near future, attempt out Google’s pic interpretation API, Cloud Vision, and let us know what you think.