My Info Blog

Data Mining vs Screen-Scraping

Info mining isn’t screen-scraping. I understand that some individuals in the room may disagree with that statement, but they’re actually two almost completely different concepts. scraping google search results

In a nutshell, you may state it this way: screen-scraping allows you to get information, where data mining permits you to analyze information. That’s a pretty big simplification, so I’ll sophisticated a bit. 

The term “screen-scraping” comes from the old mainframe terminal times where people done personal computers with green and dark-colored screens containing only textual content. Screen-scraping was used to extract characters from the screens so that they could be analyzed. Fast-forwarding to the internet world of today, screen-scraping now most commonly refers to taking out information from web sites. That is, computer programs can “crawl” or “spider” through web sites, taking out data. People often accomplish this to build things like comparison shopping search engines, archive web pages, or simply download text to a spreadsheet so that it can be strained and analyzed.

Data gold mining, on the other hands, is defined by Wikipedia as the “practice of automatically searching large stores of data for habits. ” In other words, words the data, and you’re now analyzing it to learn useful things about it. Data gold mining often involves lots of complex algorithms based upon record methods. It has not do with how you got the data in the first place. In data mining you only care about analyzing can be already there.

The issue is that folks who can’t say for sure the term “screen-scraping” will try Googling for anything that is similar to it. We include a number of these conditions on our internet site to help such folks; for example, we created internet pages entitled Text Data Gold mining, Automated Data Collection, Internet site Data Extraction, and even Web Site Ripper (I suppose “scraping” is kind of like “ripping”). Consequently it presents somewhat of any problem-we don’t necessarily want to perpetuate a belief (i. e., screen-scraping sama dengan data mining), but we also have to use terminology that folks will actually use.

Leave a Reply

Your email address will not be published. Required fields are marked *