Tag: web scrapping

Web scrapping in a smart way, making a "Today in History" object in PHP

There are thousands of services available on web who are presenting interesting as well as education information which you can really integrate in your web page or make a nice widget and let others use them seamlessly with their content delivery platforms. In this article I am going to show you how you can make a nice Today-in-History widget with the help of the data provided in Scopesys. You can use this code to make a nice widget or a trivia app or whatever. But before making your own scrappers from any services, please please please carefully note the copyright of that content. You shouldn’t violate copyright either way.

In this widget, we will strip the following content from the pages provided by scopesys and display them in different categories.
1. Today in history
2. Who’s born today
3. Who’s died today
4. Where is holiday today
5. Religious observance of today
6. Religious history of today

Lets go 😀


<?php
//todayinhistory.php
error_reporting(0);
define("MARKER_START","<H3>On this day...</h3>");
define("MARKER_END","<BR><BR><HR><h3>Holidays</h3>");
define("BIRTHDAY_START","</font></center></center>");
define("BIRTHDAY_END","<HR> <br><H3>Deaths which occurred on ".date("F d").":</H3>");
define("DEATH_START","<HR> <br><H3>Deaths which occurred on ".date("F d").":</H3>");
define("DEATH_END","<HR><IMG align=left SRC=\"http://www.scopesys.com/flag.gif\">");
define("HOLIDAYS_START",'<i>Note: Some Holidays are only applicable on a given <b>"day of the week"</b></i><br> <br>');
define("HOLIDAYS_END","<HR> <H3>Religious Observances</H3>");
define("RELIGIOUS_START","<HR> <H3>Religious Observances</H3>");
define("RELIGIOUS_END","<HR> <H3>Religious History </h3>");
define("RELHISTORY_START","<HR> <H3>Religious History </h3>");
define("RELHISTORY_END","<BR><BR><font color=red>");


echo "<h2>Today is ".Date("F d, Y")."</h2>";
$data = file_get_contents("http://www.scopesys.com/today");


if ($_GET['history']=='1'){
echo "<br/><h2 style='color: green' >Today in history</h2>";
$end = strpos($data,MARKER_END)-15;
$start = strpos($data,MARKER_START)+strlen(MARKER_START);
echo substr($data,$start,$end-$start);
}


if ($_GET['born']=='1'){
echo "<br/><h2 style='color: green' >Who's born today</h2>";
$end = strpos($data,BIRTHDAY_END);
$start = strpos($data,BIRTHDAY_START)+strlen(BIRTHDAY_START);
echo substr($data,$start,$end-$start);
}


if ($_GET['died']=='1'){
echo "<br/><h2 style='color: green' >Who died today</h2>";
$end = strpos($data,DEATH_END);
$start = strpos($data,DEATH_START)+strlen(DEATH_START);
echo substr($data,$start,$end-$start);
}


if ($_GET['holiday']=='1'){
echo "<br/><h2 style='color: green' >Where is holiday today</h2>";
$end = strpos($data,HOLIDAYS_END);
$start = strpos($data,HOLIDAYS_START)+strlen(HOLIDAYS_START);
echo substr($data,$start,$end-$start);
}


if ($_GET['religious']=='1'){
echo "<br/><h2 style='color: green' >Religious observance</h2>";
$end = strpos($data,RELIGIOUS_END);
$start = strpos($data,RELIGIOUS_START)+strlen(RELIGIOUS_START);
echo substr($data,$start,$end-$start);
}


if ($_GET['relhistory']=='1'){
echo "<br/><h2 style='color: green' >Religious history</h2>";
$end = strpos($data,RELHISTORY_END);
$start = strpos($data,RELHISTORY_START)+strlen(RELHISTORY_START);
echo substr($data,$start,$end-$start);
}
?>

Now if you want to find who born today, point your browser to todayinhistory.php?born=1. Mashup Mashup Mashup, that is what many successful web app are doing these days. And sometime this is how data collection is done behind the scene 🙂

Writing this code was really enjoyable as getting root canal done in your teeth with a rusty drill (I forgot where I’ve read such a nice quote), heh heh. But I am sure, you will enjoy it more than that 😉 – happy scrapping.