Counting occurrence of a word in a String :: Benchmarking of PHP functions

Today I was just thinking what are the possible ways to count the occurrence of a specific word inside a string. I found some possible ways finally and I just benchmarked them. Wanna see the result?? – for sure you will find it interesting too.

<?
function microtime_float()
{
   list(
$usec$sec) = explode(” “microtime());

   return ((float)$usec + (float)$sec);
}

$str “I have three PHP books, first one is ‘PHP Tastes Good’,
 next is ‘PHP in your breakfast’ and the last one is ‘PHP Nightmare'”
;

$start microtime_float();
for (
$i=0$i<10000$i++)
{

    $cnt count(split(“PHP”,$str))-1;
}
$end microtime_float();

echo “Count by Split+Count took : “.($end$start).” Seconds\n”;

$start microtime_float();

for ($i=0$i<10000$i++)
{
    
preg_match_all(“/php/i”,$str,$matches);

    $cnt count($matches[0]);

}
$end microtime_float();
echo 
“Count by Preg_Match+Count took : “.($end$start).” Seconds\n”;

$start microtime_float();

for (
$i=0$i<10000$i++)
{

    str_replace(“PHP”,“PP”,$str,$cnt);
    
//echo $cnt;
}
$end microtime_float();

echo “Count by str_replace took : “.($end$start).” Seconds\n”;

$start microtime_float();

for ($i=0$i<10000$i++)
{
    
str_ireplace(“PHP”,“PP”,$str,$cnt);

    //echo $cnt;
}
$end microtime_float();
echo 
“Count By str_ireplace took : “.($end$start).” Seconds\n”;

$start microtime_float();

for (
$i=0$i<10000$i++)
{

    $cnt count(explode(“PHP”,$str))-1;
    
//echo $cnt;
}
$end microtime_float();

echo “Count By Explode+Count took : “.($end$start).” Seconds\n”;

$start microtime_float();

for (
$i=0$i<10000$i++)
{
    
$word_count = (array_count_values(str_word_count(strtolower($str),1)));

    ksort($word_count);

    
$cnt $word_count[‘php’];
}
$end microtime_float();
echo 
“Count By Array Functions took : “.($end$start).” Seconds\n”;

$start microtime_float();
for (
$i=0$i<10000$i++)

{
    $cnt count(preg_split(“/PHP/i”,$str))-1;
}
$end microtime_float();

echo “Count By preg_split+Count took : “.($end$start).” Seconds\n”;

$start microtime_float();
for (
$i=0$i<10000$i++)
{

    $cnt substr_count($str“PHP”);
}
$end microtime_float();
echo 
“Count By substr_count took : “.($end$start).” Seconds\n”;

?>

And the result is

First Run
Count by Split+Count took : 0.44112181663513 Seconds
Count by Preg_Match+Count took : 0.46423101425171 Seconds
Count by str_replace took : 0.23512482643127 Seconds
Count By str_ireplace took : 0.39766597747803 Seconds
Count By Explode+Count took : 0.25045800209045 Seconds
Count By Array Functions took : 1.1077101230621 Seconds
Count By preg_split+Count took : 0.30741000175476 Seconds
Count By substr_count took : 0.21060705184937 Seconds

Second Run

Count by Split+Count took : 0.68125295639038 Seconds
Count by Preg_Match+Count took : 0.60020899772644 Seconds
Count by str_replace took : 0.2877471446991 Seconds
Count By str_ireplace took : 0.47500586509705 Seconds
Count By Explode+Count took : 0.31055402755737 Seconds
Count By Array Functions took : 1.3551599979401 Seconds
Count By preg_split+Count took : 0.40205383300781 Seconds
Count By substr_count took : 0.24432802200317 Seconds

Third Run
Count by Split+Count took : 0.50134515762329 Seconds
Count by Preg_Match+Count took : 0.53588891029358 Seconds
Count by str_replace took : 0.25469994544983 Seconds
Count By str_ireplace took : 0.34696006774902 Seconds
Count By Explode+Count took : 0.23176002502441 Seconds
Count By Array Functions took : 1.0504789352417 Seconds
Count By preg_split+Count took : 0.28686618804932 Seconds
Count By substr_count took : 0.20796585083008 Seconds

Fourth Run
Count by Split+Count took : 0.4736020565033 Seconds
Count by Preg_Match+Count took : 0.48813104629517 Seconds
Count by str_replace took : 0.29280996322632 Seconds
Count By str_ireplace took : 0.51396799087524 Seconds
Count By Explode+Count took : 0.34470105171204 Seconds
Count By Array Functions took : 1.4177949428558 Seconds
Count By preg_split+Count took : 0.36489319801331 Seconds
Count By substr_count took : 0.27841401100159 Seconds

If you are interested to know the machine configuration, these tests ran on a Celeron 1.6GHz processor based laptop with 768 MB of RAM. And I am using PHP 5.1.1

My book "WordPress Complete" has been slashdotted 8

Today I get the news from my Editor at Packt Publishing, David Barnes that my book “WordPress Complete” has been reviewed today in slashdot and scored 8 out of 10 — I am very happy today.

You can visit it from here http://books.slashdot.org/article.pl?sid=07/04/25/1554235

And you can buy the book from here WordPress Complete at Amazon

Thanks to all my friends, colleagues, editors and friends in packt, pageflakes and somewherein – I am very happy today.

Prelude to foundation: Its time to go for a better PHP Framework

I remember those old days when I had to write everything by myself. I wrote huge libraries to work with MySQL. Then I learned PostgreSQL and SQLite but didn’t rewrote my old library to work with those, I was running short of time. So I forsake the opportunity to write a db library which works with them. What I did was plain code relevant to database specific portions. Oh ya, that was a long time ago.

Soon after that I came to know adoDB which made my dream come true. I was so much happy getting my all db specific works done in a much more smarter way. I get rid of database portability issues. I was very happy that time.

I learned smarty soon after I realize that my codes are getting ugly with all the inline HTMLs and PHPs. Nothing could be smarter than separating the presentation logic from the business layer. I am a big smarty fan since that time. It saves my sleep for many nights.

But again I am recurrently suffering from maintainability issues. I was not surprised to find that my code is becoming huge unmanageable giant and it takes huge time for refactoring the application. I was very sad those days. Oh what a disaster that was.

When working with my team members located remote places, I fall into a deep shit. How can we manage and track the changes done by us? Even I was getting strange code in my routine which I bet was not written by me!! It was a terrific job (more…)

Dear Almighty

Dear Almighty,

Have I told you lately that I love you
Have I told you theres no one else above you
Fill my heart with gladness
Take away all my sadness
Ease my troubles thats what you do

For the morning sun in all its glory
Greets the day with hope and comfort too
You fill my life with laughter
And somehow you make it better
Ease my troubles thats what you do

Vulnerable bug in CodeIgniter which took us hours to fix our corrupted database

We use codeigniter internally to develop our web solutions in somewhere in… net limited. Day before yesterday we suffered a terrible situation for an internal bug in code igniter which corrupted data inside some tables of our application database and then it tooks hours to find the origin of that bug, to fix it and to repair the corrupted data. Let me explain what happened.

Lets guess that we have one table named “users” with the following field

1. user_id
2. username
3. password
4. email

At some point, if you want to update the password field of this table, for a particular user, what will you write in your code?


$data = array("password"=>$new_password);
$this->db->where("user_id",$user_id);
$this->db->update("users", $data);

CodeIgniter’s ActiveRecord creates the query like the following

UPDATE users set password='{$new_password}’ where user_id='{$user_id}’;

Well, it’s ok and the quesry seems pretty fine. Now what should happen if you pass a valid user id to this code? Password of only that user will be updated. But what will happen when the passed $user_id is null?? Thats the most pathetic part that Codeigniter ActiveRecord plays. Instead of generating the following query,

UPDATE users set password='{$new_password}’ where user_id=”;

CodeIgniter’s ORM actually generates the following

UPDATE users set password='{$new_password}’ where user_id;

You find the difference of the above two queries right? one contains “where user_id=” ” and another contains just “where user_id” . Now if your backend database is MySQL and this query executes? You know what the hell will happen? It will replace all the user’s password with this new password instead of failing as MySQL count the “where user_id” part equals to false and returns all users. But If your Database is PostgreSQL, it fails, you are lucky.

So day before yesterday we suffered this problem against our commercial application which corrupts our user profile data. We immediate fixed the issue from our backup db (well, we lost 3 data) and then we started to find out what actually went wrong and found this vulnerable bug in CI.

So we suggest the CodeIgniter team to fix the issue immediately and change their ORM code so that it creates the query like the following if the value of passed argument is null. because it will fail to execute in all db. Otherwise the fellow user’s of code igniter, prepare for the dooms day.

UPDATE users set password='{$new_password}’ where user_id=”;

CookieJar in CURL – It Sucks

I was working with Linked in authentication management these days for one of my project where I have to loginto linked in using user’s credentials and fetch personal information and then display it in different form. My code was working properly in local machine, 3 different LAMP servers and one windows server. But finally when I deployed the code in production box, it fails. I quickly found that the cookiejar was not created for some permission problem. I tried to figure out what went wrong but I cant.

1. My script has the permission to create file in that directory on the fly

so there shouldn’t be any problem with this cookie jar and curl, but there was. So hours after hours i spend on it to find the reason and finally I decide to go without cookie jar. And If I don’t use cookie Jar, I have to manually parse the cookies sent to me after login and then set those cookies to my next request so that Linked in recognize me as a “coming back call after login”. I did that and my script worked pretty fine.

Fuck cookie jar in curl. Why the hell the developers didn’t provide a way to override to manage cookies??

If you are interested to know how did I solve it, let me explain a bit.

I set CURLSETOPT_HEADER to true so that I get back the header info
curl_setopt($ch, CURLOPT_HEADER, 1);

and then I parse that header info and extracted the cookies
$end = strpos($header, “Content-Type”);
$start = strpos($header, “Set-Cookie”);
$parts = split(“Set-Cookie: “,substr($header, $start, $end-$start));
$cookies = array();
foreach ($parts as $co)
{
$cd = split(“;”,$co);
if (!empty($cd[0]))
$cookies[] = $cd[0];
}

I will replace this section with RegEx

and finally I set those cookies to my next request using CURLSETOPT_COOKIE
curl_setopt($ch, CURLOPT_COOKIE, implode(“;”,$cookies));

Thats it!! It works pretty fine without cookie jar.

NOTE: I know that cookiejar is a very useful feature for curl users as it automates the cookie management. But I am saying “fuck cookiejar” because developers of curl didn’t provide any way to override cookie management process. If they give us way to use cookijar with any other options beside disk files, it would be beautiful. But in fact I am a big fan of this cookijar feature of curl, except the selfish automation.

WordPress Blogrolls Importer – Opensource

Today I developed this tool to import wordpress blog rolls as XML document. You know when you export data from wordpress.com that doesn’t include the blog rolls data. So if you want to keep a backup of your blog rolls, you can use this tool to import your blog rolls data.. This one is developed using PHP and Curl

importer.gif

You can doewnload it and see the code in action here

Opensource WordPress Blogrolls Importer

################

The trick is lying here —
;

    

    $ch curl_init();

    
curl_setopt($chCURLOPT_COOKIEJAR“./login.jar”);

    

    

    
curl_setopt($chCURLOPT_RETURNTRANSFER,1);

    //curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);  // this line makes it work under https

    
curl_setopt($chCURLOPT_URL,“{$url}/wp-login.php”);

    curl_setopt($chCURLOPT_POST1);

    
curl_setopt($chCURLOPT_FOLLOWLOCATION,1);

    
curl_setopt($chCURLOPT_POSTFIELDS“log={$username}&pwd={$password}&&redirect_to=wp-admin/link-manager.php&wp_submit=”);

    

    $buffercurl_exec ($ch); // execute the curl command

    

WordPress Plugin for Bengali WordPressians (?)

Ah ya, let me call you WordPressian if you are a wordpress fan. Last night I developed a small plugin for bengali wordpress users. If you install this plugin you will be able to write in Unijoy, Phonetic and Plain English mode. You will get three buttons as shown here
editor.gif

Then you can type Unicode on se,ecting either UniJoy or Phonetic
editor2.gif

And you can write english anytime by clicking English button or pressing Control+C

editor3.gif

Thats it. I will release this plugin under Ekushey.org by Tomorrow

Thanks Omi Azad

Day before yesterday my friend Omi Azad gave me a fantastic gift. He registered the domain http://hasinhayder.net for 10 years and gave me control to it. I asm so much surprised. He said “Hey hasin, I didn’t give you any wedding gift” – ha ha ha

You can find Omi Azad’s blog at http://omiazad.net. Omi is a MVP in Windows Shell Category and Localization expert. He is maintaining ekushey.org for number of years.

Thanks Omi.