How a revert to wrong merge can mess your code in Git?

git-tutorial_branching-mergingYou can’t really imagine working on a project without using a version control system. Git is the best bet for me. This is an interesting story of how a wrong merge and then revert messed our codebase.

How did this problem arise?

I am working on a product of ours. The stable code stays in the master branch. We, however, have a few variations or customized versions of the product. To maintain this customized version, we have created a separate branch for each such customized version which is branched off from the master. This branch contains only some customized modifications of their own while it will have all new features or updates that happen to master. So these branches are regularly synchronized with the master by merging the master branch in it and hence taking latest features.

Well, one of my team members, mistakenly, merge one of these customized version’s branch with master and hence taking all customized features to master. As he realized this mistake, he reverted the merge and pushed it to master. After a few days, while I was trying to sync one of my customized branches with the master, I merge it with the master. Luckily I took a peek in the code and realized, *poof* the customized code is gone and this special branch is similar to master branch now!

Why did this happen?

As I said earlier, my colleague had mistakenly merged this branch with master and then reverted the merge. So this revert commit has caused removal of all specialized code that came to master because of the wrong merge. So for the code history, this was a new commit and difference. So when I merged master into this special branch, it also applied the patch of this revert and removed the customized code from this branch also.

So, how to fix this?

Well, this sounds scary to lose all changes of a custom branch but the fix is pretty simple. You need to revert the revert of the wrong merge that happened, in your customization branch. Not to confuse you, putting it simply, follow these steps:

  • Find out the sha1 of the commit which was created while reverting the wrong merge that happened while merging your specialized branch into master.
  • Checkout your branch with customization where you have lost your changes
  • Use this sha1 to revert the commit by using `git revert {sha1}`
  • Fix conflicts if any and just quickly skim through your code to see if everything is fine
  • Done!

I hope this will help someone from having a shock in this situation.

Feel free to ask any doubts in comments. Do not forget to tell me if you’ve been in this situation too. ;-)

Bug with PHP’s pthreads? : Thread doesn’t inherit parent’s working directory

I am lately working with PHP’s pthreads extension for one of my applications. I needed a chunk on my code to run parallelly instead of it running inside a loop because each iteration was time-consuming and blocking. As each iteration was independent of other,  I decided to use PHP’s pthreads extension and wrapped my loop’s code chunk inside a Threaded class’ run method for quick test. My code failed to run. When I looked for the cause, I was bit surprised.

My PHP code made use of relative paths. But after wrapping it inside run() method, relative path’s calculation was failing. Reason? It was calculating relative path from wrong directory. I checked working directory of the thread and it was not the directory where I run my script from, but it was inside Apache’s installation directory! I presume, each thread created using PHP’s pthreads extension should inherit the working directory of the script that started the thread, this should be the intended behavior. I could be wrong.

Here is an example that will help you understand the issue:

<?php
class ThreadExample extends Thread
{
    
    public function run()
    {
        echo "\nThread's Working Directory: ".getcwd();
    }
}
 
echo "\nScript's Working Directory: ".getcwd();
$oThread = new ThreadExample();
$oThread->start();
$oThread->join();
?>

 

Thread's Working Directory: C:\wamp\bin\apache\apache2.4.9
Script's Working Directory: C:\wamp\www\Thread

 

I am going to raise a bug ticket anyway. Feel free to post your thought on this in comments.

 

Implementing Cache in Your PHP based Web Application

If you Web Application grows or it involves a lot of computations to generate dynamic content, performance is going to get poor. There are some data why requires plenty of calculation but they do not change on every request. It’s wise to start caching such data or objects. Recently, in one of the application I have been working, there was considerably unnecessary delay because there were some heavy data that were being calculated every time a user sends a request. And it was a high time I should’ve started caching those results.

There are plenty of extensions and frameworks available for caching in PHP. But instead of spending my time looking for best extension/ framework and learning it, I decided to implement cache on my own without using any fancy things.

I am going to explain two-way of caching of the content:

  1. Server Side Caching
  2. Client Side Caching

Server Side Caching

In server-side caching, we are going to cache those data and/or objects on server and use them the next time user requests for it instead of recalculating those data.

For caching these things, we are going to use files. I assume, your calculation is going to take more time than reading a file, otherwise there is no point in using caching. What we are going to do is that we will write each calculated data or object inside a file and store it on the server.

It’s always a good idea to define few constants or configuration variables for few settings. You should be having at least following few parameters in the configuration. I would make a config file with following constants:

  1. CACHE_ENABLED: A parameter to turn on or off the caching
  2. CACHE_PATH: Path to your cache directory. I prefer to create a `cache` directory in my Application Root. If you are on Linux, don’t forgot to give your server write permission on cache folder
  3. CACHE_EXPIRY: A parameter that force the cache to be invalidated after specified time  from the creation of the cache.
php
 //! Config parameter to turn on/off the caching
 define("CACHE_ENABLED", true);
 //! Path to store your cache files
 define("CACHE_PATH", "/var/www/YourApp/cache/");
 //! Cache expiration time
 define("CACHE_EXPIRY", 3600);
?>

Now come the real caching part. Before you start caching, there are few choices that you need to make considering various factors.

  • What data are you going to cache? Well, don’t cache something that requires less time computing than reading or writing it to a file. You should also not cache something which changes on each request.
  • When do you need to purge a cache? It is as important to know when you should purge your cache. Make list of all the events or condition that can invalidate your cache. If you are not purging the data on appropriate events or condition, your application may have inconsistent or corrupted data or state.

Once you have decided on these things, you should start caching things.  We will be computing the object and storing it inside the file. It a wise thing to use a proper naming convention for storing objects. A good naming convention will make your task for purging cache easier.

In my application, there were few types of objects. And all these object’s value were different for each users. So I decided to use this follow convention: `{OBJ_TYPE}_{OBJ_TYPE_ID}_uid_{UID}.cache`

{UID} is User ID of a user. So in case when cache for a particular object becomes invalid, I can delete all files that matches `{OBJ_TYPE}_{OBJ_TYPE_ID}_*.cache`

And in case if all cache for a single user expires, I can delete all files that matches `*_uid_{UID}.cache`

Now let’s start caching object. Suppose we are serving an expensive data as a JSON for some AJAX request. Here is an example of such sample snippet:

<?php
require_once "loader.php"

$aExpensiveData = null;

if(CACHE_ENABLED) {

    //! Calculate Cache path based on your naming convention
    $sCachePath = CACHE_PATH."obj_{$oid}_uid_{$iUserID}.cache";
    if(file_exists($sCachePath)) {

        //! Get the last-modified-date of this very file
        $lastModified=filemtime($sCachePath);

        //! Calculate the expiry
        $sExpiry = time() - CACHE_EXPIRY*3600;

        //!Purge the cache if cache is older the foced expiry
        if($lastModified<$sExpiry) {
            unlink($sCachePath);
        }
        else {
            $aExpensiveData = unserialize(file_get_contents($sCachePath));
        }

    }
}

//! If Cache Miss, let's do the expensive calculation
if($aExpensiveData===null) {
    $aExpensiveData = doExpensiveCalculation();

    //! Cache the result if cache enabled
    if(CACHE_ENABLED) {
        $sCachePath = CACHE_PATH."obj_{$oid}_uid_{$iUserID}.cache";
        //! Don't forget to serialize the object, before writing
        file_put_contents($sCachePath, serialize($aExpensiveData));
    }
}

header('Content-Type: application/json');
$json = json_encode($aExpensiveData );
echo $json;
?>

 

If you look at the code, you are first checking if cache is already available. If cache is available and still valid, use it instead of doing the expensive computation.  After doing expensive computation, we are caching the result so that next time we don’t miss a cache.

 

Client Side Caching

Well, you just saved a great amount of computing power by not recalculating an expensive object. You are saving computation time by using server-side caching. What if I tell you that you can save bandwidth and data transfer delay by using client side caching?

We are still considering the previous example of an AJAX request that expects expensive object in JSON format. Now suppose, that’s an expensive as well as a big object. It surely takes time to transfer the object from server to client and you are also exhausting using your network’s bandwidth.

You can start using Client Side Caching to save bandwidth and data-transfer delay. Data transfer delay may be slowing your application’s response for clients with slower internet connection. They will benefit much from this.

To cache a “page” on client, you need to tell client that this page is valid up to certain hours or days. But again, you don’t want your client to use wrong data in case some event has invalidated that data. So we will ask client to always validate its local cache before using it. So, to let your client’s browser do this, it’s important to send, Last-Modified and Expires HTTP Header properly.

Let’s implement Client Side Caching in the previous example:

<?php
require_once "loader.php"

$aExpensiveData = null;

if(CACHE_ENABLED) {

    //! Calculate Cache path based on your naming convention
    $sCachePath = CACHE_PATH."obj_{$oid}_uid_{$iUserID}.cache";
    if(file_exists($sCachePath)) {

        //! Get the last-modified-date of this very file
        $lastModified=filemtime($sCachePath);

        //! Calculate the expiry
        $sExpiry = time() - CACHE_EXPIRY*3600;

        if($lastModified<$sExpiry) {
            unlink($sCachePath);
            $lastModified = time();
        }
       
        //Get the HTTP_IF_MODIFIED_SINCE header if set
        $ifModifiedSince=(isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? $_SERVER['HTTP_IF_MODIFIED_SINCE'] : false);

        //Set last-modified header.
        header("Last-Modified: ".gmdate("D, d M Y H:i:s", $lastModified)." GMT");
        
         //tell client to revalidate local cache before using it
        header('Cache-Control: must-revalidate');

        header('Expires: '.gmdate('D, d M Y H:i:s \G\M\T', $lastModified + CACHE_EXPIRY*3600));

        //check if page has changed. If not, send 304 and exit. Client will use it's own cache
        if (strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE'])==$lastModified)
        {
               header("HTTP/1.1 304 Not Modified");
               exit;
        }

        //! It didn't exit, that mean client doesn't havethe latest copy your data.
        $aExpensiveData = unserialize(file_get_contents($sCachePath));
        

    }
}

//! If Cache Miss, let's do the expensive calculation
if($aExpensiveData===null) {
    $aExpensiveData = doExpensiveCalculation();

    //! Cache the result if cache enabled
    if(CACHE_ENABLED) {
        //! Don't forget to serialize the object, before writing
        file_put_contents($sCachePath, serialize($aExpensiveData));

         //Set last-modified header.
        header("Last-Modified: ".gmdate("D, d M Y H:i:s", $lastModified)." GMT");

        //tell client to revalidate local cache before using it
        header('Cache-Control: must-revalidate');

        header('Expires: '.gmdate('D, d M Y H:i:s \G\M\T', $lastModified + CACHE_EXPIRY*3600));
    }
}

header('Content-Type: application/json');
$json = json_encode($aExpensiveData );
echo $json;
?>

This time, we are setting Last-Modified header to time of the creation of cache. And we are setting Expires based on CACHE_EXPIRY.

If client has a local cache, it’s modification date will be set in `$_SERVER[‘HTTP_IF_MODIFIED_SINCE’] `

Thus you will only send HTTP Code 304 saying “Content isn’t modified, use your local cache.”.

 

These examples are targeting a single scenario. But it’s easy to use same concept in any scenario with little modifications because techniques remains same.

Extent of the benefit of using cache will greatly depend on what you are caching, how  expensive your calculation is and how often your cache becomes invalid. I got several fold performance improvement in my application after implementing these cache techniques. Feel free to drop your questions and feedback in comments.

PS: These  examples are very quick and dirty examples of these techniques. Main motive of this blog post and examples in it is to make Developers familiar with the caching techniques. If your application is going to use caching seriously, I suggest you to invest some time in learning popular extensions or frameworks. And if you are planning to implement your own caching, it’s better to define proper classes and methods to make your code more structured and easy to maintain. 

How PHP Sessions can cause Concurrency Issues?

A web application without sessions is hard to imagine. People use it very liberally to maintain session data, so do I. But what most people don’t know that it can cause issues with your application’s concurrency if not used properly. Even though it is an obvious thing, I never knew (or thought!) about it till today.

What’s the problem?

PHP Session Lock can block your applications concurrent requests from a single client. If a client is sending multiple requests concurrently and each requests involves session usage, each request will be served sequentially instead of processing them concurrently.

Why does this happen?

PHP, by default, use files to store session data. For each new session, PHP will create a file and keep writing session data to it. (hint: Blocking IO). So everytime when you are calling `session_start()` function, it will open your session file and it will acquire an exclusive lock on the file. So if one of your scripts is taking time to process request and your client sends another request which also requires session, it will be blocked until previous request is completed. Second request will also call session_start() but it will have to wait because first request has already acquired an exclusive lock on the session file. Once previous request is fulfilled, PHP will close session file at the end of the script execution and release the lock. Now second process will get a chance to acquire a lock on session file and proceed it’s execution.

However, this can lead to concurrency issues for same client only. Request from a client cannot block another client’s request in such case because they both will be having different sessions and hence different session files.

When can it become a bottleneck?

It will be hard to noticed this blocking period if your scripts are short (in term of execution!). But if you have slightly long running scripts, you are in trouble. This can become a bottleneck if you are working with AJAX and fetching data from several requests on the same page; which is quite common is today’s web applications.

Consider this scenario when you are fetching several data from different background AJAX requests and displaying them on UI. These requests use session. Each asynchronous request is fired immediately and together. But first requests to reach server will receive the session lock while other requests have to wait. So all these requests will be processed sequentially even though they are not dependent on each other.  Taking an example, 5 requests, each taking approximately 500ms to complete, are being sent concurrently. But because of this blocking, each request is not executed concurrently and so the last, 5th, request will start executing at 5th second and will complete execution after 5.5 seconds even though it required only 500ms to process. This can be a serious problem if some of the scripts require more processing or the number of the requests are greater.

I wouldn’t have noticed this if I hadn’t added `sleep(2);` in my code on local machine to simulate natural use from a slow connection. My page was sending 5 requests and each request was being severed every 2 seconds, sequentially!

So what’s the solution?

Close sessions once you are done using it!

PHP has this method to close session write: `session_write_close()`. Calling this method will end current session and write session data to file and release the lock to the file. So it will not block further requests even if current script is still pending processing.

Important thing to note is that once you close session using `session_write_close()`, you will not be use session further in the current script which is being executed.

 

How do I simulate this problem?

If you want to see this problem in action, try following code:

Blocking Example


 

Now send 5 ajax requests to this file. Example code with jQuery:


	
		Send Log:
		

Complete Log:

Non-Blocking Example


Now send 5 ajax requests to this file. Example code with jQuery:

Send Log:

	
		Send Log:
		

Complete Log:

You will be able to see how this small thing can greatly affect your application.

4 Chart Plotting JS Libraries That You Should Check

Plain data is boring. But visualization of the same data is cool and interesting. If your web application ever deals with even a small set of data, you may be using charts already to display them. If you aren’t, you should start using them.

Drawing charts on a canvas isn’t  an easy job and will require a lot of your time. Let me list 4 of my favorite JS library to plot graph and charts and make your life easier.

1. Chart.JS

download

Chart.js is a very simple and easy to use library to plot charts. It’s pretty light weight and supports following 6 types of charts:

  1. Line
  2. Bar
  3. Radar
  4. Polar Area
  5. Pie
  6. Doughnut

Chart.JS creates beautiful responsive charts and has sufficient set for APIs and a neat documentation.

Link: Chart.JS

2. Morris.JS

download1

Morris.js is very similar to Chart.JS and creates pretty charts. Main difference between them is that Chart.JS uses canvas element to plot charts while Morris.JS uses SVG. However Morris.JS supports slightly fewer types of charts:

  1. Line
  2. Area
  3. Bar
  4. Donut

Link: Morris.JS

3. Flot

download3

 

Flot is a very extensive chart plotting library that draws interactive chats and has several options and features. If you require extensive and comprehensive graphs, use nothing but Flot.

If your web app contains real-time interactive graph, this library will fit your need.

Link: Flot

4. jQuery Sparklines

download2

This jQuery based library is the best choice when it comes plotting inline charts. It plots very beautiful inline chart. You can use it to get small charts in a text line or paragraph.

It supports following types of charts:

  1. Line
  2. Bar
  3. Stacked
  4. Discrete
  5. Pie
  6. Tristate
  7. Box
  8. Bullets

Link: jQuery Sparklines

Hello World! Welcome to the http://lokalhost.in!

I doubt there is a single geek who has never thought of writing a blog, regularly. But how many does really blog? Very few, very very few! I am no exception. I have started few blogs in the past, blogged regularly for some time and started losing motivation. Well, I am giving another shot at the blogging on this New Year.

So,, welcome to the http://lokalhost/ .  World of programming is tricky, everyday a new problem will hit you and you’ll get busy solving. But the fact is, many times we, software developers/ programmers, spend hours solving some silly “by-problem”  because we have never come across it before or the resources available for it are either unclear or not so helpful. And you may not be the only person wasting your time on it; someone else might have or may be wasting their time on it too.

I, too, come across such programming problems often and I believe I should start writing about them once I have solved them. It may save someone else’s time too. I will be writing posts on these issue and solution which I used to solve such close-to-general problems. I will also be posting Tutorials and How-to blog posts to help my fellow programmers.

We programmers face such problems quite often and I do believe to keep getting motivations to post more and more. Don’t forget to keep an eye on the blog. Stay Tuned!

And yes, Happy New Year!!