Copy files over an SSH tunnel

In my working environment I am connecting to the production server using a gateway where I have my public key but sometimes I need to copy files from/to production servers.

I could copy it first on the ssh gateway and after that in my machine, but I prefer to do it through an SSH tunnel.

Assuming the ssh gateway is named ssh-server (should have an entry in /etc/hosts or dns) and the production server where I want to connect has the IP 192.168.0.10, I open the tunnel on my laptop like this:

Instead of the IP you can use a hostname if is it recognized by the gateway machine. The tunnel is opened now on port 1025 on my localhost, so all the connection to my machine on this port will be forwarded to the production server using my ssh key.

After that, in another terminal on my laptop I can copy the files that I need from the production server:

The “root” user I used in this scp line is the user with which I connect to the production server.

That’s all!

git tags use

Today I had to read about git tags in order to be able to explain better why somebody would use tags and for what. The old (and still good) git workflow explains very well how to develop using branches but doesn’t explain too well why we create the tags and how should be used on production. It just saying “that commit on master must be tagged for easy future reference to this historical version”. If you followed that guide and now you are ready to go live with your code changes, you maybe wonder what should you do with the tag? Why did you create it. Some people would say, just leave it there, maybe somebody, in a shiny day will take a look to it. Just go to production and do git pull origin master. But I don’t this this is the purpose of the tag.

In my opinion, after you (or an automated process) created the tag, this should be used on the production servers for deployment. So, in production, in the project folder:

We fetch all the git changes:


As you can see, we have some changes in master branch and a the new created tag.

Now I will do:

And I will see a message like this:

The text above explain us that we are in a detached HEAD. That means we are no longer on a branch, but we have checked out a specific commit in the history (the one with the hash f69423b…). Also, a tag is represents an immutable content, used only to access it with the guarantee to get the same content every time. So, any changes you do in the tag (if you have the funny idea to do changes on production) will be lost at the next checkout.

Now, your production project is in version 1.1.

Why the tagging is good:
1. No conflict in case somebody changes the production files (the changes will be overwritten).
If you try to checkout a new tag, eg: git checkout 1.2
and you have local changes, you will get a messages like:

Anyway, keep in mind to not commit from the tag because your changes will still be lost when you do: git checkout 1.2. In order to save the changes, you should create a branch from the tag, commit the changes into branch, merge the branch into master and tag the master.

2. Easy rollback!
If it is something wrong with the version 1.1, don’t wait until the bug is fixed, tested, tagged and so on… Just do on production: git checkout 1.0
and you are back into the previous version. (it takes less than 5 seconds).

3. Easy for QA and managers to know what version is on production (you can create even tools to show the version online with a nice number like 1.1, not “Dear manager, we deployed the version 604a8d2fc3223ea28d752c8d9fc8d1b05ed543dc…”)

4. Very easy when you have to search for changes long back into time or to compare two releases like 2.12 and 1.1.

And probably more, but they don’t come into my mind right now.

Switch between branches using Nginx

For some of my projects, most of the time I use two branches: master and dev and I work mostly in dev. If I need to send the url to some clients with some dev/beta features I should create another (sub)domain like dev.myproject.com and sometimes, if I forget to change some configurations about the domain name (specially with some “very intelligent” CMSes that keeps the configuration in the database) I will get an email back saying “the new feature is not working” :)

In the end, I wanted to have an easy solution to switch between branches, keeping the same domain and to not confuse the clients or the QA with my beta/dev/test/acceptance subdomains. Everything should be on: http://myproject.com.

So, how I did it?
My project has the root folder in /var/www/project_dev/web for the dev branch and /var/www/project_master/web for the master branch.

In this case, I have the following nginx config file (/etc/nginx/sites-enabled/project)

 

In this config file I specify, if I have a cookie called “branch” to read it’s value. If the value is “dev” it will send the request further to the nginx running on port 8080, called dev. If the cookie doesn’t exists or it is “master” it will send the request to the nginx running on port 8888.

And I have also another two nginx config files: /etc/nginx/sites-enabled/dev and /etc/nginx/sites-enabled/master

In dev file I have:

And in master file I just changed:

with

After I setup this nginx configuration, in my project I created a file, set.php through which I create the specific cookie. The PHP code is very simple. I just read the GET branch parameter and I setup a cookie.

The file set.php should be in /var/www/project_dev/web as well in /var/www/project_master/web.

 

Now, I can just do: http://myproject.com/set.php?branch=dev and I will see the dev branch and if I want to switch back on master, I just have to go to http://myproject.com/set.php?branch=master

In the future I hope to write another blog post in which to present a more elegant solution, without 3 nginx config files and a PHP script to set up cookies.

All the code can be found on Github.

How to use python sched in a daemon process

Let’s assume we want to download a file (or to do some tasks) to every 5 seconds, but the condition is to not do the same task twice or more times at the same moment, even if takes more than 5 seconds. For example, we have a to download a file and this will take 8 seconds. Also, if takes more than 5 seconds, it should not wait until the next iteration, to start again (3 seconds more), but will start the download immediately.

So, the traditional cronjob/lock file combination was not suitable for my case.

I chose to use python language with python-daemon and sched.

To install python-daemon, you can do it with

The code below should be self explanatory, but I will do a short presentation. We start the main()function in the daemon context. After 1 second, we will call run_scheduled() first time. This one will call our “download” function get_file that can take a random time to finish its execution (we setup the random between 1 and 10 seconds). Inside run_scheduled, after the first run, we will schedule the next run depending on how much it takes the download and call again the same method:

where restart can be between 0 and 4 (either the download took 1 to 4 seconds, so we have to wait 5 to 1 seconds, or it took more than 5 seconds, that means to schedule the download immediately).

Of course, there are also other methods to achieve the same result like, threads or simple while True… with sleep between iterations :)

Crashing Google Chrome on Google website

Used version of Google Chrome Version 31.0.1650.63

OS Version: Ubuntu 12.04.3 LTS, Linux work 3.2.0-57-generic #87-Ubuntu SMP Tue Nov 12 21:35:10 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Go on Google Trends website http://www.google.com/trends/topcharts?zg=full

Click View #my2013 Gallery

Click any picture

Click the browser’s back button

Click again the browser’s back button

Crashed… or it will take between 20 seconds and 1-2 minutes to refresh that page.

Making a super fast blog with Drupal 7

Yesterday I decided to try to make my blog to load under 200 ms from my laptop using Drupal 7without loose my old links to blog posts, from SEO reasons. I know, some people would say there are better solutions for a blog others than Drupal, but… I wanted to give a try to Drupal because it has one of the biggest community, a lot of modules and features and of course for fun.

So, first step it was to install Drupal 7 on a separate domain and to do all the work there.

First of all, I needed the content from my old WordPress blog, so I exported it as WXR file. In Drupal, I installed migrate module. With this module I imported all the content and everything went smoothly. The URL Alias module made for every post an alias so, the links to the old blog posts are the same. Having not so many posts, I prefered to review them manually and fix some design/formating problems. One of the main problem was with the syntax highlight of code snippets from my blog posts. In order to have syntax highlight on Drupal, I installed GeSHi filter module following the video tutorialHighlight Code With GeSHi Filter In Drupal 7.

As a theme I wanted something simple as design and that it will load very fast. Skeleton theme seemed to have these features.

Another request it was to have a rich-text editor so I decided to use CKEditor. I had to choose betweenWysiwyg and CKEditor module. Anyway, for both, you need to download CKEditor separatelly in sites/all/libraries. I chosed to use CKEditor, due some bugs in Wysiwyg (could not detect CKEditor version and some code changes were needed to be made).

Another important module I installed it was CDN Module because I had in mind to use a CDN to lower the number of the requests for my static files (CSS, JS, images, etc) to my web server. You can read a short comparison between few of them on Halothe23 blog.

After I configured the CDN module, I enabled also all the caching options for Drupal, from Configuration -> Performance.

Applying all these I managed to load the first page of the blog in 1.74 seconds at the first view, and 0.65 seconds for the second view, according to webpagetest.org. The test was made using Google Chrome, from Los Angeles, US.

Of course, there are many other options to speed up your Drupal website, starting with opcode caching like APC, content caching like memcached and Varnish, database optimisation, and finishig with code optimisation in Drupal.

One good book about how to make Drupal to scale is High Performance Drupal by Jeff Sheltren, Narayan Newton, Nathaniel Catchpole.

HLS Video on Demand streaming

First of all, let’s explain shortly what HLS is. HTTP Live Streaming (also known as HLS) is an HTTP-based media streaming communications protocol implemented by Apple. Since it requires only standard HTTP transactions, HTTP Live Streaming is capable of traversing any firewall or proxy server that lets through standard HTTP traffic, unlike UDP-based protocols such as RTP. This also allows content to be delivered over widely available CDNs. In a few words, HLS works by breaking the overall stream into a sequence of small HTTP-based file downloads. At the start of the streaming session, the client downloads an extended M3U (m3u8) playlist containing the metadata for the various sub-streams (called TS files) which are available. You can read more about HLS Architecture on Apple Developer website.
Currently, there are two main solutions for streaming Adobe RTMP streaming, that we covered in the previous blog post and Apple HLS streaming. However, both technologies allow you to play your video as you record it, automatically adjust video quality to available bandwidth, and seek to different parts of a video.
The major differences between the two technologies are, that while Adobe RTMP works only in Flash and requires you to have a dedicated RTMP server installed,  Apple HLS works with both Flash and HTML5 and can be used with an ordinary web server.
Furthermore, when Apple decided to drop Flash support for iOS (the affected devices are iPhones, iPads, laptops, etc) the developers had to think to a solution for the users of these devices.

Continue reading