Install Piwik

From UNPM.org Wiki
Jump to navigation Jump to search

Piwik is a powerful web-analytics tool that can be easily integrated into most sites. Although installation is a little more work and server load a little higher than competing tools such as Google Analytics, Piwik does have the advantage of not sharing user traffic data with outside entities.

Piwik is also a good first web application to install for those who have never intalled one because it is not very involved and can be used as a part of most web applications to be installed.

Configure nginx

Create the package-configs files piwik.conf and piwik_https.conf:

username@servername:~$ sudo nano /etc/nginx/package-configs/piwik.conf

Paste into the new file:

location /piwik/ {
    location ~ (index|piwik|js/index)\.php$ {
        include global-configs/php.conf;
    }
    location ~ \.php$ { deny all; }
}
username@servername:~$ sudo nano /etc/nginx/package-configs/piwik_https.conf

Paste into the new file:

location /piwik/ {
    location ~ (index|piwik|js/index)\.php$ {
        include global-configs/php_https.conf;
    }
    location ~ \.php$ { deny all; }
}
username@servername:~$ sudo nano /etc/nginx/sites-available/example.com

Add to the HTTP server block:

    include package-configs/piwik.conf;

Add to the HTTPS server block:

    include package-configs/piwik_https.conf;

Test and restart nginx.

username@servername:~$ sudo nginx -t
username@servername:~$ sudo service nginx restart

Create Piwik database and database user

Create a Piwik database and database user:

username@servername:~$ sudo mysql -uroot -p
MariaDB [(none)]> create database databasename default character set utf8 default collate utf8_general_ci;
MariaDB [(none)]> grant all on databasename.* to 'databasenameusername'@'localhost' identified by 'databasenameusernamepassword';
MariaDB [(none)]> exit

Note that the databasename, databasenameusername and databasenameusernamepassword will be required for the Piwik configuration process.

Install and configure Piwik

Download and extract Piwik:

username@servername:~$ wget http://builds.piwik.org/latest.zip
username@servername:~$ unzip -d /var/www/example.com/public/ latest.zip
username@servername:~$ rm latest.zip
username@servername:~$ sudo chown -R www-data /var/www/example.com/public/piwik/

Navigate to the secure location https://www.example.com/piwik/ to set up Piwik through a secure session.

Viewing Piwik dashboard in https

Because the Piwik tracking code requires both http and https access to Piwik, it is not possible to force the dashboard to load an https session using nginx. Fortunately, this can be set up in a Piwik configuration file.

Open the configuration file:

username@servername:~$ sudo nano /var/www/example.com/public/piwik/config/config.ini.php

Directly under [General], add:

force_ssl = 1

Navigating to http://www.example.com/piwik/ should force the dashboard to open in a secure session.

Geolocation

The geolocation feature in Piwik provides some of the more useful data on visitors. The PECL option is generally the better module to use.

Install the packages required for this feature to work and edit the necessary files:

username@servername:~$ sudo aptitude install php5-geoip libgeoip-dev
username@servername:~$ sudo pecl install geoip
username@servername:~$ sudo nano /etc/php5/fpm/php.ini

At the bottom of the file, add:

extension=geoip.so
geoip.custom_directory=/var/www/example.com/public/piwik/misc

Restart PHP:

username@servername:~$ sudo service php5-fpm restart

Log into Piwik and navigate to Settings -> Geolocation. Select the GeoIP (PECL) radio button. At the bottom of the page, assuming the free option is to be used, enter the GeoLite City database location into the Location Database, ISP Database and Organization Database fields and click save.

Device detection

Piwik has various plugins that come with the default install, located at Settings -> (Plugins) Installed, though not all of them are activated. The DevicesDetection plugin provides more information on the devices used by visitors and can be enabled by clicking Activate.

Using Piwik

Image tracking code

The image tracking code is used to track visitors when either the page does not load javascript or the users do not have javascript enabled. The code will give very basic information as compared to the javascript code, but will load for all visitors using browsers capable of downloading images.

One problem in Piwik's default image tracking code is that it includes an http (or https) link to the image tracker:

<!-- Piwik Image Tracker -->
<img src="http://www.example.com/piwik/piwik.php?idsite=1&rec=1" style="border:0" alt="" />
<!-- End Piwik -->

This is problematic because if the site is served in an http session, then browsers may give warnings about not all elements being secure, and this will remain true in the reverse, if providing an http link on an https page. It's better to use the relative location, which will translate to either http or https depending on the page being served:

<!-- Piwik Image Tracker -->
<img src="/piwik/piwik.php?idsite=1&rec=1" style="border:0" alt="" />
<!-- End Piwik -->

When using the image tracking code as a backup to track visitors that have javascript disabled, use the noscript tag to prevent tracking visitors who will also run the javascript:

<noscript><img src="/piwik/piwik.php?idsite=1&rec=1" style="border:0" alt="" /></noscript>

Most Piwik plugins will add noscript tags by default, though any time the code is being pasted in full into the plugin, the tags should be added.

Privacy tools

Another problem that can arise is being blocked by privacy extensions such as Ghostery. Some webmasters believe that a site is private property and feel justified in tracking users on that property, ostensibly to improve the site. Some visitors do not like to be tracked and are concerned that the webmasters are selling the data acquired or managing it in an irresponsible manner. However, this is not an issue to be addressed by this article, it is only to inform webmasters on how to use Piwik, as there are certainly ethical ways of tracking site usage and maintaining that data.

Ghostery blocks the specific javascipt piwik.js. By replacing the piwik.js script in the tracking code with the directory js/, /piwik/js/index.php will be loaded and then piwik.js will run in the browser without being blocked by Ghostery. This method should allow for tracking of the vast majority of visitors that land on a page.

More advanced visitors may use tools such as Adblock Plus or AdBlock Edge with the EasyPrivacy filter. To bypass the EasyPrivacy filter, create two randomly named symlinks, one to /piwik/ and the other to /piwik/piwik.php and replace /piwik/ in the script tracking code with one symlink and piwik.php in the noscript with the other.

username@servername:~$ ln -s /var/www/example.com/public/piwik/ /var/www/example.com/public/randomsymlinkname
username@servername:~$ ln -s /var/www/example.com/public/piwik/piwik.php /var/www/example.com/public/randomsymlinkname2.php

Sample tracking code with changes in bold:

<script type="text/javascript">
  var _paq = _paq || [];
  _paq.push(["trackPageView"]);
  _paq.push(["enableLinkTracking"]);

  (function() {
    var u=(("https:" == document.location.protocol) ? "https" : "http") + "://www.example.com/randomsymlinkname/";
    _paq.push(["setTrackerUrl", u+"js/"]);
    _paq.push(["setSiteId", "1"]);
    var d=document, g=d.createElement("script"), s=d.getElementsByTagName("script")[0]; g.type="text/javascript";
    g.defer=true; g.async=true; g.src=u+"js/"; s.parentNode.insertBefore(g,s);
  })();

<noscript></noscript>

Note js/ is in bold and that the actual code used should be based on the code produced in Piwik, then modified with the symlinks - do not copy the code used here.

Package-configs edits

Edit piwik.conf:

username@servername:~$ sudo nano /etc/nginx/package-configs/piwik.conf

Change:

location /piwik/ {
    location ~ (index|piwik|js/index|randomsymlinkname2)\.php$
    include global-configs/php.conf;
    location ~ \.php { deny all; }
}

Add to the file:

location /randomsymlinkname/ {
    location ~ (index|piwik|js/index|randomsymlinkname2)\.php$
    include global-configs/php.conf;
    location ~ \.php { deny all; }
}

Edit piwik_https.conf:

username@servername:~$ sudo nano /etc/nginx/package-configs/piwik_https.conf:

Change:

location /piwik/ {
    location ~ (index|piwik|js/index|randomsymlinkname2)\.php$
    include global-configs/php_https.conf;
}

Add to the file:

location /randomsymlinkname/ {
    location ~ (index|piwik|js/index|randomsymlinkname2)\.php$
    include global-configs/php_https.conf;
}

Multiple domains

It is possible to track multiple domains with one installation of Piwik. However, doing this will require javascript to load from the domain hosting Piwik in the target domain that is being tracked. This means that additional privacy and security tools, such as NoScript, will block the tracker, increasing the number of users not tracked with javascript while visiting the site, thus reducing the quality of the usage reports. Piwik does provide an option to mask the foreign domain so tracking can be both obfuscated and run from the same domain.

For domains hosted on one server, there is a simple solution. Create symlinks in the site's /public/ directory that point to the /piwik/ directory and the /piwik/piwik.php file, then identify the site to be tracked using the ID given for the site in the Piwik dashboard.

Example:

username@servername:~$ ln -s /var/www/piwikdomain.com/public/piwik /var/www/targetdomain.com/public/randomsymlinkname
username@servername:~$ ln -s /var/www/piwikdomain.com/public/piwik/piwik.php /var/www/piwikdomain.com/public/piwik/randomsymlinkname2.php

For the target domain, be sure to add randomsymlinkname and randomsymlinkname2.php to /etc/nginx/package-configs/piwik.conf and /etc/nginx/package-configs/piwik_https.conf as described in the above Privacy tools section.

Log analytics tool

The Piwik log analytics tool will parse the access logs and import the data into Piwik. There may be various cases where importing the logs can provide very useful, but most admins will not have a need for it as the javascript tool generally provides better data more conveniently.

Log analytics import script

The /piwik/misc/log-analytics/import_logs.py script may be run to import logs into Piwik. There are may options available for running this script, including running it from a separate machine.