Install Piwik

From UNPM.org Wiki
Jump to navigation Jump to search

Piwik is a powerful web-analytics tool that can be easily integrated into most sites. Although installation is a little more work and server load a little higher than competing tools such as Google Analytics, Piwik does have the advantage of not sharing user traffic data with outside entities.

Piwik is also a good first web application to install for those who have never intalled one because it is not very involved and can be used as a part of most web applications to be installed. This article covers installing Piwik to a LEMP web server set up according this site's series of articles, which can be reviewed in the Steps to create a UNPM Server.

Configure nginx and PHP

PHP

Make one small change to php.ini:

username@servername:~$ nano /etc/php/5.6/fpm/php.ini

Uncomment:

always_populate_raw_post_data = -1 

nginx

Create the package-configs files piwik.conf and piwik_https.conf:

username@servername:~$ sudo nano /etc/nginx/package-configs/piwik.conf

Paste into the new file:

location /piwik/ {
    location ~ (index|piwik|js/index)\.php$ {
        include global-configs/php.conf;
    }
    location ~ \.php$ { deny all; }
}
username@servername:~$ sudo nano /etc/nginx/package-configs/piwik_https.conf

Paste into the new file:

location /piwik/ {
    location ~ (index|piwik|js/index)\.php$ {
        include global-configs/php_https.conf;
    }
    location ~ \.php$ { deny all; }
}
username@servername:~$ sudo nano /etc/nginx/sites-available/example.com

Add to the HTTP server block:

    include package-configs/piwik.conf;

Add to the HTTPS server block:

    include package-configs/piwik_https.conf;

Test and reload nginx.

username@servername:~$ sudo nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
username@servername:~$ sudo service php5.6-fpm reload
username@servername:~$ sudo service nginx reload

Create Piwik database and database user

Create a Piwik database and database user:

username@servername:~$ sudo mysql -uroot -p
MariaDB [(none)]> create database databasename default character set utf8 default collate utf8_general_ci;
MariaDB [(none)]> grant all on databasename.* to 'databasenameusername'@'localhost' identified by 'databasenameusernamepassword';
MariaDB [(none)]> exit

Note that the databasename, databasenameusername and databasenameusernamepassword will be required for the Piwik configuration process.

Install and configure Piwik

Download and extract Piwik:

username@servername:~$ wget https://builds.piwik.org/latest.zip
username@servername:~$ unzip -d /var/www/example.com/public/ latest.zip
username@servername:~$ rm latest.zip
username@servername:~$ sudo chown -R www-data /var/www/example.com/public/piwik/

Navigate to the secure location https://www.example.com/piwik/ to set up Piwik through a secure session.

Viewing Piwik dashboard in https

Because the Piwik tracking code requires both http and https access to Piwik, it is not possible to force the dashboard to load an https session using nginx. Fortunately, this can be set up in a Piwik configuration file.

Open the configuration file:

username@servername:~$ sudo nano /var/www/example.com/public/piwik/config/config.ini.php

Directly under [General], add:

force_ssl = 1

Navigating to http://www.example.com/piwik/ should force the dashboard to open in a secure session.

Geolocation

The geolocation feature in Piwik provides some of the more useful data on visitors. The PECL option is generally the better module to use.

Install the packages required for this feature to work and edit the necessary files:

username@servername:~$ sudo aptitude install php-geoip libgeoip-dev
username@servername:~$ sudo nano /etc/php/5.6/mods-available/geoip.ini

At the bottom of the file, add:

 geoip.custom_directory=/var/www/example.com/public/piwik/misc

Restart PHP:

username@servername:~$ sudo service php5.6-fpm restart

Log into Piwik and navigate to Settings -> Geolocation. Select the GeoIP (PECL) radio button. At the bottom of the page, assuming the free option is to be used, enter the GeoLite City database location into the Location Database, ISP Database and Organization Database fields and click save.

Device detection

Piwik has various plugins that come with the default install, located at Settings -> (Plugins) Installed, though not all of them are activated. The DevicesDetection plugin provides more information on the devices used by visitors and can be enabled by clicking Activate.

Using Piwik

Image tracking code

The image tracking code is used to track visitors when either the page does not load javascript or the users do not have javascript enabled. The code will give very basic information as compared to the javascript code, but will load for all visitors using browsers capable of downloading images.

One problem in Piwik's default image tracking code is that it includes an http (or https) link to the image tracker:

<!-- Piwik Image Tracker -->
<img src="http://www.example.com/piwik/piwik.php?idsite=1&amp;rec=1" style="border:0" alt="" />
<!-- End Piwik -->

This is problematic because if the site is served in an http session, then browsers may give warnings about not all elements being secure, and this will remain true in the reverse, if providing an http link on an https page. It's better to use the relative location, which will translate to either http or https depending on the page being served:

<!-- Piwik Image Tracker -->
<img src="/piwik/piwik.php?idsite=1&amp;rec=1" style="border:0" alt="" />
<!-- End Piwik -->

When using the image tracking code as a backup to track visitors that have javascript disabled, use the noscript tag to prevent tracking visitors who will also run the javascript:

<noscript><img src="/piwik/piwik.php?idsite=1&amp;rec=1" style="border:0" alt="" /></noscript>

Most Piwik plugins will add noscript tags by default, though any time the code is being pasted in full into the plugin, the tags should be added.

Multiple domains

It is possible to track multiple domains with one installation of Piwik. However, if this is done with the default tracking code, it will require javascript to load from the domain hosting Piwik in the target domain that is being tracked. This means that additional privacy and security tools, such as NoScript, will block the tracker, increasing the number of users not tracked with javascript while visiting the site, thus reducing the quality of the usage reports. Piwik does provide an option to mask the foreign domain so tracking can be obfuscated and run from the same domain.

For domains hosted on one server, there is an even simpler solution. Create symlinks in the target site's /public/ directory that points to the /public/piwik/ directory for the piwik domain, then use the site's tracking code stated in the dashboard and simply replace the piwik domain with the target domain.

Create the symlink:

username@servername:~$ ln -s /var/www/piwikdomain.com/public/piwik /var/www/targetdomain.com/public/piwik

Be sure to include the package-configs files in the HTTP and HTTPS server blocks for the target domain, as performed in the Piwik domain in the first step of this article.

Piwik trusted hostnames

As a security feature, Piwik includes a setting to set the trusted hostnames, so loading the Piwik dashboard from any other hostname will result in a prominent warning that the host is not trusted. To view the current trusted hostname, in the dashboard navigate to Settings -> General Settings -> Trusted Piwik Hostname and add the additional domains as desired.

Privacy tools

Another problem that can arise is being blocked by privacy extensions such as Ghostery. Some webmasters believe that a site is private property and feel justified in tracking users on that property, ostensibly to improve the site. Some visitors do not like to be tracked and are concerned that the webmasters are selling the data acquired or managing it in an irresponsible manner. However, this is not an issue to be addressed by this article, it is only to inform webmasters on how to use Piwik, as there are certainly ethical ways of tracking site usage and maintaining that data.

Ghostery blocks the specific javascipt piwik.js. By replacing the piwik.js script in the tracking code with the directory js/, /piwik/js/index.php will be loaded and then piwik.js will run in the browser without being blocked by Ghostery. This method should allow for tracking of the vast majority of visitors that land on a page.

More advanced visitors may use tools such as Adblock Plus or AdBlock Edge with the EasyPrivacy filter. To bypass the EasyPrivacy filter, create two randomly named symlinks, one to /piwik/ and the other to /piwik/piwik.php and replace /piwik/ in the script tracking code with one symlink and piwik.php in the noscript with the other.

username@servername:~$ ln -s /var/www/example.com/public/piwik /var/www/example.com/public/randomsymlinkname
username@servername:~$ ln -s /var/www/example.com/public/piwik/piwik.php /var/www/example.com/public/randomsymlinkname2.php
username@servername:~$ ln -s /var/www/example.com/public/piwik/piwik.php /var/www/example.com/public/piwik/randomsymlinkname2.php

The same link name is used twice for simplicity, as both point to the same target.

Sample tracking code:

<!-- Piwik -->
<script type="text/javascript">
  var _paq = _paq || [];
  _paq.push(['trackPageView']);
  _paq.push(['enableLinkTracking']);
  (function() {
    var u="//www.example.com/randomsymlinkname/";
    _paq.push(['setTrackerUrl', u+'randomsymlinkname2.php']);
    _paq.push(['setSiteId', 1]);
    var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
    g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'js/'; s.parentNode.insertBefore(g,s);
  })();
</script>
<noscript><p><img src="//www.example.com/randomsymlinkname2.php?idsite=1" style="border:0;" alt="" /></p></noscript>
<!-- End Piwik Code -->

Note the js/ and that the actual code used should be based on the code produced in Piwik, then modified with the symlinks - do not copy the code used here.

Package-configs edits

Edit piwik.conf:

username@servername:~$ sudo nano /etc/nginx/package-configs/piwik.conf

Add to the file:

location /randomsymlinkname/ {
    location ~ (randomsymlinkname2|index|piwik|js/index)\.php$ {
        include global-configs/php.conf;
    }
    location ~ \.php$ { deny all; }
}

location ~ randomsymlinkname2\.php$ {
    include global-configs/php.conf;
}

Edit piwik_https.conf:

username@servername:~$ sudo nano /etc/nginx/package-configs/piwik_https.conf

Add to the file:

location /randomsymlinkname/ {
    location ~ (randomsymlinkname2|index|piwik|js/index)\.php$ {
        include global-configs/php_https.conf;
    }
    location ~ \.php$ { deny all; }
}

location ~ randomsymlinkname2\.php$ {
    include global-configs/php_https.conf;
}
Multiple domains

For servers hosting multiple domains, the same symlink file names should be used in each domain to point at the Piwik domain directory and files. This way the same package-configs files may be used in all of the domain server blocks.

Log analytics tool

The Piwik log analytics tool will parse the access logs and import the data into Piwik. There may be various cases where importing the logs can provide very useful, but most admins will not have a need for it as the javascript tool generally provides better data more conveniently.

Log analytics import script

The /piwik/misc/log-analytics/import_logs.py script may be run to import logs into Piwik. There are may options available for running this script, including running it from a separate machine.

External links

Piwik

Mailinator provides a free, disposable email service that is very convenient for testing.