What Job Offers Tell About Architectures #2

It is time for another round of having a look which technologies companies announce with job openings for system administrators.

Todays List

Company Reference OS Languages Databases Software Hardware Monitoring Hosting Automation
1&1 Shared Hosting [src] Debian Perl, Bash - Apache, DRBD, Debian Packaging - - - -
1&1 Virtualisation [src] Linux Perl, Python - - NAS, SAN, iSCSI, NFS - VMWare, Xen, KVM, OpenStack, OpenNebula -
1&1 Monitoring [src] Linux, Windows Server Java, C/C++, Perl, Bash, Ruby - - - - - -
1&1 [src] Linux Perl, Bash, Python MySQL, Oracle, NoSQL Apache, Tomcat, Java - - - -
1&1 [src] Linux Java - Apache, Tomcat, JBoss - - - -
1&1 gmx.de web.de [src] Linux Perl, Bash, Python MySQL, Oracle, NoSQL, Cassandra - - - - Puppet
buch.de [src] - - MySQL, MS SQL DRBD, heartbeat - - - -
First Colo [src] Linux Bash, PHP, Perl, Python Postgres, MySQL - Cisco, Juniper - - -
IKB Bank [src] SLES 10/11/12, Solaris 8/10 Bash, Perl Postgres, MySQL - Fibre Channel SAN, Storages: EMC, HP, IBM - - -
s'Oliver [src] AIX - - SAP, Tivoli TSM IBM SAN - PowerVM -

The Interesting Things

Well as you can see a lot of 1&1 offers. With 1&1 being on of the large successful German ISPs running the leading mail portals GMX and Web.de it is interesting to see their technology requirements. They seem to be focussing on Java, MySQL and Oracle with NoSQL and Apache Cassandra being the youngest startup-ish tool used. So for me 1&1 makes a somewhat old-school impression, which of course is not necessarily bad. As they indicate no standard monitoring solution I guess they use a proprietory or commercial solution.

What I found noteworthy about the other companies is the usage of DRBD. It's more commonly used than I'd expect so far.

Note: You can find the complete index of all companies so far here: What Job Offers Tell About Architectures.

Large Website Technology Changes in October 2013

Last month I did a first indexing of mostly request header, HTML and DNS based information about the top 200 sites listed by Alexa and the top 100 German websites. All the information is freely available and extracted from the website responses.

The detailed results can be found here:

What Changed In A Month?

Now roughly a month later I repeated the scan to see what changes do happen and what trends might be noticable:

DNS-Prefetching THE HTML header based DNS prefetching is spreading. Two of the large adult content sites added it and Chinese search engine baidu.com introduced it too. As in my experience it provides a small but measurable latency improvement on many sites I'd guess usage will further spread...

Added by baidu.com, pornhub.com, tube8.com, www.wer-kennt-wen.de
Removed by taobao.com

IPv6 IPv6 support did not spread in the last month. Just two sides that dropped their AAAA records.

Removed by www.mail.ru, www.volkswagen.de

Varnish One of the most significant changes is more and more sites announcing the usage of Varnish as a cache. This is sometimes combined with a CDN, sometimes without.

Introduced by bbc.co.uk, www.otto.de, www.pinterest.com, www.wikipedia.org

Hiding Server Version Two more websites are now hiding their Apache version.

Removed version now: www.weibo.com, www.ftd.de

XSS Header XSS headers are not widely spread. No real change in adoption.

Added by www.craigslist.org
Removed by www.pinterest.com


All the results listed above are based on a simple scanning script. The results present a snapshot of the websites and a single response only. This is of course not necessarily an indicating for what techniques the site uses in daily operations!

Learning from Job Offers

Every other day you get one, fly over it and delete it. In a way each time it says exactly the same. Or does it?

Surely the different companies you read job offers from are using different technologies. Actually when announcing the position they are at their weakest time. They have to admit which technologies they use, how heterogenous they are and sometimes how old-style they are.

I think I'll from time to time compile some positions available online (not offers I got!) of mostly German and maybe more Berlin located companies and add them to an ever growing list "What Job Offers Tell". Below you find the first 10 company offers with their data.

Todays List

Company Link OS Languages Databases Software Hardware Monitoring Hosting Automation
Fraunhofer HHI [src] Ubuntu, Redhat, Debian C# - Windows AD, Windows Terminal Server, DFS Fibre Channel - vSphere 5 -
Idealo.de [src] Debian, Redhat - - - - Icinga, Cacti, NewRelic - -
Idealo.de [src] Debian - NoSQL, MongoDB LAMP - Nagios, Zookeeper, Corosync KVM -
KPMG [src] Windows Server 2003/2008 - MS SQL 2005/2008, SQL BI MS IIS - - - -
Lusini.de [src] Linux Node.js MongoDB nginx, Varnish, Elastic Search - NewRelic - Puppet, SaltStack
maxdome.de [src] Redhat PHP MySQL, Postgres, Redis, CouchDB Tomcat, JBoss, Apache, nginx - - Cloud Puppet, Foreman, Chef, Rex
Springer Online [src] CentOS, Redhat Bash, Python, Ruby MySQL, Postgres, MongoDB Apache, HaProxy, JBoss, Tomcat, Nginx, Varnish F5, Cisco Icinga, Graphite KVM, VMWare, Xen, AWS chef, Puppet
Teufel [src] Windows Powershell, Bash Exchange - - - VMWare, Hyper V chef, Puppet
T Systems [src] Linux Shell, Perl, PHP MySQL, Oracle heartbeat, DRBD, Apache, Tomcat, JBoss, Weblogic - - - -
Zalando [src] Windows - - MS ADS SAN - - -

The Interesting Parts

It gets interesting where positions go into details like with T-Systems using heartbeat and DRBD or Springer mentioning the real-time graphing engine Graphite. I also like maxdome using four different automation tools: Puppet, Foreman, Chef and Rex. That's two to many. A more exotic thing is Idealo.de using Apache Zookeeper, or is it more common than I think?

On the other hand some other candidates either have no automation needs, or the lack of hints indicates self-made automation or none at all.

Stay tuned for the next set of companies!

HowTo: Munin and rrdcached on Ubuntu 12.04

Let's expect you already have Munin installed and working and you want to reduce disk I/O and improve responsiveness by adding rrdcached... Here are the complete steps to integrate rrdcached:

Basic Installation

First install the stock package

apt-get install rrdcached

and integrate it with Munin:

  1. Enable the rrdcached socket line in /etc/munin/munin.conf
  2. Disable munin-html and munin-graph calls in /usr/bin/munin-cron
  3. Create /usr/bin/munin-graph with
    nice /usr/share/munin/munin-html $@ || exit 1
    nice /usr/share/munin/munin-graph --cron $@ || exit 1 

    and make it executable

  4. Add a cron job (e.g. to /etc/cron.d/munin) to start munin-graph:
    10 * * * *      munin if [ -x /usr/bin/munin-graph ]; then /usr/bin/munin-graph; fi

The Critical Stuff

To get Munin to use rrdcached on Ubuntu 12.04 ensure to follow these vital steps:

  1. Add "-s <webserver group>" to $OPT in /etc/init.d/rrdcached (in front of the first -l switch)
  2. Change "-b /var/lib/rrdcached/db/" to "-b /var/lib/munin" (or wherever you keep your RRDs)

So a patched default Debian/Ubuntu with Apache /etc/init.d/rrdcached would have

OPTS="-s www-data -l unix:/var/run/rrdcached.sock"
OPTS="$OPTS -j /var/lib/rrdcached/journal/ -F"
OPTS="$OPTS -b /var/lib/munin/ -B"

If you do not set the socket user with "-s" you will see "Permission denied" in /var/log/munin/munin-cgi-graph.log

[RRD ERROR] Unable to graph /var/lib/munin/
cgi-tmp/munin-cgi-graph/[...].png : Unable to connect to rrdcached: 
Permission denied

If you do not change the rrdcached working directory you will see "rrdc_flush" errors in your /var/log/munin/munin-cgi-graph.log

[RRD ERROR] Unable to graph /var/lib/munin/
cgi-tmp/munin-cgi-graph/[...].png : 
rrdc_flush (/var/lib/munin/[...].rrd) failed with status -1.

Some details on this can be found in the Munin wiki.

Liferea Code Repo Moved to github

I moved the source repo away from SourceForge away to GitHub.
It is currently located here:


If in doubt always follow the "Code" link from the website to find the repo.

Sorry, if this causes troubles for you. I'll contact all with current git write
access directly to see how we can continue on github and who will be able
to merge.

Please keep contributing! I think with github this can actually become
easier and more developers are familiar with its best practices.

GLib GRegex Regular Expression Cheat Sheet

Glib supports PCRE based regular expressions since v2.14 with the GRegex class.


GError *err = NULL;
GMatchInfo *matchInfo;
GRegex *regex;
regex = g_regex_new ("text", 0, 0, &err);
// check for compilation errors here!
g_regex_match (regex, "Some text to match", 0, &matchInfo);

Not how g_regex_new() gets the pattern as first parameter without any regex delimiters. As the regex is created separately it can and should be reused.

Checking if a GRegex did match

Above example just ran the regular expression, but did not test for matching. To simply test for a match add something like this:

if (g_match_info_matches (matchInfo))
    g_print ("Text found!\n");

Extracting Data

If you are interested in data matched you need to use matching groups and need to iterate over the matches in the GMatchInfo structure. Here is an example (without any error checking):

regex = g_regex_new (" mykey=(\w+) ", 0, 0, &err);   
g_regex_match (regex, content, 0, &matchInfo);

while (g_match_info_matches (matchInfo)) {
   gchar *result = g_match_info_fetch (matchInfo, 0);

   g_print ("mykey=%s\n", result);
   g_match_info_next (matchInfo, &err);
   g_free (result);

Easy String Splitting

Another nice feature in Glib is regex based string splitting with g_regex_split() or g_regex_split_simple():

gchar **results = g_regex_split_simple ("\s+", 
       "White space separated list", 0, 0);

Use g_regex_split for a precompiled regex or use the "simple" function to just pass the pattern.

Chef: How To Debug Active Attributes

If you experience problems with attribute inheritance on a chef client and watch the chef-client output without knowing what attributes are effective you can either look at the chef GUI or do the same on console using "shef" or in "chef-shell" in newer chef releases.

So run

chef-shell -z

The "-z" is important to get chef-shell to load the currently active run list for the node that a "chef-client" run would use.

Then enter "attributes" to switch to attribute mode

chef > attributes
chef:attributes >

and query anything you like by specifying the attribute path as you do in recipes:

chef:attributes > default["authorized_keys"]
chef:attributes > node["packages"]

By just querying for "node" you get a full dump of all attributes.

Never Forget _netdev with GlusterFS Mounts

When adding GlusterFS share to /etc/fstab do not forget to add "_netdev" to the mount options. Otherwise on next boot your system will just hang!

Actually there doesn't seem to be a timeout. That would be nice too.

As a side-note: do not forget that Ubuntu 12.04 doesn't care about the "_netdev" even. So network is not guaranteed to be up when mounting. So an additional upstart task or init script is needed anyway. But you need "_netdev" to prevent hanging on boot.

I also have the impression that this only happens with stock kernel 3.8.x and not with 3.4.x!

Splunk Cheat Sheet

Basic Searching Concepts

Simple searches look like the following examples. Note that there are literals with and without quoting and that there are field selections with an "=":

Exception                # just the word
One Two Three            # those three words in any order
"One Two Three"          # the exact phrase

# Filter all lines where field "status" has value 500 from access.log
source="/var/log/apache/access.log" status=500

# Give me all fatal errors from syslog of the blog host
host="myblog" source="/var/log/syslog" Fatal

Basic Filtering

Two important filters are "rex" and "regex".

"rex" is for extraction a pattern and storing it as a new field. This is why you need to specifiy a named extraction group in Perl like manner "(?...)" for example

source="some.log" Fatal | rex "(?i) msg=(?P[^,]+)"

When running above query check the list of "interesting fields" it now should have an entry "FIELDNAME" listing you the top 10 fatal messages from "some.log"

What is the difference to "regex" now? Well "regex" is like grep. Actually you can rephrase

source="some.log" Fatal


source="some.log" | regex _raw=".*Fatal.*"

and get the same result. The syntax of "regex" is simply "=". Using it makes sense once you want to filter for a specific field.


Sum up a field and do some arithmetics:

... | stats sum(<field>) as result | eval result=(result/1000)

Determine the size of log events by checking len() of _raw. The p10() and p90() functions are returning the 10 and 90 percentiles:

| eval raw_len=len(_raw) | stats avg(raw_len), p10(raw_len), p90(raw_len) by sourcetype

Simple Useful Examples

Splunk usually auto-detects access.log fields so you can do queries like:

source="/var/log/nginx/access.log" HTTP 500
source="/var/log/nginx/access.log" HTTP (200 or 30*)
source="/var/log/nginx/access.log" status=404 | sort - uri 
source="/var/log/nginx/access.log" | head 1000 | top 50 clientip
source="/var/log/nginx/access.log" | head 1000 | top 50 referer
source="/var/log/nginx/access.log" | head 1000 | top 50 uri
source="/var/log/nginx/access.log" | head 1000 | top 50 method

Emailing Results

By appending "sendemail" to any query you get the result by mail!

... | sendemail to="[email protected]"


Create a timechart from a single field that should be summed up

... | table _time, <field> | timechart span=1d sum(<field>)
... | table _time, <field>, name | timechart span=1d sum(<field>) by name

Index Statistics

List All Indices

 | eventcount summarize=false index=* | dedup index | fields index
 | eventcount summarize=false report_size=true index=* | eval size_MB = round(size_bytes/1024/1024,2)
 | REST /services/data/indexes | table title
 | REST /services/data/indexes | table title splunk_server currentDBSizeMB frozenTimePeriodInSecs maxTime minTime totalEventCount

on the command line you can call

$SPLUNK_HOME/bin/splunk list index

To query write amount of per index the metrics.log can be used:

index=_internal source=*metrics.log group=per_index_thruput series=* | eval MB = round(kb/1024,2) | timechart sum(MB) as MB by series

MB per day per indexer / index

index=_internal metrics kb series!=_* "group=per_host_thruput" monthsago=1 | eval indexed_mb = kb / 1024 | timechart fixedrange=t span=1d sum(indexed_mb) by series | rename sum(indexed_mb) as totalmb

index=_internal metrics kb series!=_* "group=per_index_thruput" monthsago=1 | eval indexed_mb = kb / 1024 | timechart fixedrange=t span=1d sum(indexed_mb) by series | rename sum(indexed_mb) as totalmb

Silencing the Nagios Plugin check_ntp_peer


The Nagios plugin "check_ntp_peer" from Debian package "nagios-plugins-basic" is not very nice. It shouts at you about LI_ALARM bit and negative jitter all the time after a machine reboots despite everything actually being fine.


result=$(/usr/lib/nagios/plugins/check_ntp_peer $@)

if echo "$result" | egrep 'jitter=-1.00000|has the LI_ALARM' >/dev/null; then
	echo "Unknown state after reboot."
	exit 0

echo $result
exit $status

Using above wrapper you get rid of the warnings.

Syndicate content Syndicate content