GeoIP and MySQL

For my own and possibly others’ reference, these are quick notes on how to use GeoIP data from MaxMind in their new split file formats. Older tutorials describe using the GeoIP data from a time when they were one file, it seems now MaxMind have split into two files.

The files are split into Location and Blocks, so we create two tables to accommodate this:

CREATE TABLE location (
 LocId INT(8) NOT NULL,
 PRIMARY KEY (LocId),
 country CHAR(2),
 region VARCHAR(255),
 city VARCHAR(255),
 postalCode VARCHAR(255),
 latitude DECIMAL(20,17),
 longitude DECIMAL(20,17),
 metroCode VARCHAR(50),
 areaCode VARCHAR(50)
) ENGINE=INNODB;
CREATE TABLE blocks (
 startIpNum INT(10) NOT NULL,
 EndIpNum INT(10) NOT NULL,
 LocId INT(8) NOT NULL,
 FOREIGN KEY (LocID) REFERENCES location(LocId) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=INNODB;

Now it’s time to import the data from the files. Since LocId is a foreign key to the blocks table, location must be inserted first.

mysql> LOAD DATA LOCAL INFILE '/root/GeoLiteCity_20090601/GeoLiteCity-Location.csv' INTO TABLE location FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' (LocId, country, region, city, postalCode, latitude, longitude, metroCode, areaCode);
Query OK, 248276 rows affected, 13 warnings (9.65 sec)
Records: 248277 Deleted: 0 Skipped: 1 Warnings: 9
mysql> LOAD DATA LOCAL INFILE '/root/GeoLiteCity_20090601/GeoLiteCity-Blocks.csv' INTO TABLE blocks FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' (startIpNum, EndIpNum, LocId);
Query OK, 4132041 rows affected, 65535 warnings (15 min 20.44 sec)
Records: 4132041 Deleted: 0 Skipped: 0 Warnings: 2739156

That done, we try a select.

mysql> select A.country from location A
  JOIN blocks B on A.LocId = B.LocId
  AND 404232216 >= startIpNum AND 404232216 <= EndIpNum;
+---------+
| country |
+---------+
| US      |
+---------+
1 row in set (2 min 3.58 sec)

As I had no indexes yet, the query took some time. So let’s add them:

mysql> CREATE INDEX startIp_ind ON blocks (startIpNum);
Query OK, 4132041 rows affected (9 min 22.23 sec)
Records: 4132041 Duplicates: 0 Warnings: 0
mysql> CREATE INDEX endIp_ind ON blocks (endIpNum);
Query OK, 4132041 rows affected (9 min 22.33 sec)
Records: 4132041 Duplicates: 0 Warnings: 0
mysql> CREATE INDEX LocId_ind ON location (LocId);
Query OK, 248276 rows affected (4.08 sec)
Records: 248276 Duplicates: 0 Warnings: 0

And voila:

mysql> select A.country from location A
  JOIN blocks B on A.LocId = B.LocId
  AND 404232216 >= startIpNum AND 404232216 <= EndIpNum;
+---------+
| country |
+---------+
| US      |
+---------+
1 row in set (0.55 sec)

Also, a better SQL query could’ve been more efficient, but hey – this is for illustrative purposes.

Ikke autorisert personer!

I dag ble jeg gjort oppmerksom på den flotte oversettelsen på de nye selvbetjeningssidene for (blant annet) å søke om visum til USA (med min utheving):

Er hjemmesiden sikker og privat?

Ja. Denne hjemmesiden blir operert av den Amerikanske myndigheten, og det benyttes teknologi som gjør at ikke autorisert personer får tilgang til informasjonen du legger inn og ser på . I tillegg opererer denne hjemmesiden ut i fra de lover og regler som den Amerikanske Stats lov om personvern. Denne Personvernsbekreftelsen forsikrer deg om at ditt personvern vil bli ivaretattinformation.

Så da burde jo alt være i orden.

Oppdatering – Kollega Knut viste meg popup-en han fikk da han gikk inn på de samme sidene. Med de forutsetningene burde de fleste muligheter være dekket.

Remote syslogging with OpenWRT Kamikaze 8.09

For Kamikaze 8.09 (RC1) the documentation on remote syslogging is a bit inaccurate or outdated – but this ticket pointed me in the right direction. Add the following to /etc/config/system:

log_ip 10.20.30.40

Reboot your unit and watch the logs roll in. Verify by checking what syslog is up to:

root@OpenWRT:~# ps -ef | grep syslog
87 root 1928 S syslogd -C16 -L -R 10.20.30.40

Now, that said, I also found how to set the name servers (for /etc/resolv.conf) the correct way – adding an option entry under a suitable interface definition in /etc/config/network. Note that for several DNS servers the IPs should be space separated, like this:

option ‘dns’ ‘10.0.0.1 10.0.0.2’

This creates an /etc/resolv.conf looking like this one:

nameserver 10.0.0.1
nameserver 10.0.0.2

VLAN woes

I’ve been planning to do this for a long time and now I’m finally there. My home network now consists of two virtual host servers (one Xen and one KVM) and a firewall between them, all nodes understanding VLANs. On top of this, add a small but powerful Linksys switch and a Linksys wireless access point running OpenWRT Kamikaze 8.09 (RC1). Among other things, this setup should facilitate testing Munin and other stuff without breaking the current functionality too much.

I got great help from this article on tweaking VLANs on the Xen host, and this article on VLANs with kvm. Thanks!