Intro to Osmosis and OSMFilter¶

Author: Alex Michels

In this notbook we will discuss how to use Osmosis and OSMFilter to work with OpenStreetMap exports.

Table of Contents¶

  • Obtaining Data
  • Install
    • Osmosis
    • OSMFilter
  • Getting Road Network
  • Filtering Features
    • Airports
    • Offices
    • Coffee in Chambana
In [1]:
import contextily as cx
import matplotlib.pyplot as plt
import osmnx as ox
from shapely.geometry import Polygon

Obtaining Data¶

Our data comes from a full planet export of OpenStreetMap data. You can find it here: https://planet.openstreetmap.org/planet/2018/ and we specifically downloaded the 180101 dataset.

The data is too large to store on Github, so we extracted the Urbana-Champaign area using osmosis with the following code block:

osmosis --read-xml planet-180101.osm \
        --bounding-box  top=40.17102388888516 \
                        left=-88.33484169754672 \
                        bottom=40.0238548208513 \
                        right=-88.16238199485753 \
        completeWays=yes \
        completeRelations=yes \
        --write-xml cu-180101.osm

For this notebook we have extracted a very small segment that covers the Urbana-Champaign area. Let's unzip that data.

In [2]:
!unzip -o cu-180101.osm.zip
Archive:  cu-180101.osm.zip
  inflating: cu-180101.osm           

Let's also create a Shapely polygon matching the bounding box used as it can come in handy:

In [3]:
bbox = Polygon(
    [(-88.33484169754672, 40.17102388888516),
    (-88.16238199485753, 40.17102388888516),
    (-88.16238199485753, 40.0238548208513),
    (-88.33484169754672, 40.0238548208513)]
)
bbox
Out[3]:

Install¶

Unfortunately Osmosis and OSMFilter are not currently supported by Easybuild, but they are relatively simple to install, so we will do that in the next few steps. However, Osmosis does require Java which is by default available in the kernel and we can use Lmod to load the Easybuild-installed Java 11.0.2.

In [4]:
import sys, os
sys.path.insert(0, os.path.join(os.environ['MODULESHOME'], "init"))
from env_modules_python import module

Osmosis requires Java which is not loaded by default into the Python environment. However, we can use Lmod to load the software anyways:

In [5]:
# we catch the outputs here so they aren't printed, but don't care about them unless there is an error
(dont, care) = module("load", "Java/11.0.2")

After loading, let's list the loaded modules and verify that Java is there (should be last)

In [6]:
# we catch the outputs here so they aren't printed, but don't care about them unless there is an error
(dont, care) = module("list")
Currently Loaded Modules:
  1) GCCcore/8.3.0                      44) x265/3.2-GCCcore-8.3.0
  2) zlib/1.2.11-GCCcore-8.3.0          45) util-linux/2.34-GCCcore-8.3.0
  3) binutils/2.32-GCCcore-8.3.0        46) fontconfig/2.13.1-GCCcore-8.3.0
  4) GCC/8.3.0                          47) xorg-macros/1.19.2-GCCcore-8.3.0
  5) numactl/2.0.12-GCCcore-8.3.0       48) X11/20190717-GCCcore-8.3.0
  6) XZ/5.2.4-GCCcore-8.3.0             49) FriBidi/1.0.5-GCCcore-8.3.0
  7) libxml2/2.9.9-GCCcore-8.3.0        50) FFmpeg/4.2.1-GCCcore-8.3.0
  8) libpciaccess/0.14-GCCcore-8.3.0    51) pixman/0.38.4-GCCcore-8.3.0
  9) hwloc/1.11.12-GCCcore-8.3.0        52) libffi/3.2.1-GCCcore-8.3.0
 10) OpenMPI/3.1.4-GCC-8.3.0            53) GLib/2.62.0-GCCcore-8.3.0
 11) OpenBLAS/0.3.7-GCC-8.3.0           54) cairo/1.16.0-GCCcore-8.3.0
 12) gompi/2019b                        55) GMP/6.1.2-GCCcore-8.3.0
 13) FFTW/3.3.8-gompi-2019b             56) nettle/3.5.1-GCCcore-8.3.0
 14) ScaLAPACK/2.0.2-gompi-2019b        57) libdrm/2.4.99-GCCcore-8.3.0
 15) foss/2019b                         58) LLVM/9.0.0-GCCcore-8.3.0
 16) bzip2/1.0.8-GCCcore-8.3.0          59) libunwind/1.3.1-GCCcore-8.3.0
 17) ncurses/6.1-GCCcore-8.3.0          60) Mesa/19.1.7-GCCcore-8.3.0
 18) gettext/0.20.1-GCCcore-8.3.0       61) libGLU/9.0.1-GCCcore-8.3.0
 19) libpng/1.6.37-GCCcore-8.3.0        62) gzip/1.10-GCCcore-8.3.0
 20) libreadline/8.0-GCCcore-8.3.0      63) lz4/1.9.2-GCCcore-8.3.0
 21) Szip/2.1.1-GCCcore-8.3.0           64) zstd/1.4.4-GCCcore-8.3.0
 22) HDF5/1.10.5-gompi-2019b            65) GRASS/7.8.3-foss-2019b
 23) cURL/7.66.0-GCCcore-8.3.0          66) MPICH/3.3.2-GCC-8.3.0
 24) netCDF/4.7.1-gompi-2019b           67) RHESSysEastCoast/7.2.0-foss-2019b
 25) expat/2.2.7-GCCcore-8.3.0          68) netCDF-Fortran/4.5.2-gompi-2019b
 26) GEOS/3.8.0-GCC-8.3.0               69) SUMMA/3.0.3-foss-2019b
 27) Tcl/8.6.9-GCCcore-8.3.0            70) TauDEM/5.3.8-foss-2019b
 28) SQLite/3.29.0-GCCcore-8.3.0        71) WRF/4.2.1-foss-2019b-dmpar
 29) NASM/2.14.02-GCCcore-8.3.0         72) WPS/4.2-foss-2019b-dmpar
 30) libjpeg-turbo/2.0.3-GCCcore-8.3.0  73) find_inlets/20191210-foss-2019b
 31) JasPer/2.0.14-GCCcore-8.3.0        74) Boost/1.71.0-gompi-2019b
 32) LibTIFF/4.0.10-GCCcore-8.3.0       75) Xvfb/1.20.8-GCCcore-8.3.0
 33) PCRE/8.43-GCCcore-8.3.0            76) protozero/1.7.0-GCCcore-8.3.0
 34) PROJ/6.2.1-GCCcore-8.3.0           77) sparsehash/2.0.3-GCCcore-8.3.0
 35) libgeotiff/1.5.1-GCCcore-8.3.0     78) libosmium/2.15.6-foss-2019b
 36) libtirpc/1.2.6-GCCcore-8.3.0       79) SoPlex/4.0.1-foss-2019b
 37) HDF/4.2.14-GCCcore-8.3.0           80) PostgreSQL/12.4-GCCcore-8.3.0
 38) GDAL/3.0.2-foss-2019b              81) protobuf/3.10.0-GCCcore-8.3.0
 39) FreeXL/1.0.5-GCCcore-8.3.0         82) protobuf-c/1.3.3-GCCcore-8.3.0
 40) libspatialite/4.3.0a-GCC-8.3.0     83) PostGIS/3.1.2-foss-2019b
 41) freetype/2.10.1-GCCcore-8.3.0      84) cybergisx/0.9.0
 42) x264/20190925-GCCcore-8.3.0        85) Java/11.0.2
 43) LAME/3.100-GCCcore-8.3.0

 



Osmosis¶

Osmosis is a Java-based command line application for processing OpenStreetMap data!

We will install the latest (as of writing) release of Osmosis from Github with the following lines of code (documentation here):

In [7]:
!mkdir -p ~/.local/osmosis
!wget -P ~/.local/osmosis https://github.com/openstreetmap/osmosis/releases/download/0.48.3/osmosis-0.48.3.tgz
!cd ~/.local/osmosis && tar xvfz osmosis-0.48.3.tgz
!rm ~/.local/osmosis/osmosis-0.48.3.tgz
!cp ~/.local/osmosis/bin/osmosis ~/.local/bin
!chmod a+x ~/.local/osmosis/bin/osmosis
--2024-05-21 20:23:40--  https://github.com/openstreetmap/osmosis/releases/download/0.48.3/osmosis-0.48.3.tgz
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/2564522/c71b8480-f1b7-11ea-95c1-1f22abeac411?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20240521%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240521T202340Z&X-Amz-Expires=300&X-Amz-Signature=bb16da6de0483cf2510e744399d4b5e42f2d39b3bdd32e737229efdc4c8ee323&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=2564522&response-content-disposition=attachment%3B%20filename%3Dosmosis-0.48.3.tgz&response-content-type=application%2Foctet-stream [following]
--2024-05-21 20:23:40--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/2564522/c71b8480-f1b7-11ea-95c1-1f22abeac411?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20240521%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240521T202340Z&X-Amz-Expires=300&X-Amz-Signature=bb16da6de0483cf2510e744399d4b5e42f2d39b3bdd32e737229efdc4c8ee323&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=2564522&response-content-disposition=attachment%3B%20filename%3Dosmosis-0.48.3.tgz&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15169299 (14M) [application/octet-stream]
Saving to: ‘/home/jovyan/.local/osmosis/osmosis-0.48.3.tgz’

osmosis-0.48.3.tgz  100%[===================>]  14.47M  --.-KB/s    in 0.1s    

2024-05-21 20:23:41 (126 MB/s) - ‘/home/jovyan/.local/osmosis/osmosis-0.48.3.tgz’ saved [15169299/15169299]

copying.txt
bin/
bin/osmosis-extract-mysql-0.6
bin/osmosis.bat
bin/osmosis
bin/osmosis-extract-apidb-0.6
config/
config/plexus.conf
script/
script/fix_line_endings.sh
script/pgsimple_schema_0.6_action.sql
script/pgsnapshot_schema_0.6_changes.sql
script/pgsnapshot_schema_0.6.sql
script/pgsnapshot_load_0.6.sql
script/pgsnapshot_schema_0.6_action.sql
script/pgsimple_load_0.6.sql
script/pgsnapshot_schema_0.6_linestring.sql
script/munin/
script/munin/README
script/munin/osm_replication.conf
script/munin/osm_replication_lag
script/pgsimple_schema_0.6.sql
script/contrib/
script/contrib/CreateGeometryForWays.sql
script/contrib/replicate_osm_file.sh
script/contrib/dump_apidb.sh
script/contrib/apidb_0.6.sql
script/contrib/apidb_0.6_osmosis_xid_indexing.sql
script/pgsnapshot_schema_0.6_bbox.sql
script/pgsnapshot_schema_0.6_upgrade_5-6.sql
script/pgsimple_schema_0.6_linestring.sql
script/pgsimple_schema_0.6_bbox.sql
script/pgsnapshot_schema_0.6_upgrade_4-5.sql
script/pgsnapshot_and_pgsimple.txt
changes.txt
lib/
lib/default/
lib/default/osmosis-areafilter-0.48.3.jar
lib/default/xercesImpl-2.12.0.jar
lib/default/osmosis-pbf2-0.48.3.jar
lib/default/commons-csv-1.7.jar
lib/default/guava-26.0-jre.jar
lib/default/checker-qual-2.5.2.jar
lib/default/osmosis-pgsimple-0.48.3.jar
lib/default/osmosis-tagtransform-0.48.3.jar
lib/default/commons-logging-1.0.4.jar
lib/default/protobuf-java-3.12.2.jar
lib/default/error_prone_annotations-2.1.3.jar
lib/default/osmosis-replication-0.48.3.jar
lib/default/woodstox-core-5.1.0.jar
lib/default/osmosis-pbf-0.48.3.jar
lib/default/postgresql-42.2.5.jar
lib/default/jpf-1.5.jar
lib/default/osmosis-osm-binary-0.48.3.jar
lib/default/commons-dbcp-1.4.jar
lib/default/spring-beans-5.1.0.RELEASE.jar
lib/default/osmosis-pgsnapshot-0.48.3.jar
lib/default/osmosis-apidb-0.48.3.jar
lib/default/commons-codec-1.11.jar
lib/default/spring-tx-5.1.0.RELEASE.jar
lib/default/commons-io-2.6.jar
lib/default/spring-jcl-5.1.0.RELEASE.jar
lib/default/osmosis-hstore-jdbc-0.48.3.jar
lib/default/j2objc-annotations-1.1.jar
lib/default/xml-apis-1.4.01.jar
lib/default/netty-3.10.6.Final.jar
lib/default/spring-core-5.1.0.RELEASE.jar
lib/default/commons-compress-1.18.jar
lib/default/stax2-api-4.1.jar
lib/default/animal-sniffer-annotations-1.14.jar
lib/default/osmosis-replication-http-0.48.3.jar
lib/default/postgis-jdbc-2.2.1.jar
lib/default/osmosis-set-0.48.3.jar
lib/default/plexus-classworlds-2.5.2.jar
lib/default/osmosis-dataset-0.48.3.jar
lib/default/mysql-connector-java-8.0.12.jar
lib/default/jsr305-3.0.2.jar
lib/default/commons-pool-1.5.4.jar
lib/default/osmosis-xml-0.48.3.jar
lib/default/osmosis-tagfilter-0.48.3.jar
lib/default/osmosis-core-0.48.3.jar
lib/default/spring-jdbc-5.1.0.RELEASE.jar
lib/default/osmosis-extract-0.48.3.jar
readme.txt

With the software installed, we need to add it to our PATH so the computer knows where to look:

In [8]:
os.environ["PATH"] = f"~/.local/osmosis/bin:{os.environ['PATH']}"

Let's check that the software installed correctly. The command should print out the help text for Osmosis:

In [9]:
!osmosis --help
osmosis

Example Usage

Import a planet file into a local PostgreSQL database.

osmosis --read-xml file=~/osm/planbet/planet.osm --write-apidb host="x" database="x" user="x" password="x"

Export a planet file from a local PostgreSQL database.

osmosis --read-apidb host="x" database="x" user="x" password="x" --write-xml file="planet.osm"

Derive a change set between two planet files.

osmosis --read-xml file="planet2.osm" --read-xml file="planet1.osm" --derive-change --write-xml-change file="planetdiff-1-2.osc"

Derive a change set between a planet file and a database.

osmosis --read-mysql host="x" database="x" user="x" password="x" --read-xml file="planet1.osm" --derive-change --write-xml-change file="planetdiff-1-2.osc"

Apply a change set to a planet file.

osmosis --read-xml-change file="planetdiff-1-2.osc" --read-xml file="planet1.osm" --apply-change --write-xml file="planet2.osm"

Sort the contents of a planet file.

osmosis --read-xml file="data.osm" --sort type="TypeThenId" --write-xml file="data-sorted.osm"

The above examples make use of the default pipe connection feature, however a simple read and write planet file command line could be written in two ways. The first example uses default pipe connection, the second explicitly connects the two components using a pipe named "mypipe". The default pipe connection will always work so long as each task is specified in the correct order.

osmosis --read-xml file="planetin.osm" --write-xml file="planetout.osm"

osmosis --read-xml file="planetin.osm" outPipe.0="mypipe" --write-xml file="planetout.osm" inPipe.0="mypipe"

Full usage details are available at: http://wiki.openstreetmap.org/wiki/Osmosis/Detailed_Usage


OSMFilter¶

OSMFilter is a command line tool for filter OpenStreetMap data based on the tags. Information on the software can be found here.

The following line installs OSMFilter (documentation here).

In [10]:
!cd ~/.local/bin/ && wget -O - http://m.m.i24.cc/osmfilter.c |cc -x c - -O3 -o osmfilter
--2024-05-21 20:23:48--  http://m.m.i24.cc/osmfilter.c
Resolving m.m.i24.cc (m.m.i24.cc)... 92.205.48.195, 2a00:1169:103:5a80::
Connecting to m.m.i24.cc (m.m.i24.cc)|92.205.48.195|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 240002 (234K) [text/plain]
Saving to: ‘STDOUT’

-                   100%[===================>] 234.38K   561KB/s    in 0.4s    

2024-05-21 20:23:48 (561 KB/s) - written to stdout [240002/240002]

Since we are installing directly to ~/.local/bin (which is already in our PATH), we don't need to do anything to tell our computer where the software is!

Let's verify that worked by checking the help text:

In [11]:
!osmfilter --help
osmfilter 1.4.6

THIS PROGRAM IS FOR EXPERIMENTAL USE ONLY.
PLEASE EXPECT MALFUNCTION AND DATA LOSS.
SAVE YOUR DATA BEFORE STARTING THIS PROGRAM.

This program filters OpenStreetMap data.

The input file name must be supplied as command line argument. The
file must not be a stream. Redirections from standard input will not
work because the program needs random access to the file. You do not
need to specify the input format, osmfilter will recognize these
formats: .osm (XML), .osc (OSM Change File), .osh (OSM Full History),
.o5m (speed-optimized) and .o5c (speed-optimized Change File).

The output format is .osm by default. If you want a different format,
please specify it using the appropriate command line parameter.

--keep=OBJECT_FILTER
        All object types (nodes, ways and relations) will be kept
        if they meet the filter criteria. Same applies to dependent
        objects, e.g. nodes in ways, ways in relations, relations in
        other relations.
        Please look below for a syntax description of OBJECT_FILTER.

--keep-nodes=OBJECT_FILTER
--keep-ways=OBJECT_FILTER
--keep-relations=OBJECT_FILTER
--keep-nodes-ways=OBJECT_FILTER
--keep-nodes-relations=OBJECT_FILTER
--keep-ways-relations=OBJECT_FILTER
        Same as above, but just for the specified object types.

--drop=OBJECT_FILTER
        All object types (nodes, ways and relations) which meet the
        supplied filter criteria will be dropped, regardless of
        meeting the criteria of a keep filter (see above).
        Please look below for a syntax description of OBJECT_FILTER.

--drop-nodes=OBJECT_FILTER
--drop-ways=OBJECT_FILTER
--drop-relations=OBJECT_FILTER
--drop-nodes-ways=OBJECT_FILTER
--drop-nodes-relations=OBJECT_FILTER
--drop-ways-relations=OBJECT_FILTER
        Same as above, but just for the specified object types.

--keep-tags=TAG_FILTER
        The in TAG_FILTER specified tags will be allowed on output.
        Please look below for a syntax description of TAG_FILTER.

--keep-node-tags=TAG_FILTER
--keep-way-tags=TAG_FILTER
--keep-relation-tags=TAG_FILTER
--keep-node-way-tags=TAG_FILTER
--keep-node-relation-tags=TAG_FILTER
--keep-way-relation-tags=TAG_FILTER
        Same as above, but just for the specified object types.

--drop-tags=TAG_FILTER
        The specified tags will be dropped. This overrules the
        previously described parameter --keep-tags.
        Please look below for a syntax description of TAG_FILTER.

--drop-node-tags=TAG_FILTER
--drop-way-tags=TAG_FILTER
--drop-relation-tags=TAG_FILTER
--drop-node-way-tags=TAG_FILTER
--drop-node-relation-tags=TAG_FILTER
--drop-way-relation-tags=TAG_FILTER
        Same as above, but just for the specified object types.

--modify-tags=TAG_MODIFICATION_LIST
        The specified tags will be modified. This is done after any
        filtering (see --keep, --keep-tags, --drop, --drop-tags).
        Please look below for a description of TAG_MODIFICATION_LIST.

--modify-node-tags=TAG_MODIFICATION_LIST
--modify-way-tags=TAG_MODIFICATION_LIST
--modify-relation-tags=TAG_MODIFICATION_LIST
--modify-node-way-tags=TAG_MODIFICATION_LIST
--modify-node-relation-tags=TAG_MODIFICATION_LIST
--modify-way-relation-tags=TAG_MODIFICATION_LIST
        Same as above, but just for the specified object types.

--drop-author
        For most applications the author tags are not needed. If you
        specify this option, no author information will be written:
        no changeset, user or timestamp.

--drop-version
        If you want to exclude not only the author information but
        also the version number, specify this option.

--drop-nodes
--drop-ways
--drop-relations
        According to the combination of these parameters, no members
        of the referred section will be written.

--emulate-osmosis
--emulate-pbf2osm
        In case of .osm output format, the program will try to use
        the same data syntax as Osmosis, resp. pbf2osm.

--fake-author
        If you have dropped author information (--drop-author) that
        data will be lost, of course. Some programs however require
        author information on input although they do not need that
        data. For this purpose, you can fake the author information.
        o5mfiler will write changeset 1, timestamp 1970.

--fake-version
        Same as --fake-author, but - if .osm xml is used as output
        format - only the version number will be written (version 1).
        This is useful if you want to inspect the data with JOSM.

--fake-lonlat
        Some programs depend on getting longitude/latitude values,
        even when the object in question shall be deleted. With this
        option you can have osmfilter to fake these values:
           ... lat="0" lon="0" ...
        Note that this is for XML files only (.osc and .osh).

-h
        Display a short parameter overview.

--help
        Display this help.

--ignore-dependencies
        Usually, all member nodes of a way which meets the filter
        criteria will be included as well. Same applies to members of
        included relations. If you activate this option, all these
        dependencies between OSM objects will be ignored.

--raw-comparison
        By default, values are compared numerically if they start
        with a digit. Use this option to prevent this behaviour.
        Please note that this option will not apply to filter
        expressions which have already been entered left to it.

--out-key=KEYNAME
        The output will contain no regular OSM data but only
        statistics: a list of all used keys is assembled. Left to
        each key, the number of occurrences is printed.
        If KEYNAME is given, the program will list all values which
        are used in connections with this key.
        You may use wildcard characters for KEYNAME, but only at the
        beginning and/or at the end. For example:  --out-key=addr:*

--out-count=KEYNAME
        Same as --out-key=, but the list is sorted by the number of
        occurrences of the keys resp. values.

--out-osm
        Data will be written in .osm format. This is the default
        output format.

--out-osc
        The OSM Change format will be used for output. Please note
        that OSM objects which are to be deleted are represented by
        their ids only.

--out-osh
        For every OSM object, the appropriate 'visible' tag will be
        added to meet 'full planet history' specification.

--out-o5m
        The .o5m format will be used. This format has the same
        structure as the conventional .osm format, but the data are
        stored as binary numbers and are therefore much more compact
        than in .osm format. No packing is used, so you can pack .o5m
        files using every file packer you want, e.g. lzo, bz2, etc.

--out-o5c
        This is the change file format of .o5m data format. All
        <delete> tags will not be performed as delete actions but
        converted into .o5c data format.

-o=<outfile>
        Standard output will be rerouted to the specified file.
        If no output format has been specified, the program will
        proceed according to the file name extension.

-t=<tempfile>
        osmfilter uses a temporary file to process interrelational
        dependencies. This parameter defines the name prefix. The
        default value is "osmfilter_tempfile".

--parameter-file=FILE
        If you want to supply one ore more command line arguments
        by a parameter file, please use this option and specify the
        file name. Within the parameter file, parameters must be
        separated by empty lines. Line feeds inside a parameter will
        be converted to spaces.
        Lines starting with "// " will be treated as comments.

-v
--verbose
        With activated 'verbose' mode, some statistical data and
        diagnosis data will be displayed.
        If -v resp. --verbose is the first parameter in the line,
        osmfilter will display all input parameters.

OBJECT_FILTER
        Some of the command line arguments need a filter to be
        specified. This filter definition consists of key/val pairs
        and uses the following syntax:
          "KEY1=VAL1 OP KEY2=VAL2 OP KEY3=VAL3 ..."
        OP is the Boolean operator, it must be either "and" or "or".
        As usual, "and" will be processed prior to "or". If you
        want to influence the sequence of processing, you may use
        brackets to do so. Please note that brackets always must be
        padded by spaces. Example: lit=yes and ( note=a or source=b )
        Instead of each "=" you may enter one of these comparison
        operators: != (not equal), <, >, <=, >=
        The program will use ASCII-alphabetic comparison unless you
        compare against a value which is starting with a digit.
        If there are different possible values for the same key, you
        need to write the key only once. For example:
          "amenity=restaurant =pub =bar"
        It is allowed to omit the value. In this case, the program
        will accept every value for the defined key. For example:
          "highway= and lit=yes"
        You may use wildcard characters for key or value, but only at
        the beginning and/or at the end. For example:
          wikipedia:*=  highway=*ary  ref_name=*central*
        Please be careful with wildcards in keys since only the first
        key which meets the pattern will be processed.
        There are three special keys which represent object id, user
        id and user name: @id, @uid and @user. They allow you to
        search for certain objects or for edits of specific users.

TAG_FILTER
        The tag filter determines which tags will be kept and which
        will be not. The example
          --keep-tags="highway=motorway =primary"
        will not accept "highway" tags other than "motorway" or
        "primary". Note that neither the object itself will be
        deleted, nor the remaining tags. If you want to drop every
        tag which is not mentioned in a list, use this example:
          all highway= amenity= name=

TAG_MODIFICATION_LIST
        The tag modification list determines which tags will be
        modified. The example
          --modify-tags="highway=primary to =secondary"
        will change every "primary" highway into "secondary".
        You can also use comparisons or add additional tags:
          --modify-way-tags="maxspeed>200 add highspeed=yes"

Examples

./osmfilter europe.o5m --keep=amenity=bar -o=new.o5m
./osmfilter a.osm --keep-nodes=lit=yes --drop-ways -o=light.osm
./osmfilter a.osm --keep="
    place=city or ( place=town and population>=10000 )" -o=b.osm
./osmfilter region.o5m --keep="bridge=yes and layer>=2" -o=r.o5m

Tuning

To speed-up the process, the program uses some main memory for a
hash table. By default, it uses 1800 MB for storing a flag for every
possible node, 180 for the way flags, and 20 relation flags.
Every byte holds the flags for 8 ID numbers, i.e., in 1800 MB the
program can store 14400 million flags. As there are less than 7400
million IDs for nodes at present (Mar 2020), 925 MB would suffice.
So, for example, you can decrease the hash sizes to e.g. 1000, 120
and 4 MB using this option:

  --hash-memory=1000-120-4

But keep in mind that the OSM database is continuously expanding. For
this reason the program-own default value is higher than shown in the
example, and it may be appropriate to increase it in the future.
If you do not want to bother with the details, you can enter the
amount of memory as a sum, and the program will divide it by itself.
For example:

  --hash-memory=3000

These 3000 MB will be split in three parts: 2700 for nodes, 270 for
ways, and 30 for relations.

Because we are taking hashes, it is not necessary to provide all the
suggested memory; the program will operate with less hash memory too.
But, in this case, the border filter will be less effective, i.e.,
some ways and some relations will be left in the output file although
they should have been excluded.
The maximum value the program accepts for the hash size is 4000 MiB;
If you exceed the maximum amount of memory available on your system,
the program will try to reduce this amount and display a warning
message.

Limitations

When filtering whole OSM objects (--keep...=, --drop...=), the input
file must contain the objects ordered by their type: first, all nodes
nodes, next, all ways, followed by all relations.

Usual .osm, .osc, .o5m and o5c files adhere to this condition. This
means that you do not have to worry about this limitation. osmfilter
will display an error message if this sequence is broken.

The number of key/val pairs in each filter parameter is limited to
1000, the length of each key or val is limited to 100.

There is NO WARRANTY, to the extent permitted by law.
Please send any bug reports to marqqs@gmx.eu


Getting Road Network¶

To get the road network data from our export, we can use the OSMFilter tool, keeping highways. While it is confusing, the highway key "is the main key used for identifying any kind of road, street or path."

In [12]:
!osmfilter cu-180101.osm --keep="highway=" -o="cu-180101-roads.osm"

Then we want to reject the types of ways we don't want in our datasets like elevators:

In [13]:
!osmosis --read-xml cu-180101-roads.osm \
    --tf reject-ways highway=abandoned,bus_guideway,construction,corridor,cycleway,elevator,escalator,footway,path,pedestrian,planned,platform,proposed,raceway,service,steps \
    --tf reject-ways access=private \
    --tf reject-ways motor_vehicle=no \
    --tf reject-ways motorcar=no \
    --tf reject-ways service=alley,driveway,emergency_access,parking,parking_aisle,private \
    --tf reject-relations \
    --used-node \
    --write-xml  cu-180101-roads-cleaned.osm
May 21, 2024 8:23:57 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Osmosis Version 0.48.3
May 21, 2024 8:23:57 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Preparing pipeline.
May 21, 2024 8:23:57 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Launching pipeline execution.
May 21, 2024 8:23:57 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline executing, waiting for completion.
May 21, 2024 8:24:00 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Pipeline complete.
May 21, 2024 8:24:00 PM org.openstreetmap.osmosis.core.Osmosis run
INFO: Total execution time: 3790 milliseconds.

Now, the data is clean enough that we can load it with OSMNX!

In [14]:
roads = ox.graph_from_xml("cu-180101-roads-cleaned.osm")

We can also convert that data to geodataframes:

In [15]:
nodes, edges = ox.graph_to_gdfs(roads, nodes=True, edges=True)
In [16]:
print(len(nodes))
nodes.head()
4902
Out[16]:
y x highway ref geometry
osmid
37945125 40.095100 -88.209586 NaN NaN POINT (-88.20959 40.09510)
37945135 40.095139 -88.204865 NaN NaN POINT (-88.20486 40.09514)
37945137 40.116374 -88.270283 NaN NaN POINT (-88.27028 40.11637)
37945139 40.117389 -88.270301 NaN NaN POINT (-88.27030 40.11739)
37945140 40.118360 -88.270312 NaN NaN POINT (-88.27031 40.11836)
In [17]:
print(len(edges))
edges.head()
13921
Out[17]:
osmid name highway oneway reversed length geometry maxspeed ref access bridge lanes junction tunnel width est_width
u v key
37945125 38011738 0 143324749 South Race Street secondary False False 110.196 LINESTRING (-88.20959 40.09510, -88.20958 40.0... NaN NaN NaN NaN NaN NaN NaN NaN NaN
38011734 0 143324749 South Race Street secondary False True 152.683 LINESTRING (-88.20959 40.09510, -88.20961 40.0... NaN NaN NaN NaN NaN NaN NaN NaN NaN
37945135 0 5324504 Montclair Road residential False False 401.615 LINESTRING (-88.20959 40.09510, -88.20924 40.0... NaN NaN NaN NaN NaN NaN NaN NaN NaN
37945135 38082685 0 5341486 South Vine Street residential False False 52.845 LINESTRING (-88.20486 40.09514, -88.20485 40.0... NaN NaN NaN NaN NaN NaN NaN NaN NaN
38033922 0 5341486 South Vine Street residential False True 50.817 LINESTRING (-88.20486 40.09514, -88.20487 40.0... NaN NaN NaN NaN NaN NaN NaN NaN NaN

..and of course plot the data. You'll notice the long highways stretching well past the CU area. This is because the Osmosis query kept all ways that crossed the bounding box.

In [18]:
ox.plot_graph(roads, figsize=(24, 12))
Out[18]:
(<Figure size 2400x1200 with 1 Axes>, <Axes: >)
In [19]:
ox.plot_graph_folium(roads)
Out[19]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Filtering Features¶

The OSM export contains all OSM data from the area, not just the roads. We can use OSMFilter to filter out various kinds of features:


Airports¶

Let's use our data to find the airport in Chambana (Willard Airport). There is a aeroways tag that can help us find a variety of aviation and space flight related features.

In [20]:
!osmfilter cu-180101.osm \
    --keep="aeroway=" \
    --keep-ways="aeroway=" \
    --keep-relations="aeroway=" \
    -o="cu-180101-aeroway.osm"

Then load the data with OSMNx as a GeoDataFrame. This time we will use our bounding box to only keep the data within Chambana.

In [21]:
# features_from_xml in newer versions of OSMNx
aeroway = ox.geometries_from_xml("cu-180101-aeroway.osm", polygon=bbox)
print(len(aeroway))
aeroway.head()
47
Out[21]:
ele name aeroway addr:state gnis:created gnis:feature_id gnis:county_name gnis:feature_type geometry source ... faa iata icao type is_in name_1 name:en operator wikipedia closest_town
element_type osmid
node 369052439 232 University of Illinois Heliport helipad IL 09/01/1992 427128 Champaign Airport POINT (-88.24172 40.09336) NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
369053016 245 Andrew RLA Airport aerodrome IL 09/01/1992 427125 Champaign Airport POINT (-88.30846 40.16250) NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2968897591 NaN Willard Airport Terminal terminal NaN NaN NaN NaN NaN POINT (-88.26396 40.03633) Bing ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2968942162 NaN Flightstar Corporation terminal NaN NaN NaN NaN NaN POINT (-88.27048 40.03951) Bing ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
way 27216523 NaN B taxiway NaN NaN NaN NaN NaN LINESTRING (-88.27256 40.04000, -88.27288 40.0... FAA ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 26 columns

And finally plot our aeroways! I didn't realize there were this many airports in CU!

In [22]:
ax = aeroway.plot(figsize=(24, 12), column="aeroway", legend=True, legend_kwds={'loc': "lower right"})
cx.add_basemap(ax, crs=aeroway.crs.to_string())

The geopandas explore function allows us to examine the data with an interactive map:

In [23]:
aeroway.explore()
Out[23]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Offices¶

Let's try to identify all companies. This could be useful for identifying high traffic areas as companies often require employees and customers to visit them.

We will do this using the office key.

In [24]:
!osmfilter cu-180101.osm \
    --keep="office=" \
    --keep-ways="office=" \
    --keep-relations="office=" \
    -o="cu-180101-office.osm"

Then load the data with OSMNx as a GeoDataFrame. This time we will use our bounding box to only keep the data within Chambana.

In [25]:
# features_from_xml in newer versions of OSMNx
office = ox.geometries_from_xml("cu-180101-office.osm", polygon=bbox)
print(len(office))
office.head()
92
Out[25]:
name office geometry contact:fax contact:phone contact:website fixme contact:email fax email ... building operator description roof:shape building:levels alt_name healthcare start_date social_facility contact:tty
element_type osmid
node 2414005491 Teresa D. Stoerger, CPA accountant POINT (-88.24900 40.11547) NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2415529891 American Family Insurance insurance_agent POINT (-88.24759 40.11515) +1 217 3568956 +1 217 3568920 http://insurance-agency.amfam.com/IL/kenney-da... NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2423728504 Ortho Technologies, Ltd. company POINT (-88.23897 40.11407) NaN NaN NaN survey NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2423728506 Ragle Dental Laboratory Inc. company POINT (-88.23897 40.11412) +1 217 3988098 +1 217 3980090 http://www.raglelab.com/ NaN info@raglelab.com NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2423728507 Roland Realty Leasing Office apartments POINT (-88.23893 40.10768) +1 217 3281253 +1 217 3518900 http://www.roland-realty.com/ NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 40 columns

Now we can plot the various kinds of offices in town.

In [26]:
ax = office.plot(figsize=(24, 12), column="office", legend=True, legend_kwds={'loc': "lower right"})
cx.add_basemap(ax, crs=office.crs.to_string())

The geopandas explore function allows us to examine the data with an interactive map:

In [27]:
office.explore()
Out[27]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Coffee in Chambana¶

As our last example, let's look at amenities in Chambana and specifically coffee shops. Amenities capture a wide variety of places and their description says

top-level tag describing useful and important facilities for visitors and residents, such as toilets, telephones, banks, pharmacies, prisons and schools.

In [28]:
!osmfilter cu-180101.osm \
    --keep="amenity=" \
    --keep-ways="amenity=" \
    --keep-relations="amenity=" \
    -o="cu-180101-amenity.osm"

Then load the data with OSMNx as a GeoDataFrame. This time we will use our bounding box to only keep the data within Chambana.

In [29]:
# features_from_xml in newer versions of OSMNx
amen = ox.geometries_from_xml("cu-180101-amenity.osm", polygon=bbox)
amen = amen[~amen["amenity"].isna()]
print(len(amen))
amen.head()
1212
Out[29]:
highway geometry name amenity cuisine fax phone website addr:postcode opening_hours ... postal_code food bin bench park_ride age_range leisure unsigned_ref ways type
element_type osmid
node 287470185 NaN POINT (-88.20955 40.11257) Siam Terrace restaurant thai NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
287774106 NaN POINT (-88.20713 40.11136) Urbana Post Office post_office NaN +1 217 3376435 +1 217 3679629 https://tools.usps.com/go/POLocatorDetailsActi... 61801-9998 Mo-Fr 08:00-17:00;Sa 09:30-12:00 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
287900592 NaN POINT (-88.21330 40.11075) NaN post_box NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
287965039 NaN POINT (-88.21786 40.10919) NaN post_box NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
287965785 NaN POINT (-88.21892 40.11034) Marathon fuel NaN NaN NaN NaN NaN 24/7 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 146 columns

And finally plot our amenities:

In [30]:
amen.plot(figsize=(24, 12), column="amenity", legend=True)
Out[30]:
<Axes: >

Using this data, we can filter just the coffee shops/cafes and plot that:

In [31]:
# places like Dunkin don't count as cafes (they are fast_food), but have the "coffee_shop" cuisine tag still
# Example: Dunkin' on Green, https://www.openstreetmap.org/node/1641482845
cafes = amen[(amen["amenity"] == "cafe") | (amen["cuisine"] == "coffee_shop")]
ax = cafes.plot(figsize=(12, 10), column="name", legend=True, legend_kwds={'loc': 'upper left'})
cx.add_basemap(ax, crs=cafes.crs.to_string())
plt.savefig("CoffeeShops.png", bbox_inches='tight')
In [ ]: