car2go Traffic Analysis tool.
Go to file
2018-01-14 22:18:23 +01:00
data Initial commit. 2018-01-14 03:23:42 +01:00
old Initial commit. 2018-01-14 03:23:42 +01:00
sql Initial commit. 2018-01-14 03:23:42 +01:00
.editorconfig Added EditorConfig. 2018-01-14 22:17:51 +01:00
.gitignore Initial commit. 2018-01-14 03:23:42 +01:00
calc_trips.py Initial commit. 2018-01-14 03:23:42 +01:00
getData.sh Initial commit. 2018-01-14 03:23:42 +01:00
import.py Bugfixes. 2018-01-14 22:17:28 +01:00
init_db.sh Initial commit. 2018-01-14 03:23:42 +01:00
Pipfile Added Pipfile for pipenv. 2018-01-14 22:18:23 +01:00
Pipfile.lock Added Pipfile for pipenv. 2018-01-14 22:18:23 +01:00
README.md Bugfixes. 2018-01-14 22:17:28 +01:00
requirements.txt Initial commit. 2018-01-14 03:23:42 +01:00
testquery.sh Initial commit. 2018-01-14 03:23:42 +01:00

Car Sharing Analyser

This is a collection of scripts and tools to gather, prepare and analyse data about a German car sharing company.

Requirements

  • Python 3.6 or newer (3.5 will probably work, too)
    • geopy
  • SQLite 3.7.11 or newer commandline client

Gathering

The script getData.sh collects a JSON dump from the webpage every 30 seconds and stores that into the data/ folder. A week's worth of data is about 3.3 GiB.

Edit the file first to configure your desired city. Then let it run in a tmux session.

Preparing

init_db.sh will create a SQLite database file according to sql/dbschema.sql.

Make sure you have Python 3 installed. Install required Python packages by running:

sudo -H pip3 install -r requirements.txt

(Or use virtualenv, pipenv, venv, etc.)

Then run import.py to import all the JSON dumps into the database. If you continue the data collection, you can run import.py later to import the new data. Successfully imported JSON dumps can be deleted, of course.

Importing a week's worth of data takes about 5 minutes and results in a 14.5 MiB SQLite3 database.

Analysing

Run calc_trips.py to analyse car's state changes and calculate possible(!) trips from it. The trips data is written back into the database. (Trips shorter than 70 seconds are filtered. See notes below.) A week's worth of trip data increases the database by about 4 MiB.

You can also use this script as a starting point for your own analysing scripts. (Pull requests are welcome.)

For working with the database itself, I recommend DB Browser for SQLite.

Example Queries

To see the history of a specific car, you can run e.g. (XYZ = number plate):

SELECT * FROM car_history WHERE plate="XYZ";

To see cars in a specific area, you can run:

SELECT *
FROM car_history
WHERE latitude>=52.515652 AND longitude>=13.372373
AND   latitude<=52.516813 AND longitude<=13.378115;

Also check out the views in the database.

Notes

  • The distance_km is beeline from starting point to end point. If somebody runs errands and parks in the exact same spot, the distance is (almost) 0.
  • You can't distinguish between these cases with distances of (almost) zero and no petrol spent:
    • a car that made a short trip and parked in the same spot,
    • a car that has been taken out of service for a while,
    • a car that has been reserved (possible for up to 30 minutes) but the reservation expired
  • "Trips" over several days are most probably cars that have been taken out of service.
  • A car that has been "reserved" (for up to 30 minutes) disappears from the list of cars. The reservation time is included in the trip's duration_minutes.
  • The calculated prices don't factor in additional fees (airport, drop-off), the time a car was "reserved", and are also calculated for cars taken offline, i.e. where no money was paid by the customer.
  • Smaller negative fuel_spent values are probably because the car was parked on a slope.
  • The id for a car is it's number plate. Theoretically, a plate could be put on another vehicle (with a different vin). But this is very unlikely.