5

I am having the following situation:

On an embedded linux board (a Raspberry Pi - but might be replaced in future by a SBC with similar h/w-specs) a Python application is reading sensor data every five seconds.

I want to extend the application by some long-term monitoring which can be accessed through a web-interface showing the long-term activity nicely with graphs, etc.

After doing some research I ended up with a couple of possible databases:

  • MySQL
  • SQLite
  • RRD-Tool

I am however a bit afraid of using one of the above databases on a Raspberry Pi. The reasons are the following:

  1. Data will be stored for one or two years. With having new data every 5 seconds this will make a total of more than 6 million entries! - For a single year only.

  2. I am not sure if the Raspberry Pi (or similar devices) will handle MySQL or SQLite databases of that size.

One solution might be lowering the resolution like you'll find it build-in when using the RRD-Tool as the database. But due to the fact that the RRD-Tool works with a concept of hearbeat and step, data will be averaged every time a new entry is being inserted into the database. And I haven't figured out how to turn that off.

So which database is best for:

  • Millions of entries
  • Or: lowering resolution after given time but without building averages, etc.
  • Working fluently on Raspberry (similar devices) when accessing over Web-Interface an 2-3 Users in parallel

Raspberry is used for:

  • Reading the sensors data
  • storing it into a database
  • providing web-interface to visualize data
Simon Kemper
  • 193
  • 2
  • 7
  • Why a database? Surely a file with readings appended is all you need. Download the file and import it into a database if you actually need to process the data. – joan Jun 26 '16 at 11:59
  • 2
    "millions of entries" is peanuts for databases (or even data files as @joan suggests) nowadays. Let's say you need to store a 64-byte datavalue (maybe readings from a few sensors). If you store that a million times, you've used up 64MB of space. Even the RPi doesn't think twice about that file size. And sorting through 1M records is a breeze for any proper db. Once you start getting in the billions of records in several tables with complex joins - then your database choice might play a role. Until then - use whatever you like. MySQL is straightforward and well supported. – Phil B. Jun 26 '16 at 12:08
  • @PhilB. thanks! You can turn that into an answer because that's the answer I was looking for! – Simon Kemper Jun 26 '16 at 12:10
  • I agree with @PhilB.'s answer. While I don't think the RPi is as frail as some users like to argue, I would argue that your choice of backing up the data is bigger concern that the storage style. Having it occationally offload the data, or store it on two different storage mediums would be ideal. Multiple years of continuous unerring runtime is a lot to request from any consumer grade hardware. – Jacobm001 Jun 29 '16 at 15:11
  • 1
    Additionally, I would ensure that the database is on a network drive or attached HDD, not on the SD card. – agurney Jun 28 '16 at 19:36
  • @agurney I don't think using an external drive is going to provide any advantage here. The amount of data is very small -- as the OP and Phil point out, this is ~6.3 million writes dispersed over a year for a total volume of probably < a few hundred megabytes. So the (slightly) faster speed of a USB device is not going to matter, nor is that volume going to wear out a GB+ sized SD card [in a reasonable period of time](http://raspberrypi.stackexchange.com/a/28844/5538). – goldilocks Jun 29 '16 at 15:42

1 Answers1

5

"Millions of entries" is peanuts for databases (or even data files as @joan suggests) nowadays. Let's say you need to store a 64-byte datavalue (maybe readings from a few sensors). If you store that a million times, you've used up 64MB of space. Even the RPi doesn't think twice about that file size. And sorting through 1M records is a breeze for any proper db. Once you start getting in the billions of records in several tables with complex joins - then your database choice might play a role. Until then - use whatever you like. MySQL is straightforward and well supported

Phil B.
  • 5,013
  • 14
  • 30