BigData-TemperatureHave you ever considered the fact that the availability of big data might also determines the way on you process the data immediately and afterwards?

Well I would like to introduce the term of “big data temperature”, i.e. determine the zone where to process big data best. Let me introduce the 3 big data temperature zones.

Cold Big Data – Data that you normally receive on a hourly, daily or weekly base. It does not really matter how fast the data is being delivered. It might be caused by the fact that you data delivery site or your data broker is not able to transfer the data more immediately. The reason why it can not be delivered more recent could be also caused by the technology receiving the the data chunks (e.g. Hadoop likes to process major blocks of data at once).

Warm Big Data – Whenever you need to process big data that has to be available on demand, but not instantaneously, this is a right landing zone for your data. Warm big data can be stored in-memory data grids or fast noSQL databases. Whenever your warm big data needs to be retained for a longer time (getting colder) it is recommend to let the data flow into the cold big data zone.

Hot Big Data – You have to make instant decisions during the time when you receive the data. This is critical when you process new data in motion sources, e.g. sensor or location data. Technologies such as Complex Event Processing or Low Latency Messaging are tailored towards that purpose. In case you need to retain the data for subsequent access or you want to apply identified patterns of historic data it is recommended to move data to warm or cold.

The idea behind big data temperature zones should not be seen in isolation, but as a flowing data concept that is always supported by the best of breed technology – assuming that for the near future we will not have one technology that covers all zones.