Wednesday, March 18, 2009

open usage of sequence of data points

lot of important data out there are simple "Time series" data, means a sequence of one or more data points change over time.

a service to share and use such data is Timetric. currently the amount of user and useful time series are small but in general such pretty platform can deploy common time series from many different domains.

problem

who takes care that the shared data is correct and therefore is valuable to use? all services based on public contribution and usage are faced with the same issue. do you trust the data you see? do you trust wikipedia? in general you should not. you have to double check at least 2 different sources before you use the provided data.

in addition once you double checked your data you have to make sure that the quality of data is guaranteed over time. that is much more difficult.

solution?

if the data is mission critical you should not use data before validating them. in case of non static data you have to validate the data each time they change. this means that you either need more than one data source as service which are not based on same data source or you have to look and buy commercial services takes care of the provided data or you request the service from the organization owns / collecting the data. each of those solution requires special handling for the particular domain.

summary

availability of public data service are promising but i currently do not see a available model to trust in. therefore usage is pretty limited only for some kind of "outline view"

No comments: