They say you can never be too rich or too thin (don’t know about that!) but you can definitely have too much data. If your firm is accumulating more data that you can process, then you must reconsider your data management strategy.
Data storage cost money and hoarding data that will never be accessed is a misuse of resources. Reducing the data inflow without losing the valuable insights and information that could be gleaned from the data is yet another challenge facing data scientists.
Rick Braddy in explains in this excerpt from his article in NetworkWorld:
“Figuring out what data you want to keep and how the remaining data you’re collecting should be processed is just one piece of the puzzle. You also need to work out where the processing and data reduction is going to take place…In many cases it will prove more cost effective to reduce data at the edge, as close as possible to where it’s generated. This is a good way of reducing storage requirements and network traffic by only sending forward what you need for analysis. The trick is accurately identifying what you need, but as machine learning advances we’ll be able to progress beyond educated guessing.
“As the mountain of data grows ever larger, failing to act is asking for trouble. You need a smart cloud data management strategy to drive innovation and it will rely on the data collection and processing foundation you build…Use your current business performance and future goals to identify the data you need, find ways to process that data at the edge where practical, and weigh up the value of analysis versus storage. The ideal data strategy is going to take time to figure out, and will differ from organization to organization, but what’s certain is that data hoarding is no longer a viable approach.”