Escolar Documentos
Profissional Documentos
Cultura Documentos
Introduction
According to the Journal of Business Research (Janssen, 2017) “Big data refers to datasets that are
both big and high in variety and velocity, which makes them difficult to handle using traditional tools
and techniques”. In my opinion the simplest, and my own personal definition of big data is that the
term describes vast amounts of both unstructured and structured data from internal and external
data points, this data is able to be used in business analytical work in order to provide companies
advanced knowledge and wisdom.
The company ‘SAS’ states that the importance of big data is all about how the data you have is
utilised, rather than how much you have, and that it can be used to determine root causes of
failures, issues and defects in near-real time, generate coupons at the point of sale based on
customer’s buying habits and recalculate entire risk portfolios in minutes. (SAS Institute, 2017).
Once again using the Journal of the Association for Information Systems (Abbasi, et al., 2016) as a
reference, it is stated that “its four “V” characteristics have had a profound impact on the people,
processes, and technologies related to the information value chain.” This is due to the data provided
being able to be derived from the big data and then once it has passed through the four ‘V’s’ it is
now able to be used by companies to help serve data analytics.
Generally for standard information systems that do not use big data as a model, they will only record
specific data that they need and that almost always serves only one purpose such as: collecting
receipts, sales data, customer information, work rota data, primary research data, etc. This data is
then used for that specific function or purpose and is it is used to investigate a pre-existing
hypothesis. Big data inverts this rule; the data collected is used to form hypothesis and in turn more
information is gathered, swiftly followed by knowledge being gained from the big data. In direct
contrast, the hypothesis from older information systems is already set, with the data collected only
being used to answer one specific hypothesis mostly (Khan, 2015). The advantage to the standard
method is that businesses aren’t overwhelmed by the four V’s and the answer is easily accessible
with correct data. No excess data means less costs storing the data and less cost obtaining the vast
volume of data. However big data is only going to expand the financial strength of the company if it
is properly utilised, having all the data in the world is meaningless if it is not utilised properly and if
no information or knowledge is gained then there is no point in having all the data there in the first
place. Effective and efficient use of big data however is going to expand the potential financial
barriers of any company in the world.
SAS Institute is a reference that I have already used myself and they explain how big data works,
who uses it, its’ importance and its’ history; I am going to refer to how it works (SAS Institute, 2017).
SAS Institute says that the data source for big data is generally either streaming data (this is data
that reaches company IT systems from a web of connected devices), social media data (this data is
unstructured or semi structured and is from all social media sources surrounding specific parameters
e.g. a certain location or the name of the company) and publically available sources (this data is
available through open data sources like data.gov and European Union Open Data Portal). All of
these lead the company to consider how to store the newly acquired data and how to manage the
data, this storage is the volume aspect of big data. A company must know how much of the data
should be analysed, if the company has high enough performance technology such as grid
computing or in-memory analytics then all data collected could be analysed. It is said that only 0.5%
of all aggregated data is ever analysed (What Exactly Is Big Data?, 2017). Finally the company must
know how to use the newly found information and as such a strategy should be put in place in order
to optimise the information and covert it to business knowledge.
Also in the BBC article (BBC News, 2011) a statement read “It’s about bringing analytics to specific
business problems. We had very good success with this in the retail space, and also helping banks
fighting credit card fraud.” In addition it says “There is an explosion in the understanding in the value
of analytics. One problem is actually acquiring enough talent to deal with the demand.” This shows
that even with the new insights it must be analysed by a trained team in order to maximise the
potential for cost reductions, otherwise the data is essentially meaningless.
As an extra resource, the company insideBIGDATA gives its definition of veracity (Normandeau,
2013) – “Big Data Veracity refers to the biases, noise and abnormality in data. Is the data that is
being stored, and mined meaningful to the problem being analysed. Inderpal feel veracity in data
analysis is the biggest challenge when compared to things like volume and velocity. In scoping out
your big data strategy you need to have your team and partners work to help keep your data clean
and processes to keep ‘dirty data’ from accumulating in your systems” This source states that
“Inderpal feel veracity in data analysis is the biggest challenge” which is evidence that if monitored
incorrectly, the analysis will be greatly diminished.
Correlation does not equal causation, so for example, if sales rise when the temperature is over 20
degrees it does not automatically mean that people are more likely to buy specific company items;
another causal link could be underlying in the data, such as a new advertising campaign that
positively affects sales for that period of time. If data is analysed incorrectly and then money is spent
based on the unreliable data the company is practically going in blind with the false perception of
proposed knowledge.
This new data requires companies to either outsource data analysts or the company has to increase
its own staff roster with their own data analysts and it must support them with high quality and high
performance technology. If the data analysts are outsourced then the cost will be more expensive in
the long run, but if they can’t afford the start-up costs to maximise their own data analysts then it
may be their only option. SAS Institute (SAS Institute, 2017) runs down a list of considerations to
help smooth over the big data analysis, it suggests: ‘Cheap, abundant storage, faster processors,
affordable open source, distributed big data platforms such as Hadoop (this is a big data software),
parallel processing, clustering, MPP (massive parallel processing), virtualisation, large grid
environments, high connectivity and throughputs, and finally cloud computing and other flexible
resource allocation arrangements’. These technological systems will cost a large amount of money
to the majority of companies who aren’t specifically heavily invested in the IT sector.
The costs can be recouped through smart implementation of ideas and processes, which have been
gathered by the big data analysed; but it is vital that the analysts understand what the company
needs, and that the veracity of the data is thoroughly checked before being used in future
developments.
Conclusion
The advantages that are available through big data are potentially huge, as long as all four V’s are
monitored and managed regularly, quickly and attentively. The information provided must be fully
utilised and thoroughly examined by hard working and knowledgeable data analysts to improve it
into knowledge for the business in question, only then will the benefits be fully realised and the
company will likely see financial benefits from the practices.
The costs associated with implementing a big data system are large, especially when considering all
of the technological tools and systems necessary to fully optimise the analysis of the everlasting
volume of data, but knowledge on such a large scale is worth the money required. The acquired
knowledge will undoubtedly improve the functions of nearly every aspect surrounding the business.
Most importantly, even more important than the financial aspect of big data, is that the application
of the knowledge acquired will lead to smarter and better informed decision making, which is the
bedrock of any safe, stable and trustworthy company.