Main Data Analytics: Practical Guide to Leveraging the Power of Algorithms, Data Science, Data Mining, Statistics,..
Data Analytics: Practical Guide to Leveraging the Power of Algorithms, Data Science, Data Mining, Statistics, Big Data, and Predictive Analysis to Improve Business, Work, and LifeArthur Zhang
You may be interested in
Most frequently terms
Data Analytics, 2017_(Arthur Zhang).pdf
05 September 2019 (08:19)
Data Analytics Practical Guide to Leveraging the Power of Algorithms, Data Science, Data Mining, Statistics, Big Data, and Predictive Analysis to Improve Business, Work, and Life By: Arthur Zhang Legal notice This book is copyright (c) 2017 by Arthur Zhang. All rights are reserved. This book may not be duplicated or copied, either in whole or in part, via any means including any electronic form of duplication such as recording or transcription. The contents of this book may not be transmitted, stored in any retrieval system, or copied in any other manner regardless of whether use is public or private without express prior permission of the publisher. This book provides information only. The author does not offer any specific advice, including medical advice, nor does the author suggest the reader or any other person engage in any particular course of conduct in any specific situation. This book is not intended to be used as a substitute for any professional advice, medical or of any other variety. The reader accepts sole responsibility for how he or she uses the information contained in this book. Under no circumstances will the publisher or the author be held liable for damages of any kind arising either directly or indirectly from any information contained in this book. Table of Contents Introduction Chapter 1: Why Data is Important to Your Business Data Sources How Data Can Improve Your Business Chapter 2: Big Data Big Data – A New Advantage Big Data Creates Value Big Data is a Big Deal Chapter 3: Development of Big Data Chapter 4: Considering the Pros and Cons of Big Data The Pros New methods of generating profit Improving Public Health Improving Our Daily Environment Improving Decisions: Speed and Accuracy Personalized Products and Services The Cons Privacy Big Brother Stifling Entrepreneurship Data Safekeeping Erroneous Data Sets and Flawed Analyses Conclusions Chapter 5: Big Data for Small Businesses? Why not? The Cost Effectiveness of Data Analytics Big Data can be for Small Businesses Too Where can Big Data improve the Cost Effectiveness of Small Businesses? What to consider when preparing for a New Big Data Solution Chapter 6: Important training for the management of big data Present level of skill in managing data Where big data training is necessary The Finance department The Human Resources department The supply and logistics department The Operations department The Marketing department The Data Integrity, Integration and Data Warehouse department The Legal and Compliance department Chapter 7: Steps Taken in Data Analysis Defining Data Analysis Actions Taken in the Data Analysis Process Phase 1: Setting of Goals Phase 2: Clearly Setting Priorities for Measurement Determine What You’re Going to be Measuring Choose a Measurement Method Phase 3: Data Gathering Phase 4: Data Scrubbing Phase 5: Analysis of Data Phase 6: Result Interpretation Interpret the Data Precisely Chapter 8: Descriptive Analytics Descriptive Analytics- What is It? How Can Descriptive Analysis Be Used? Measures in Descriptive Statistics Inferential Statistics Chapter 9: Predictive Analytics Defining Predictive Analytics Different Kinds of Predictive Analytics Predictive Models Descriptive Modeling Decision Modeling Chapter 10: Predictive Analysis Methods Machine Learning Techniques Regression Techniques Linear Regression Logistic Regression The Probit Model Neural Networks Radial Basis Function Networks Support Vector Machines Naive Bayes Instance-Based Learning Geospatial Predictive Modeling Hitachi’s Predictive Analytic Model Predictive Analytics in the Insurance Industry Chapter 11: R - The Future In Data Analysis Software Is R A Good Choice? Types of Data Analysis Available with R Is There Other Programming Language Available? Chapter 12: Predictive Analytics & Who Uses It Analytical Customer Relationship Management (CRM) The Use Of Predictive Analytics In Healthcare The Use Of Predictive Analytics In The Financial Sector Predictive Analytics & Business Keeping Customers Happy Marketing Strategies *Fraud Detection Processes Insurance Industry Shipping Business Controlling Risk Factors Staff Risk Underwriting and Accepting Liability Freedom Specialty Insurance: An Observation of Predictive Analytics Used in Underwriting Positive Results from the Model The Effects of Predictive Analytics on Real Estate The National Association of Realtors (NAR) and Its Use of Predictive Analytics The Revolution of Predictive Analysis across a Variety of Industries Chapter 13: Descriptive and predictive analysis Chapter 14: Crucial factors for data analysis Support by top management Resources and flexible technical structure Change management and effective involvement Strong IT and BI governance Alignment of BI with business strategy Chapter 15: Expectations of business intelligence Advances in technologies Hyper targeting The possibility of big data getting out of hand Making forecasts without enough information Sources of information for data management Chapter 16: What is Data Science? Skills Required for Data Science Mathematics Technology and Hacking Business Acumen What does it take to be a data scientist? Data Science, Analytics, and Machine Learning Data Munging Chapter 17: Deeper Insights about a Data Scientist’s Skills Demystifying Data Science Data Scientists in the Future Chapter 18: Big Data and the Future Online Activities and Big Data The Value of Big Data Security Risks Today Big Data and Impacts on Everyday Life Chapter 19: Finance and Big Data How a Data Scientist Works Understanding More Than Numbers Applying Sentiment Analysis Risk Evaluation and the Data Scientist Reduced Online Lending Risk The Finance Industry and Real-Time Analytics How Big Data is Beneficial to the Customer Customer Segmentation is Good for Business Chapter 20: Marketers profit by using data science Reducing costs to increasing revenue Chapter 21: Use of big data benefits in marketing Google Trends does all the hard work The profile of a perfect customer Ascertaining correct big data content Lead scoring in predictive analysis Geolocations are no longer an issue Evaluating the worth of lifetime value Big data advantages and disadvantages Making comparisons with competitors Patience is important when using big data Chapter 22: The Way That Data Science Improves Travel Data Science in the Travel Sector Travel Offers Can be personalized because of Big Data Safety Enhancements Thanks to Big Data How Up-Selling and Cross-Selling Use Big Data Chapter 23: How Big Data and Agriculture Feed People How to Improve the Value of Every Acre One of the Best Uses of Big Data How Trustworthy is Big Data? Can the Colombian Rice Fields be saved by Big Data? Up-Scaling Chapter 24: Big Data and Law Enforcement Data Analytics, Software Companies, and Police Departments: A solution? Analytics Decrypting Criminal Activities Enabling Rapid Police Response to Terrorist Attacks Chapter 25: The Use of Big Data in the Public Sector United States Government Applications of Big Data Data Security Issues The Data Problems of the Public Sector Chapter 26: Big Data and Gaming Big Data and Improving Gaming Experience Big Data in the Gambling Industry Gaming the System The Expansion of Gaming Chapter 27: Prescriptive Analytics Prescriptive Analytics- What is It? What Are its Benefits? What is its Future? Google’s “Self-Driving Car” Prescriptive Analytics in the Oil and Gas Industry Prescriptive Analytics and the Travel Industry Prescriptive Analytics in the Healthcare Industry Data Analysis and Big Data Glossary A B C D E F G H I K L M N O P Q R S T U V Conclusion Introduction How do you define the success of a company? It could be by the number of employees or level of employee satisfaction. Perhaps the size of the customer base is a measure of success or the annual sales numbers. How does management play a role in the operational success of the business? How critical is it to have a data scientist to help determine what’s important? Is fiscal responsibility a factor of success? To determine what makes a business successful, it is important to have the necessary data about these various factors. If you want to find out how employees contribute to your success, you will need a headcount of all the staff members to determine the value they contribute to business growth. On the other hand, you will need a bank of information about customers and their transactions to understand how they contribute to your success. Data is important because you need information about certain aspects of your business to determine the state of that aspect and how it affects overall business operations. For example, if you don’t keep track of how many units you sell per month, there is no way to determine how well your business is doing. There are many other kinds of data that are important in determining business success that will be discussed throughout this book. Collecting the data isn’t enough, though. The data needs to be analyzed and applied to be useful. If losing a customer isn’t important to you, or you feel it isn’t critical to your business, then there’s no need to analyze data. However, a continual lack of appreciation for customer numbers can impact the ability of your business to grow because the number of competitors who do focus on customer satisfaction is growing. This is where predictive analytics becomes important and how you employ this data will distinguish your business from competitors. Predictive analytics can create strategic opportunities for you in the business market, giving you an edge over the competition. The first chapter will discuss how data is important in business and how it can increase efficiency in business operations. The subsequent chapters will outline the steps and methods involved in analyzing business data. You will gain a perspective on techniques for predictive analytics and how it can be applied to various fields from medicine to marketing and operations to finance. You will also be presented with ways that big data analysis can be applied to gaming and retail industries as well as the public sector. Big data analysis can benefit private businesses and public institutions such as hospitals and law enforcement, as well as increase revenue for companies to create a healthier climate within cities. One section will focus on descriptive analysis as the most basic form of data analysis and how it is necessary to all other forms of analysis – like predictive analysis – because without examining available data you can’t make predictions. Descriptive analysis will provide the basis for predictive and inferential analysis. The fields of data analysis and predictive analytics are vast and complex, having so many sub-branches that add to the complexity of understanding business success. One branch, prescriptive analysis, will be covered briefly within the pages of this book. The bare necessities of the fields of analytics will be covered as you read on. This method is being employed by a variety of industries to find trends and determine what will happen in the future and how to prevent or encourage certain events or activities. The information contained in this book will help you to manage data and apply predictive analytics to your business to maximize your success. Chapter 1: Why Data is Important to Your Business Have you ever been fascinated with ancient languages, perhaps those now known as “dead” languages? The complexity of these languages can be mesmerizing, and the best part about them is the extent to which ancient peoples went to preserve them. They used very monotonous methods to preserve texts that are anywhere from a few hundred years old to some that are several thousands of years old. Scribes would copy these texts several times to ensure they were preserved, a process that could take years. Using ink made from burned wood, water, and oil they copied the text to papyrus paper. Some used tools to chisel the text into pottery or stone. While these processes were tedious and probably mind-numbing, the people of the time determined this information was so valuable and worth preserving that certain members of a society dedicated their entire lives to copying the information. What is the commonality between dead languages and business analytics? The answer is data. Data is everywhere and flows through every channel of our lives. Think about social media platforms and how they help shape the marketing landscape for companies. Social media can provide companies with analytics that help them measure how successful – or unsuccessful – company content may be. Many platforms provide this data for free, yet there are other platforms that charge high prices to provide a company with high-quality data about what does or doesn’t work on their website. When it comes to business, product and market data can provide an edge over the competition. That makes this data worth its weight in gold. Important data can include weather, trends, customer tendencies, historical events, outliers, products, and anything else relevant to an aspect of business. What is different about today is how data can be stored. It no longer has to be hand-copied to papyrus or chiseled into stone. It is an automatic process that requires very little human involvement and can be done on a massive scale. Sensors are connected to today’s modern scribes. This is the Internet of Things. Most of today’s devices are connected, constantly collecting, recording, and transmitting usage and performance data. Sensors collect environmental data. Cities are connected to record data relevant to traffic and infrastructure information to ensure they are operating efficiently. Delivery vehicles are connected to monitor their location and functionality, and if mechanical problems arise they can usually be addressed early. Buildings and homes are connected to monitor energy usage and costs. Manufacturing facilities are connected in ways that allow automatic communication of critical data sets. This is the present – and the future – state of “things.” The fact that data is important isn’t a new concept, but the way in which we collect the data is. We no longer need scribes; they have been replaced with microprocessors. The ways to collect data, as well as the types of data to be collected, is an ever-changing field itself. To be ahead of the game when it comes to business, you’ve got to be up-to-date about how you collect and use data. The product or service provided can establish a company in the market, but data will play the critical role in sustaining the success of the business. The technology-driven world in which we live can make or break a business. There are large companies that have disappeared in a short amount of time because they failed to monitor their customer base or progress. In contrast, there are smaller startup businesses that have flourished because of the importance they’ve placed on customer expectations and their numbers. Data Sources Sources of data for a business can range from customer feedback to sales figures to product or service demands. Here are a few sources of data a business may utilize: Social media: LinkedIn, Twitter, and Facebook can provide insight into the kind of customer traffic your web page receives. These platforms also provide cost-effective ways to conduct surveys about customer satisfaction with products or services and customer preferences. Online Engagement Reporting: Using tools such as Google Analytics or Crazy Egg can provide you with data about how customers interact with your website. Transactional Data: This kind of data will include information collected from sales reports, ledgers, and web payment transactions. With a customer relationship management system, you will also be able to collect data about how customers spend their money on your products. How Data Can Improve Your Business By now you’ve realized that proper and efficient use of data can improve your business in many ways. Here are just a few examples of data playing an important role in business success. Improving Marketing Strategies: Based on the types of data collected, it can be easier to find attractive and innovative marketing strategies. If a company knows how customers are reacting to current marketing techniques, it will allow them to make changes that will fall in line with trends and expectations of their customers. Identifying Pain Points: If a business is driven by predetermined processes or patterns, data can help identify points of deviation. Small deviations from the norm can be the reason behind increased customer complaints, decreased sales, or a decrease in productivity. By collecting and analyzing data regularly, you will be able to catch a mishap early enough to prevent irreversible damages. Detecting Fraud: In the absence of proper data management, fraud can run rampant and seriously affect business success. With access to sales numbers in hand, it will be easy to detect when and where fraud may be occurring. For instance, if you have a purchase invoice for 100 units, but your sales reports only show that 90 units have been sold, you know that ten units are missing from inventory and you will know where to look. Many companies are silent victims of fraud because they fail to utilize the data to realize that fraud is even occurring. Identifying Data Breaches: With the availability of data streams ever-increasing, it creates another problem when it comes to fraudulent practices. Although comprehensive yet subtle, the impacts of data breaches can negatively affect accounting, payroll, retail, and other company systems. Data hackers are becoming more sneaky and devious in their attacks on data systems. Data analytics will allow a company to see a possible data breach and prevent further data compromises which might completely cripple the business. Tools for data analytics can help a company to develop and implement data tests that will detect early signs of fraudulent activity. Sometimes standard fraud testing is not possible for certain circumstances, and tailored tests may be a necessity for detecting fraud in specific systems. In the past, it was common for companies to wait to investigate possible fraudulent activity and implement breach safeguards until the financial impacts became too large to ignore. With the amount of data available today this is no longer a wise – or necessary – method to prevent data breaches. The speed at which data is dispersed throughout the world can mean a breach could happen from one point to the next, crippling a company from the inside out on a worldwide scale. Data analytics testing can prevent data destruction by revealing certain characteristics or parameters that may indicate fraud has entered the system. Regular testing can give companies the insight they need to protect the data they are entrusted to keep secure. Improving Customer Experience: Data can also be gathered from customers in the form of feedback about certain business aspects. This information will allow a company to alter business practices, services, or products to better satisfy the customer. By maintaining a bank of customer feedback and continually asking for feedback you are better able to customize your product or service as the customers’ needs change. Some companies send customized emails to their customers, creating the feeling that they genuinely care about their customers. They do this most likely because of effective data management. Making Decisions: Many important decisions about a business require data about market trends, customer bases, and prices offered by competitors for the same or similar products or services. If data does not influence the decision-making process, it could cost the company immensely. For example, launching a new product in the market without considering the price of a competitor’s product might cause your product to be overpriced – therefore creating problems when trying to increase sales. Data should not only apply to decisions about products or services, but also to other areas of business management. Certain datasets will provide information on how many employees it will take to foster the efficient functioning of a department. This will allow you determine where you are understaffed or overstaffed. Hiring Process: Using data to select the right personnel seems to be neglected by many corporations. For effective business operation, it is crucial to put the right candidate in the right position. Using data to hire the most qualified person for a position will ensure the business will remain highly successful. Large companies with even larger budgets use big data to seek out and choose skilled people for their open positions. Smaller companies would benefit from using big data from the beginning to staff appropriately to further the successes of a startup or small business. This method of using gathered data during hiring has been proven to be a lucrative practice for various sizes of organizations. Data scientists can extract and interpret specific data needed from the human resources department for hiring the right person. Job Previews: By providing an accurate description of an open position, a job seeker will be better prepared about what to expect should they be hired for the position. Pre-planning the hiring process utilizing data about the open position is critical in appealing to the right candidate. Trial and error are no doubt a part of learning a new job, but it slows down the learning process. It will take the new employee longer to catch up to acceptable business standards which also slows their ability to become a valuable company resource. By incorporating job preview data into the hiring process, the learning curve is reduced, and the employee will become more efficient faster. Innovative Methods for Gathering Data for Hiring: Using new methods of data collection in the hiring process can prove to be beneficial in hiring the right professional. Social sites that collect data, such as Google+, Twitter, Facebook, and LinkedIn can give you additional resources for recruiting potential candidates. A company can search these sites for relevant data from posts made by the users to connect to qualified applicants. Keywords are the driving force for online searches. Using the most visible keywords in a job description will increase the number of views your job posting will receive. Traditionally, software and computers have been used to determine if an employee would be better suited for another position within the company or to terminate employment. However, using this type of resource can also help to find the right candidate for a job outside of the company. Basic standards such as IQ or skills tests can be limiting, but focusing on personality traits may open the field of potential candidates. By identifying personality characteristics, it will help to filter out candidates based on traits that will not be beneficial to the company. If a person is argumentative or prefers to be isolated, they certainly wouldn’t thrive in a team-oriented environment. By eliminating mismatches between candidates and job expectations, it will save the company time, training materials, and other resources. By utilizing this type of data collection, it would not only find candidates with the right skills but also with the right personalities to align with current company culture. Being sociable and engaging will foster the new employee as they learn their new role. It’s important that new candidates fit well with seasoned employees to reinforce working relationships. The health of the working environment greatly influences how productive the company is overall. Using Social Media to Recruit: Social media platforms are chock full of data sources for finding highly qualified individuals to fill positions within a company. On Twitter, recruiters can follow people who tweet about a certain industry. A company can then find and recruit ideal candidates based on their interest and knowledge of an industry or a specific position within that industry. If someone is constantly tweeting about new ideas or innovations about an industry aspect, they could make a valuable contribution to your company. Facebook is also valuable for this kind of public recruitment. It’s a cost-effective way to collect social networking data for companies who are seeking to expand their employee base or fill a position. By “liking” and following certain groups or individuals a company can establish an online presence. Then when the company posts a job ad, it is likely to be seen by many people. It is also possible to promote ads for a small fee on Facebook. This means your ad will be placed more often in more places, increasing your reach among potential candidates. It’s a geometrical equation – furthering your reach with highly effective job data posts increases the number of skilled job seekers who will see your ad, resulting in a higher engagement of people who will be a great fit for your company. Niche Social Groups: By joining certain groups on social media platforms recruiters will have access to a pool of candidates who most likely already possess certain specific skills. For instance, if you need to hire a human resources manager, joining a group comprised of human resource professionals can potentially connect you with your next hire. Within this group, you can post engaging and descriptive job openings your company has. Even if your potential candidate isn’t in the group, other members will most likely have referrals. Engaging in these kinds of groups is a very cost-effective method to advertise open positions. Gamification: This is an underused data tool but can be effective if the hiring process requires multiple steps or processes. By rewarding candidates with virtual badges or other goods, it will motivate candidates to put forth effort during the selection process. This will allow their relevant skills in performing the job to be highlighted and is a fun experience when applying for a job which is typically a rather boring process. These are only a few of the ways in which data can help companies and human resource departments streamline the hiring process and save resources. As you can see, data can be very important for effective business functioning, and you’ve also seen the multitude of uses it has for just the hiring process. This is why proper data utilization is critical in business decision making for all other aspects of your business. Chapter 2: Big Data Across the globe, data and technology are interwoven into society and the things we do. Like other production factors – such as human capital and hard assets – there are many parts of the modern economic activity that couldn’t happen without data. Big data is, in short, the large amounts of data that are gathered in order to be analyzed. From this data, we can find patterns that will better inform future decisions. This data and what can be learned from it will become how companies compete and grow in the near future. Productivity will be greatly improved, as well. Significant value will be created in the economy of the world because of increase in the quality of services and products while reducing waste. While this data has been around, it has only really excited people that are already interested in data. As times have changed, we are getting more and more excited by the amount of data that we’re generating, mining and storing. This data is now one of the most important economic factors for so many different people. In the present, we can look back at trends in IT innovation and investment. We can also see the impact on productivity and competitiveness that have resulted from those trends and how big data can make large changes in our modern lives. Like the previous IT-enabled innovations, big data has the same requirements to move productivity further. For example, if you see innovations in current technology, then there will need to be a close following after of complementary management innovations. Big data technology supplies and analytic capabilities are so advanced now that it will have just as much of an impact on productivity as suppliers of other technologies. Businesses around the world will need to start taking big data seriously because of the potential it has to create some real value. There are already retail companies that are putting big data to work because of the potential it has to increase the operating margins. Big Data – A New Advantage Since it has come to light, big data is becoming an incredibly important way that companies are outperforming each other. Even new entrants into the market are going to be able to leverage strategies that data has found in order to compete, innovate, and attain real value. This will be the way that all the different companies, new and established, will compete on the same level. There are already examples of this competition everywhere. In the healthcare industry, data pioneers are looking at the outcomes of some pharmaceuticals that are widely prescribed. From the analysis of the results, they learned that there were risks and benefits that had not been seen in the limited trials that companies had run with the pharmaceuticals. There are other industries that are using the sensors in their products to gain data that they can use. This can be seen in children’s toys, large-scale industrial goods, and so many others. The data that they gather show how the products are used in real life. With this data, companies can make improvements on the products based on how people are really using them. This will make these products so much better for the future users. Big data is going to help create new growth opportunities and create new companies that specialize in aggregating and analyzing data. There’s a good proportion of companies that will sit right in the middle of flowing information. They’ll be receiving information and data that comes from many sources just to analyze it. Managers and company leaders that are thinking ahead need to start creating and finding new ways to make their companies capable of dealing with big data. People that do so will need to be especially aggressive about it. It’s important to realize that not only the amount of big data but the high frequency and real-time nature of data as well. There’s the idea of “nowcasting” around right now. This process is estimating metrics right away. These metrics can be things like consumer confidence. Knowing that information so soon used to be impossible and only something that could be done after a while. “Nowcasting” is being used more and more, adding a lot of potential to the ways that companies predict things. The high frequency of the data will allow users to try to test theories and analyze the results in ways that they were incapable of before. There have been studies of major industries that have found ways that big data can be used: 1. Big data can unlock serious value for industries because it makes information transparent. There is a lot of data that isn’t being recorded and stored. There is still a lot of information that cannot be found as well. There are people that are spending a quarter of their time looking for extremely specific data and then storing it, sometimes in a digital space. There’s a lot of inefficiency in this work right now. More and more companies are storing data from transactions online, these people are able to collect tons of accurate and detailed information about everything. They can find out inventory and even the number of sick days that people are taking. Some companies are already using this data collection and analysis to do experiments and see how they can make better-informed management decisions. Big data allows companies to put their customers into smaller groups. This will allow them to tailor the services and products that they are offering. More sophisticated analytics are also allowing for better decision making to happen. There are fewer risks and bring light to information and insights that might not have seen the light of day. Big data can be used to create a brand new generation of services and products that wouldn’t have been otherwise possible. Some manufacturers are already using the data that has been collected from their sensors to figure out more efficient and useful after-sales services. Big Data Creates Value Using the US healthcare system as an example, we can look at ways that big data can really create good value. If the healthcare system used big data to use the efficient and quality of their services, they would actually create $300 billion of value every year. 70% of that value would have been seen from a cut in expenditures. These expenditures that would be cut are only 8% of the current expenditures. If you look at European developed economies instead, you can see a different way that big data creates value. The government administrations could use big data in the right way to improve operational efficiency. That would result in about €100 billion worth of value every year. This is just one area. If the governments used advanced analytics and boosted tax revenue collection, they would create ever more value just from cutting down on errors and fraud in the system. Even though we’ve been looking at companies and governments so far, they aren’t the only ones that are going to benefit from using big data. A consumer will benefit from this system as well. Using location data in specific services, people could find a consumer surplus of up to $600 billion. This can be seen especially in systems and apps that use real-time traffic information to make smart routing. These systems are some of the most used on the market and they use location data. There are more and more people using smartphones. Those that have smartphones are taking advantage of the free map apps that are available. With an increase in demand, it’s likely that the nmber of apps that use smart routing are going to increase. By the year 2020, more than 70% of mobile phones are going to have GPS capabilities built into them. In 2010, this number was only 20%. Because of the increase in GPS capable devices, we can expect that smart routing will have the potential to create savings of around $500 billion in fuel and time that people will spend on the road. That amount of money is equal to around 20 billion driving hours. It’s like saving a driver 15 hours a year on the road. This would save them $150 billion dollars in fuel. While we have seen specific pools of data in the examples listed above, but big data has a huge potential in combined pools of data. The US healthcare system is a great way to look at the potential future of big data. The healthcare system has four distinct data pools: clinical, medical, pharmaceutical products; research and development; activity and cost; and patient data. Each data pool is captured and managed by a different portion of the healthcare system. If big data was used to its full potential, then the annual productivity of the healthcare system could be improved around 0.7%. But it would take the combination of data from all these different sources to create that improved efficiency. The unfortunate part is that some of the data would need to come from places that do not share their data at scale right now. Data like clinical claims and patient records would need to somehow be integrated into the system. The patient, in turn, would have better access to more of their healthcare information and would be able to compare physicians, treatments, and drugs. This would allow patients to pick out their medications and treatments based on the statistics that are available to them. However, in order to get these kinds of benefits, patients would have to accept a trade for some of their privacy. Data security and privacy are two of the biggest roadblocks in the way of this. We must find a way around them if we really ever want to see the true benefits of using big data. The most prevalent challenge right now is the fact that there is a shortage of people that are skilled in analyzing big data properly. By 2018, the US will be facing a shortage of 140,000 and 190,000 people with training in deep analysis. They’ll also be facing a shortage of roughly 1.5 million people that have the quantitative skills and managerial experience needed to interpret the analyses correctly. These people will be basing their decisions off of the data. There are many technological issues in the way as well that will need to be resolved before big data can be used effectively by more companies. There are so many incompatible formats and standards that are floating around as well as legacy systems that are stopping people from integrating data and from using sophisticated analytical tools to really look at the data sets. Ultimately, there will have to be technology made for computing and storage through to the application of visualization and analytical software. All this technology will have to be available in a stack so that it is more effective. In order to take true advantage of big data, there has to be better access to data, and that means all of it. There are going to be so many organizations that will need to have access to data stores and maintained by third parties to add that data in with their own. These third parties could be customers or business partners. This need for data will mean that companies that really need data will have to be able to come up with interesting proposals for suppliers, consumers, and possibly even competitors in order to get their hands on that data. As long as big data is understood by governments and companies, the potential it has to deliver better productivity will ensure that there will be some incentive for companies to take the actions that they have to get over the barriers that are standing in the way. In getting around these barriers, companies will find new ways to be competitive in their industries and against individual companies. There will be greater productivity and efficiency all around which will result in better services, even when money is tight. Big Data Brings Value to Businesses Worldwide Big data has been bringing value to business internationally for a while. The amount of value that it will continue to bring is almost immeasurable. There are several ways that the big data has impacted the world so far. It has created a brand new career field in Data Science. Data interpretation has been changed drastically because of big data. The healthcare industry has been improving quickly and considerably since they added predictive analytics into part of their business. Laser scanning technology is changing and has changed the way that law enforcement officers reconstruct crime scenes. Predictive analytics are changing how caregivers and patients interact. There are even data models that are being built now to look at business problems and help find solutions. Predictive analytics has had an impact on the way that the real estate industry conducts business. Big Data is a Big Deal Besides the fact that data is bringing so much value to so many different companies and industries, it is also opening up a whole new path of management principles that companies can use. Early on in professional management, corporate leaders discovered that one of the key factors for competitive success was a minimum scale of efficiency. Comparatively, one of the modern factors for competitive success is going to be capturing higher quality data and using that data with more efficiency at scale. For the current company executives that might be doubting how much big data is going to help them, there are these five questions that will really help them figure out how big data is going to benefit them and their organizations. What can we expect to happen in a world that is “transparent” meaning that data is readily available? Over time information is becoming more accessible in all sectors. The fact that that data is coming out of the shadows means that organizations, which have relied heavily on data as a competitive asset, are potentially going to feel threatened. This can be seen especially in the real-estate industry. The real-estate industry has typically provided a gateway to transaction data and a knowledge of bids and buyer behaviors that haven’t been available elsewhere. Gaining access to all of that requires quite a bit of money and even more effort. In recent years, online specialists are bypassing the agents to create a parallel resource for real-estate data. This data is gotten directly from buyers and sellers, and available to those same groups. Pricing and cost data has also seen a spike in availability for several industries. There are even companies using satellite imagery that is available at their fingertips. They’re using processing and analysis to look at the physical facilities of their competitors. That information can provide insights into what expansion plans or physical constraints that their competitors are facing. But with all that data there comes a challenge. The data is being kept within departments. Engineering, R&D, service operations, and manufacturing will have their different information and it will be stored in different ways depending on the department. However, the fact that all this information is kept in these little pockets means that the data cannot be used and analyzed in a timely manner. This can cause all sorts of problems for companies. For example, financial institutions don’t share data across departments like money management, financial markets, or lending. This segmentation means that the customers have been compartmentalized. They don’t see the customer across all of these different areas, but just as separate images. Some companies in the manufacturing business are trying to stop this separation of data. They’re integrating data from their different systems and asking their smaller units to collaborate in order to help their data flow. They’re even looking for data and information outside of their groups to see if there’s anything else out there that might help them figure out better products and services. The automotive industry has suppliers all around the world making components that are then used in the cars that they’re making. Integrating data across all of these would allow the companies and their supply chain partners to work together at the design stage instead of later on. Can testing decisions change the way that companies compete? Gaining the ability to really test decisions would cut down on costs and improve a company’s competitiveness. These automotive companies would be able to test and experiment with the different components. By going through this process, they’ll be able to gain results and data that will guide their decisions about operational changes and investments. Really, experimentation will allow companies and their managers to really see the difference between correlation and causation while also boosting financial performance and producing more effective products. The experiments that companies will use to collect data can take several forms. Some online companies are always testing and running experiments. In particular cases, there will be a set of their web page views that they are using to test the factors that drive sales and higher usage. Companies with physical products will use tests to help make decisions, however, big data can make these experiments go even further. McDonald’s put devices in some of their stores that track customer interaction, traffic, and ordering patterns. The data gained through these devices can help them make decisions about their menus, the design of their restaurants, as well as many other things. Companies that can’t use controlled experiments may turn to natural experiments to figure out which variables are in play. A government sector collected data on different groups of employees that were working in various places but doing similar jobs. This data was made available and the workers that were lagging were pushed to improve their performance. What effect will big data have on business if it is used for real-time customization? Companies that deal with the public have been dividing and targeting specific customers for quite a while now. Big data is taking that further than it ever by making it possible for real-time personalization to become part of these companies. Retailers may become able to track individual customers and their behaviors by monitoring their internet click streams. Knowing this, they will be able to make small changes on websites that will help move the customer in a direction to buy. They will be able to see when a customer is making a decision on something they might purchase. From here, they will be able to “nudge” the customer towards buying. They could offer bundled products, benefits, and reward programs. This is real-time targeting. Real-time targeting also brings in data from loyalty groups. This can help increase higher-end purchases made by the most valuable customers. The retail industry is likely to be the most driven by data. Because they’re keeping track of internet purchases, conversations taking place on social media, and location data pulled from smartphones, they’ve got tons of data at their fingertips. Besides the data, they have better analytical tools now that can divide customers into smaller segments for even bettering targeting. Will big data just help management or will it eventually replace it? Big data opens up new ways for algorithms and analysis, mediated by machines, to be used. Manufacturers are using algorithms to analyze the data that’s being collected from sensors on the production line. This data and analysis help the manufacturers regulate the processes, reduce their waste, increase outfit, and even cut down on potentially expensive and dangerous human intervention. There are “digital oilfields,” where sensors monitor the wellheads, pipelines, and mechanical systems all the time. The data is fed into computers where the data is turned into results that are given to the operation centers where the oil flows are adjusted to post production and reduce the amount of downtime for the whole process. One of the largest oil companies has managed to increase oil production by five percent, while also reducing staff and operating costs by ten and twenty-five percent. Products ranging from photocopiers to jet engines are now tracking data that helps people understand their usage. Manufacturers are able to analyze data and fix the problems, whether they’re just simply fixing glitches in software or needing to send out a repair representative. The data is even predicting when products will fail and being used to schedule repairs before they’re likely to fail. It’s obvious that big data can create huge improvements in performance and help make risk management easier. The data could be used to even find errors that would otherwise unseen. Because of the increasing demand for analytics software, communication devices, and sensors; prices for these things are falling fast. More and more companies will be able to find the time and money to get involved in collecting data. Will big data be used for the creation of brand new business models? Big data has already been responsible for the creation of new industries surrounding the analysis and use of the information it has. But the company categories that are also being produced big data have business models that are driven entirely by data. Many of these companies are intermediaries in a value chain. They are generating valuable “exhaust” data from transactions. A major transport company was keeping data about their own business, but they were also collecting vast amounts of data about what products were being shipped where. They took the opportunity and began selling the data that they were collecting to supplement economic and business forecasts. There was another global company that was learning a lot by looking at their own data. From doing the analysis for themselves, they eventually decided to branch out and create a business that analyzes data for other organizations. The business aggregates supply chain and chop floor data for manufacturers. It also sells relevant software tools that a company will need to improve their own performance. This side business that the company opened is outperforming the manufacturing business, and that is because of the value of big data. Big data is creating a whole new support model for the markets that already exist today. Companies have all sorts of new data needs, and they need qualified people to support that data. As a result, if you own a business, then you may need an outside firm to analyze and interpret any data you’re producing for you. These specialized firms can take large amounts of data in various forms and break it down for you. These firms exist because there is a need for support for larger companies in many different industries. The employees they hire are trained to locate and capture data in systems and processes. They’re allowing larger companies to focus on their work and doing the data aggregation for the company. They assimilate, analyze, and interpret trends in the data, and then they report to the company about any notifications that they have. For a company that doesn’t want to hire out a firm, they have the option to create a data support department within their own company. This would be more cost-effective than hiring an entire outside firm, but it does require very specific and specialized skills within the company. The department would focus on taking the data flow and analyzing, interpreting, and finding new ways to use the data. These new applications and the new data department would monitor existing data for fraud, triggers, or issues. Big data has created a whole new field of studies in colleges and higher institutions of learning. People are training in the latest methods of big data gathering, analyzing, and interpreting. This path will lead people to critical positions in the newly trending data support companies. Big data has created all sorts of changes, and it will continue to make even more. In education areas, big data will influence and change the way that teachers are hired. The data will be able to look at recruiting processes and predictive analytics will be able to look at the traits that most effective teachers are going to need to most properly maximize the learning experience. Chapter 3: Development of Big Data While most of our data collection and analysis has only happened in the last couple of years, the term “big data” has been in our vocabulary since 2005. Analysis of data has been around for as long as we could count. Accounting in ancient Mesopotamia tracked the increases and decreases of herds and crops and even then we were trying to find patterns in that data. In the 17th century, John Graunt published a book, “Natural and Political Observations Made upon the Bills of Mortality,” that was the first large-scale example of data analysis. It provided insight into the causes of death at the time, and the book was meant to help stop the Bubonic plague. Graunt’s book and the way he approached the data was a revolution. Statistics, as it is now, was invented at that time, even though we couldn’t use it fully before the invention of computers. Data analysis came in in the 20th century when the information age really began. There were many examples of early data analysis and collection even in the beginning. There was the machine invented Herman Hollerith that could analyze data in 1887; it was used to organize census data. Roosevelt’s administration used big data for the first time to keep track of the social security contributions for millions of Americans. The first real data processing machine came during World War 2. The British intelligence wanted to decipher Nazi codes. The machine, Colossus, processed 5,000 characters per second to find the patterns in coded messages. The task of deciphering went from weeks to just hours. This was a huge victory for technology and a massive improvement for statistical analysis. In 1965, the electronic storage of information started, as another idea of the American government. The system was put in place to store tax return claims and fingerprints. However, the project went unfinished because of the worries of the American people. They thought of that as something similar to “Big Brother,” but the electronic storage of information was already starting. It would be impossible to stop the flow of information. The invention of the Internet was really what sparked the true revolution in data storage and analysis. Tim Berners-Lee couldn’t have known what he had really started in the world. However, it was really in the 90s that his system was turned into the monster that it is today. In 1995, the first supercomputer was made. The machine was capable of doing in a single second what a human with a calculator could do in 30,000 years. This was the next great stride in data analysis. In 2005, Roger Mougalas mentioned the term big data as a way of saying that traditional tools could not deal with the amount of data that was being collected. In that year, Hadoop was invented to index the internet. Today, this tool is used by companies to go through their own data. Eric Schmidt said in 2010 at a conference that the amount of information created between the dawn of the time and 2003 (roughly 10 exabytes) was equal to the amount of information that had been created in just two days in 2010. Data had become so ingrained in our everyday lives. Hundreds of upstarts are attempting to take on big data. Thousands of business are using the data to optimize their business models. Almost every industry is using the inferences made by analyzing big data. That information has become the most valuable currency in the world; the second most valuable things are the people that are able to properly use it. As the world becomes more caught in up in the digital world, it will bring us closer to each other. It was also brought more of our lives into the public eye. Data collection will become more and more important. Companies will be using all of this data to find new ways to sell people products and services. There’s no doubt that the government will also be using it to improve the environment, get votes, and keep the people in check. Even with big data, the future is still a mystery since it may go either way. The future could be changed for the better by big data; the future could also be hurt by the ways that private corporations are using that data now. Having more data out in the open gives more and more power to the governments. And one day it may lead to the realization of the people’s “Big Brother” fears. Chapter 4: Considering the Pros and Cons of Big Data Back in the day, whenever a crisis such as a recession or a bubble burst hit, no one could truly understand it, and all anyone could come up with would be: “something went wrong somewhere.” Nowadays, however, due to the rise of big data, it is much easier to precisely describe socio-economic, political, and other types of factors. Though many might think that this quantum leap in our capabilities to measure and analyze data would be nothing but positive, there may be some negative implications that we must remember to keep in mind. The discussion on the existence of “big data” and how it shapes and will continue to shape our future has been never-ending, ever since the very concept of “big data” was introduced. Very few would dispute that big data has proved to be the catalyst of many positive changes and developments in our everyday lives. What many don’t see, however, are the various harms that big data has been introducing into our lives as well. Some economic and social studies experts have posited that the reduction in personal privacy thanks to the often unfettered access to our personal information that public and private corporations have is only one of the least of the drawbacks. Even national politicians in Washington have become aware of this growing unease that big data may be negatively impacting the lives of the average Joe. The White House has addressed the issue, stating that big data must seek balance between its socio-economic value and the privacy it may have been violating in order to become one of the strongest catalysts for socio-economic progress. Here we examine some of the pros and cons of big data in our modern society. The Pros As was earlier stated, big data may be an immense help to both the private and public sectors, but how exactly? Here we can find some of the more common ones. New methods of generating profit This one may not be directly beneficial to most, as the ones who benefit the most are company owners and employers, but greater revenues lead to a stronger economy, which means that more people can keep their jobs. It is crucial for companies to be able to turn a profit in order for them to be able to employ people. Big data may open new opportunities for companies in any sector. Companies that directly use big data gather valuable information that is desired by other companies as well. The raw data as well as the analyzed and interpreted form of such data may be sold to other companies, generating even more revenue. Improving Public Health Going into more specific examples, healthcare is a component of the public sector that is a beneficiary of big data. The improved ability to gather and analyze massive amounts of data about hospital staff, patients, and even the wants and needs of the public has allowed experts to better develop methods and policies that will be more responsive to public needs. Perhaps even more important is the merger of big data and the science of genetics. This merger is one of the things that will revolutionize the world. It may someday be possible to include a patient’s genetic code in their health records. It may be possible to analyze these genetic maps in order to discover the genetic bases of certain illnesses. The possibilities are endless, and no one knows just how much the partnership between big data and the health sector will be able to benefit us. Unlike most other industries, healthcare has been lax in following the trend of personalized services, but the arrival of big data will help in picking up the slack, and will indubitably shift the trend and bring us closer to the age of truly personalized medicine. Improving Our Daily Environment How many trash cans are needed on the street? What amount of street lamps is needed? At what point in the day do traffic jams occur? These are all questions that are easily answered through the use of big data. Thanks to the development of modern data gathering systems, it has never been easier to find out what happens in our public spaces. This data can be used not only to save vast amounts of money, but to create significant and concrete impacts in our daily lives. The city of Oslo in Norway has been able to greatly reduce the amount of energy used in their street lighting. Portland, Oregon has used big data to reduce their carbon dioxide emissions. Even the police department of Memphis, Tennessee has reduced the serious crime rate in their area by 30% through big data. Big data revolutionizes how we run our cities, and this is only the beginning. In the future, it may be possible that a central mainframe could gather and analyze the data in real time, and use this data to tweak the performance of the city’s services. Imagine the improvements that this may bring to our cities. More and more cities are beginning to incorporate big data into their systems, and eventually, every city will be using this to improve our daily lives. Improving Decisions: Speed and Accuracy Regardless of the industry, and no matter what the final target may be, may it be increased security, revenue, or healthcare, the existence of big data lets us respond faster. Big data affords anyone using it the ability to make more informed decisions, from how to market to individual customers to providing adequate healthcare for everyone. As the big data industry evolves, we become better and better at being able to analyze it in real-time, which allows us rapid results and helps assist our decision-making. Personalized Products and Services Companies develop products based on what they think customers may buy. Now, with the advent of big data, companies are better able to find out about people’s interests and preferences. One of the services that sees great use today is Google Trends, allowing companies to find what people are searching for on the World Wide Web. This data allows companies to develop personalized products and services that are even more responsive to consumer needs. The Cons As was mentioned earlier, while big data has many benefits, it is not without its negative side. The positive aspects are extremely helpful in developing society and moving progress forward, but there are certain aspects to it that give people legitimate cause for concern. Like anything too good to be true, big data can be a double-edged sword. Privacy The greatest critics of big data have been civil rights activists and people who maintain the belief that privacy is more valuable than any advantage that big data grants us. Big data collects personal data, and this allows companies to learn numerous things about any individual user. This enables marketers to use this knowledge to sell products by manipulating the subconscious of unsuspecting users. There are numerous methods used by marketers that allow them to convince us to buy products we would probably not have bought, and most of these methods make good use of what big data says about us. Detractors of big data say that this constitutes an unjustifiable invasion of our privacy, especially when carried out by the private sector. This argument carries a lot of weight, and should be considered. Big data tells marketers so much about us, companies can even tell a what color a product should be so people would be better incentivized to buy it. While this may sound like a magic trick, it’s quite real, and shows one of the dark sides of big data being commercialized. Big Brother Ever since its introduction into the mainstream culture by Orson Welles, the concept of a “Big Brother” has been a constant specter looming over everyone. We know that our governments observe us and carry out certain activities to ensure that we are kept “in check”. Some believe this more strongly than others, with certain conspiracy theorists positing that a cabal of men and women run the world from the shadows. Though many dismiss this, even the most moderate of us know and understand that governments really do collect a lot of personal data, some of which they may not have any business collecting. “Big Brother” as a concept has been a specter, but with the advent of modern technologies, this specter seems more and more likely to turn into reality. In most American cities, one cannot walk more than a few streets without being caught on camera. Most, if not all of our devices such as phones or even cars have GPS signals that an unscrupulous entity may be able to take advantage of. Even satellite footage has become more accessible to governments, begging the question: “are we ever really alone?”. Over the years, the concept of Big Brother has been a very present argument for everyone, from the average, everyday people to conspiracy theorists. For quite some time now, we have known that our governments have been watching us and doing all sorts of things to keep us “in check.” Some conspiracy theorists go so far as to say that a small group of extremely powerful men and women are now running the entire world, but even the most moderate of us still understand that governments do collect a huge amount of data that they might not really have the right to collect. The fear of the Big Brother is something that is very prominent in Western societies, but as time progresses, it seems that this is becoming more and more real. For instance, it is now nearly impossible to walk several blocks of any American city without being filmed by numerous cameras. There is also the topic of the GPS devices that are on our phones and vehicles. Satellite footages are becoming more and more available to governments, and the question we have to ask ourselves is: Are we ever alone? The sheer amount of data collected has done some good in the world, as we earlier saw, but naturally, people desire some measure of privacy, even when they have nothing serious to hide. There have been recent leaks that revealed the existence of phone tapping, social media monitoring, and other such forms of government surveillance. This leads to a sense of distrust and unease, even for citizens not up to anything malicious. There has to be a balance, and people are afraid that if the government knows too much about the personal lives of its citizens, it will be holding too much power, as information, especially in our modern age, is power. This is why regulations are key to limiting the access of public agencies to big data. Even given a democratic system, a government with so much information holds a lot of power over its citizens, and citizens should at least have the right to be asked what information they want accessible. Stifling Entrepreneurship Small businesses are not banned from using big data; far from it. However, with the sheer amount of resources and capacity large corporations can bring to bear, it is well-nigh impossible for a small business to compete. One of the methods that a small business has always had access to in order to compete is the personalization of their services and products. With the dozens or even hundreds of data scientists large corporations have at their disposal, they can easily sift through the extensive amounts of data to better target their market. This neutralizes any comparative advantage a small business may have once had, and there will be virtually no way for a small business to offer something a big corporation can’t. Data Safekeeping Given the massive amount of data gathered, there is no feasible way to store it in traditional physical media. All this data is stored on computers, all accessible through the internet. It may already be bad enough that a corporation has all of your personal information, but how much worse does it become once they decide to sell it? What if hackers decide to steal it from them? These are all legitimate concerns held by many people. In fact, it is well known that companies do share user data among themselves, regardless of its legality. This leads to the very real possibility of someone’s personal information landing in the hands of a company that they have never even interacted with. The possibilities get even worse when one accounts for hackers who may be able to access user data such as pictures, addresses, or even credit card numbers. Companies are constantly upgrading their systems to protect against this possibility, but hackers also constantly find ways around these defenses. There is no surefire way to protect the data from unwanted hands, which means that perhaps some sort of limit should be imposed on this type of data collection. Erroneous Data Sets and Flawed Analyses We already know that big data has major potential in shaping the directions that businesses, corporations, and the public sector take. This, however, is assuming that the proper data is collected, and it is properly analyzed. Nothing is perfect, however, and neither data collection systems nor the analysis systems are even near perfect. These flaws mean that businesses, corporations, or anyone intending to use big data should use their data judiciously, as over-reliance on this may prove to be counterproductive. There are many opportunities for flaws to interfere with the process. Beginning with data collection, collection systems may not collect sufficient data, or perhaps the data gathered is skewed or biased, therefore intrinsically flawed. Even if the analysis of such data would be perfect, you would end up with incorrect conclusions and inferences, which may cost millions per error. That is assuming the data analysis is flawless. It is also possible, even likely, that the analysis would be flawed, leading to imperfect conclusions. In fact, it has been found that over 15% of data analysis projects are flawed enough that the very existence of the companies using them may be in jeopardy. In addition, around 30% of other projects may end up with a net loss. This just goes to show that while big data is useful and potentially game-changing, it must be used with caution, otherwise it may do more harm than good. Conclusions At the end of the day, big data has both its pros and its cons. Much like any other important issue, deliberate observation is needed, and both its benefits and drawbacks must be considered. There have been countless ways that big data has improved our lives, but there have also been causes for concern. Big data will most likely be the subject of numerous discussions and political debates, and the extent of its regulation or lack thereof still remains to be seen. We can be sure, however, that our lives are becoming more and more open every day, no matter what we do. We are being monitored more than ever, but we must keep in mind that the bulk of this data, especially those collected by the public sector, has been used to improve our lives and make them safer. The private sector’s data collection tends to be through voluntary means, and it is up to the government to regulate their use of it. Chapter 5: Big Data for Small Businesses? Why not? Small businesses sometimes lag behind large companies when it comes to cutting-edge technologies, mainly because they simply cannot afford it. When it comes to big data, many large corporations have embraced it, and this has led people to believe that this is one of the cutting-edge technologies that small business will have trouble adapting. The truth is, to employ data analytics; there is no need for complex systems that are resource hogs. Small business may employ big data, as the most important components are based on developing human resources: how skillful the data analysts are, how much attention to detail they are, and how well-organized they are. Any company that has the capability to gather relevant data and analyze it critically will be able to create and seize new opportunities. Big data may be used by small businesses to improve the quality of their products and services, better tailor their marketing strategies, and even foster better customer relations. These are all impactful on a businesses’ bottom line, and the use of big data to achieve this does not have to be prohibitively expensive. The Cost Effectiveness of Data Analytics Why is cost an issue, anyway? The reason that costs are always accounted for before a vital business decision is made is that the cost has a direct effect on the decision’s profit potential, and therefore viability. If a small business increases their net revenue by 30% through new techniques, but their costs have gone up by 35%, the net profit decreases, and the innovations introduced will have turned out to be detrimental. However, it will spell doom for a business if they do not innovate, while their competitors adapt new techniques and technologies. Most companies have begun to use data analytics, and it would be folly to ignore the opportunity to make use of this field as well. As said earlier, the use of big data helps create opportunities, and oftentimes these opportunities reduce costs and increase revenues. Small businesses should consider using it, as they grow much as large corporations do: by taking advantage of new business opportunities. Big Data can be for Small Businesses Too There are certain traditions that come with running a business, no matter the size. Oftentimes, the inertia proves too much for a new technology to effectively gain a foothold. Developers try to get around this by pushing their ideas on large companies, as these have deeper pockets and a larger profit margin, and can therefore afford to test new concepts. This is what happened when big data was first introduced. The software solutions marketed to businesses were based on the advantages of an economy of scale. Small businesses have problems with creating economies of scale, however, as they often struggle to build base capacity to begin with. This leads many to think that big data isn’t for them, especially as many of the existing software solutions require a large capital outlay in order to properly operate. However, innovators have developed data solutions that are viable for small companies as well. These solutions allow small businesses the capability to use the appropriate tools for their operations. They can then begin to catch up with the bigger companies, as big data increases efficiency and productivity, allowing a company to expand its market share. Unless these smaller companies start to adopt this, however, larger companies will eventually run them out of town. Where can Big Data improve the Cost Effectiveness of Small Businesses? Social media is a new development that allows for greater and greater connectivity between people. Small businesses can use social media sites such as Facebook, Instagram, Twitter, and other similar websites in order to gauge consumer trends. This can lead businesses to get ahead of the curve and rapidly capitalize on emerging trends. A strong social media presence can also contribute to rapid client base growth, especially with effective use of the site’s consumer analytics and ad platforms. This moves away from traditional marketing strategies, and requires implementation of different strategies, such as incorporating targeted ad strategies to display ads to those who would most likely buy your companies’ product. Another method would be the launch of online services. Any website can be setup so the administrator can analyze the customer’s tendency to visit the site, as well as study their habits, such as their most frequented pages. These little details can assist a company in developing a better website designed to better appeal to their customer base. These are all cost-effective solutions when done properly, and can easily be as beneficial to small businesses as they are to large corporations. What to consider when preparing for a New Big Data Solution As a small business, a mass produced data solution will not properly serve your needs. A small business needs a data solution tailored for their specific needs, which will help gather relevant data, as well as help in its analysis. This solution should also be streamlined, in order to do away with superfluous functions and to reduce costs. A good data solution should also be able to take advantage of and be able to work alongside the pre-existing solutions of a business, as well as its systems. At the end of the day, a small business needs a solution that is integrated and conveniently packaged. A solution requiring a complete overhaul of the business’ pre-existing system may not be the best choice, as it would be very costly and would take much time and resources in order to return to a “business as usual” state. One way to address this is to bypass the massive integrated solutions, and go for the smaller modular data solutions. These data solutions could be developed for specific departments, as all they have to do would be to lay out their system requirements, justify their costs, and then be able to conduct their own research as to which solution best addresses their needs, rather than going for a costly data solution that will apply to the whole organization. This would allow a small business to make use of multiple solutions, each tailor-fit to the relevant department. This allows a department to increase its efficiency while minimizing interference with other departments, as well as keeping costs down. Another thing to remember is to obtain a data solution that can be easily deployed. This also means that there should be a relative ease of use for the end users. A data solution that takes more than a few weeks to deploy, from testing stage to final integration, may not suit the needs of a small business. Small businesses cannot afford the loss in productivity that a prolonged period of downtime would bring. Small businesses will also have a hard time using a data solution that requires too much specialization, as they may not have the resources to train or hire personnel who can properly make use of the system. In addition, acquiring a new big data solution will be pointless if it is too complicated to properly use, as the entire point of obtaining a data solution is to improve efficiency and effectiveness, and to create new opportunities to be taken advantage of. The best data solution would be one easy enough to use that a new user can learn it quickly, rather than having to undergo specific training just to employ it. As a small business with limited resources, one of the most important things to consider is the overall cost. A versatile data solution would be best, as the use of its capabilities can be increased as the company begins to expand. The data solution offered should be designed such that a company only has to pay for the capabilities that they actually use, rather than paying for all the bells and whistles. However, as the company expands, a greater amount of capabilities of the data solution will need to be employed, and as such, there should be a licensing option to allow access to these greater capabilities as the company grows. It is possible, and very practical to make the switch from a traditional intuition-based business to an analytics based business, even when the business in question is small. All that is needed is the proper identification of an IT solution that will suit the size and purpose of the business involved. Once this is done, the business will be poised to reap the benefits associated with the use of big data, the chance to create and take advantage of new opportunities. Chapter 6: Important training for the management of big data Just as teachers impart education and police uphold law and order, the systems of an organization direct its operating processes. In the same way that the various departments of government are upgraded, organizations also make changes to keep up with modern trends and technologies. New systems are important to ensure optimal output of the organization and they must be fully understood for effective management. Army personnel attend refresher courses to update their skills and companies hold conventions to introduce new remedies to medical practitioners. Therefore talented people are selected and trained to handle the management of big data. Management must be fully cognizant of the implications and impact new systems have on the overall functioning of operations. From time to time, new systems replace or update processes in order to meet fresh demands. No matter how large or small an organization is, the ability to use big data has immense benefits. Supporting innovation insures increased efficiency and productivity while accelerating growth and reducing costs. When a system change arrives, former processes become out of date and handlers have to be trained to manage the new functions. Big data supports the distribution of information even when departments work independently of each other. Departments interact more efficiently due to the availability of swiftly shared data. Updated facts improve collaboration and help department heads make smart, informed, on the spot decisions. A lot of importance used to be placed on the acquirement of data, but since the internet has made information so accessible, data is no longer protected. Important data is now only useful when it is gathered and arranged specifically, and classified. It can then be used where it is most pertinent. This kind of information sorting takes place without any rules or regulations and depends on the sorter’s intuition and judgement. This creates a deficit in information and impedes people’s capacity to make best use of such incomplete data. It is vitally essential that only those individuals in an organization who are able to use big data productively are trained in handling this information. Some criteria must be formalized for training for information gathering and sorting. Those employees who handle big data may need to use Hadoop, a software library in open source form that helps distributed data processing. They should get maximum precedence when it comes to training. The Human Resources department usually selects those employees who have an aptitude for training. Their qualifications, skills and motivation are checked. Department heads are consulted since they understand their people best and can make informed recommendations. These officers can identify the individuals that handle data well during heavy traffic and they know whether the training of these persons will benefit the organization. The head of the IT division make the final selection since they have frequent dealings with the shortlisted candidates and know their caliber. The head has to be thorough in his final assessment since he must choose only those candidates who will bring dynamism to the department, drive the new technology, and help employees to become more efficient. Present level of skill in managing data Data management is becoming increasingly important to organizations, and yet, surveys continue to reveal how below par operators are when dealing with information handling. It is noted in these reports that a large number of IT staff are deficient in data management and analysis skills. They all struggle with similar issues with big data, having long term storage issues, problems with data organization and management, knowledge of data management plans, and need for consultation and instruction. Most employees lack of information about data management facilities. Few departments have little or no representation. The survey showed that employees showed an interest in streamlined systems and IT staff indicated that they may not be getting the appropriate guidance and support and requested more training in their field. Findings point to the need for improved training for staff managing big data. A conscious effort must be made to raise the level of data collection, management, and distribution, so that the installation of super systems benefits the organization. Where big data training is necessary The IT department Technology changes constantly. If it is incorporated in the workplace, it is necessary that staff handling this area remain current. This should be a matter of course since trained employees lead to the success of an organization and move it forward. The IT department in particular should receive all the training they can get since it is they who suffer the brunt of complaints from the rest of the organization when employees fail to manage their systems. It is members of the IT division that are beset by irate colleagues with unending complaints about the inefficiency of the department and system. When connections between departments stop working correctly, it is these poor individual who are held accountable for the glitches whether it is their fault or not. Demands to restore and update systems stresses the IT department since it is generally the user who is at fault. The blame for under-functioning systems must lie with the organization and department heads, who must realize the problems and ensure all departments are kept up-to-date. Proper training, policies, and working practices should be in place to maximize return on systems. It is vitally important that the right persons are selected for training so that organizational efforts are not wasted. Well trained IT staff fully understand the value of their system and optimize its usage to benefit the organization’s productivity. IT security is compromised when systems are affected by theft or damage to its hardware, software, or the data on it. The security of systems has become a major concern in spite of all the new guidelines on the correct handling of data. The IT department must be aware of any threat to the security of their system and ensure its protection from any attack. Hacking and breaching applications are a constant threat and the department must know how to preempt such hazards. Organizations suffer significant losses due to the theft of data and some establishments have created security systems supported by legal measures to insure their data isn’t compromised. The Product Development department Training is essential to the product development staff since they not only create new products but also re-engineer existing merchandise. They have to be proactive and innovative and the whole department is involved in research and development which requires fresh, original thinking. This department must come up with new products that will be readily accepted by competitive markets, and the team must also interact between departments to get their feedback. This is a huge responsibility since the future of the company is directed by this department. Failure of a product can lead to the failure of the organization. Product development staff must have a complete understanding of product development issues. Reengineering products and inventing new ones requires the support of and big data and this must be well analyzed, critically evaluated, and properly utilized. For this reason staff in this department must be trained well. They must be able to recognize and take advantage of the benefits big data provides, principally in areas of data integration across touch points in the course of product development such as design, software applications, manufacture, and quality control. Training will promotes the productive use of big data which will assist the generation of new ideas and solutions while creating a greater level of interdepartmental cooperation that will benefit the entire organization. The Finance department This department is almost as vital as the IT department, some would argue it is the most vital area of any organization. The finance department is all about the money. This is understandable because business is about making money and organizations would collapse without substantial profits. Anyone who cannot deliver the money will be considered a liability for the organization. Staff in this department must know whether there is financial value in deploying certain assets. Clearly they must be well trained in using big data in order to make important monetary decisions. With the influx of information, financial functions have become intricate making training in big data a necessity. Employees in the finance department must be trained to make use of big data platforms in order to construct financial prototypes that will sustain the organization. Money will be depleted without controlled spending and maximum project funding, and the financial team must keep the money coming in. Big data technologies must be understood and harnessed to support the goals of the organization. Training prepares these employees to make use of big data in their central roles of business planning, auditing, accounting, and overall regulation of finances. When the work is managed successfully, the organization’s finances are significantly improved. Well trained staff generate accurate statements regarding cash flow, compliance with finance standards, cost modeling, and prize realization, etc. The Human Resources department In today’s fast changing world, the way in which a human resources department operates goes a long way in defining the success of the organization. The human resources department must be able to analyze information in order of relevance in order to make strategic decisions regarding functioning. Training in big data brings skills like data analysis, visualization and problem solving right from operational reporting to strategic analytics. Skillful application of big data improves personnel quality by evaluating the capabilities of current workers and determining competencies required in future employment. Employment is no longer the main concern of the human resources department. Its scope has widened and is now understands and analyze all matters concerning its department. Human resources staff members must be able to use available tools for data analysis because such capabilities will eventually help in resolving issues which comprise staff retention, staff-customer relations that affect sales, skill and deficiency within the organization, quality of candidates for employment, and a host of other matters related to the department and organization. Intuition, experience, and a sound data driven approach benefits the human resources department and in turn the whole organization as well. The supply and logistics department Business will come to a standstill if supplies are delivered late or not at all due to stock and logistical failures. Late deliveries and breakages in transit will cause irate customers to cancel their orders spreading complaints about poor services. The aims of this department are speed and agility, saving costs, and improving performance. They capture and track different forms of data to improve operational efficiency, improve customer experience, and create new and better business models. These factors help organizations to conserve resources, build a better brand name, and create a systematic process for their supply chain and logistics. Logistics and supply staff must also have the benefit of training in big data in order to realize and employ tactics to attain departmental objectives. Planning and scheduling are perhaps the most vital part of any supply chain. So much money can be lost or expended in scheduling and planning, and with big data this process can be optimized. Big data can help predict demands so that money, space, and time are not wasted. Levels of inventory can be maintained and market trends observed. Logistics and supply departments have transaction based possesses that generate an enormous amount of data. Big data has applications at all levels of business and the supply and logistic department must appreciate its advantages. The Operations department Operations are the organized daily activities of business. Operations departments administer business practices to create the highest level of efficiency within an organization. Operations divisions ensure that companies can effectively meet customer requirements using the least amount of resources necessary, and manage customer support. Undertakings also include product creation, development, production and distribution. Operations play a distinctive role that contributes towards the overall success of the organization. Organizations have been using data in various ways to improve their operations. With the emergence of new technologies, big data is becoming an inherent part of modern operations where challenges are overcome by using big data analytics. Concepts and applications help businesses predict product and customer behavior. Organizations realize the utility of big data brings value through continuous improvements in their operations departments. The Marketing department Numbers are critical in marketing whether it concerns the number of customers, the percentage of market shares, or sales revenue. There are many more figures and statistics in the marketing department and all these numbers keep the team busy. There is a huge amount of data generated by market activities. Some of it is positive and useful while other data can possibly damage your brand. The marketing department must collect data that is relevant, analyze this information, discard what is unimportant, and preserve details key details that can be utilized to benefit the brand. Without the skills to handle big data, it is difficult to understand the massive amount of information on the internet, particularly in an age when digital marketing is part of the business process. Training in big data enables marketing staff to measure with relative accuracy the response advertisements have on the market, as well as the volume and impact of click-through-rate impressions, return on investment, and other factors that impact marketing. There are plenty of these statistics on social media and while they may look unimportant to the casual eye, this information is gold for the marketing department. The large amount of data that is available through the various levels of customer activity, social media, as well as evidence generated during sales, is useful to trained marketing employees who are able to make good use of it. Platforms such as Hadoop are very useful for framing marketing indicators. Big data training also helps in retrospection so that marketing staff can judge the performance of their brand as compared with other competing products. Big data even has a function that allows marketing teams to check out competitor’s websites and do a bit of helpful text mining. All this information will be used to improve their own brand. The Data Integrity, Integration and Data Warehouse department It is important for organizations to have teams of experts to monitor and sort through the immense amount of data that collects on various company systems. This team must be well trained and current, and be aware of the possible dangers that may be linked with accessing this data. They must also be able to warehouse this data and make structured sense of it. These teams need to know the rules that apply to customer protection and privacy laws if they are to work with and interpret this data. The Legal and Compliance department It has become necessary for the legal department in an organization to be aware of existing privacy and retention procedures since new legal and compliance rules that cover data are enacted regularly. Companies that do not monitor and report their data can run into problems with the law and must have policies in place to safeguard themselves against possible litigation and breaches. These departments must work together to understand data privacy laws and make sure that their files are secure, warehoused, and handled correctly. Chapter 7: Steps Taken in Data Analysis In this chapter, we’ll break down the process of data modeling into steps and look at each one separately, but before that, we’ll be defining it. Defining Data Analysis We need to know exactly what data analysis is before we can understand the process. Analysis of data is the procedure of first of all setting goals as to what data you need and what questions you’re hoping it will answer, then collecting the information, then inspecting and interpreting the data, with the aim of sorting out the bits that are useful, in order to suggest conclusions and help with decision making by various users. It focuses on knowledge discovery for predictive and descriptive purposes, sometimes discovering new trends, and sometimes to confirm or disprove existing ideas. Actions Taken in the Data Analysis Process Business intelligence requirements may be different for every business, but the majority of the underlined steps are similar for most: Phase 1: Setting of Goals This is the first step in the data modeling procedure. It’s vital that understandable, simple, short, and measurable goals are defined before any data collection begins. These objectives might be set out in question format, for example, if your business is struggling to sell its products, some relevant questions may be, “Are we overpricing our goods?” and “How is the competition’s product different to ours?” Asking these kinds of questions at the outset is vital because your collection of data will depend on the type of questions you have. So, to answer your question, “How is the competition’s product different to ours?” you will need to gather information from customers regarding what it is they prefer about the other company’s product, and also launch an investigation into their product’s specs. To answer your question, “Are we overpricing our goods?” you will have to gather data regarding your production costs, as well as details about the price of similar goods on the market. As you can appreciate, the type of data you’ll be collecting will differ hugely depending on what questions you need answered. Data analysis is a lengthy and sometimes costly procedure, so it’s essential that you don’t waste time and money by gathering data that isn’t relevant. It’s vital to ask the right questions so the data modeling team knows what information you need. Phase 2: Clearly Setting Priorities for Measurement Once your goals have been defined, your next step is to decide what it is you’re going to be measuring, and what methods you’ll use to measure it. Determine What You’re Going to be Measuring At this point, you’ll need to determine exactly what type of data you’ll be needing in order to answer your questions. Let’s say you want to answer the question, “How can we cut down on the number of people we employ without a reduction in the quality of our product?” The data you’ll need will be along these lines: the number of people the business is currently employing; how much the business pays these employees each month; other benefits the employees receive that are a cost to the company, such as meals or transport; the amount of time these employees are currently spending on actually