Sunday 6 August 2017

GLOSSARY: BIG DATA


Source:

https://en.wikipedia.org/wiki/Analytics


File: GLOSSARY-BIG DATA

Analytics: https://en.wikipedia.org/wiki/Analytics


Analytics is the discovery, interpretation, and communication of meaningful patterns in data. Especially valuable in areas rich with recorded information, analytics relies on the simultaneous application of statistics, computer programming and operations research to quantify performance.

In a fast-moving space like big data, it’s critical to separate the jargon from meaning and (more importantly) to recognize the difference between the hype and the true value proposition. The following glossary covers many of the most common – and sometimes misunderstood – big data terms and concepts.

In a f ALGORITHM

An algorithm is mathematical “logic” or a set of rules used to make calculations. Starting with an initial input (which may be zero or null), the logic or rules are coded or written into software as a set of steps to be followed in conducting calculations, processing data or performing other functions, eventually leading to an output.
Teradata Take: Within the context of big data, algorithms are the primary means for uncovering insights and detecting patterns. Thus, they are essential to realizing the big data business case.

ANALYTICS PLATFORM
An analytics platform is a full-featured technology solution designed to address the needs of large enterprises. Typically, it joins different “tools and analytics systems together with an engine to execute, a database or repository to store and manage the data, data mining processes, and techniques and mechanisms for obtaining and preparing data that is not stored. This solution can be conveyed as a software-only application or as a cloud-based software as a service (SaaS) provided to organizations in need of contextual information that all their data points to, in other words, analytical information based on current data records.”Source: Techopedia

APACHE HIVE
Apache Hive is an open-source data warehouse infrastructure that provides tools for data summarization, query and analysis. It is specifically designed to support the analysis of large datasets stored in Hadoop files and compatible file systems, such as Amazon S3. Hive was initially developed by data engineers at Facebook in 2008, but is now used by many other companies.

ARTIFICIAL INTELLIGENCE (AI)
AI is an old branch of computer science software for simulating human decision-making. It mimics "learning" and "problem solving" through advanced algorithms and machine learning. AI has grown popular across many industries, with use case examples that include personalization of marketing offers and sales promotions, anti-virus security, equities trading, medical diagnosis, fraud detection and self-driving cars. Big data coupled with deep neural networks and fast parallel processing are currently driving AI growth.
Teradata Take: Teradata’s Sentient Enterprise vision recommends widespread use of automated machine learning algorithms. Business leaders should focus on specific use cases, not the term “AI” itself. After all, algorithms are not human: they don’t think and they are not truly intelligent or conscious. It requires fresh data and program maintenance to improve accuracy and reduce risk in the applications of AI, thus it’s best to be skeptical of Hollywood renderings of AI and general marketing hype about AI.

BEHAVIORAL ANALYTICS
Behavioral Analytics is a subset of business analytics that focuses on understanding what consumers and applications do, as well as how and why they act in certain ways. It is particularly prevalent in the realm of eCommerce and online retailing, online gaming and Web applications. In practice, behavioral analytics seeks to connect seemingly unrelated data points and explain or predict outcomes, future trends or the likelihood of certain events. At the heart of behavioral analytics is such data as online navigation paths, clickstreams, social media interactions, purchases or shopping cart abandonment decisions, though it may also include more specific metrics.
Teradata Take: But behavioral analytics can be more than just tracking people. Its principles also apply to the interactions and dynamics between processes, machines and equipment, even macroeconomic trends.

BIG DATA
“Big data is an all-encompassing term for any collection of data sets so large or complex that it becomes difficult to process them using traditional data-processing applications.” Source: Wikipedia
Teradata take: What is big data? Big data is often described in terms of several “V’s” – volume, variety, velocity, variability, veracity – which speak collectively to the complexity and difficulty in collecting, storing, managing, analyzing and otherwise putting big data to work in creating the most important “V” of all – value.

BIG DATA ANALYICS
“Big data analytics refers to the strategy of analyzing large volumes of data … gathered from a wide variety of sources, including social networks, videos, digital images, sensors and sales transaction records. The aim in analyzing all this data is to uncover patterns and connections that might otherwise be invisible, and that might provide valuable insights about the users who created it. Through this insight, businesses may be able to gain an edge over their rivals and make superior business decisions.” Source: Techopedia
Teradata Take: What is big data analytics? Big data analytics isn’t one practice or one tool. Big data visualizations are needed in some situations, while connected analytics are the right answer in others.


BUSINESS INTELLIGENCE
“Business intelligence (BI) is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.” Source: Gartner “Companies use BI to improve decision making, cut costs and identify new business opportunities. BI is more than just corporate reporting and more than a set of tools to coax data out of enterprise systems. CIOs use BI to identify inefficient business processes that are ripe for re-engineering.” Source: CIO.com

CASCADING
Cascading is a platform for developing Big Data applications on Hadoop. It offers a computation engine, systems integration framework, data processing and scheduling capabilities. One important benefit of cascading is that it offers development teams portability so they can move existing applications without incurring the cost to rewrite them. Cascading applications run on and can be ported between different platforms, including MapReduce, Apache Tez and Apache Flink.

CLOUD COMPUTING
Cloud computing refers to the practice of using a network of remote servers to store, manage and process data (rather than an on-premise server or a personal computer) with access to such data provided through the Internet (the cloud). Programs, applications and other services may also be hosted in the cloud, which frees companies from the task and expense of building and maintaining data centers and other infrastructure. There are a few types of common cloud computing models. Private clouds provide access to data and services via dedicated data centers or servers for specific audiences (e.g., a company’s employees). They may offer customized infrastructure, storage and networking configurations. Often used by small and medium-sized businesses with fluctuating computing requirements, public clouds are typically based on shared hardware, offering data and services on-demand usually through “pay-as-you-go” models that eliminate maintenance costs. Hybrid clouds combine aspects of both private and public clouds. For example, companies can use the public cloud for data, applications and operations that are not considered mission critical and the private cloud to ensure dedicated resources are available to support core processes and essential computing tasks.
Teradata take: Effective cloud computing capabilities have become essential elements in the most effective Big Data environments.

CLUSTER ANALYSIS
Cluster analysis or clustering is a statistical classification technique or activity that involves grouping a set of objects or data so that those in the same group (called a cluster) are similar to each other, but different from those in other clusters. It is essential to data mining and discovery, and is often used in the context of machine learning, pattern recognition, image analysis and in bioinformatics and other sectors that analyze large data sets.

COGNITIVE COMPUTING
Cognitive computing is a subset of artificial intelligence. It combines natural language processing with machine learning, rules, and interactive “stateful” programming. It is often used in spoken question-and-answer dialogs. Interactive cognitive systems “remember” the context of the current dialog and use that information to refine the next answer. Cognitive computing requires constant program maintenance and new data to improve the knowledge base. Examples of cognitive technology include Apple Siri, Amazon Alexa and IBM Watson.
Teradata Take: Cognitive computing is still in the early stages of maturity. It requires enormous investment, skill and patience for businesses to apply it effectively. Cognitive systems typically make many mistakes when interacting with humans. We expect cognitive computing to mature rapidly for specific tasks in the next decade. But, again, it’s best to be wary of Hollywood and marketing hype about cognitive computing.

COMPARATIVE ANALYSIS
Comparative analysis refers to the comparison of two or more processes, documents, data sets or other objects. Pattern analysis, filtering and decision-tree analytics are forms of comparative analysis. In healthcare, comparative analysis is used to compare large volumes of medical records, documents, images, sensor data and other information to assess the effectiveness of medical diagnoses.

CONNECTION ANALYTICS
Connection analytics is an emerging discipline that helps to discover interrelated connections and influences between people, products, processes machines and systems within a network by mapping those connections and continuously monitoring interactions between them. It has been used to address difficult and persistent business questions relating to, for instance, the influence of thought leaders, the impact of external events or players on financial risk, and the causal relationships between nodes in assessing network performance.

CONCURRENCY/CONCURRENT COMPUTING
Concurrency or concurrent computing refers to the form of computing in which multiple computing tasks occur simultaneously or at overlapping times. These tasks can be handled by individual computers, specific applications or across networks. Concurrent computing is often used in Big Data environments to handles very large data sets. For it to work efficiently and effectively, careful coordination is necessary between systems and across Big Data architectures relative to scheduling tasks, exchanging data and allocating memory.

CORRELATION ANALYSIS
Correlation analysis refers to the application of statistical analysis and other mathematical techniques to evaluate or measure the relationships between variables. It can be used to define the most likely set of factors that will lead to a specific outcome – like a customer responding to an offer or the performance of financial markets.

DATA ANALYST
The main tasks of data analysts are to collect, manipulate and analyze data, as well as to prepare reports, which may be include graphs, charts, dashboards and other visualizations. Data analysts also generally serve as guardians or gatekeepers of an organization's data, ensuring that information assets are consistent, complete and current. Many data analysts and business analysts are known for having considerable technical knowledge and strong industry expertise.
Teradata Take: Data analysts serve the critical purpose of helping to operationalize big data within specific functions and processes, with a clear focus on performance trends and operational information.

DATA ARCHITECTURE
“Data architecture is a set of rules, policies, standards and models that govern and define the type of data collected and how it is used, stored, managed and integrated within an organization and its database systems. It provides a formal approach to creating and managing the flow of data and how it is processed across an organization’s IT systems and applications.” Source: Techopedia
Teradata Take: Teradata Unified Data Architecture is the first comprehensive big data architecture. This framework harnesses relational and non-relational repositories via SQL and non-SQL analytics. Consolidating data into data warehouses and data lakes enables enterprise-class architecture. Teradata’s unifies big data architecture through cross-platform data access for all analytic tools and the ability to “push-down” functions to the data, rather than moving data to the function. See data gravity.

DATA CLEANSING
Data cleansing, or data scrubbing, is the process of detecting and correcting or removing inaccurate data or records from a database. It may also involve correcting or removing improperly formatted or duplicate data or records. Such data removed in this process is often referred to as “dirty data.” Data cleansing is an essential task for preserving data quality. Large organizations with extensive data sets or assets typically use automated tools and algorithms to identity such records and correct common errors (such as missing zip codes in customer records).
Teradata take: The strongest Big Data environments have rigorous data cleansing tools and processes to ensure data quality is maintained at scale and confidence in data sets remains high for all types of users.

DATA GRAVITY
Data gravity appears when the amount of data volume in a repository grows and the number of uses also grows. At some point, the ability to copy or migrate data becomes onerous and expensive. Thus, the data tends to pull services, applications and other data into its repository. Primary examples of data gravity are data warehouses and data lakes. Data in these systems have inertia. Scalable data volumes often break existing infrastructure and processes, which require risky and expensive remedies. Thus, the best practice design is to move processing to the data, not the other way around.
Teradata Take: Data gravity has affected terabyte- and petabyte-size data warehouses for many years. It is one reason scalable parallel processing of big data is required. This principle is now extending to data lakes which offer different use cases. Teradata helps clients manage data gravity.

DATA MINING
“Data mining is the process of analyzing hidden patterns of data according to different perspectives for categorization into useful information, which is collected and assembled in common areas, such as data warehouses, for efficient analysis, data mining algorithms, facilitating business decision making and other information requirements to ultimately cut costs and increase revenue. Data mining is also known as data discovery and knowledge discovery.” Source: Techopedia

DATA MODEL / DATA MODELING
“Data modeling is the analysis of data objects that are used in a business or other context and the identification of the relationships among these data objects. A data model can be thought of as a diagram or flowchart that illustrates the relationships between data.” Source: TechTarget
Teradata Take: Data models that are tailored to specific industries or business functions can provide a strong foundation or “jump-start” for big data programs and investments.

DATA WAREHOUSE
“In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons. The data stored in the warehouse is uploaded from the operational systems (such as marketing, sales, etc.) Source: Wikipedia

DESCRIPTIVE ANALYTICS
Considered the most basic type of analytics, descriptive analytics involves the breaking down of big data into smaller chunks of usable information so that companies can understand what happened with a specific operation, process or set of transactions. Descriptive analytics can provide insight into current customer behaviors and operational trends to support decisions about resource allocations, process improvements and overall performance management. Most industry observers believe it represents the vast majority of the analytics in use at companies today.
Teradata Take: A strong foundation of descriptive analytics – based on a solid and flexible data architecture – provides the accuracy and confidence in decision making most companies need in the big data era (especially if they wish to avoid being overwhelmed by large data volumes). More importantly, it ultimately enables more advanced analytics capabilities – especially predictive and prescriptive analytics.

ETL
Extract, Transform and Load (ETL) refers to the process in data warehousing that concurrently reads (or extracts) data from source systems; converts (or transforms) the data into the proper format for querying and analysis; and loads it into a data warehouse, operational data store or data mart). ETL systems commonly integrate data from multiple applications or systems that may be hosted on separate hardware and managed by different groups or users. ETL is commonly used to assemble a temporary subset of data for ad-hoc reporting, migrate data to new databases or convert database into a new format or type.

EXABYTE
An extraordinarily large unit of digital data, one Exabyte (EB) is equal to 1,000 Petabytes or one billion gigabytes (GB). Some technologists have estimated that all the words ever spoken by mankind would be equal to five Exabytes.

HADOOP
Hadoop is a distributed data management platform or open-source software framework for storing and processing big data. It is sometimes described as a cut-down distributed operating system. It is designed to manage and work with immense volumes of data, and scale linearly to large clusters of thousands of commodity computers. It was originally developed for Yahoo!, but is now available free and publicly through Apache Software Foundation, though it usually requires extensive programming knowledge to be used.

INTERNET OF THINGS (IOT):
A concept that describes the connection of everyday physical objects and products to the Internet so that they are recognizable by (through unique identifiers) and can relate to other devices. The term is closely identified with machine-to-machine communications and the development of, for example, “smart grids” for utilities, remote monitoring and other innovations. Gartner estimates 26 billion devices will be connected by 2020, including cars, coffee makers.
Teradata Take: Big data will only get bigger in the future and the IOT will be a major driver. The connectivity from wearables and sensors mean bigger volumes, more variety and higher-velocity feeds.

MACHINE LEARNING
“Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. It focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data. The process of machine learning is similar to that of data mining. Both systems search through data to look for patterns. However, instead of extracting data for human comprehension – as is the case in data mining applications – machine learning uses that data to improve the program's own understanding. Machine learning programs detect patterns in data and adjust program actions accordingly.” Source: TechTarget
Teradata Take: Machine learning is especially powerful in a big data context in that machines can test hypotheses using large data volumes, refine business rules as conditions change and identify anomalies and outliers quickly and accurately.

METADATA
“Metadata is data that describes other data. Metadata summarizes basic information about data, which can make finding and working with particular instances of data easier. For example, author, date created and date modified and file size are very basic document metadata. In addition to document files, metadata is used for images, videos, spreadsheets and web pages.” Source: TechTarget
Teradata Take: The effective management of metadata is an essential part of solid and flexible big data “ecosystems” in that it helps companies more efficiently manage their data assets and make them available to data scientists and other analysts.

MONGODB
MongoDB is a cross-platform, open-source database that uses a document-oriented data model, rather than a traditional table-based relational database structure. This type of database structure is designed to make the integration of structured and unstructured data in certain types of applications easier and faster.

NATURAL LANGUAGE PROCESSING
A branch of artificial intelligence, natural language processing (NLP) deals with making human language (in both written and spoken forms) comprehensible to computers. As a scientific discipline, NLP involves tasks such as identifying sentence structures and boundaries in documents, detecting key words or phrases in audio recordings, extracting relationships between documents, and uncovering meaning in informal or slang speech patterns. NLP can make it possible to analyze and recognize patterns in verbal data that is currently unstructured.
Teradata Take: NLP holds a key for enabling major advancements in text analytics and for garnering deeper and potentially more powerful insights from social media data streams, where slang and unconventional language are prevalent.

PATTERN RECOGNITION
Pattern recognition occurs when an algorithm locates recurrences or regularities within large data sets or across disparate data sets. It is closely linked and even considered synonymous with machine learning and data mining. This visibility can help researchers discover insights or reach conclusions that would otherwise be obscured.

PETABYTE
An extremely large unit of digital data, one Petabyte is equal to 1,000 Terabytes. Some estimates hold that a Petabyte is the equivalent of 20 million tall filing cabinets or 500 billion pages of standard printed text.

PREDICTIVE ANALYTICS
Predictive analytics refers to the analysis of big data to make predictions and determine the likelihood of future outcomes, trends or events. In business, it can be used to model various scenarios for how customers react to new product offerings or promotions and how the supply chain might be affected by extreme weather patterns or demand spikes. Predictive analytics may involve various statistical techniques, such as modeling, machine learning and data mining.

PRESCRIPTIVE ANALYTICS
A type or extension of predictive analytics, prescriptive analytics is used to recommend or prescribe specific actions when certain information states are reached or conditions are met. It uses algorithms, mathematical techniques and/or business rules to choose among several different actions that are aligned to an objective (such as improving business performance) and that recognize various requirements or constraints.

R
R is an open-source programming language for statistical analysis. It includes a command line interface and several graphical interfaces. Popular algorithm types include linear and nonlinear modeling, time-series analysis, classification and clustering. According to Gartner research, more than 50% of data science teams now use R in some capacity. R language competes with commercial products such as SAS and Fuzzy Logix.
Teradata Take: Many R language algorithms yield inaccurate results when run in parallel. Teradata partnered with Revolution Analytics to convert many R algorithms to run correctly in parallel. Teradata Database runs R in-parallel via its scripting and language support feature. Teradata Aster R runs in-parallel as well. Both solutions eliminate open source R’s limitations around memory, processing and data.

SEMI-STRUCTURED DATA
Semi-structured data refers to data that is not captured or formatted in conventional ways, such as those associated with a traditional database fields or common data models. It is also not raw or totally unstructured and may contain some data tables, tags or other structural elements. Graphs and tables, XML documents and email are examples of semi-structured data, which is very prevalent across the World Wide Web and is often found in object-oriented databases.
Teradata Take: As semi-structured data proliferates and because it contains some rational data, companies must account for it within their big data programs and data architectures.

SENTIMENT ANALYSIS
Sentiment analysis involves the capture and tracking of opinions, emotions or feelings expressed by consumers in various types of interactions or documents, including social media, calls to customer service representatives, surveys and the like. Text analytics and natural language processing are typical activities within a process of sentiment analysis. The goal is to determine or assess the sentiments or attitudes expressed toward a company, product, service, person or event.
Teradata Take: Sentiment analysis is particularly important in tracking emerging trends or changes in perceptions on social media. Within big data environments, sentiment analysis combined with behavioral analytics and machine learning is likely to yield even more valuable insights.

STRUCTURED DATA
Structured data refers to data sets with strong and consistent organization. Structured data is organized into rows and columns with known and predictable contents. Each column contains a specific data type, such as dates, text, money or percentages. Data not matching that column’s data type is rejected as an error. Relational database tables and spreadsheets typically contain structured data. A higher semantic level of structure combines master data and historical data into a data model. Data model subject areas include topics such as customers, inventory, sales transactions, prices and suppliers. Structured data is easy to use and data integrity can be enforced. Structured data becomes big data as huge amounts of historical facts are captured.
Teradata Take: All important business processes and decisions depend on structured data. It is the foundation of data warehouses, data lakes and applications. When integrated into a data model, structured data provides exponential business value.

TERABYTE
A relatively large unit of digital data, one Terabyte (TB) equals 1,000 Gigabytes. It has been estimated that 10 Terabytes could hold the entire printed collection of the U.S. Library of Congress, while a single TB could hold 1,000 copies of the Encyclopedia Brittanica.

UNSTRUCTURED DATA
Unstructured data refers to unfiltered information with no fixed organizing principle. It is often called raw data. Common examples are web logs, XML, JSON, text documents, images, video, and audio files. Unstructured data is searched and parsed to extract useful facts. As much as 80% of enterprise data is unstructured. This means it is the most visible form of big data to many people. The size of unstructured data requires scalable analytics to produce insights. Unstructured data is found in most but not all data lakes because of the lower cost of storage.
Teradata Take: There is more noise than value in unstructured data. Extracting the value hidden in such files requires strong skills and tools. There is a myth that relational databases cannot process unstructured data. Teradata's Unified Data Architecture embraces unstructured data in several ways. Teradata Database and competitors can store and process XML, JSON, Avro and other forms of unstructured data.

THE V’S:
Big data – and the business challenges and opportunities associated with it – are often discussed or described in the context of multiple V’s:
·         Value: the most important “V” from the perspective of the business, the value of big data usually comes from insight discovery and pattern recognition that lead to more effective operations, stronger customer relationships and other clear and quantifiable business benefits
·         Variability: the changing nature of the data companies seek to capture, manage and analyze – e.g., in sentiment or text analytics, changes in the meaning of key words or phrases
·         Variety: the diversity and range of different data types, including unstructured data, semi-structured data and raw data
·         Velocity: the speed at which companies receive, store and manage data – e.g., the specific number of social media posts or search queries received within a day, hour or other unit of time
·         Veracity: the “truth” or accuracy of data and information assets, which often determines executive-level confidence
·         Volume: the size and amounts of big data that companies manage and analyze

***** 


BUILDING A TEAM FOR BIG DATA SUCCESS

Most business people know that big data success takes more than just the latest technology. The right big data strategy (aligned to broader bigger-picture corporate objectives), strong big data processes (in reporting and governance, for example) and big data cultures (with strong commitments to data-driven decision making) are critical ingredients, too. 
Still, big data strategy discussions too often focus – even obsess on – the ginormous data volumes, the dizzying range of data infrastructure options and the shiny new technology du jour. And they almost always overlook one crucial variable: the people who generate the critical insights that reveal game-changing opportunities.

PEOPLE AS BIG DATA DIFFERENCE MAKERS
In fact, having the right people and teams may be the big data best practice. But, according to a 2014 IDG Enterprise survey of 750 IT decision makers, 40% of big data projects challenged by a skills shortage.
It’s not just a specific technical big data skill set or single discipline that companies need, but rather a range of expertise and knowledge. Yes, technical chops are a must-have. But a broader understanding of big data best practices in specific operational contexts – from sales and service, to finance and the supply chain – are also essential.

Required big data skills and roles for a successful big data strategy and organization:


EXECUTIVE SPONSORS
Senior leaders who can craft  a clear vision and rally the troops as to why big data so important, how it can be used to transform  the business and what the major impacts will be; such leaders are necessary to  build data cultures as well.
Because many big data initiatives are every bit as transformative as other strategic, enterprise-wide change programs, strong senior leadership is an absolute requirement for success. The potential for disruption – in both the positive and negative senses of that word – is high.
Therefore, effective executive sponsorship (very much including the C-suite) may be the biggest and most important big data best practice of them all.
BUSINESS ANALYSTS
People who know the right questions to ask relative to specific operations and functions, with a real focus on performance trends; they will regularly interrogate big data to identify how specific matrices fir within the broader strategic context relate to mega trends.
 Who they are and why business need them. Even as big data analytics technology and platform have matured, there has been an increasing recognition that people and skills are just as important in winning big data. And the essential resource in big data just might be the business analyst.    
DATA SCIENTISTS
Viewed in some circles as “the sexiest job of 21st century” data scientists are most likely to have advanced degrees and training in math and statistics, they will often lead the deep data-driving expeditions and bold explorations into the largest and most diverse data sets, seeking the subtlest patterns.
 MARKETING PROFESSIONALS
 Because so much of the potential value of the big data comes from consumers facing-operations, marketing executive can and should get ramped up (and rapidly), on the full range of big data practices to optimize Digital advertising, customer segmentation and promotional offerings.
 Where the big data action (and value) is.  Once upon a time big data was viewed as domain of IT.  Today, however, to deliver game-changing potential means it’s very much a front-line, customer facing phenomenon. And that means marketing, sales and service.

*****
CIO: Big data applications; http://www.cio.com/category/big-data/

Big Data: Big Data news, analysis, research, how-to, opinion, and video


How scientists are using big data to discover rare mineral deposits

Searching the earth for valuable mineral deposits has never been easy, but big data is allowing scientists to gleam the signal from the noise.

·         http://idge.staticworld.net/images/twitter.svg
 http://idge.staticworld.net/images/print.svg
Thinkstock (Thinkstock)
MORE LIKE THIS
·         security 2016 big data
·          
·          
·         Leaders
VIDEO
Big data is shaking up the ways our entrepreneurs start their businesses, our healthcare professionals deliver care, and our financial services render their transactions. Now, big data’s reach has expanded so far that it’s revolutionizing the way our scientist search for gas, oil, and even valuable minerals.
Searching under the surface of the earth for valuable mineral deposits has never been easy, but by exploiting recent innovations in big data that allow scientist to gleam the signal from the noise, experts are now capable of discovering and categorizing new minerals more efficiently than ever before.

A new type of mining

By mining big data, or by crunching huge sums of numbers to predict trends, scientist are now capable of mapping mineral deposits in new and exciting ways. Network theory, which has been used with great success in fields ranging from healthcare to national security, is one big data tool that scientist are coming to rely on more and more.
As they outlined in their research paper, researchers recently categorized minerals as nodes and the coexistence of different types of minerals as “lines”, or connections. By visualizing their data like this, they created an extraordinarily useful mapping process which could help determine which areas had a higher likelihood of possessing large mineral deposits.
While researchers used to suffer from limitations on how much data they could process in a given time period, today’s computers can effortlessly handle the math while the researchers are freed up to focus on more specialized task. As minerals often form in clusters under the surface of the earth, researchers can tap into their computer’s predictive analytical capabilities to gain a better understanding of which areas may be dry and which may be literal goldmines.
While geologist used to often rely on luck when figuring out where mineral deposits lay, they can now take their fates into their own hands. The benefits of big data aren’t constrained to mere minerals, either; scientist have successfully used big data in similar fashions to find deposits of gold and oil, as well as other resources.

Using data to lower cost

While geologist, miners, and virtually everyone else seeking to make a living off of the earth’s minerals have relied on data in the past, only recently have innovations made the process of using big data so cheap that it’s available to nearly everyone. Goldcorp’s CEO stunned the industry in 2000 when he released the company’s proprietary data to the public in an effort to harness the public’s innovative capabilities.
By offering a prize of a little over a million dollars, Goldcorp ended up discovering more than $6 billion in underground deposits, entirely because of contributors to his competition who relied on big data to map the area and find the valuable treasures stowed away below. As big data’s potential continues to grow, crowdsourcing operations like these will become more commonplace, as companies such as QP software, come to realize the incredible value of their data and understand that they can use the public to make use of it.
The sophisticated application of big data to create 3D maps is only one of the ways its fundamentally reshaping the prospecting industry. As companies develop new and greater abilities to categorize the minerals they detect underground, mining operations will find it cheaper and easier than ever before to locate the highly-valued prizes they seek. These kinds of developments will come to fill in the gaps that exist with current data-analysis techniques, to the great benefit of the industry and its consumers.
As big data’s ability to network and visualize huge sums of information continues to grow, more mineral deposits which have never before been unearthed are likely to be discovered. As advances in chemistry make it easier to determine the makeup of the minerals they uncover, companies will rapidly come to discover deposits in areas which they previously overlooked, or which earlier test determined to be unworthy of their time.
Big data is showing no signs of slowing down as it continues on its crusade to reshape the world as we know it. By tapping into this wondrous phenomenon, industries of all stripes are revolutionizing how they collect and use information to their benefit. While big data doesn’t hold the answer to every problem facing the world, its already made itself invaluable to the public, and will likely continue to grow.
This article is published as part of the IDG Contributor Network. Want to Join?
Next read this:
·         Why IT projects still fail
Related: 
·         Big Data

·         Data Mining

·         IT Industry
Gary Eastwood has over 20 years' experience as a science and technology journalist, editor and copywriter; writing on subjects such as mobile & UC, smart cities, ICT, the cloud, IoT, clean technology, nanotechnology, robotics & AI and science & innovation for a range of publications. Outside his life as a technology writer and analyst, Gary is an avid landscape photographer who has authored two photography books and ghost-written two others.
Follow

****

Data governance in the world of “data everywhere”

With data sources, uses and solutions on the rise, data governance is becoming even more important. But what are the phases of a successful and scalable data governance program?

·         http://idge.staticworld.net/images/twitter.svg

·         http://idge.staticworld.net/images/facebook.svg

·         http://idge.staticworld.net/images/linkedin.svg

·         http://idge.staticworld.net/images/gplus.svg

·         http://idge.staticworld.net/images/reddit.svg

·         http://idge.staticworld.net/images/stumbleupon.svg

·         http://idge.staticworld.net/images/mail.svg

·         http://idge.staticworld.net/images/print.svg
Road sign with the word Governance on itThinkstock (Thinkstock)
MORE LIKE THIS
·         Road sign with the word Governance on it
·          
·         Road sign with the word Governance on it
·         Leaders
VIDEO
In the world of “data everywhere”, Data Governance is becoming even more important.  Organizations that develop a data warehouse ‘single source of truth’ need data governance to ensure that a Standard Business Language (SBL) is developed and agreed to, and the various sources of data are integrated with consistent and reliable definitions and business rules.  Decisions around who can use what data and validations that the data being used and how it’s used meets regulatory and compliance requirements are important.
As the enterprise data management solutions grow and broaden, incorporating Enterprise Application Integration (EAI), Master Data Management (MDM), increasing use of external data, real time data solutions, data lakes, cloud, etc. Data Governance is even more important.  While there may be value in having data, if it’s not accurate, no-one can use it and it isn’t managed, then the value of the data, wherever it resides, diminishes greatly.
The foundational and implementation activities needed to initiate and successfully scale a Data Governance capability remain the same:
·         a discovery phase to assess sentiment, define the current and future data landscape, identify stakeholders, prioritize opportunities (and business value) and focus areas, and start to develop goals and a Data Governance roadmap
·         a foundational implementation phase to put the organization around data governance in place, communicate and educate stakeholders, secure executive support, define metrics for success and begin with an initial project, process or data set
·         a scalable implementation that includes tools, workflows and a focus on continuous improvement
Upcoming articles will describe approaches to each of these phases.  Working through these phases with the desired future state in mind, and with a high level roadmap to get there, will provide you with a greater probability of establishing a data governance capability that will scale in the long run.
This article is published as part of the IDG Contributor Network. Want to Join?
Next read this:
·         Why IT projects still fail
Related: 
·         Best PracticesNancy Couture has more than 30 years of experience leading enterprise data management at Fortune 500 companies and midsize organizations in both healthcare and the financial services industries. Nancy is delivery enablement lead for Datasource Consulting in Denver.
Follow
****
By Gary Eastwoodstar AdvisorCIO | JUL 18, 2017 6:00 AM PT
Opinions expressed by ICN authors are their own.

How big data Is driving technological innovation

While businesses have analyzed data for decades, recent developments in computing have opened new doors and unleashed big data’s potential.

·         http://idge.staticworld.net/images/twitter.svg

·         http://idge.staticworld.net/images/facebook.svg

·         http://idge.staticworld.net/images/print.svg
Thinkstock (Thinkstock)
MORE LIKE THIS
·         future what is next
·         big data certification hand holding data
·         bigdata problem thinkstock
·         https://core3.staticworld.net/images/article/2015/07/freeup-kosut-100596019-small.3x2.jpg
VIDEO
·         aukey dashcam
·         lens kit
·         rolled up keyboard
Big data analytics, or the collection and analysis of huge sums of data to discover underlying trends and patterns, is increasingly shaking the foundations of the business world. As the field continues to grow at an explosive rate, many innovators are asking themselves how they can exploit big data to optimize their businesses.
While businesses have analyzed data for decades, recent developments in computing have opened new doors and unleashed big data’s potential.  A report from SNS research details the breadth of big data’s impact; it’s now a $57 billionmarket, and is expected to continue to grow.
So how exactly is big data driving changes in the marketplace, and what does the future of this exciting industry hold?

Big data demands a skilled workforce

Savvy firms are using big data to foster increased consumer engagement, target new audiences with their advertisements, and hone the efficiency of their operations. A company can’t make use of this exciting new technology, however, if they don’t have the necessary human capital to exploit it.
Businesses are increasingly looking for skilled workers intimately familiar with data collection and analysis. These talented data gurus are being scooped up in droves by firms hoping to one day be on the Fortune 500 list, with some firms even employing training to ensure their teams are up to snuff. While college-educated employees are already highly valued, the workplace of tomorrow will demand even greater academic credentials and familiarity with tech from its workers.
Consider Teradata’s 2017 Data and Analytics Trend Report, which highlights the fact that nearly half of global businesses are facing a dearth of employees with data skills. As the gargantuan big data market continues to grow, firms will need more innovative workers who aren’t intimidated by disruptive technologies.
As big data’s capabilities grow to be more impressive, companies will need to ensure their workforce is up to the task of analyzing it to make better decisions. The last thing any innovative firm needs is to be left in the dust due to the lackluster performance of its human employees.

Big data’s disruptive impact

The disruptive nature of big data has led it to revolutionize a number of key industries. The financial industry, which is predicted to become heavily automated in the coming years, now relies on software which can crunch astonishingly large amounts of data to predict market trends and detect inefficiency’s in company’s financial operations.
Emerging industries like credit services, autonomous vehicles and smart homes, too, are being fueled by the emergence of big data. The impressive smart cars of tomorrow rely on the collection and interpretation of localized data to avoid crashes and optimize their routes, for instance.
Many existing business behemoths owe their place in the market to big data, as well. Netflix, a service so proliferated it’s almost taken for granted, reshaped the home entertainment industry largely thanks to its collection and analysis of user data. The company can determine which of its shows will be the most successful in any given market, predict which pilots it should fund, and even forecast how many entertainment awards it may win by crunching ever-growing amounts of data.

Utilizing big data for business success

As more companies and governmental organizations sees the benefits of big data, there’s no doubt they’ll pour more funding into it to better exploit it. Insurance companies eager to determine who among their clients is the most likely to get into accidents will employ increasingly advanced algorithms to detect risk. Tech giants like Apple and Google will employ analytics to determine how their latest gadgets might sell among their existing customers. Big data’s opportunities are virtually limitless.
IBM’s innovation report points to how emerging industries like 3D-printing and wearable tech will use big data to detect flaws in their operations or gauge user’s opinions on the products they buy. One of the most important factors in a business’ success, they point out, is how CEO’s and CIO’s invest early in analytics to better optimize their firms and forecast the future.
As the internet of things continues to grow at a dizzying pace, firms will find more sources of valuable data waiting to be collected and interpreted. There’s an entire marketplace waiting to be exploited by those companies wise enough to invest now in analytical forecast.
While the visions of the digital world’s future are often grim, highlighting increased levels of automation and pointing to existing markets which may be disrupted, they seldom capture the full potential of big data. This extraordinary phenomenon will soon find itself being used in the manufacturing, marketing, and delivering of virtually every product and service.
Individuals and companies who don’t want to be left behind should appreciate the future of the information marketplace, and prepare for it while they can. There’s a brave new world waiting to be capitalized on, and it belongs to those who big data.
This article is published as part of the IDG Contributor Network. Want to Join?
Next read this:
·         Why IT projects still fail
Related: 
·         Big Data

·         Cloud Computing
Gary Eastwood has over 20 years' experience as a science and technology journalist, editor and copywriter; writing on subjects such as mobile & UC, smart cities, ICT, the cloud, IoT, clean technology, nanotechnology, robotics & AI and science & innovation for a range of publications. Outside his life as a technology writer and analyst, Gary is an avid landscape photographer who has authored two photography books and ghost-written two others.
****

IDG CONTRIBUTOR NETWORK

Peers and technology content sites are the top resources that IT leaders rely on to stay current with new technologies.* The IDG Contributor Network unites these two tremendous resources by providing IT pros and other experts with a platform to share their ideas and insights. We invite qualified contributors to apply.
http://idge.staticworld.net/ins/CIO_logo-footer.svg http://idge.staticworld.net/ins/Computerworld_logo-black.svg http://idge.staticworld.net/ins/CSO-logo-black-for-ins.png http://idge.staticworld.net/ifw/InfoWorld_logo-black.svg
http://idge.staticworld.net/jvw/Javaworld-logo-black.svg http://idge.staticworld.net/nww/nww-logo-idg.svg

What is it?

The IDG Contributor Network is a collection of blogs written by YOU and leading IT practitioners about the technology, business opportunities and challenges you face everyday.
We invite you to become a contributor or participate by joining the conversations your peers spark.
If writing for the IDG Contributor Network isn't right for you, perhaps you know a colleague or peer who might want to apply. Invite your colleagues to join the conversation 

What topics?

While our readers are interested in a wide variety of technology topics, we're especially interested in covering the following subjects:
·         Careers/staffing
·         Mobile enterprise
·         Business continuity/disaster recovery
·         Security technologies
·         Microsoft Windows/Office/365/Server
·         Enterprise applications (SharePoint, SFDC, Google Apps)
·         Android
·         Apple/Mac/iOS/Mac IT
·         Linux
·         Programming and big data
·         Network/systems management
·         Unified Communications/VoIP

Who can join?

Technologists, analysts, experts. Do you have insights to share with your peers? Tell us about yourself and your topic expertise in the application below.
Our blog contributors must be the authors of original content. We do not accept previously published articles, posts written by ghostwriters, or promotional posts.
Read our Vendor Policy
Requests for additional information can be emailed here.

Application

To propose a contributor, please complete the application below.* Required

* Source: IDG Enterprise Role & influence of the Technology Decision-Maker Study, 2014
*****

BUILDING A DIGITAL ENTERPRISE: http://www.cio.com/blog/building-a-digital-enterprise/

****
****
****

Using Big Data to Hack Autism

Researchers scour datasets for clues to autism—needles in a genetic haystack  of 20,000 people 
·         By Simon MakinSpectrum on July 6, 2017
·          
·          
·          
·   ****       
·          
·          
Share on Facebook
Share on Twitter
Share

 DATA ANALYTICS  (DA):


DEFINITION

data analytics (DA)

http://cdn.ttgtmedia.com/rms/onlineImages/rouse_margaret.jpg



Contributor(s): Craig Stedman


This definition is part of our Essential Guide: An admin's guide to AWS data management

Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software. Data analytics technologies and techniques are widely used in commercial industries to enable organizations to make more-informed business decisions and by scientists and researchers to verify or disprove scientific models, theories and hypotheses.
As a term, data analytics predominantly refers to an assortment of applications, from basic business intelligence (BI), reporting and online analytical processing (OLAP) to various forms of advanced analytics. In that sense, it's similar in nature to business analytics, another umbrella term for approaches to analyzing data -- with the difference that the latter is oriented to business uses, while data analytics has a broader focus. The expansive view of the term isn't universal, though: In some cases, people use data analytics specifically to mean advanced analytics, treating BI as a separate category.
Data analytics initiatives can help businesses increase revenues, improve operational efficiency, optimize marketing campaigns and customer service efforts, respond more quickly to emerging market trends and gain a competitive edge over rivals -- all with the ultimate goal of boosting business performance. Depending on the particular application, the data that's analyzed can consist of either historical records or new information that has been processed for real-time analytics uses. In addition, it can come from a mix of internal systems and external data sources.


Types of data analytics applications

At a high level, data analytics methodologies include exploratory data analysis (EDA), which aims to find patterns and relationships in data, and confirmatory data analysis (CDA), which applies statistical techniques to determine whether hypotheses about a data set are true or false. EDA is often compared to detective work, while CDA is akin to the work of a judge or jury during a court trial -- a distinction first drawn by statistician John W. Tukey in his 1977 book Exploratory Data Analysis.
Data analytics can also be separated into quantitative data analysis and qualitative data analysis. The former involves analysis of numerical data with quantifiable variables that can be compared or measured statistically. The qualitative approach is more interpretive -- it focuses on understanding the content of non-numerical data like text, images, audio and video, including common phrases, themes and points of view.
At the application level, BI and reporting provides business executives and other corporate workers with actionable information about key performance indicators, business operations, customers and more. In the past, data queries and reports typically were created for end users by BI developers working in IT or for a centralized BI team; now, organizations increasingly use self-service BI tools that let execs, business analysts and operational workers run their own ad hoc queries and build reports themselves.
Get Flash Player
More advanced types of data analytics include data mining, which involves sorting through large data sets to identify trends, patterns and relationships; predictive analytics, which seeks to predict customer behavior, equipment failures and other future events; and machine learning, an artificial intelligence technique that uses automated algorithms to churn through data sets more quickly than data scientistscan do via conventional analytical modeling. Big data analytics applies data mining, predictive analytics and machine learning tools to sets of big data that often contain unstructured and semi-structured data. Text mining provides a means of analyzing documents, emails and other text-based content.  
Data analytics initiatives support a wide variety of business uses. For example, banks and credit card companies analyze withdrawal and spending patterns to prevent fraud and identity theft. E-commerce companies and marketing services providers do clickstream analysis to identify website visitors who are more likely to buy a particular product or service based on navigation and page-viewing patterns. Mobile network operators examine customer data to forecast churn so they can take steps to prevent defections to business rivals; to boost customer relationship management efforts, they and other companies also engage in CRM analytics to segment customers for marketing campaigns and equip call center workers with up-to-date information about callers. Healthcare organizations mine patient data to evaluate the effectiveness of treatments for cancer and other diseases.

Inside the data analytics process

Data analytics applications involve more than just analyzing data. Particularly on advanced analytics projects, much of the required work takes place upfront, in collecting, integrating and preparing data and then developing, testing and revising analytical models to ensure that they produce accurate results. In addition to data scientists and other data analysts, analytics teams often include data engineers, whose job is to help get data sets ready for analysis.

PRO+

Content

Find more PRO+ content and other member only offers, here.

·         E-Handbook

·         Buyer's Handbook

·         E-Handbook

The analytics process starts with data collection, in which data scientists identify the information they need for a particular analytics application and then work on their own or with data engineers and IT staffers to assemble it for use. Data from different source systems may need to be combined via data integration routines, transformed into a common format and loaded into an analytics system, such as a Hadoop clusterNoSQL database or data warehouse. In other cases, the collection process may consist of pulling a relevant subset out of a stream of raw data that flows into, say, Hadoop and moving it to a separate partition in the system so it can be analyzed without affecting the overall data set.



BUILDING A TEAM FOR BIG DATA SUCCESS

Most business people know that big data success takes more than just the latest technology. The right big data strategy (aligned to broader bigger-picture corporate objectives), strong big data processes (in reporting and governance, for example) and big data cultures (with strong commitments to data-driven decision making) are critical ingredients, too. 
Still, big data strategy discussions too often focus – even obsess on – the ginormous data volumes, the dizzying range of data infrastructure options and the shiny new technology du jour. And they almost always overlook one crucial variable: the people who generate the critical insights that reveal game-changing opportunities.

PEOPLE AS BIG DATA DIFFERENCE MAKERS
In fact, having the right people and teams may be the big data best practice. But, according to a 2014 IDG Enterprise survey of 750 IT decision makers, 40% of big data projects challenged by a skills shortage.
It’s not just a specific technical big data skill set or single discipline that companies need, but rather a range of expertise and knowledge. Yes, technical chops are a must-have. But a broader understanding of big data best practices in specific operational contexts – from sales and service, to finance and the supply chain – are also essential.

Required big data skills and roles for a successful big data strategy and organization:


EXECUTIVE SPONSORS
Senior leaders who can craft  a clear vision and rally the troops as to why big data so important, how it can be used to transform  the business and what the major impacts will be; such leaders are necessary to  build data cultures as well.
Because many big data initiatives are every bit as transformative as other strategic, enterprise-wide change programs, strong senior leadership is an absolute requirement for success. The potential for disruption – in both the positive and negative senses of that word – is high.
Therefore, effective executive sponsorship (very much including the C-suite) may be the biggest and most important big data best practice of them all.
BUSINESS ANALYSTS
People who know the right questions to ask relative to specific operations and functions, with a real focus on performance trends; they will regularly interrogate big data to identify how specific matrices fir within the broader strategic context relate to mega trends.
 Who they are and why business need them. Even as big data analytics technology and platform have matured, there has been an increasing recognition that people and skills are just as important in winning big data. And the essential resource in big data just might be the business analyst.    
DATA SCIENTISTS
Viewed in some circles as “the sexiest job of 21st century” data scientists are most likely to have advanced degrees and training in math and statistics, they will often lead the deep data-driving expeditions and bold explorations into the largest and most diverse data sets, seeking the subtlest patterns.
 MARKETING PROFESSIONALS
 Because so much of the potential value of the big data comes from consumers facing-operations, marketing executive can and should get ramped up (and rapidly), on the full range of big data practices to optimize Digital advertising, customer segmentation and promotional offerings.
 Where the big data action (and value) is.  Once upon a time big data was viewed as domain of IT.  Today, however, to deliver game-changing potential means it’s very much a front-line, customer facing phenomenon. And that means marketing, sales and service.

*****
CIO: Big data applications; http://www.cio.com/category/big-data/

Big Data: Big Data news, analysis, research, how-to, opinion, and video


How scientists are using big data to discover rare mineral deposits

Searching the earth for valuable mineral deposits has never been easy, but big data is allowing scientists to gleam the signal from the noise.

·         http://idge.staticworld.net/images/twitter.svg
 http://idge.staticworld.net/images/print.svg
Thinkstock (Thinkstock)
MORE LIKE THIS
·         security 2016 big data
·          
·          
·         Leaders
VIDEO
Big data is shaking up the ways our entrepreneurs start their businesses, our healthcare professionals deliver care, and our financial services render their transactions. Now, big data’s reach has expanded so far that it’s revolutionizing the way our scientist search for gas, oil, and even valuable minerals.
Searching under the surface of the earth for valuable mineral deposits has never been easy, but by exploiting recent innovations in big data that allow scientist to gleam the signal from the noise, experts are now capable of discovering and categorizing new minerals more efficiently than ever before.

A new type of mining

By mining big data, or by crunching huge sums of numbers to predict trends, scientist are now capable of mapping mineral deposits in new and exciting ways. Network theory, which has been used with great success in fields ranging from healthcare to national security, is one big data tool that scientist are coming to rely on more and more.
As they outlined in their research paper, researchers recently categorized minerals as nodes and the coexistence of different types of minerals as “lines”, or connections. By visualizing their data like this, they created an extraordinarily useful mapping process which could help determine which areas had a higher likelihood of possessing large mineral deposits.
While researchers used to suffer from limitations on how much data they could process in a given time period, today’s computers can effortlessly handle the math while the researchers are freed up to focus on more specialized task. As minerals often form in clusters under the surface of the earth, researchers can tap into their computer’s predictive analytical capabilities to gain a better understanding of which areas may be dry and which may be literal goldmines.
While geologist used to often rely on luck when figuring out where mineral deposits lay, they can now take their fates into their own hands. The benefits of big data aren’t constrained to mere minerals, either; scientist have successfully used big data in similar fashions to find deposits of gold and oil, as well as other resources.

Using data to lower cost

While geologist, miners, and virtually everyone else seeking to make a living off of the earth’s minerals have relied on data in the past, only recently have innovations made the process of using big data so cheap that it’s available to nearly everyone. Goldcorp’s CEO stunned the industry in 2000 when he released the company’s proprietary data to the public in an effort to harness the public’s innovative capabilities.
By offering a prize of a little over a million dollars, Goldcorp ended up discovering more than $6 billion in underground deposits, entirely because of contributors to his competition who relied on big data to map the area and find the valuable treasures stowed away below. As big data’s potential continues to grow, crowdsourcing operations like these will become more commonplace, as companies such as QP software, come to realize the incredible value of their data and understand that they can use the public to make use of it.
The sophisticated application of big data to create 3D maps is only one of the ways its fundamentally reshaping the prospecting industry. As companies develop new and greater abilities to categorize the minerals they detect underground, mining operations will find it cheaper and easier than ever before to locate the highly-valued prizes they seek. These kinds of developments will come to fill in the gaps that exist with current data-analysis techniques, to the great benefit of the industry and its consumers.
As big data’s ability to network and visualize huge sums of information continues to grow, more mineral deposits which have never before been unearthed are likely to be discovered. As advances in chemistry make it easier to determine the makeup of the minerals they uncover, companies will rapidly come to discover deposits in areas which they previously overlooked, or which earlier test determined to be unworthy of their time.
Big data is showing no signs of slowing down as it continues on its crusade to reshape the world as we know it. By tapping into this wondrous phenomenon, industries of all stripes are revolutionizing how they collect and use information to their benefit. While big data doesn’t hold the answer to every problem facing the world, its already made itself invaluable to the public, and will likely continue to grow.
This article is published as part of the IDG Contributor Network. Want to Join?
Next read this:
·         Why IT projects still fail
Related: 
·         Big Data

·         Data Mining

·         IT Industry
Gary Eastwood has over 20 years' experience as a science and technology journalist, editor and copywriter; writing on subjects such as mobile & UC, smart cities, ICT, the cloud, IoT, clean technology, nanotechnology, robotics & AI and science & innovation for a range of publications. Outside his life as a technology writer and analyst, Gary is an avid landscape photographer who has authored two photography books and ghost-written two others.
Follow

****

Data governance in the world of “data everywhere”

With data sources, uses and solutions on the rise, data governance is becoming even more important. But what are the phases of a successful and scalable data governance program?

·         http://idge.staticworld.net/images/twitter.svg

·         http://idge.staticworld.net/images/facebook.svg

·         http://idge.staticworld.net/images/linkedin.svg

·         http://idge.staticworld.net/images/gplus.svg

·         http://idge.staticworld.net/images/reddit.svg

·         http://idge.staticworld.net/images/stumbleupon.svg

·         http://idge.staticworld.net/images/mail.svg

·         http://idge.staticworld.net/images/print.svg
Road sign with the word Governance on itThinkstock (Thinkstock)
MORE LIKE THIS
·         Road sign with the word Governance on it
·          
·         Road sign with the word Governance on it
·         Leaders
VIDEO
In the world of “data everywhere”, Data Governance is becoming even more important.  Organizations that develop a data warehouse ‘single source of truth’ need data governance to ensure that a Standard Business Language (SBL) is developed and agreed to, and the various sources of data are integrated with consistent and reliable definitions and business rules.  Decisions around who can use what data and validations that the data being used and how it’s used meets regulatory and compliance requirements are important.
As the enterprise data management solutions grow and broaden, incorporating Enterprise Application Integration (EAI), Master Data Management (MDM), increasing use of external data, real time data solutions, data lakes, cloud, etc. Data Governance is even more important.  While there may be value in having data, if it’s not accurate, no-one can use it and it isn’t managed, then the value of the data, wherever it resides, diminishes greatly.
The foundational and implementation activities needed to initiate and successfully scale a Data Governance capability remain the same:
·         a discovery phase to assess sentiment, define the current and future data landscape, identify stakeholders, prioritize opportunities (and business value) and focus areas, and start to develop goals and a Data Governance roadmap
·         a foundational implementation phase to put the organization around data governance in place, communicate and educate stakeholders, secure executive support, define metrics for success and begin with an initial project, process or data set
·         a scalable implementation that includes tools, workflows and a focus on continuous improvement
Upcoming articles will describe approaches to each of these phases.  Working through these phases with the desired future state in mind, and with a high level roadmap to get there, will provide you with a greater probability of establishing a data governance capability that will scale in the long run.
This article is published as part of the IDG Contributor Network. Want to Join?
Next read this:
·         Why IT projects still fail
Related: 
·         Best PracticesNancy Couture has more than 30 years of experience leading enterprise data management at Fortune 500 companies and midsize organizations in both healthcare and the financial services industries. Nancy is delivery enablement lead for Datasource Consulting in Denver.
Follow
****
By Gary Eastwoodstar AdvisorCIO | JUL 18, 2017 6:00 AM PT
Opinions expressed by ICN authors are their own.

How big data Is driving technological innovation

While businesses have analyzed data for decades, recent developments in computing have opened new doors and unleashed big data’s potential.

·         http://idge.staticworld.net/images/twitter.svg

·         http://idge.staticworld.net/images/facebook.svg

·         http://idge.staticworld.net/images/print.svg
Thinkstock (Thinkstock)
MORE LIKE THIS
·         future what is next
·         big data certification hand holding data
·         bigdata problem thinkstock
·         https://core3.staticworld.net/images/article/2015/07/freeup-kosut-100596019-small.3x2.jpg
VIDEO
·         aukey dashcam
·         lens kit
·         rolled up keyboard
Big data analytics, or the collection and analysis of huge sums of data to discover underlying trends and patterns, is increasingly shaking the foundations of the business world. As the field continues to grow at an explosive rate, many innovators are asking themselves how they can exploit big data to optimize their businesses.
While businesses have analyzed data for decades, recent developments in computing have opened new doors and unleashed big data’s potential.  A report from SNS research details the breadth of big data’s impact; it’s now a $57 billionmarket, and is expected to continue to grow.
So how exactly is big data driving changes in the marketplace, and what does the future of this exciting industry hold?

Big data demands a skilled workforce

Savvy firms are using big data to foster increased consumer engagement, target new audiences with their advertisements, and hone the efficiency of their operations. A company can’t make use of this exciting new technology, however, if they don’t have the necessary human capital to exploit it.
Businesses are increasingly looking for skilled workers intimately familiar with data collection and analysis. These talented data gurus are being scooped up in droves by firms hoping to one day be on the Fortune 500 list, with some firms even employing training to ensure their teams are up to snuff. While college-educated employees are already highly valued, the workplace of tomorrow will demand even greater academic credentials and familiarity with tech from its workers.
Consider Teradata’s 2017 Data and Analytics Trend Report, which highlights the fact that nearly half of global businesses are facing a dearth of employees with data skills. As the gargantuan big data market continues to grow, firms will need more innovative workers who aren’t intimidated by disruptive technologies.
As big data’s capabilities grow to be more impressive, companies will need to ensure their workforce is up to the task of analyzing it to make better decisions. The last thing any innovative firm needs is to be left in the dust due to the lackluster performance of its human employees.

Big data’s disruptive impact

The disruptive nature of big data has led it to revolutionize a number of key industries. The financial industry, which is predicted to become heavily automated in the coming years, now relies on software which can crunch astonishingly large amounts of data to predict market trends and detect inefficiency’s in company’s financial operations.
Emerging industries like credit services, autonomous vehicles and smart homes, too, are being fueled by the emergence of big data. The impressive smart cars of tomorrow rely on the collection and interpretation of localized data to avoid crashes and optimize their routes, for instance.
Many existing business behemoths owe their place in the market to big data, as well. Netflix, a service so proliferated it’s almost taken for granted, reshaped the home entertainment industry largely thanks to its collection and analysis of user data. The company can determine which of its shows will be the most successful in any given market, predict which pilots it should fund, and even forecast how many entertainment awards it may win by crunching ever-growing amounts of data.

Utilizing big data for business success

As more companies and governmental organizations sees the benefits of big data, there’s no doubt they’ll pour more funding into it to better exploit it. Insurance companies eager to determine who among their clients is the most likely to get into accidents will employ increasingly advanced algorithms to detect risk. Tech giants like Apple and Google will employ analytics to determine how their latest gadgets might sell among their existing customers. Big data’s opportunities are virtually limitless.
IBM’s innovation report points to how emerging industries like 3D-printing and wearable tech will use big data to detect flaws in their operations or gauge user’s opinions on the products they buy. One of the most important factors in a business’ success, they point out, is how CEO’s and CIO’s invest early in analytics to better optimize their firms and forecast the future.
As the internet of things continues to grow at a dizzying pace, firms will find more sources of valuable data waiting to be collected and interpreted. There’s an entire marketplace waiting to be exploited by those companies wise enough to invest now in analytical forecast.
While the visions of the digital world’s future are often grim, highlighting increased levels of automation and pointing to existing markets which may be disrupted, they seldom capture the full potential of big data. This extraordinary phenomenon will soon find itself being used in the manufacturing, marketing, and delivering of virtually every product and service.
Individuals and companies who don’t want to be left behind should appreciate the future of the information marketplace, and prepare for it while they can. There’s a brave new world waiting to be capitalized on, and it belongs to those who big data.
This article is published as part of the IDG Contributor Network. Want to Join?
Next read this:
·         Why IT projects still fail
Related: 
·         Big Data

·         Cloud Computing
Gary Eastwood has over 20 years' experience as a science and technology journalist, editor and copywriter; writing on subjects such as mobile & UC, smart cities, ICT, the cloud, IoT, clean technology, nanotechnology, robotics & AI and science & innovation for a range of publications. Outside his life as a technology writer and analyst, Gary is an avid landscape photographer who has authored two photography books and ghost-written two others.
****

IDG CONTRIBUTOR NETWORK

Peers and technology content sites are the top resources that IT leaders rely on to stay current with new technologies.* The IDG Contributor Network unites these two tremendous resources by providing IT pros and other experts with a platform to share their ideas and insights. We invite qualified contributors to apply.
http://idge.staticworld.net/ins/CIO_logo-footer.svg http://idge.staticworld.net/ins/Computerworld_logo-black.svg http://idge.staticworld.net/ins/CSO-logo-black-for-ins.png http://idge.staticworld.net/ifw/InfoWorld_logo-black.svg
http://idge.staticworld.net/jvw/Javaworld-logo-black.svg http://idge.staticworld.net/nww/nww-logo-idg.svg

What is it?

The IDG Contributor Network is a collection of blogs written by YOU and leading IT practitioners about the technology, business opportunities and challenges you face everyday.
We invite you to become a contributor or participate by joining the conversations your peers spark.
If writing for the IDG Contributor Network isn't right for you, perhaps you know a colleague or peer who might want to apply. Invite your colleagues to join the conversation 

What topics?

While our readers are interested in a wide variety of technology topics, we're especially interested in covering the following subjects:
·         Careers/staffing
·         Mobile enterprise
·         Business continuity/disaster recovery
·         Security technologies
·         Microsoft Windows/Office/365/Server
·         Enterprise applications (SharePoint, SFDC, Google Apps)
·         Android
·         Apple/Mac/iOS/Mac IT
·         Linux
·         Programming and big data
·         Network/systems management
·         Unified Communications/VoIP

Who can join?

Technologists, analysts, experts. Do you have insights to share with your peers? Tell us about yourself and your topic expertise in the application below.
Our blog contributors must be the authors of original content. We do not accept previously published articles, posts written by ghostwriters, or promotional posts.
Read our Vendor Policy
Requests for additional information can be emailed here.

Application

To propose a contributor, please complete the application below.* Required

* Source: IDG Enterprise Role & influence of the Technology Decision-Maker Study, 2014
*****

BUILDING A DIGITAL ENTERPRISE: http://www.cio.com/blog/building-a-digital-enterprise/

****
****
****

Using Big Data to Hack Autism

Researchers scour datasets for clues to autism—needles in a genetic haystack  of 20,000 people 
·         By Simon MakinSpectrum on July 6, 2017
·          
·          
·          
·          
·          
·          
Share on Facebook
Share on Twitter
Share

 data analytics (DA):


DEFINITION

data analytics (DA)

http://cdn.ttgtmedia.com/rms/onlineImages/rouse_margaret.jpg
Contributor(s): Craig Stedman
This definition is part of our Essential Guide: An admin's guide to AWS data management
·          
Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software. Data analytics technologies and techniques are widely used in commercial industries to enable organizations to make more-informed business decisions and by scientists and researchers to verify or disprove scientific models, theories and hypotheses.
As a term, data analytics predominantly refers to an assortment of applications, from basic business intelligence (BI), reporting and online analytical processing (OLAP) to various forms of advanced analytics. In that sense, it's similar in nature to business analytics, another umbrella term for approaches to analyzing data -- with the difference that the latter is oriented to business uses, while data analytics has a broader focus. The expansive view of the term isn't universal, though: In some cases, people use data analytics specifically to mean advanced analytics, treating BI as a separate category.

Data analytics initiatives can help businesses increase revenues, improve operational efficiency, optimize marketing campaigns and customer service efforts, respond more quickly to emerging market trends and gain a competitive edge over rivals -- all with the ultimate goal of boosting business performance. Depending on the particular application, the data that's analyzed can consist of either historical records or new information that has been processed for real-time analytics uses. In addition, it can come from a mix of internal systems and external data sources.

Types of data analytics applications

At a high level, data analytics methodologies include exploratory data analysis (EDA), which aims to find patterns and relationships in data, and confirmatory data analysis (CDA), which applies statistical techniques to determine whether hypotheses about a data set are true or false. EDA is often compared to detective work, while CDA is akin to the work of a judge or jury during a court trial -- a distinction first drawn by statistician John W. Tukey in his 1977 book Exploratory Data Analysis.
Data analytics can also be separated into quantitative data analysis and qualitative data analysis. The former involves analysis of numerical data with quantifiable variables that can be compared or measured statistically. The qualitative approach is more interpretive -- it focuses on understanding the content of non-numerical data like text, images, audio and video, including common phrases, themes and points of view.
At the application level, BI and reporting provides business executives and other corporate workers with actionable information about key performance indicators, business operations, customers and more. In the past, data queries and reports typically were created for end users by BI developers working in IT or for a centralized BI team; now, organizations increasingly use self-service BI tools that let execs, business analysts and operational workers run their own ad hoc queries and build reports themselves.
Get Flash Player
More advanced types of data analytics include data mining, which involves sorting through large data sets to identify trends, patterns and relationships; predictive analytics, which seeks to predict customer behavior, equipment failures and other future events; and machine learning, an artificial intelligence technique that uses automated algorithms to churn through data sets more quickly than data scientistscan do via conventional analytical modeling. Big data analytics applies data mining, predictive analytics and machine learning tools to sets of big data that often contain unstructured and semi-structured data. Text mining provides a means of analyzing documents, emails and other text-based content.  
Data analytics initiatives support a wide variety of business uses. For example, banks and credit card companies analyze withdrawal and spending patterns to prevent fraud and identity theft. E-commerce companies and marketing services providers do clickstream analysis to identify website visitors who are more likely to buy a particular product or service based on navigation and page-viewing patterns. Mobile network operators examine customer data to forecast churn so they can take steps to prevent defections to business rivals; to boost customer relationship management efforts, they and other companies also engage in CRM analytics to segment customers for marketing campaigns and equip call center workers with up-to-date information about callers. Healthcare organizations mine patient data to evaluate the effectiveness of treatments for cancer and other diseases.

Inside the data analytics process

Data analytics applications involve more than just analyzing data. Particularly on advanced analytics projects, much of the required work takes place upfront, in collecting, integrating and preparing data and then developing, testing and revising analytical models to ensure that they produce accurate results. In addition to data scientists and other data analysts, analytics teams often include data engineers, whose job is to help get data sets ready for analysis.

PRO+

Content

Find more PRO+ content and other member only offers, here.

·         E-Handbook

·         Buyer's Handbook

·         E-Handbook

The analytics process starts with data collection, in which data scientists identify the information they need for a particular analytics application and then work on their own or with data engineers and IT staffers to assemble it for use. Data from different source systems may need to be combined via data integration routines, transformed into a common format and loaded into an analytics system, such as a Hadoop clusterNoSQL database or data warehouse. In other cases, the collection process may consist of pulling a relevant subset out of a stream of raw data that flows into, say, Hadoop and moving it to a separate partition in the system so it can be analyzed without affecting the overall data set.

Who's who on the data analytics team
Once the data that's needed is in place, the next step is to find and fix data quality problems that could affect the accuracy of analytics applications. That includes running data profiling and data cleansing jobs to make sure that the information in a data set is consistent and that errors and duplicate entries are eliminated. Additional data preparation work is then done to manipulate and organize the data for the planned analytics use, and data governance policies are applied to ensure that the data hews to corporate standards and is being used properly.

At that point, the data analytics work begins in earnest. A data scientist builds an analytical model, using predictive modeling tools or other analytics software and programming languages such as Python, Scala, R and SQL. The model is initially run against a partial data set to test its accuracy; typically, it's then revised and tested again, a process known as "training" the model that continues until it functions as intended. Finally, the model is run in production mode against the full data set, something that can be done once to address a specific information need or on an ongoing basis as the data is updated.
In some cases, analytics applications can be set to automatically trigger business actions -- for example, stock trades by a financial services firm. Otherwise, the last step in the data analytics process is communicating the results generated by analytical models to business executives and other end users to aid in their decision-making. That usually is done with the help of data visualization techniques, which analytics teams use to create charts and other infographics designed to make their findings easier to understand. Data visualizations often are incorporated into BI dashboard applications that display data on a single screen and can be updated in real time as new information becomes available.
***

Once the data that's needed is in place, the next step is to find and fix data quality problems that could affect the accuracy of analytics applications. That includes running data profiling and data cleansing jobs to make sure that the information in a data set is consistent and that errors and duplicate entries are eliminated. Additional data preparation work is then done to manipulate and organize the data for the planned analytics use, and data governance policies are applied to ensure that the data hews to corporate standards and is being used properly.
At that point, the data analytics work begins in earnest. A data scientist builds an analytical model, using predictive modeling tools or other analytics software and programming languages such as Python, Scala, R and SQL. The model is initially run against a partial data set to test its accuracy; typically, it's then revised and tested again, a process known as "training" the model that continues until it functions as intended. Finally, the model is run in production mode against the full data set, something that can be done once to address a specific information need or on an ongoing basis as the data is updated.
In some cases, analytics applications can be set to automatically trigger business actions -- for example, stock trades by a financial services firm. Otherwise, the last step in the data analytics process is communicating the results generated by analytical models to business executives and other end users to aid in their decision-making. That usually is done with the help of data visualization techniques, which analytics teams use to create charts and other infographics designed to make their findings easier to understand. Data visualizations often are incorporated into BI dashboard applications that display data on a single screen and can be updated in real time as new information becomes available.
***
  


No comments:

Post a Comment