What is a storage repository that holds a vast amount of raw data in its original format until the business needs it multiple choice question?

The extent of detail within the information (fine and detailed or coarse and abstract).

1) Individual
2) Department
3) Enterprise

1) Document
2) Presentation
3) Spreadsheet
>4) Database

Information Granularities

1) Detail (fine)
2) Summary
3) Aggregate (coarse)

Four Primary Traits of the Value of Information

1) Information Type
2) Information Timeliness
3) Information Quality
4) Information Governance

Two Primary Types of Information

1) Transactional
2) Analytical

Immediate, up-to-date information.

Provides real-time information in response to requests.

Information Inconsistency

Occurs when the same data element has different values.

Information Integrity Issues

Occurs when a system produces incorrect, inconsistent, or duplicate data.

Five Common Characteristics of High-Quality Information

1) Accurate
2) Complete
3) Consistent
4) Timely
5) Unique

Occurs when a company examines its data to determine if it can meet business expectations, while identifying possible data gaps or where missing data might exist.

The management and oversight of an organization’s data assets to help provide business users with high-quality data that is easily accessible in a consistent manner.

Responsible for ensuring the policies and procedures are implemented across the organization and acts as a liaison between the MIS department and the business.

Refers to the overall management of the availability, usability, integrity, and security of company data.

Master Data Management (MDM)

The practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, sales, employees, and other critical entities that are commonly integrated across organizational systems.

Includes the tests and evaluations used to determine compliance with data governance policies to ensure correctness of data.

Maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses).

Database Management System (DBMS)

Creates, reads, updates and deletes data in a database while controlling access and security.

Primary Tools Available for Retrieving Information in Databases

1) Query-by-example (QBE) tool
2) Structured query language (SQL)

Query-by-Example (QBE) Tool

Helps users graphically design the answer to a question against a database.

Structured Query Language (SQL)

Users write lines of code to answer questions against a database.

Data Element
(or Data Field)

The smallest or basic unit of information.

Logical data structures that detail the relationships among data elements using graphics or pictures.

Complies all of the metadata about the data elements in the data model.

DBMS Uses Three Primary Data Models for Organizing Information

1) Hierarchical database
2) Network database
3) Relational database

Relational Database Model

Stores information in the form of logically related two-dimensional tables.

Relational Database Management System

Allows users to create, read, update, and delete data in a relational database.

Primary Concepts of a Relational Database Model

1) entities
2) attributes
3) keys
4) relationships

Stores information about a person, place, thing, transaction, or event.

Attributes
(or Columns or Fields)

The data elements associated with an entity.

Collection of related data elements.

A field (or group of fields) that uniquely identifies a given record in a table.

A primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables.

Business Advantages of a Relational Database

1) Increased flexibility
2) Increased scalability and performance
3) Reduced information redundance
4) Increased information integrity
5) Increased information security

Physical View of Information

The physical storage of information on a storage device.

Logical View of Information

Shows how individual users logically access information to meet their own particular business needs.

The time it takes for data to be stored or retrieved.

The duplication of data, or the storage of the same data in multiple places.

A measure of the quality of information.

Rules that help ensure the quality of information.

Two Types of Integrity Constraints

1) Rational
2) Business critical

Relational Integrity Constraint

Rules that enforce basic and fundamental information-based constraints.

Defines how a company performs a certain aspect of its business and typically results in either a yes/no or true/false answer.

Business-Critical Integrity Constraint

Enforces business rules vital to an organization’s success and often requires more insight and knowledge than relational integrity constraints.

A broad administrative area that deals with identifying individuals in a system (such as a country, a network, or an enterprise) and controlling their access to resources within that system by associating user rights and restrictions with the established identity.

The person responsible for creating the original website content.

The person responsible for updating and maintains website content.

Includes fixed data that are not capable of change in the event of a user action.

Includes data that change based on user actions.

An area of a website that stores information about products in a database.

An interactive website kept constantly updated and relevant to the needs of its customers using a database.

Reasons Business Analysis is Difficult from Operational Databases

1) Inconsistent data definitions
2) Lack of data standards
3) Poor data quality
4) Inadequate date usefulness
5) Ineffective direct data access

A central location in which data is stored and managed.

A logical collection of information, gathered from many different operational databases, that supports business analysis activities and decision-making tasks.

Data Warehousing Components

1) Data mart
2) Information cleansing
3) Business intelligence

The collection of data from various sources for the purpose of data processing.

Extraction, Transformation, and Loading (ETL)

A process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse.

Contains a subset of data warehouse information.

Erroneous or flawed data.

1) Duplicate data
2) Misleading data
3) Incorrect data
4) Non-formatted data
5) Violates business rules data
6) Non-integrated data
7) Inaccurate data

Information Cleansing or Scrubbing

A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.

An individual item on a graph or a chart.

A business that collects personal information about consumers and sells that information to other organizations.

A storage repository that holds a vast amount of raw data in its original format until the business needs it.

Identifies the primary locations where date is collected.

Examples: invoices, spreadsheets, time-sheets, transactions, and electronic sources such as other databases.

An organized collection of data.

Compares two ore more data sets to identify patterns and trends.

Comparative Analysis Decisions can be Based On

1) data sets
2) experience
3) knowledge
4) combination of all three

When a company keeps tables of its competitor’s activities on the web using software that automatically tracks all competitor website activities such as discounts and new products.

A technique for establishing a match, or balance, between the source data and the target data warehouse.

Data-Driven Decision Management

An approach to business governance that values decisions that can be backed up with verifiable data.

Four Common Characteristics of Big Data

1) Variety
2) Veracity
3) Volume
4) Velocity

Different forms of structured and unstructured data.

The uncertainty of data, including biases, noise and abnormalities.

The analysis of streaming data as it travels around the internet.

Processes and manages algorithms across many machines in a computing environment.

Creates multiple “virtual” machines on a single computing device.

Business Focus Areas of Big Data

1) Data Mining
2) Data Analysis
3) Data Visualization

The process of analyzing data to extract information not offered by the raw data alone.

Data Mining Process Model Overview

1) Business understanding
2) Data understanding
3) Data preparation
4) Data modeling
5) Evaluation
6) Deployment

Gain a clear understanding of the business problem that must be solved and how it impacts the company.

Analysis of all current data along with identifying any data quality issues.

Gather and organize the data in the correct formats and structures for analysis

Apply mathematical techniques to identify trends and patterns in the data.

Analyze the trends and patterns to assess the potential for solving the business problem.

Deploy the discoveries to the organization for work in everyday business.

The process of collecting statistics and information about data in an existing sources.

The process of sharing information to ensure consistency between multiple data sources.

A data-mining algorithm that analyzes a customer’s purchases and actions on a website and then uses the data to recommend complementary products.

1) Estimation Analysis
2) Affinity Grouping Analysis
3) Cluster Analysis
4) Classification Analysis

Determines values for an unknown continuous variable behavior or estimated future value.

Predict numeric outcomes based on historical data.

Affinity Grouping Analysis

Reveals the relationship between variables along with the nature and frequency of the relationships.

Affinity Grouping Algorithms

Association rule generators which create rules to determine the likelihood of events occurring together at a particular time or following each other in a logical progression.

Evaluates such items as websites and checkout scanner information to detect customers’ buying behavior and predict future behavior by identifying affinities among customers’ choices of products and services.

A technique used to divide information sets into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible.

The process of organizing data into categories or groups for its most effective and efficient use.

Uses a variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making.

A statement about what will happen or might happen in the future, for example, predicting future sales or employee turnover.

Data Mining Modeling Techniques for Predictions

1) Optimization model
2) Forecasting model
3) Regression model

A statistical process that finds the way to make a design, system or decision as effective as possible, for example, finding the values of controllable variables that determine maximal productivity or minimal waste.

Predictions based on time-series information allowing users to manipulate the time series for forecasting activities.

Includes many techniques for modeling and analyzing several variables when the focus is on the relationship between a dependent variable and one or more independent variables.

Time-stamped information collected at a particular frequency.

The common term for the representation of multidimensional information.

1) the layers
2) the rows
3) the columns

Mathematical formulas placed in software that performs an analysis on a data set.

The science of fact-based decision making that uses software-based algorithms and statistics to derive meaning from data.

The process of identifying rare or unexpected items or events in a data set that do not conform to other items in the data set.

Data value that is numerically distant from most of the other data points in a set of data.

The application of big data analytics to smaller data sets in near-real or real-time in order to solve a problem or create business value.

Extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes and other relevant information.

1) Behavioral Analysis
2) Correlation Analysis
3) Exploratory Data Analysis
4) Pattern Recognition Analysis
5) Social Media Analysis
6) Speech Analysis
7) Text Analysis
8) Web Analysis

Infographics (Information Graphics)

Presents the results of data analysis, displaying the patterns, relationships, and trends in a graphical format.

A business analytics specialist who uses visual tools to help people understand complex data.

Occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome.

Describes technologies that allow users to “see” or visualize data to transform information into a business perspective.

Moves beyond Excel graphs and charts into sophisticated analysis techniques such as pie charts, controls, instruments, maps, time-series graphs, etc.

Business Intelligence Dashboard

Tracks corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis.

What is a storage repository that holds a vast amount of raw data in its original format until the business needs it multiple choice?

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analytics applications. While a traditional data warehouse stores data in hierarchical dimensions and tables, a data lake uses a flat architecture to store data, primarily in files or object storage.

What is a storage repository that holds a vast amount of raw data in its original format until the business needs it data broker data lake data map data point?

If you're not already familiar with the term, a “data lake” is generally defined as an expansive collection of data that's held in its original format until needed. Data lakes are repositories of raw data, collected over time, and intended to grow continually.

What is a storage repository that holds?

A storage repository is essentially logical disk space made available through a file system on top of physical storage hardware.

What is a collection of large complex data?

Big data defined This is also known as the three Vs. Put simply, big data is larger, more complex data sets, especially from new data sources. These data sets are so voluminous that traditional data processing software just can't manage them.